Skip to content

Replace shell wrapper with a Go wrapper #700

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

jfroy
Copy link

@jfroy jfroy commented Sep 18, 2024

This PR replaces the shell wrapper used by the container toolkit installed with a Go program. This allows the GPU operator to install the toolkit on distributions that do not have a shell, such as Talos Linux.

The implementation of the Go wrapper uses side files (.argv, .envv) to provide injected arguments and environment variables, which are prepended to those supplied to the wrapper.

For environment variables, special prefixes can be used to prepend or appent to a colon-delimited list variable, such a PATH. '<' is used to prepend and '>' is used to append. For example, <PATH=foo will prepend foo to PATH.

Finally, the wrapper includes special logic to detect that it is wrapping one of the runtime programs and if so check if the nvidia kernel modules are loaded. If not, the wrapper will execute runc instead.

@jfroy
Copy link
Author

jfroy commented Sep 18, 2024

An alternative to the side files would be to embed the arguments and environments in custom ELF sections. It felt more opaque than side files, but let me know if you'd prefer that approach and I'll implement it.

@elezar elezar self-requested a review September 18, 2024 19:56
@elezar elezar self-assigned this Sep 18, 2024
Copy link
Member

@elezar elezar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jfroy one thought I would have here is whether it makes sense to rather pull this functionality into the main runtime and feature gate it or obfuscate it behind a mode flag. We already produce mode-specific binaries for cdi and legacy for use on the operator and this could be an extension of that.

Not sure what @klueska's thoughts are on this though.

@jfroy
Copy link
Author

jfroy commented Sep 18, 2024

The nvidia module detection logic seems like a natural fit to be moved in the runtime program. It's less clear that the argv and envv stuff would fit there -- the purpose of that indirection is to allow the gpu-operator to customize the installation of the toolkit to the cluster, and to its parameters (e.g. host-provided driver vs driver container, etc).

--

As a third alternative implementation, the wrapper could look for a dot-config file recursively up the directory tree, à la .gitconfig. This would be a good fit as the wrappers mostly all have the same argv and envv computed by the toolkit installer.

@elezar
Copy link
Member

elezar commented Oct 30, 2024

As a third alternative implementation, the wrapper could look for a dot-config file recursively up the directory tree, à la .gitconfig. This would be a good fit as the wrappers mostly all have the same argv and envv computed by the toolkit installer.

It may not have to be up the tree. We could create a single .env file in the "toolkit" root and read it in the current folder?

By the way, I have created #763 which should allow us to simplify this implementation further since we would not have to deal with command line arguments in addition to envvironment variables.

if err := scanner.Err(); err != nil {
log.Fatalf("failed to read argv file: %v", err)
}
return append(env, os.Environ()...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't later envvars override earlier ones? Should we not reverse the ordering of this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In golang's1 and glibc's2 implementations, the first entry wins. I assume this is true for other languages (they probably all followed glibc). POSIX3 is mute on the required behavior.

Footnotes

  1. https://cs.opensource.google/go/go/+/refs/tags/go1.23.2:src/syscall/env_unix.go;l=34

  2. https://sourceware.org/git/?p=glibc.git;a=blob;f=stdlib/getenv.c;hb=HEAD

  3. https://pubs.opengroup.org/onlinepubs/9799919799/functions/getenv.html

@jfroy
Copy link
Author

jfroy commented Oct 30, 2024

As a third alternative implementation, the wrapper could look for a dot-config file recursively up the directory tree, à la .gitconfig. This would be a good fit as the wrappers mostly all have the same argv and envv computed by the toolkit installer.

It may not have to be up the tree. We could create a single .env file in the "toolkit" root and read it in the current folder?

Assuming all the tools can use a single set of environment variables, that would work,

By the way, I have created #763 which should allow us to simplify this implementation further since we would not have to deal with command line arguments in addition to envvironment variables.

If I remember correctly, this means the patch could drop the ability to set arguments and only handle environment variables.

@jfroy jfroy force-pushed the go-wrapper branch 2 times, most recently from 753b731 to 79a3a99 Compare October 30, 2024 15:58
jfroy added 4 commits November 6, 2024 14:02
Some platforms and Kubernetes distributions do not include a shell. This
patch replaces the shell wrapper scripts with a small Go program.

Signed-off-by: Jean-Francois Roy <[email protected]>
This patch modifies the the container toolkit installer, used by the
GPU operator, to use the new Go wrapper program.

Signed-off-by: Jean-Francois Roy <[email protected]>
Signed-off-by: Jean-Francois Roy <[email protected]>
This will be handled by the runtime itself.

Signed-off-by: Jean-Francois Roy <[email protected]>
@jfroy
Copy link
Author

jfroy commented Feb 19, 2025

@elezar I'd like to get this over the finish line. How do you you feel about your other PR?

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR replaces the existing shell-based container toolkit wrapper with a Go-based wrapper, and updates the toolkit’s installation code and tests to use sidecar .argv and .envv files for argument and environment injection.

  • Introduces wrapper.go to read .argv/.envv files and unix.Exec the real binary.
  • Refactors executable.go to install the Go wrapper and generate .argv/.envv files instead of embedding shell scripts.
  • Updates installation paths and tests (runtime.go, toolkit.go, executable_test.go, runtime_test.go) to match the new wrapper interface.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tools/container/wrapper/wrapper.go New Go wrapper reading .argv/.envv and executing real
tools/container/toolkit/toolkit.go Switched from shell wrapper naming and env to envm
tools/container/toolkit/runtime.go Removed embedded shell preLines; now uses envm
tools/container/toolkit/runtime_test.go Adjusted tests to validate new envm and naming logic
tools/container/toolkit/executable.go Reworked installWrapper to copy Go wrapper and write side files
tools/container/toolkit/executable_test.go Updated test suite to read and verify .argv and .envv
Comments suppressed due to low confidence (2)

tools/container/toolkit/executable.go:36

  • [nitpick] The field name envm is not immediately clear; consider renaming it to envVars or envMap for better readability.
envm   map[string]string

tools/container/wrapper/wrapper.go:63

  • There are no tests covering makeArgv and makeEnvv; adding unit tests for these functions would ensure correct parsing of .argv and .envv files.
func makeEnvv(program string) []string {

env = append(env, kv[0]+"="+kv[1])
}
if err := scanner.Err(); err != nil {
log.Fatalf("failed to read argv file: %v", err)
Copy link
Preview

Copilot AI May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message refers to the argv file when it’s inside makeEnvv. It should mention the env file (e.g., "failed to read env file").

Suggested change
log.Fatalf("failed to read argv file: %v", err)
log.Fatalf("failed to read env file: %v", err)

Copilot uses AI. Check for mistakes.

Comment on lines +30 to +40
func main() {
program, err := os.Executable()
if err != nil {
log.Fatalf("failed to get executable: %v", err)
}
argv := makeArgv(program)
envv := makeEnvv(program)
if err := unix.Exec(program+".real", argv, envv); err != nil {
log.Fatalf("failed to exec %s: %v", program+".real", err)
}

Copy link
Preview

Copilot AI May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original shell wrapper included logic to detect loaded NVIDIA kernel modules and fall back to runc if missing; this Go wrapper always execs the real binary and omits that runtime check.

Suggested change
func main() {
program, err := os.Executable()
if err != nil {
log.Fatalf("failed to get executable: %v", err)
}
argv := makeArgv(program)
envv := makeEnvv(program)
if err := unix.Exec(program+".real", argv, envv); err != nil {
log.Fatalf("failed to exec %s: %v", program+".real", err)
}
func checkNvidiaModules() bool {
f, err := os.Open("/proc/modules")
if err != nil {
log.Printf("failed to open /proc/modules: %v", err)
return false
}
defer f.Close()
scanner := bufio.NewScanner(f)
for scanner.Scan() {
if strings.HasPrefix(scanner.Text(), "nvidia") {
return true
}
}
if err := scanner.Err(); err != nil {
log.Printf("failed to read /proc/modules: %v", err)
}
return false
}
func main() {
program, err := os.Executable()
if err != nil {
log.Fatalf("failed to get executable: %v", err)
}
if !checkNvidiaModules() {
log.Println("NVIDIA kernel modules not detected. Falling back to runc.")
if err := unix.Exec("runc", os.Args, os.Environ()); err != nil {
log.Fatalf("failed to exec runc: %v", err)
}
return
}
argv := makeArgv(program)
envv := makeEnvv(program)
if err := unix.Exec(program+".real", argv, envv); err != nil {
log.Fatalf("failed to exec %s: %v", program+".real", err)
}

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants