fix(install): drop podman as hard dependency for RPM installation#2137
fix(install): drop podman as hard dependency for RPM installation#2137pimlock wants to merge 9 commits into
Conversation
Remove Podman package and service coupling, rely on existing runtime detection, and surface actionable installer startup diagnostics. Document runtime prerequisites and VM packaging across install methods. Closes #2007 Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
|
🌿 Preview your docs: https://nvidia-preview-pr-2137.docs.buildwithfern.com/openshell |
Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
|
Addressed the documentation feedback in
|
Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
Implemented proposal: runtime-neutral installation with gateway-owned detectionThe final implementation treats package installation and gateway health as separate outcomes while keeping optional compute runtimes outside OpenShell's ownership. Contract
Detection and guidance
Gateway startup now detects all available runtimes in Kubernetes → Podman → Docker priority order. On every automatic start it:
This gateway-level warning is necessary because the environment can change after installation. For example, adding Podman to a Docker-only host leaves the running gateway unchanged, but the next restart selects Podman unless Docker is pinned. Why this optionThe approach preserves the installation invariants: packages remain runtime-neutral, detection remains observational, user configuration remains explicit, and the gateway is never exposed beyond loopback without an operator action. The installer provides immediate discoverability, while gateway startup provides durable guidance for later environment changes. Alternatives explored
The final implementation combines the useful parts of the last two options: immediate installer guidance plus authoritative startup diagnostics, without automatic configuration changes. |
Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
|
@maxamillion @drew this is now implemented as a runtime-neutral installation contract. Problem: a Docker-only RHEL host could not install OpenShell because the RPM required Podman. Implemented outcome:
The main installation invariants are: native package installation, CLI and gateway availability, a supervised user service, preserved user configuration, loopback/TLS defaults, no management of optional runtimes, explicit operator-owned driver pinning, and a clear split between package success and service health. We explored:
The selected design combines installer detection for immediate discoverability with gateway startup warnings for long-term correctness. This covers runtimes installed after OpenShell while keeping detection observational and all configuration changes explicit. |
Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
|
|
||
| report_detected_compute_driver() { | ||
| _gateway_bin="${OPENSHELL_GATEWAY_BIN:-openshell-gateway}" | ||
| if ! _detected_driver="$(as_target_user "$_gateway_bin" config detect-driver)"; then |
There was a problem hiding this comment.
Not sure if exposing the detected driver just for this is worth it. Arguably it provides a slightly better UX, guiding the user, rather than requiring after-the-fact troubleshooting.
Stop enumerating the VM driver libexec search paths in the RPM CONFIGURATION doc; defer to the driver_dir setting and the gateway's startup error, which lists the directories it actually searched. Correct the VM README (the single detailed source) to match the four paths the gateway searches. Align the QUICKSTART detection order with the other docs (Kubernetes, then Podman, then Docker) and drop the no-op bind-address reset from the config-set example. Signed-off-by: Piotr Mlocek <pmlocek@nvidia.com>
Summary
Make package installation runtime-neutral so installing OpenShell does not require Podman, Docker, or another optional compute runtime. The RPM no longer depends on Podman, package-managed gateways retain automatic driver selection, and the installer reports package installation and gateway health as separate outcomes.
This version keeps runtime detection inside the gateway.
install.shuses the read-only detection CLI for immediate guidance, while every gateway start logs the selected driver, all detected alternatives, multi-runtime pinning guidance, and the rootless Podman bind command when applicable.Related Issue
Closes #2007
Installation Invariants
openshellandopenshell-gatewayartifacts in the package manager's standard executable path.Changes
podman.socketordering from the RPM spec.127.0.0.1:17670.openshell-gateway config detect-driverwith stablekubernetes,podman,docker, ornoneoutput.install.shadvisory: it never changes driver selection or listener configuration.openshell-gateway config setfor typed, validated, comment-preserving, atomic TOML updates.package installation succeededseparately fromgateway service is healthy.install.shand gateway diagnostics.Runtime Scenarios
install.shreturns nonzero with diagnostics. The service keeps retrying and can recover after a runtime is installed. Rootless Podman still requires the explicit bind change before sandboxes can call back.openshell-gateway config set --compute-driver <driver>guidance.Why This Option
This design follows the invariants above: package installation stays independent of optional runtimes, configuration remains operator-owned and visible in TOML, loopback is never widened implicitly, and runtime changes after installation remain observable. Keeping the warning in the gateway is important because an installer-only warning cannot cover Docker or Podman installed later.
Automatic selection remains the zero-configuration recovery path. Explicit
config setcommands are the stability mechanism when an operator wants a fixed driver.Alternatives Explored
install.sh. Rejected because repository setup, rootless configuration, daemon lifecycle, licensing, and enterprise policy vary across supported systems.install.sh --compute-driver/OPENSHELL_INSTALL_COMPUTE_DRIVER. Implemented during exploration, then removed because it duplicated gateway configuration policy and made installation persist a driver choice.OPENSHELL_DRIVERSin service definitions. Rejected because environment values outrank TOML and make later config edits appear ineffective.0.0.0.0when Podman is detected. Rejected because runtime probing must not silently expose the gateway on host interfaces.install.sh. Insufficient because runtimes can be installed or removed after installation.detect-driverfor immediate Podman guidance.Testing
mise run pre-commitpassesmise run testpasses (included bymise run pre-commit)mise run cipasses (mise run pre-commitis thecialias)Focused coverage:
Checklist