Skip to content

provider refresh configure: accept secret material off-argv (file/stdin) so private keys aren't exposed in process listings #2104

Description

@hunglp6d

Problem Statement

openshell provider refresh configure accepts all refresh material — including secrets like a service-account private_key or an OAuth2 client_secret — only through --material KEY=VALUE command-line arguments. Secret values therefore appear in the host process table (ps aux), /proc/<pid>/cmdline, shell history, process accounting, and any audit/observability tooling that records command lines.

This affects every strategy carrying secret material — google_service_account_jwt (private_key), oauth2_client_credentials (client_secret), etc. — including OpenShell's own built-in GOOGLE_VERTEX_AI_SERVICE_ACCOUNT_TOKEN provider when configured via the CLI. OpenShell already treats this material as sensitive: --secret-material-key marks which keys are secret so they're redacted in storage and diagnostics — yet the value is still forced through argv, where redaction does not apply. The CLI knows the value is secret but has no way to receive it without exposing it. On the multi-admin / audited hosts where a gateway typically runs, this is a real credential exposure and contradicts OpenShell's own "secrets stay gateway-side / never exposed" posture.

Proposed Design

Add an off-argv ingestion path for secret material to provider refresh configure. The value still flows through the existing gRPC ConfigureProviderRefreshRequest.material map — only the CLI ingestion changes. Two complementary options (either/both):

  1. --secret-material-file KEY=PATH (recommended): the CLI reads the value for KEY from the file at PATH (e.g. a 0600 temp file the caller writes then deletes), merges it into the material map, and marks KEY secret. Symmetric with the existing --secret-material-key.
  2. --material-stdin: read KEY=VALUE lines (or a small JSON object) from stdin when piped. OpenShell already has a piped-stdin reader (run.rs ~2761, 4 MiB cap) for other commands, so this is consistent.

Precedence: off-argv values override/augment --material for the same key; keys supplied off-argv are auto-added to secret_material_keys. Non-secret metadata (client_email, scope) stays on --material. Example: openshell provider refresh configure <name> --credential-key K --strategy google-service-account-jwt --material client_email=... --material scope=... --secret-material-file private_key=/run/secrets/sa.pem — no secret in argv.

Alternatives Considered

  • --material KEY=@PATH value convention: smaller, but ambiguous with values legitimately starting with @ and silently changes existing --material semantics. Rejected (surprise/compat).
  • Env-var indirection (--material KEY=env:VAR): env is only marginally better than argv (/proc//environ), leaks to children, overloads the value grammar. Rejected.
  • Do nothing / document the risk: leaves a secret-in-argv anti-pattern in the product's own security-sensitive path. Rejected as the long-term answer.
  • Callers use the gRPC API directly (bypass CLI): forces every integrator to build an mTLS gRPC client; the CLI is the supported interface. Rejected.

Agent Investigation

Explored openshell-cli + openshell-server:

  • crates/openshell-cli/src/main.rs:898 — --material is Vec<String>; :902 --secret-material-key is names only. No file/stdin/env flag on the Configure subcommand.
  • crates/openshell-cli/src/run.rs:3974 parse_key_value_pairs — splits on =, stores value verbatim (no @file/env:).
  • crates/openshell-cli/src/run.rs:5298 — CLI forwards material into the gRPC request; the transport already carries it off-argv.
  • crates/openshell-server/src/provider_refresh.rs:506 / :487google_service_account_jwt / oauth2_client_credentials read secrets from the material map only.
  • crates/openshell-cli/src/run.rs:4335 read_gcloud_adc reads a file but is authorized_user-only and rejects service_account keys (:4376).
  • Stdin precedent: run.rs ~2761 (4 MiB reader).
    Fix is localized to the CLI ingestion boundary (main.rs add flag + run.rs provider_refresh_config merge file/stdin material before the gRPC call); server/gRPC/strategy unchanged.

Checklist

  • I've reviewed existing issues and the architecture docs
  • This is a design proposal, not a "please build this" request

Metadata

Metadata

Assignees

No one assigned

    Labels

    state:triage-neededOpened without agent diagnostics and needs triage

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions