Skip to content

fix(mcp): reject JSON-RPC batches#2083

Open
shiju-nv wants to merge 1 commit into
NVIDIA:mainfrom
shiju-nv:fix/mcp-reject-json-rpc-batches
Open

fix(mcp): reject JSON-RPC batches#2083
shiju-nv wants to merge 1 commit into
NVIDIA:mainfrom
shiju-nv:fix/mcp-reject-json-rpc-batches

Conversation

@shiju-nv

@shiju-nv shiju-nv commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR rejects every top-level JSON array in MCP mode before member parsing, method policy evaluation, or upstream delivery. It leaves existing generic JSON-RPC batch behavior unchanged.

Related Issue

Closes #2082

Changes

  • Return the stable unsupported_mcp_batch classification as soon as the MCP parser sees a top-level array, preserve that exact detail in dedicated, route-selected, and explicit forward-proxy denial paths, and leave ordinary JSON-RPC parse diagnostics unchanged.
  • Regression coverage: Empty, request, notification, response, and mixed MCP arrays; valid single MCP request, notification, and response controls; one valid generic JSON-RPC batch parser control; a generic JSON-RPC batch relay control; direct and route-selected MCP denial under audit and enforce modes; exact HTTP 403 detail; relay completion; and deterministic zero upstream bytes.

Testing

  • mise run pre-commit passes
  • Unit tests added/updated
  • E2E tests added/updated (if applicable)

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated (if applicable)

Reject top-level JSON arrays on MCP Streamable HTTP endpoints before policy evaluation or upstream forwarding.

Preserve single-message MCP handling and generic JSON-RPC batch behavior. Return a stable denial detail across direct, route-selected, and forward-proxy paths.

Signed-off-by: Shiju <shiju@nvidia.com>

@krishicks krishicks left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we avoid passing unsupported_mcp_batch through the generic human-readable reason string and then recovering its semantics through string equality? A structured hard-denial/error classification would let the response boundary choose the stable API code without jsonrpc_parse_rejection_reason and forward_l7_denial_detail both special-casing this value.

Also, the parser test already exercises every top-level array shape. Could we keep one representative batch for the direct/route-selected × audit/enforce relay matrix and remove mcp_rejects_jsonrpc_batches plus mcp_batch_rejection_preserves_single_mcp_and_generic_batch_controls? The latter’s assertions all describe behavior that predates this change.

}

#[test]
fn mcp_batch_rejection_preserves_single_mcp_and_generic_batch_controls() {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this test would have failed before this change, and it seems to duplicate parses_valid_batch_without_error. I would remove it.


/// Preserve stable classifier reasons while contextualizing ordinary JSON-RPC parse failures.
pub(crate) fn jsonrpc_parse_rejection_reason(error: &str) -> String {
if error == UNSUPPORTED_MCP_BATCH_REASON {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to special-case UNSUPPORTED_MCP_BATCH_REASON, but I don't think this is a good pattern. I would prefer it if the jsonrpc_info was the only source of this info.

) -> String {
// Stable classifier reasons are machine-readable API details. Policy and
// ordinary parse denials retain request context for human diagnostics.
if force_deny && reason == crate::l7::jsonrpc::UNSUPPORTED_MCP_BATCH_REASON {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another special-casing of UNSUPPORTED_MCP_BATCH_REASON which I think could be resolved through better design.

@ddurst-nvidia

Copy link
Copy Markdown
Contributor

Important

I feel like things like this would be best aligned with an mcp.versions: [] flag, that tightens the constraints based on supported MCP versions running through this specific endpoint. That doesn't exist today, and I'm not advocating this PR add it.

  • MCP 2025-03-26: requires JSON-RPC batching.
  • MCP 2025-06-18: removed JSON-RPC batching.

I don't expect to see it's return, but we should do our best to not exclude folks from being able to use older MCP servers at the gateway/proxy level.

I would suggest, that until a mcp.versions: [] style of gating is spec'd out and delivered, that this be a configurable option under the mcp.[options] section, as mcp.disable-batch with a default behavior of blocking batches, and allowing users to opt-into allowing it (a pointer in the docs as to which versions of MCP support it, would be helpful).

Note

If we intend to expose a key of json-rpc: {} configurable to protocol: mcp configuration, it could go there, but might be confusing for folks seeking just mcp configuration. If that becomes the case, I'd recommend that for now an mcp equivalent alias.

@krishicks

Copy link
Copy Markdown
Collaborator

I think we could also choose just not to reject JSON-RPC batches for MCP in any event, given some versions of the protocol require and some remove batching. We're not in the business of validating the MCP payload against any specific protocol version. We only need to know about the batching at all so we can validate the entire batch against the allowed calls. We know there will be structural differences in the payloads across different versions that require special handling, like the upcoming stateless changes, and we'll need to support that.

@ddurst-nvidia

Copy link
Copy Markdown
Contributor

@krishicks

I think we could also choose just not to reject JSON-RPC batches for MCP in any event, given some versions of the protocol require and some remove batching. We're not in the business of validating the MCP payload against any specific protocol version.

An MCP policy engine needs an explicit set of supported wire profiles and fail-closed behavior for structures it cannot safely interpret. We can offer that as individual options within the mcp.<option> flags, or we can do that via mcp.versions: [] (I prefer the latter, but accept that we need the former for now).

I want to be clear: for a policy engine, treating every version’s wire forms as universally acceptable imposes cumulative compatibility and validation costs on every MCP policy.

The "batch on"/"batch off" argument can be most cleanly seen in the reaction to the PR that removed batching from the spec, after it was merged. It's drama that we can avoid by providing this as a single mcp.{allow,disable}_batches option (no preference), and letting a policy author decide without forming an opinion beyond current defaults. What I do not want is for every MCP endpoint to incur that compatibility and validation behavior by default when modern MCP versions exclude batches.

Specifically, 2025-03-26 says implementations MAY send batches but MUST support receiving them, and that initialize MUST NOT appear in a batch. Beginning with 2025-06-18, a POST body MUST contain a single JSON-RPC message. That's RFC 2119 language, not descriptive guidance.

We know there will be structural differences in the payloads across different versions that require special handling, like the upcoming stateless changes, and we'll need to support that.

I think that is a strong argument in favor of per-version behavior, especially given the changes in the current draft.

The policy-relevant features of messages across protocol versions need to be enunciated for a policy engine to enforce them.

The draft says that selected body fields are mirrored into HTTP headers so intermediaries can "route and inspect requests without parsing the body." However, it also says that any server processing the body MUST validate that the header values match their corresponding body values. It further requires Mcp-Param-* headers to be constructed from the tool’s most recently obtained inputSchema. Since we're doing policy enforcement here, not just routing, validating that header/body alignment is an implicit part of our security contract.

For OpenShell to use those headers safely at the Agent Security Boundary, it would have two representations to reconcile and, for parameter headers, tool-schema state to maintain or consult. I do not want endpoints that are not using that MCP version profile to incur that additional validation complexity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MCP endpoints accept JSON-RPC batch payloads

3 participants