Skip to content
@Intelligent-Internet

Intelligent-Internet

First Principles Sovereign AI
Intelligent Internet logo

Intelligent Internet

First Principles Sovereign AI

Most AI companies rent intelligence and compete at the UI. We ship the control points that turn intelligence into owned, verifiable work, in the open.


Website Blog Hugging Face Symbioism


🏭 The Agentic Production Line

Agents don't fail for lack of intelligence. They fail when knowledge is stale, retrieval is expensive, output is unverified, and capability never reaches a surface users can own. So we build every stage of the line, not one layer of it:

Control point Why it matters Open source
01 Capability Foundry: create capability, not wrappers If you only rent frontier APIs, your ceiling is someone else's roadmap II-Medical · II-Search · II-Thought
02 Governed Context: turn raw knowledge into machine-usable supply Agents fail when knowledge is scattered, stale, or outside source boundaries II-Commons · II-Commons-Skills
03 Retrieval Fabric: make search cheap, local, inspectable Context is useless if agents can't search before every decision, tool call, or handoff psql_bm25s
04 Work Harness: completion under gates, not just generation Agent output isn't work until it survives validators, evidence review, and replanning II-Agent · II-Researcher · Zenith
05 Owned Surfaces: land capability where work happens Capability only compounds when people can run, fork, and extend it CommonGround · CG-Cardbox · opencode-a2a

Models can be rented. UIs can be copied. Control points compound.


🏆 Zenith: #1 on Frontier SWE

The clearest proof that the harness layer matters: on the independent Frontier SWE benchmark, GPT-5.5 running inside Zenith ranks #1 overall, ahead of every frontier model paired with its own native harness. The identical model on its native harness ranks #5. Same model, better control loop.

# Model Harness Avg rank ↓ Dominance ↑
1 GPT-5.5 🥇 Zenith 2.06 92%
2 Claude Fable Claude Code 2.71 88%
3 Claude Opus 4.8 Claude Code 5.06 71%
4 GLM-5.2 Claude Code 5.31 69%
5 GPT-5.5 Codex (native) 5.53 68%

Metrics as reported by the Frontier SWE leaderboard. The full 15-entry table is in the Zenith results.

Zenith is our continuous-improvement harness for missions that run for days or weeks, where the dominant failure mode is premature completion. One orchestrator session reads task state each turn and decides whether to spawn workers and testers, register reusable skills, replan, or stop, all over MCP/ACP on top of Claude Code, Codex, or Hermes. In our published ablation across eight long-horizon tasks, Zenith achieves the best mean rank at less than half of RALPH's per-task cost ($176 vs $408).

📄 Technical report: From RALPH to Zenith: Designing Harnesses for Long-Running Agents


🚀 Flagship Projects

Project What it does
II-Agent Stars Open general agent framework: browser, code, files, sandboxed execution, documents, slides, multi-model routing
Zenith Stars #1 on Frontier SWE. A continuous-improvement harness for long-running agent tasks that turns Claude Code, Codex, or Hermes into a multi-agent mission orchestrator via MCP/ACP
II-Researcher Stars Deep-research agent: query decomposition, search generation, context compression, self-critique, and cited reports. Scores 84.1 on FRAMES
CommonGround Stars From isolated agents to shared work: records, evidence, handoffs, and decisions that persist beyond one run
psql_bm25s Stars Postgres-native exact BM25: mutable indexes, crash recovery, replication-friendly storage, SQL-native permissions
II-Commons Stars The knowledge supply chain: Wikipedia, PD12M, arXiv, and PubMed, parsed, embedded, indexed, and served with provenance

🧠 Open Models & Datasets

Everything on the 🤗 Hugging Face hub, with weights, data, and benchmark traces included.

Release Type Highlight
II-Medical-8B Model Specialist medical reasoning with SFT, RL, and safety stages
II-Search-4B Model Multi-hop search and tool-use behavior in a small model
II-Thought-RL-v0 Dataset 341,795 verified, machine-checkable RL problems across math, code, science, medicine
II-Medical-Reasoning-SFT Dataset Part of 2.2M medical reasoning rows behind the II-Medical series
wikipedia_en · arxiv · pd12m Datasets Public knowledge, processed for agents, with citations and source boundaries

📊 At a Glance

🏆 #1 5,000+ 🧪 341K 🏥 2.2M 🤗 9 + 20 🏭 5/5
on Frontier SWE (Zenith) GitHub stars across the org verified RL problems, open medical reasoning rows open models + datasets production-line stages shipped, all open

🧭 Why Open?

We publish the research, the data pipelines, the retrieval infrastructure, the harnesses, and the philosophy, because an intelligence economy only compounds when its production line is inspectable and forkable. Our long-form thesis lives at Symbioism: A Third Path for the Intelligence Age (source, naturally).

Earlier experiments like CoT-Lab, Common Chronicle, and CommonGround-legacy are archived in public. Every stage of the line started as an open experiment; the ones that worked became infrastructure.


Intelligence is our greatest resource. Together, we make it abundant.

ii.inc · Blog · 🤗 Hugging Face · Symbioism

Pinned Loading

  1. ii-agent ii-agent Public

    II-Agent: a new open-source framework to build and deploy intelligent agents

    Python 3.4k 517

  2. CommonGround CommonGround Public

    From isolated agents to shared work

    Python 141 20

  3. zenith zenith Public

    Zenith — a continuous-improvement harness for long-running agent tasks. Turns Claude Code, Codex, or Hermes into a multi-agent mission orchestrator via MCP/ACP.

    Python 193 23

  4. psql_bm25s psql_bm25s Public

    PostgreSQL BM25S extension

    PLpgSQL 142 4

  5. ii-researcher ii-researcher Public

    II-Researcher: a new open-source framework designed to aid building search / research agents

    Python 494 71

Repositories

Showing 10 of 18 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…