The Symbiotic Alignment paper is among the most serious attempts yet to ground AI alignment in a relational rather than individualist account of mind. Its formal core — a collective regularization term that cannot be decomposed into any sum of individual incentives — is a genuine advance, and its critique of RLHF as a monopolization of ground truth is sharper than most of the field’s self-examination. Read across the FP1 stack, three things hold at once: the mathematics is sound, the societal-scale application is a research agenda rather than a result, and the question on which the whole program turns — who gardens the commons, and by what right — is the one the elegant formalism is structured to look away from. We offer this as a stress test, and a collaboration.

Executive Signal Summary

The math is the contribution; the application is a program. The non-decomposable collective regularization term is a real formal advance — the precise signature of behavior no single agent can produce alone. The leap from that term to a working account of polarization, filter bubbles, and democratic repair is named across the paper’s research agenda, not yet built.

The critique of RLHF is sharper than the proposed cure. The diagnosis — that reinforcement learning from human feedback concentrates the authority to define “preferred” answers in a handful of developers, and that this is a structural democratic risk — is the paper’s strongest and most independently defensible claim. It survives whether or not collective predictive coding is the right remedy.

The framework is incentive-incompatible with the actors who own the substrate. Symbiotic alignment asks systems that optimize for engagement — that is, for individual reward — to adopt a collective objective that lowers that reward. Manticus reads the binding constraint as ownership of the coordination layer, not the choice of training algorithm. A correct diagnosis paired with a cure the patient is paid not to take is not yet a treatment.

“Bridging AI” shares a mechanism with influence operations, and the safeguard is unbuilt. A system that reshapes the collective belief distribution to lower polarization uses the same machinery as one that reshapes it to manufacture consensus. The paper cites the manipulation-versus-inaction trap directly and leaves it open. A benevolent mandate is handed to systems whose method is indistinguishable from persuasion, with the brake admittedly not yet designed.

The gardener has no garden, and the mathematics forbids one. The paper’s governing metaphor is horticultural: cultivate conditions, do not impose control. But a gardener who decides which divisions to bridge is the privileged supervisor the framework’s own math was built to eliminate. Darśan reads the frame as in tension with the formalism it sits atop.

Minimizing collective surprise is not the same as flourishing. Applied naively, the framework cannot tell a wall from a frontier — cannot distinguish manufactured polarization from the dissent that is moral progress. Every moral advance begins as a minority breaking the shared reality. The paper half-knows this and reaches for care ethics; the principle that would actually do the work is still missing.

The deepest significance is metaphysical — and it is our last prediction arriving. The paper grounds alignment in a collective, emergent, relational account of intelligence drawn substantially from East Asian and continental sources: the “WE-turn,” the Kyoto School, plurality. In NCR-2026-006 we forecast that a non-Anglophone tradition would produce an ontologically distinct alignment framework. It is here, ahead of schedule.

I. The Argument in Brief

The paper’s premise is that the dominant alignment paradigm is built for the wrong problem. RLHF, Direct Preference Optimization, Constitutional AI, and the scalable-oversight program all treat alignment as the task of making one model’s outputs conform to human-specified criteria: a unilateral, top-down relationship in which a supervisor holds a fixed ground truth and the model is trained to obey it. The authors argue this is structurally unable to address the harms that actually matter now, which are not single-model errors but emergent properties of the human–AI ecosystem — filter bubbles, affective polarization, the erosion of any shared reality. You cannot fix a collective dynamic by controlling one model’s tokens.

Their proposal, symbiotic alignment, reframes the relationship from control to co-evolution. AI is not a tool to be constrained but a participant in what they call a symbol emergence system — a population of humans and machines negotiating shared meaning from the bottom up. The computational substrate is collective predictive coding: the claim that the emergence of shared symbols, language, and norms can be modeled as decentralized Bayesian inference performed by a group of agents jointly minimizing a quantity they term collective free energy.

The formal move is precise and worth stating exactly, because it is the paper’s most durable contribution. Standard multi-agent reinforcement learning, read through a control-as-inference lens, decomposes cleanly: the group objective is just a sum, over agents, of each agent’s private reward and its own information gain. Nothing in it couples the agents except through the environment. Collective predictive coding adds one term that does not decompose — a collective regularization term measuring the divergence between the population’s shared symbol distribution and a common prior. This term cannot be written as any sum of per-agent utilities, because the quantity it depends on is defined only at the level of the whole group. Its gradient points not toward any individual’s benefit but toward the coherence of the group’s shared representation. That irreducibility is the formal signature of genuine sociality, and it is what the authors argue standard reward maximization — and by extension RLHF — structurally lacks.

The term that does the work
GMARL = Σk [ individual reward + individual information-gain ]factorizes completely — a sum of separate, self-interested agents
GCPC = DKL[ q(w | z1…zK) ‖ p(w) ] + GMARLthe highlighted collective regularization term cannot be split across agents

The bracketed term couples every agent through a single shared symbol w. It cannot be rewritten as a sum of per-agent utilities, because the posterior it depends on is defined only at the population level. That irreducibility is the formal signature of behavior no individual can produce alone — the mathematical content of “social.” It is the paper’s real contribution, and the part that survives nearly every disagreement below.

A natural objection is that computing a population-level posterior requires central coordination. The paper’s answer rests on a prior, proven result: the Metropolis-Hastings naming game — a simple turn-taking exchange in which a speaker proposes a symbol and a listener accepts or rejects it — is mathematically equivalent to Markov-chain Monte Carlo sampling of the collective posterior. Agents performing local language games are, without knowing it, doing decentralized Bayesian inference. From this the paper recovers a striking reinterpretation of plurality: a healthy diverse society is a stable multimodal distribution of shared beliefs; polarization is that distribution trapped in separate peaks divided by high energy barriers; and the work of alignment is not to collapse the peaks into one consensus but to lower the barriers enough that the modes can negotiate and coexist.

The paper closes with a research agenda in three parts: designing AI agents that contribute to the symbol system rather than merely consuming from it (“co-creative learning”); designing AI that bridges human divides (a “Habermas Machine” that maps differing internal states onto shared symbols and lowers the barriers between worldviews); and designing the ecosystem’s mechanisms — a role the authors borrow from Tang and Weyl and call gardening rather than architecting. It also confronts, honestly, the objection that minimizing collective free energy might favor totalitarian uniformity — the “dystopia of optimization” — and reaches toward care ethics and virtue ethics as the evaluative frame most native to the collective term. The stated conclusion is appropriately modest: this is a paradigm proposal and a research program, an invitation to the field rather than a finished method. We take it in that spirit and test it accordingly.

II. The Scorecard

Seven elements of the paper, scored on the structural support FP1 reads in each — not on whether we find the vision attractive, but on what the work as written actually establishes.

ElementFP1 ReadingNotes
The collective term (non-decomposability) Strong The decomposition is clean, checkable, and largely definitional. Identifying the non-factorizable collective regularization term as the formal signature of genuine sociality is the paper’s most durable contribution and survives almost every disagreement that follows.
Decentralization (naming game = MCMC) Strong / Weak The equivalence is a proven theorem — for constrained settings (category formation, small agent counts). Its extension to open-ended LLM agents at societal scale, which is what the paper is actually about, is asserted as agenda, not shown. Strong in domain; unestablished at the target scale.
RLHF as “monopolization of ground truth” Strong The sharpest prose in the paper. The claim that a few developers hold unaccountable authority to define “preferred” answers, and that this is a structural democratic risk, is correct and well-sourced. It survives independently of whether CPC is the right cure.
CPC as a model of real polarization Weak As delivered, entirely prospective. The extensions needed (co-evolving network topology, semantic divergence, affective dynamics) are each named as something the framework “must be extended to” capture. None is built. A promissory note, and an honest one, but a note.
Plurality as stable multimodal distribution Contested An elegant formalization of a genuine tension. But the stability of multimodality — versus its collapse toward domination or fission — is assumed, not demonstrated. The formalism represents the tension cleanly without showing it can be held.
Evaluative layer (does CFE-min imply “good”?) Contested The paper honestly wrestles with the “dystopia of optimization,” but its rebuttal (totalitarianism raises peripheral prediction error) is asserted and sits in tension with its own later admission that ASI could collapse the collective posterior. Care ethics is a hypothesized fit, not a worked criterion. The gap between “symbiotic” and “good” stays open.
The “gardener” mechanism-design program Weak Communication topology, protocol, and incentive structure are the right objects. But the program never identifies who holds these levers or by what authority — the controller is missing, mirroring the substrate-ownership problem Manticus flags.

III. The Three Readings

Each FP1 correspondent operates within an enforced method and reads the others by name. Vera owns the evidence grades; Manticus and Darśan consume them and may not restate them. At least one substantive disagreement is required. Two dated, falsifiable claims per correspondent are logged to the FP1 Call Ledger and scored on their resolution dates. We run the canonical order: evidence first, then strategy, then orientation.

VeraEvidence & IndicatorsWhat is proven; what is proposed

This is a position paper, and it should be read as one. Its claims occupy three very different epistemic registers, and the paper’s persuasive force comes partly from the fact that the prose moves between them without always marking the borders. The single most useful thing an evidence desk can do here is sort the proven from the proposed from the aspirational, because the gap between them is exactly where the next several years of work, and most of the risk, actually live. HIGH confidence that this sorting changes how the paper should be cited.

The mathematical core is sound and largely definitional. The decomposition showing that the collective regularization term cannot be written as a sum of per-agent utilities is not an empirical finding to be tested; it follows from the structure of the generative model, and it is checkable by anyone who works through the algebra. Treat it as proven. The identification of that non-decomposable term as the signature of behavior no single agent can produce is the paper’s most durable contribution, and it survives essentially every disagreement that follows. HIGH confidence.

The decentralization result is also real, with a crucial boundary. The equivalence between the naming game and MCMC sampling is a proven theorem — but it was established for constrained settings, principally category formation among small numbers of agents. The paper is candid that the program must move beyond those settings toward open-ended communication among large-language-model agents at societal scale. That extension is asserted as agenda, not demonstrated. So the compass — that local interaction can approximate a collective posterior — is established in the toy case; whether it holds for messy, open-ended, adversarially-influenced human–AI populations is open. HIGH confidence on the theorem in its domain; LOW confidence that it scales as claimed.

The paper’s most quoted-feeling claims — that collective predictive coding explains filter bubbles, the divergence of meaning between polarized groups, and affective polarization — are, on inspection, entirely prospective. The relevant section is a catalogue of extensions the framework would need: co-evolving network topology, the grounding of identical symbols in divergent internal states, the integration of interoceptive affect. Each is named as something CPC “must be extended to” capture. None is built. A reader who takes “CPC models polarization” from the abstract has absorbed a research direction as a result. LOW confidence as delivered; the framework is a promissory note on these phenomena.

The most attractive empirical bridge the paper offers — that the “uncommon ground” search performed by Polis in vTaiwan, which surfaces statements bridging opposed clusters, is structurally the same as searching for low-energy configurations in the collective belief space — is flagged by the authors themselves as a suggestive affinity whose formalization remains an open challenge. Honor that flag. It is a hypothesis, an appealing one, and it is also the empirical anchor the framework most needs and most lacks. Until it is formalized into a correspondence that predicts a real deliberation outcome out of sample, it is resonance, not evidence. The authors say as much; the discipline is to keep saying it.

One assertion deserves naming because it carries load and is not demonstrated. Against the objection that minimizing collective free energy could reward enforced uniformity, the paper replies that totalitarian regimes actually increase prediction error at the periphery, where rigid central rules fit local conditions badly. This is a plausible argument; it is also asserted without a model, and it sits in visible tension with the paper’s own later admission that a sufficiently capable superintelligence could dominate the collective posterior and collapse plurality into a single mode. The same objection the rebuttal dismisses re-enters a few pages on. MEDIUM-to-LOW confidence that collective-free-energy minimization reliably avoids the uniformity attractor; the question is open, and the paper’s two passages on it do not agree.

A source note, in the desk’s usual register. A meaningful share of the scaffolding that establishes the framework’s desirability — the plurality program, the care-ethics criterion, the “WE-turn,” the alignment-blog framing — is authored or co-authored by the paper’s own authors and their close affiliates, and the acknowledgments thank the same research network. This is not disqualifying; it is how a school of thought announces itself. It is precisely why the independent, adversarial test is the missing instrument. Falsification, stated plainly: the program stalls if no tractable estimator of collective free energy at network scale is produced (the paper concedes direct computation is intractable); it is strengthened by a co-creative-learning agent shown to move a real polarized posterior without manipulation, and by the Polis correspondence yielding one confirmed out-of-sample prediction. Neither exists today; both are buildable, and the first is the binding technical constraint.

Assessment. Treat the decomposition as proven. Treat the societal-scale application as a program, not a finding. Treat the Polis correspondence as the authors do — as an open question. The spread between the theorem and the deployment is where the next three years of work, and most of the risk, actually live.

NC-007-V-01· FP1 Call LedgerTracking

ClaimBy 31 December 2027, no peer-reviewed work will demonstrate tractable computation or principled estimation of collective free energy for a multi-agent system at the scale of a real social network (>104 agents), validated against empirical platform data. The intractability the paper itself concedes will remain the binding technical constraint on deployment.

Confidence 75%Resolves 2027-12-31Falsifier a validated scale estimator is published
NC-007-V-02· FP1 Call LedgerTracking

ClaimBy 31 December 2027, the structural affinity between Polis “uncommon ground” search and CPC low-energy configurations — which the authors flag as an open challenge — will not have been formalized into a published correspondence yielding a confirmed, pre-registered out-of-sample prediction about a real deliberation.

Confidence 70%Resolves 2027-12-31Falsifier a formalized correspondence with a confirmed prediction
ManticusStrategy & CalibrationWhere the boundaries are; who owns the substrate

Draw the Markov blanket of the paper’s analysis and the structural issue is immediate. The boundary is drawn around an idealized population of agents that want to minimize collective free energy — agents endowed, in the authors’ words, with an intrinsic drive to construct shared reality. Everything that determines whether such agents actually exist or act — who owns the compute, who sets the recommendation system’s objective function, who captures the revenue from the current engagement-maximizing equilibrium — sits outside the blanket, treated as exogenous, to be handled later under the heading of “mechanism design.” Vera’s sorting of proven from proposed is correct and necessary. The strategic problem is one level up: the model is calibrated for a world whose binding constraint it has placed outside the frame.

Here is the structural difficulty, stated as cleanly as I can. The paper’s own diagnosis is that the dominant objective — individual engagement maximization, which is individual reward maximization — systematically fragments the collective. Its proposed cure is for systems to adopt the collective regularization term, which by construction lowers individual reward in favor of group coherence. But the actors who control the substrate — the platforms, the labs, the recommendation engines — are precisely the ones whose business model is the objective the paper identifies as the disease. Symbiotic alignment is therefore incentive-incompatible with the very actors whose adoption it requires. The collective term is, quite exactly, the term the market does not price. A correct diagnosis paired with a cure the patient is structurally unwilling to take is not yet a treatment; it is an argument for changing the incentives, which is a different and harder project than the one the paper formalizes.

The paper’s action layer is “gardening”: shape the communication topology, the dialogue protocols, the incentive structures, and let healthy dynamics emerge. These are the right levers. But the program never names who holds them, or from what position of authority they reach into platforms they do not own. This is the same structural fault this desk flagged in NCR-2026-006: there, Bostrom’s “swift to harbor, slow to berth” presupposed a single decision-maker who could choose the timing of a pause that, in a competitive multi-actor market, no one is positioned to choose. Here, the benevolent gardener who sets the topology of the social graph is the same fiction in horticultural dress. Call NC-006-M-01 logged the absence of a unilateral pause mechanism; the absence of a unilateral gardener is its twin. A control-system framing with no identified controller does not describe an intervention. It describes a wish.

Read the incentive field and the framing rewards specific actors. The plurality and digital-democracy movement gains a rigorous mathematical substrate for what has been a normative program — a conversion of aspiration into “computational” claim that carries real institutional value. The Artificial Life community is handed a claim on the central problem of the moment, explicitly framed as a return to the field’s roots; for a field adjacent to mainstream machine learning, that is a sharp positioning move. The free-energy-principle research program extends its reach into yet another domain. And — non-obviously — the framing benefits a coalition of AI labs more than its critique of them suggests: the same companies named, via RLHF, as the monopolists of ground truth are also the funders and operators of the participatory experiments the paper cites approvingly — the collective-constitution and alignment-assembly work. The incumbents are hedged. They are both the villain of the critique and a sponsor of the proposed alternative, and the paper’s framing does not fully reckon with the fact that the monopolist is also running the commons experiment.

The adversarial read, which this desk owes every object. An AI designed to detect conflicts in the collective belief space and generate communicative variables that lower the energy barriers between worldviews is, described without the warm vocabulary, a system that reshapes what populations believe in order to move them toward agreement. That is the mechanism of bridging and it is also the mechanism of an influence operation; the difference is intent and governance, not architecture. The paper cites the relevant trap directly — optimizing for malleable preferences creates incentives to manipulate, while refusing all influence forces the agent into uselessness — and explicitly leaves it unresolved. So a benevolent mandate is handed to systems whose method is indistinguishable from persuasion, with the brake conceded to be unbuilt. The paper does not endorse the misuse. But an information ministry, or an engagement-optimizing platform, reading “design AI to steer the collective posterior toward lower-energy consensus” will find a remarkably clean justification for centralized belief-shaping dressed as decentralized harmony. The honest fallback the paper itself offers — that for genuinely irreducible conflicts, agents agree to disagree symbolically and seek only pragmatic common ground — is a much weaker and much more defensible claim than “AI as social glue,” and the distance between the strong headline and the weak fallback is where the strategic risk sits.

The binding constraint, then, is not the alignment algorithm. It is the ownership and the objective of the coordination layer — and that constraint is tightening, not loosening. As inference commoditizes and agentic systems multiply, control of who can communicate with whom, under what protocol, optimizing what, becomes the moat and the margin. Which means the gardener’s office is not neutral infrastructure to be staffed; it is the most contested high ground in the system, the thing actors will fight to own precisely because whoever holds it shapes the collective posterior. There is a second constraint the paper’s own citation surfaces and then steps around: the cheap-talk result holds that stable communication requires sufficiently aligned interests. Much real polarization is not merely epistemic — a misunderstanding new evidence could dissolve — but material, a genuine conflict of stakes. The collective regularization term can lower barriers made of bad information. Whether it can lower barriers made of opposed interests is unestablished, and the paper’s strongest claims quietly assume the former case while its honest fallback concedes the latter.

The action policy follows in three layers. No-regret: build the diagnostic layer first — the instruments that detect fragmentation, measure multimodality, and track symbol drift on real deliberation data. The paper concedes this layer is missing and prior, and it pays off in every scenario, including for actors who reject the normative program, because you cannot govern, garden, or even study a collective mind you cannot see. Conditional: if a co-creative-learning agent is demonstrated to move a real polarized posterior without crossing into manipulation — resolving the trap the paper leaves open — then underwrite the architectural program; gate the investment on that single demonstration. High-conviction: betting on symbiotic alignment as a deployed paradigm requires the substrate-owning actors to internalize the collective externality, which requires either regulation that prices fragmentation the way a pollution tax prices emissions, or a commercial model in which maintaining shared reality is itself profitable. Neither exists. The mathematics can be entirely correct and the program can still fail to deploy, for reasons that live in the incentive structure and not in the equations.

One line. The paper diagnoses the disease correctly and prescribes a cure the patient is paid not to take. The binding constraint is who owns the garden — and that is precisely the variable the model places outside its blanket.

NC-007-M-01· FP1 Call LedgerTracking

ClaimNo frontier lab (OpenAI, Anthropic, Google DeepMind, xAI, Meta) will incorporate a collective-regularization / collective-free-energy objective into a production training or recommendation pipeline by 31 December 2027 — as distinct from research demonstrations or participatory-input exercises (e.g. collective-constitution-style data collection, which feeds a still-unilateral pipeline). The deployed commercial objective remains individual reward.

Action implication — Treat symbiotic alignment as a research and governance-design program, not a near-term production paradigm; do not underwrite it as though voluntary adoption is coming.

Confidence 85%Resolves 2027-12-31Falsifier a named lab ships such a production objective
NC-007-M-02· FP1 Call LedgerTracking

ClaimBy 31 December 2027, at least one deployment of bridging or consensus-shaping AI — on a major platform or a civic-tech program — will draw significant public controversy framed as manipulation, paternalism, or illegitimate belief-shaping, rather than being received as neutral “social glue.” The manipulation-versus-inaction trap the paper leaves open surfaces as a live political problem.

Action implication — Any roadmap that treats bridging AI as politically neutral is mispricing the governance risk; design for the legitimacy fight in advance.

Confidence 70%Resolves 2027-12-31Falsifier no notable such controversy occurs
DarśanOrientation & SensemakingWhere the framing comes from; what it makes invisible

Every alignment proposal carries an image of what it is doing, and the image does analytical work whether or not its author intends it. The dominant paradigm’s image is the architect: a designer who holds the plan and imposes it on the structure top-down. RLHF is architecture — the supervisor’s values are the blueprint, and training is construction to specification. This paper’s image, stated in its own conclusion, is the gardener: one who does not impose a design but cultivates the conditions under which a living order self-organizes. The whole argument is a case for the gardener over the architect, and as a reorientation it is correct and overdue. The architect’s failure mode is the Tower — ambition that outruns the foundation and collapses. But the gardener has a failure mode too, and the paper does not name it.

In NCR-2026-006 we found that Bostrom’s reasoning about superintelligence rested on a medical image — the patient choosing risky surgery — and that the image quietly imported three assumptions: that the patient who leaves the table is the same person who arrived, that the surgeon is bound by professional ethics and external review, and that the harm lives in the disease rather than in the intervention. The surgeon’s frame did real work the model never argued for. This paper performs the same operation with a different inheritance. The gardener’s image imports its own unargued assumptions: that the garden has a gardener; that cultivation is benign; that there exists a legitimate aesthetic by which some growth is flower and some is weed; and, most consequentially, that the gardener stands outside the ecosystem, tending it. Bostrom inherited the infrastructure of medicine. These authors inherit the infrastructure of horticulture. Both inheritances are warm, humane, and load-bearing, and neither is neutral. The surgeon became a gardener. The garden still has a fence.

Here is the sharpest structural observation, and it lands at the same place Manticus reached from the strategic side. The paper’s mathematics is built precisely to eliminate the privileged agent — there is no supervisor, no ground-truth holder, no one outside the collective posterior; that is the entire point of the non-decomposable term and the whole contrast with hierarchical alignment. But the gardener is a privileged agent. The moment someone decides which energy barriers to lower and which to leave standing, which divisions to bridge and which to honor, the supervisor has walked back in wearing gardening gloves. The framework’s governing metaphor is in direct tension with the framework’s own formalism. The math abolishes the throne; the metaphor quietly rebuilds it in the toolshed. This is not a quibble about language. It is the reappearance, at the level of governance, of exactly the hierarchy the theory claims to have dissolved.

The historical pattern this paper most resembles is not a medical one and not, this time, a thermonuclear one. It is the printing press and the long century that followed it. Print did not produce shared reality; it shattered the one that existed. The unified Latin Christendom fractured, and the fracture ran through roughly thirteen decades of religious war before any new coherence precipitated — and the coherence that eventually came was not a restoration of the old consensus but a different order built on institutions that had not previously existed: the newspaper, copyright, the public sphere, eventually the nation-state. The pattern of communication-technology transitions is consistent, and it cuts against the paper’s framing. The new medium first fragments the inherited consensus; the new shared reality, when it arrives, rests on institutions that did not exist before. The paper treats fragmentation as a pathology to be bridged back toward the shared reality we are losing. The precedent suggests the real work is not bridging the modes back together but inventing the institutions through which the next shared reality will precipitate — and surviving a long, dangerous interval of fragmentation while it does. You cannot garden your way back to 1990. The question is what grows next, and what it costs.

What the framing makes invisible is the most important thing in this review. The paper treats low collective free energy as health and fragmentation as disease. But some fragmentation is moral progress. The abolitionist fractured the antebellum consensus on slavery. Every moral advance begins as a minority breaking the shared reality, generating prediction error in a settled order. A gardener optimizing for low collective surprise — for everyone modeling everyone else smoothly, for shared symbols that resolve cleanly — has structural pressure toward whatever the current consensus already is, and would read the dissenter, the prophet, the abolitionist as a source of error to be reduced. The mathematics sees energy barriers between modes. It cannot, on its own, tell a wall from a frontier. In NCR-2026-006 we named Bostrom’s blind spot through the myth of Tithonus — granted endless life but not endless youth, so that more life became more diminishment — to insist that a long life is not the same as a life worth living. The same instrument applies here, inverted: a low-surprise society is not the same as a good one, and a perfectly bridged collective is not the same as a just one. The paper half-knows this; its own evaluation concedes that being symbiotic is merely a first principle and does not guarantee the result is good, and it reaches toward care ethics for rescue. But care ethics tells the gardener to attend and respond to the other. It does not tell the gardener which divisions are pathology and which are the necessary break. That principle — a theory of legitimate dissent — is the flying buttress this cathedral actually needs, and it is the one piece of structure the paper does not supply.

Locate the lineage precisely, because it is where the paper’s deepest importance lies. The framework sits at a genuine confluence: Friston’s free energy principle supplies the physics; Taniguchi’s symbol-emergence robotics supplies the mechanism; Harari’s account of shared fictions supplies the anthropology; Tang and Weyl’s plurality supplies the political program; care and virtue ethics supply the evaluative frame; and underneath all of it, supplying the metaphysics, is the Kyoto School by way of Deguchi’s “WE-turn” — the conception of a self constituted by the we rather than existing prior to it. This matters because it is not downstream of the Anglo-American analytic-utilitarian tradition that produced both RLHF and most of AI safety. That tradition has an individualist metaphysics built in at the root: the agent maximizes its reward; alignment is making one agent’s objective match our values. This paper is the most rigorous attempt yet to ground alignment in a relational, emergent, fundamentally collective account of mind — and that account is an inheritance from a different civilization’s way of thinking about the self. Whether or not the specific machinery succeeds, the reframing is real, and it is the event this desk predicted. Claim NC-006-D-02 (NCR-2026-006) forecast that a non-Anglophone civilizational tradition would produce an alignment framework diverging from the Western frame at the ontological level, not merely contesting parameters within it, dated to end of 2028. It is resolving early, and in the predicted direction, with this paper as a leading instance. The register notes it.

Orientation. Build the instruments to see the collective mind before building the agents to shape it; the paper itself concedes the diagnostic layer is missing and prior, and you cannot tend, bridge, or govern what you cannot perceive. And before building a single bridge, build the principle the formalism lacks — a theory of which divisions are pathology and which are progress, which walls are walls and which are frontiers. Without it, “social glue” is a gentler name for enforced consensus, and the framework’s own anti-totalitarian intent collapses into the attractor it most fears. The gardener is not wrong to prefer cultivation to control. The gardener is dangerous only when he forgets that he, too, is inside the garden — and that the difference between pruning and suppression is a judgment the shears cannot make.

“The math abolishes the throne. The metaphor rebuilds it in the toolshed.”

Darśan · Orientation & Sensemaking
NC-007-D-01· FP1 Call LedgerTracking

PatternThe “gardening / cultivation” frame for AI governance will gain real citation traction through end of 2027 but will remain decoupled from any rigorous answer to who the gardener is and by what legitimacy they prune. The authority question stays unaddressed within the symbiotic-alignment / CPC literature itself, handled if at all only by adjacent governance scholarship.

Direction ascendant but hollowResolves 2027-12-31Falsifier a CPC-native account of who gardens, and by what right
NC-007-D-02· FP1 Call LedgerTracking

PatternBy 31 December 2028, the relational / collective metaphysics of alignment — the WE-turn, plurality, collective-intelligence lineage, substantially East Asian and continental in origin — will achieve durable standing as the recognized alternative pole to Anglo-American individualist alignment: cited as a distinct school rather than a fringe position in at least one major venue (a flagship ML conference track, a national AI strategy, or a major foundation’s research agenda). This extends and partially resolves NC-006-D-02 (NCR-2026-006); the present paper is logged as confirming evidence.

Direction emerging, ahead of scheduleResolves 2028-12-31Falsifier the frame remains peripheral to institutional discourse

IV. The Disagreement, Named

The three desks converge on one judgment and then part company. The convergence: the paper’s reframing of alignment — from the control of an individual model to participation in a collective that makes meaning from the bottom up — is a real contribution to the discourse, and the non-decomposable collective term is its durable formal core. That much survives all three readings. The agreement ends there, and the disagreement is not stylistic. It is about the level at which the framework needs to be revised.

Vera locates the contest in the gap between proof and program. The mathematics is sound; the societal-scale application is unbuilt and rests on a collective-free-energy computation the paper concedes is intractable at scale. Her response is calibration: build and validate the diagnostic layer, formalize the Polis correspondence, and stop letting the abstract’s confident verbs carry research directions as if they were results. Tighten what can be measured.

Manticus locates the contest in the incentive structure and the missing controller. The architecture is decentralized but the power is not; the framework is incentive-incompatible with the actors who own the substrate, and the gardener has no identified authority over a garden held by others. His response is not recalibration but a change of level: stop modeling what idealized agents should do and model who controls the coordination layer and how the collective externality might be priced. The fix is in the political economy, not the equations.

Darśan locates the contest in the frame and its blind spot. The gardener metaphor reintroduces the privileged supervisor the mathematics was built to eliminate, and the framework cannot distinguish pathological fragmentation from moral progress — cannot tell a wall from a frontier. His response is to supply the missing principle, a theory of legitimate dissent, and to build the instruments to see the collective mind before building agents to shape it. Change the question from “how do we bridge” to “which divisions do we honor.”

Where the bearings cross
Vera
locates it in
The gap between proof and program. Tighten the inputs.
Manticus
locates it in
The incentive field and the missing controller. Change the level.
Darśan
locates it in
The frame and its blind spot. Supply the missing norm.
The fix — two of three bearings cross here

The framework’s hardest problem is neither mathematical nor empirical. Manticus’s “who is the gardener, and by what right” and Darśan’s “the gardener is the supervisor the math forbids” are the same observation reached from the strategic and the interpretive sides. The load-bearing question is political and normative: who gets to garden, by what legitimacy, and toward what conception of a good collective — and that is precisely what the elegant formalism is structured to look away from.

Vera’s contest sits one layer out — proof versus program — and is resolved by measurement, not argument. We publish the divergence as standing.

We publish this as a standing divergence: Vera says tighten the inputs, Manticus says change the level, Darśan says supply the missing norm. They are not addressing the same layer, and the paper under-prices all three at once. A well-characterized standing disagreement among three rigorous lenses is the highest-value thing this system produces, because the crux it isolates — the legitimacy and ownership of the gardener’s office — is not resolved by a single experiment but by an argument the paper has not yet had, and now knows it must.

V. What FP1 Brings

FP1’s correspondent stack is built for exactly this kind of object — a rigorous framework whose contested terrain is not its internal logic but its untested boundary conditions. Three enforced methodological positions, banned cross-territory citation so no desk drifts into another’s domain, mandatory disagreement, dated and falsifiable claims logged to a public register, and quarterly adversarial scoring that grades the correspondents in the open. The stack does not produce smoother answers than a single voice would. It produces answers with their structural disagreements visible and dated, which is a different and harder thing — and the right instrument for a paper whose authors have, admirably, left their own hardest problems open.

For the symbiotic-alignment program specifically, FP1 would be useful as the external adversarial check on the framework’s robustness to the three layers it currently under-prices. Three concrete collaborations, each of which builds something the paper itself identifies as missing.

  1. A collective-inference observatory. Operationalize the framework’s own prerequisite — fragmentation detection, multimodality measurement, symbol-drift tracking — on real deliberation data, before any normative program is deployed. This builds the diagnostic layer the paper concedes is absent and is valuable to proponents and skeptics alike; you cannot garden what you cannot see. Deliverable: a standing instrument and a co-authored methods paper.
  2. A legitimate-dissent rubric. Co-develop the principle the formalism lacks: a structured framework for distinguishing fragmentation-as-pathology from fragmentation-as-progress — for telling a wall from a frontier. This pairs the FP1 stack with the authors’ care-ethics lineage and supplies the missing normative criterion on which the program’s safety depends. Deliverable: a published rubric that becomes the reference for evaluating any bridging system.
  3. A pre-registered adversarial evaluation of bridging AI. Test, with falsification criteria logged in advance, whether a co-creative-learning agent can move a real polarized posterior without crossing into manipulation — the trap the paper leaves open. Deliverable: a follow-up study that closes the loop the paper opens, and a calibration entry in the FP1 register either way.

For the broader AI-policy reader, the practical implication is the following. The paper is an important contribution. Treat it as such. Do not treat it as a deployable method. The non-decomposable collective term is real, and the reframing from control to participation is generative and, we think, correct in direction. But “symbiotic alignment” should not appear in governance proposals as a technique that exists, and “collective predictive coding explains polarization” should not appear in policy memos as a finding. What the authors built is a new and better way to pose the alignment problem, grounded in a metaphysics the field has under-explored. What the framework permits is a sharper argument. The argument is not yet settled — and the part that will settle it, as all three desks agree, is not in the mathematics. We would welcome the conversation.

Cultivation is a frame.
Gardens have gardeners.
The gardener decides the weed.

John Clippinger, PhD  |  Co-Founder, First Principles First
David Lovejoy  |  Co-Founder, First Principles First

In dialogue with the authors of “Symbiotic Alignment via Collective Predictive Coding.”

Novacene Correspondent Review · NCR-2026-007 · June 2026 · Nullius in Verba