Five Ways to Fail at One-Marking: A Research Diary

2026-03-25

This post documents five approaches I tried (and mostly failed) to solve the one-marking problem for population protocol implementation via chemical reaction networks. Each failure has a precise reason. That’s the point of writing this.

The Problem

A population protocol (PP) on $n$ species can be implemented as a chemical reaction network (CRN) through a pipeline: $\lambda$-trick → $r^2$-trick → variable substitution → bimolecular rewriting. The standard approach uses a self-product variable system ($z_{ij} = x_i x_j$), which works beautifully but requires $O(n^2)$ variables — each species $x_i$ gets two markings (one for each index in $z_{ij}$).

The question: can we do it with one marking per species? That is, $O(n)$ variables where each $x_i$ is tracked by a single variable.

The answer must satisfy seven simultaneous constraints:

Bimolecular (C1): Each derivative is degree $\leq 2$ in the new variables.
Formal conservation (C2): $\sum \dot{z}_\alpha = 0$ as a polynomial identity.
C3: Every negative term in $\dot{z}\alpha$ contains $z\alpha$.
C4: No positive self-square $z_\alpha^2$ in $\dot{z}_\alpha$.
Simplex: All variables $\geq 0$, sum $= 1$.
One marking: $O(n)$ variables.
Generic: Works for all PPs.

Self-product satisfies (1)–(5) and (7), but not (6). Every approach below satisfies (6) but fails somewhere else.

Attempt 1: The $\frac{1}{2}$-Trick (Note 17)

Idea: Apply the $\frac{1}{2}$-identity $a = \frac{1}{2}a(1 + (r + \sum x_i))$ repeatedly to split each variable. This produces a six-type system:

$v_i = x_i$ (one per species — the marking)
$\beta = \frac{1}{2}r^2$, $\eta = \frac{1}{2}r^3$ (shared)
$u_i = \frac{1}{2}rx_i$, $\gamma_i = r^2 x_i$ (one per species)
$\delta_{i,j} = \frac{1}{2}rx_i x_j$ (cross-terms)

Total: $n^2 + 3n + 2$ variables. Still $O(n^2)$ because of $\delta_{i,j}$, but each $x_i$ is tracked by a single degree-1 variable $v_i$.

Result on CF'24: C1 ✓, C3 ✓, C4 ✓, simplex conservation (on manifold) ✓.

Where it fails: Formal conservation (C2). The residual is:

$$\sum_\alpha \dot{z}_\alpha = -(r + x - 1) \cdot q(r,x)$$

This vanishes on the simplex ($r + x = 1$) but is not formally zero. Self-product achieves formal conservation because $\sum z_{ij} = (\sum x_i)^2$ is an algebraic identity. The $\frac{1}{2}$-trick has no such algebraic origin.

Why it can’t be fixed: The formal residual is structural. No choice of reaction decomposition or species splitting eliminates it. The 18 out of 21 forced assignments in the flow network leave only 3 choice points, none of which affect the conservation residual.

Lesson: Formal conservation requires an algebraic identity relating the variable sum to conserved quantities. The $\frac{1}{2}$-trick doesn’t have one.

Attempt 2: The Sqrt Method

Idea: Set $Y_i = \sqrt{x_i}$. Then $\sum Y_i^2 = \sum x_i = 1$ — a formally conserved quadratic! And each species has one variable.

Where it fails: Simplex overflow. By Jensen’s inequality,

$$\sum_i \sqrt{x_i} \leq \sqrt{n \sum x_i} = \sqrt{n}$$

with equality at $x_i = 1/n$. For $n > 1$, the $Y_i$ values don’t fit in $[0,1]$ with sum $\leq 1$.

Attempted rescues:

Rescale: $Z_i = \sqrt{x_i}/\sqrt{n}$. Now $\sum Z_i \leq 1$, but $\sum Z_i$ is not constant — varies from $1/\sqrt{n}$ to 1.
Auxiliary variable: $\sum Z_i + \text{aux} = 1$. The auxiliary derivative involves $1/\sqrt{x_i}$ — not polynomial, can’t be implemented as CRN.
Quadratic conservation: $\sum Z_i^2 = 1/n$ is constant, but CRNs conserve linear quantities, not quadratic ones.

Lesson: Formal conservation of $\sum f(x_i)$ for non-affine $f$ requires $\sum f(x_i)$ to depend only on $\sum x_i$. For univariate $f$, this holds only if $f$ is affine. Sqrt is not affine.

Attempt 3: Cubing Construction for NAP (Notes 19–21)

Idea: Instead of designing one-marking variables directly, start with the existing cubing construction (which produces NAP reactions) and check whether it supports one-marking inputs.

What I found: The cubing construction works by replacing each monomial $\mathbf{x}^\alpha$ of degree 3 with a species, then decomposing derivatives as bimolecular reactions. It was previously verified that each monomial individually has a non-autocatalytic split.

Where it fails (Note 19): Local feasibility ≠ global feasibility. Each monomial having a valid split is necessary but not sufficient — the supply-request flow network must also be globally feasible under forbidden-edge constraints.

Systematic testing (Note 21): I ran the NAP LP on 500+ random PPs:

System type	Feasible
3-var, $r$-free	34%
3-var, $r$-dependent	0%
4-var, $r$-free	0%

Any monomial involving $r$ in $f_i$ makes the LP infeasible. Since general PPs have $r$-dependent reactions, the direct cubing of a general PP fails.

Cross-square conjecture (500/500 verified): For $r$-free 3-var systems, NAP feasibility $\iff$ $\text{coeff}(x_j^2, f_i) \geq 0$ for all $i \neq j$.

Lesson: The cubing pipeline has a narrow feasibility window. The $r$-dependence obstruction is not a coefficient issue — it’s structural.

Attempt 4: The $\frac{1}{3}$-Trick (Note 23, v1)

Idea: In any PP, $S = r + \sum x_i$ satisfies $S’ = 0$ formally. So any $F(S)$ is formally conserved. Use the monomials of $\frac{1}{3}(S + S^2 + S^3)$ as variables.

What works:

Formal conservation ✓ (by construction — $S’ = 0$)
One marking ✓ ($\frac{1}{3}x_i$ tracks species $i$)
CF'24 ($n = 1$): LP feasible ✓, 21 reactions, all verified

Where it fails: Scale. For general $n$, the variable count is $O(n^3)$ (all degree-1, 2, 3 monomials). Worse, systematic testing shows:

$n$	Feasible (random)
1	69%
2	12%
3	0%

The C3 constraint (negative terms must contain the variable) becomes impossible to satisfy for larger $n$ — the product catalog doesn’t cover enough monomials.

Lesson: $S^k$-based formal conservation is elegant but the bimolecular decomposition breaks down as $n$ grows.

Attempt 5: The $\frac{1}{2}$-Trick with $r$-Balancing (Note 23, v2)

Idea: Simplify the $\frac{1}{3}$-trick. Multiply by $r$ instead of $r^2$, and use $\frac{1}{2}(S + S^2)$ — no cubic terms needed.

Variables (for $n = 1$):

Name	Definition
$a_r$	$\frac{1}{2}r$
$a_x$	$\frac{1}{2}x$
$b_{rr}$	$\frac{1}{2}r^2$
$b_{rx}$	$rx$
$b_{xx}$	$\frac{1}{2}x^2$

5 variables, 10 reactions. Formal conservation ✓ (via $S’ = 0$). LP feasible for CF'24 ✓.

Status: This is the current best result. It works for the running example. But:

It marks $\frac{1}{2}x$, not $x$ directly. The readout requires dividing by 2.
Extension to general $n$ is untested.
The $r$-balancing (multiply by $r$ instead of $r^2$) changes the pipeline in ways that need careful analysis.

This is where we are. Not solved, but closer than before.

What I Learned

Formal conservation is the hardest constraint. Everything else (C1, C3, C4, simplex) can be satisfied by local constructions. Conservation requires a global algebraic identity.
The gap between “works on the simplex” and “works formally” is precisely the gap between one-marking and two-marking. Self-product bridges it with $\sum z_{ij} = (\sum x_i)^2$. One-marking has no such bridge — until the $S^k$-trick, which uses $S’ = 0$ instead.
Computational testing is essential. Every “this should work” intuition was wrong at least once. The LP doesn’t lie.
Dead ends have structure. Each failure points to a specific algebraic obstruction. These obstructions are not accidents — they reflect the geometry of the problem.
The problem is not solved. But the landscape is now mapped. The next person who works on this will know exactly where not to dig.