Reflection on Simulating Time With Square Root Space

The machine makes a claim before the program does.

On the machines we use today, computation appears as a single trace. Fetch an instruction. Touch memory. Mutate state. Move to the next instruction. The trace is so familiar that it starts to feel like computation itself.

But the trace is not the computation. It is one way of operating a substrate.

That distinction is the reason feynpath exists. The von Neumann machine carried computing this far, but nature does not look like a single global instruction pointer. Physical systems evolve locally, reversibly, noisily, thermodynamically, and sometimes quantum mechanically. They expose different resources and different costs. If we want algorithms for those machines, we have to stop confusing the operational trace with the underlying structure.

Ryan Williams’s paper Simulating Time With Square Root Space is not a paper about quantum computers, thermodynamic hardware, or reversible chips. It stays inside classical complexity theory. That is why it is useful here.

It shows that even inside the classical model, changing the representation of a computation can reveal room that the sequential trace concealed.

The Trace Is A Choice

Complexity theory is powerful because it abstracts. A multitape Turing machine is not a CPU, but it gives us a robust way to ask how time and space relate. P and PSPACE are not hardware roadmaps, but they give us durable language for computation.

The cost of that abstraction is that many physical resources disappear. Energy does not appear. Locality is flattened. Reversibility becomes a modeling choice. Noise is usually an error source, not a resource. The model tells us what can be done with bounded time or bounded memory, but it does not tell us which physical paths nature makes cheap.

For about fifty years, the best general simulation of time by space for multitape Turing machines was the Hopcroft-Paul-Valiant bound:

\mathrm{TIME}[t] \subseteq \mathrm{SPACE}[t / \log t].

Williams improves this to:

\mathrm{TIME}[t] \subseteq \mathrm{SPACE}[\sqrt{t \log t}].

That is a large asymptotic change. More importantly for feynpath, it changes the story of why the old bound felt natural. The direct way to simulate a time-bounded computation is to replay its trace. Williams does something else: he decomposes the run, exposes the dependencies between blocks, and reduces the simulation to structured evaluation problems that can be solved in much less space.

This is the point worth keeping: the same computation can have more than one useful representation. One representation makes the trace central. Another makes dependencies central.

time parameter

t2^30

HPV space3.58e7

Williams space1.79e5

HPV / Williams199x

The plot uses log-scale readouts. It shows the generic bounds, not a claim about practical wall-clock performance.

Theorem 1.1, stated formally

For every function t(n) >= n, Williams proves TIME[t(n)] subset SPACE[sqrt(t(n) log t(n))].

Here TIME[t(n)] and SPACE[s(n)] are defined over multitape Turing machines. The theorem is a generic simulation: it does not rely on special structure in the language being decided.

Seeing The Structure

The high-level picture is simple enough to keep in your head.

Start with a Turing machine running for $t$ steps. Instead of treating those steps as $1, 2, 3, \ldots, t$ , partition the run into blocks of length $b$ . A block is not just a slice of time; it touches a small number of tape blocks, carries local state, and depends on earlier blocks that produced the data it needs.

Once you look at the run this way, the simulation problem is no longer only “replay every step.” It becomes “evaluate the pieces that determine the final state.”

The formal proof reduces the simulation to implicitly defined Tree Evaluation instances, using the Cook-Mertz space-efficient Tree Evaluation algorithm as the engine. That matters because it avoids a misleading picture: Williams is not just drawing a prettier DAG over the same trace. He is reducing a time-bounded computation to a structured evaluation problem with parameters that can be optimized.

The dependency view is not magic. It does not make the original computation faster. In fact, the simulation may spend a lot of time. The win is space. Williams’s construction shows that the information needed to decide the final result can be organized so the simulator does not store a near-linear amount of the trace.

There are three moving parts.

First, use block-respecting Turing machines. This old idea from Hopcroft, Paul, and Valiant arranges the computation so tape heads cross block boundaries only at block boundaries in time. That makes the dependency structure controlled enough to encode.

Second, reduce the computation to Tree Evaluation. A Tree Evaluation instance has leaves containing values and internal nodes containing functions. The root value is what we need. In Williams’s reduction, time blocks and tape blocks become the objects whose values must be evaluated.

Third, use Cook and Mertz’s algorithm. The obvious recursive evaluation of a tree stores too much along the path. Cook-Mertz gives a much more space-efficient way to evaluate the tree. Williams’s contribution is to make generic time-bounded computation fit that engine.

Why Tree Evaluation matters

In the Tree Evaluation problem, leaves hold b-bit strings. Internal nodes hold functions that map the values of their children to a new b-bit string. The task is to determine the root value.

The Cook-Mertz result used by Williams evaluates trees of bit-length b, height h, and fan-in at most d in O(d * b + h log(d * b)) space.

Williams sets up the Turing-machine simulation so h = Theta(t/b) and d = O(1).

The block size $b$ is the knob. If $b$ is too small, the tree gets tall. If $b$ is too large, each node value gets expensive. The useful balance is near:

b \approx \sqrt{t \log t}.

That is where the square root bound comes from.

block size b

chosen b2^17

space2.70e5

min b1.29e5

min space2.70e5

This simplified model fixes t = 2^30 and plots b + (t / b) log b. The dashed line is the numerical minimum of this curve; asymptotically the minimizing block size scales like sqrt(t log t), here about 1.79e5.

Where the square root comes from

Ignoring constants and lower-order terms, the Cook-Mertz space bound becomes b + (t/b) log b.

The first term is the cost of handling a block value. The second term is the cost induced by the height of the evaluation problem.

Balancing the two terms gives b approx (t/b) log b, hence b^2 approx t log b. At the asymptotic scale used in the theorem, this yields b = Theta(sqrt(t log t)).

The result also has consequences beyond simulation. By diagonalization, Williams obtains a generic polynomial separation between time and space in a robust model:

\mathrm{SPACE}[s] \not\subset \mathrm{TIME}[s^{2-\varepsilon}]

for space-constructible $s(n) \ge n$ and every $\varepsilon > 0$ . That is not P versus PSPACE. It is not close to resolving the whole question. But it is real movement on a problem where generic progress has been scarce.

Corollary 1.2 in plain terms

The simulation says time t can be represented in about sqrt(t log t) space. Diagonalization then gives explicit space-bounded problems that cannot be solved too quickly by multitape Turing machines.

The formal statement is SPACE[s(n)] not-subset TIME[s(n)^(2-epsilon)] for space-constructible s(n) >= n and all epsilon > 0.

What The Result Changes

The immediate technical change is clear: the old $t/\log t$ generic simulation was not tight for multitape Turing machines. The new bound is $\sqrt{t\log t}$ .

The cultural change is more interesting. Williams notes that it was common to believe that time $t$ could not generally be simulated in $t^{1-\varepsilon}$ space. That belief supported intuitions around derandomization and time-space tradeoffs. The theorem refutes that particular intuition for this model.

This does not mean the community was foolish. It means the trace was persuasive. If the natural representation of a computation is sequential, it is easy to believe the cost of carrying that sequence is fundamental.

Feynpath’s interest is not to overread the theorem. Williams does not prove that physical computers should be quantum, reversible, optical, neuromorphic, or thermodynamic. He proves something narrower and sharper: in one of the cleanest classical settings we have, representation can change the resource story.

That is the right kind of lesson to export carefully.

The export is not:

Williams found a graph trick, therefore nature will beat von Neumann machines.

The export is:

Do not assume the dominant operational trace is the only faithful representation of the computation.

That is a design principle. It is also a research posture.

The Same Discipline, In Atoms

Post-von-Neumann computing is often described as a hardware story. That is only half true. New substrates matter only if we know what algorithms they want.

A substrate is not just a faster box. It changes which operations are natural, which resources are scarce, and which failure modes dominate. Energy may matter more than time. Locality may matter more than instruction count. Reversibility may change thermodynamic costs. Noise may become something to shape rather than something to remove.

This is where the Williams result fits feynpath’s philosophy. It is not a proof that these substrates win. It is a clean reminder that changing the representation can expose structure the old model hid.

Quantum

Structure: Amplitudes over many histories
Axis: time / sampling
Tradeoff: Interference is powerful, but useful speedups require careful structure.
Example: Factoring, phase estimation, quantum simulation

Neuromorphic

Structure: Local dependency updates
Axis: energy / locality
Tradeoff: Memory and compute move closer, but programming models are less universal.
Example: Event streams, sparse sensing, adaptive control

Reversible

Structure: Bidirectional state evolution
Axis: energy
Tradeoff: Avoids some erasure costs, but bookkeeping and error control get harder.
Example: Low-power arithmetic, reversible simulation

Thermodynamic

Structure: Stochastic movement through state space
Axis: search / energy
Tradeoff: Noise becomes a resource only when the landscape is engineered well.
Example: Annealing, sampling, combinatorial landscapes

Optical

Structure: Broadcast and wave interference
Axis: latency / fan-out
Tradeoff: Propagation is cheap, but precision and nonlinear control are expensive.
Example: Fourier optics, matrix operations, signal routing

The word “graph” should also be handled carefully. Williams’s proof uses a precise reduction to Tree Evaluation. Quantum hardware does not simply evaluate the same tree. Reversible hardware does not automatically give the same bound. Thermodynamic systems do not solve arbitrary optimization problems by existing.

The connection is conceptual, not formal.

Still, the concept matters. A von Neumann machine serializes. A quantum system evolves amplitudes. A reversible circuit preserves enough information to run backward. A thermodynamic system moves through a state landscape. An optical system propagates and interferes. Each substrate asks us to stop writing algorithms as if the only real computation is a linear instruction stream over passive memory.

That is the feynpath bet: the next useful machines will not just need ports of today’s algorithms. They will need algorithms that respect their physics.

We are not replacing the machines the world runs on. Not yet. What we can do is prepare the math: find the representations, reductions, and abstractions that make other substrates programmable.

Williams gives a classical exhibit of that habit. Look at the same computation differently. Measure the new costs honestly. Keep the result narrow. Then ask what other paths were hidden by the trace.

Nature was never sequential. Our algorithms do not have to be either.