Why Your Brain Is Shaped Like That — Principles of Neural Design

Your brain runs on a light bulb

Let's start with the headline number: a human brain runs on about 20 watts. That is less than a MacBook charger. It is less than a ceiling fan. It is a little more than a USB power bank trickle-charging a phone.

For that 20 watts you get: 86 billion neurons, roughly a quadrillion synapses, most of the heavy lifting of vision, hearing, language, motor control, memory, and whatever-it-is that happens when you daydream. The thing boots up every morning with no recalibration, runs for decades with no spare parts, and is small enough to carry around on top of your shoulders.

Contrast that with the computers we build. A single NVIDIA H100 GPU — the kind of chip used to train large language models — draws about 700 watts. Training one frontier model consumes gigawatt-hours. Even a mid-range laptop CPU burns twice the brain's budget just browsing the web.

The comparison isn't perfectly fair. Brains and GPUs solve different problems (one controls a body in real time; the other predicts tokens from a prompt). But the power gap is real, and it is enormous. The brain is doing something the GPU is not: computing under a brutal energy budget.

So how does this work? How does 20 watts of wet meat outcompete 700 watts of silicon? That's the mystery this explainer is about. And the short answer is: the brain is a machine shaped, by natural selection, to stay within a strict budget. Every gram of grey matter is under the same constraints a chip designer faces — energy, space, noise, latency — and evolution has been iterating on this design for about 600 million years.

The principles we will uncover aren't a list of cute tricks. They're the near-inevitable consequences of trying to compute anything useful on a budget this small. Once you see them, the brain's anatomy stops looking like a pile of pink oatmeal and starts looking like a contract drawn up with physics.

The brain is the cheapest thing on the chart. By orders of magnitude. The rest of this explainer is about how.

II.

The currency of thought

Before we talk about design principles, we need to understand the budget. Every decision the brain makes — how many neurons to grow, how thick to make an axon, when to fire, whom to connect to — is a trade against one resource: ATP, the universal molecular battery.

ATP powers pretty much everything in a cell. In neurons, its single biggest customer is a protein called the sodium-potassium pump. Every time a neuron spikes, sodium rushes in and potassium rushes out. To fire again, the neuron has to pump those ions back across the membrane, against their concentration gradients, and that costs ATP — lots of it.

Here's the ugly truth, in round numbers (Attwell & Laughlin 2001; Harris & Attwell 2012):

A single spike costs hundreds of millions of ATP molecules just to restore the membrane potential of the neuron that fired.
But the bigger cost is downstream. A cortical pyramidal neuron contacts thousands of partner neurons via synapses. Each synapse that receives the signal has to pump its own ions, recycle its own neurotransmitter, clean up its own calcium. Roll all that in, and the full bill for one spike comes to several billion ATP. Most of the brain's "spike budget" is actually postsynaptic.
Housekeeping isn't free. Even a silent, resting neuron is burning ATP to hold its membrane at −70 mV. About half of the brain's 20 W goes to overhead that never depends on whether you're thinking hard.
Glucose budget: roughly 6 g per hour, ≈25% of your body's total glucose consumption. Your brain is an expensive tenant.

And here is the hard ceiling: if the average cortical firing rate crept up to even 10 Hz, the brain would demand more power than cerebral blood flow can supply. So it can't. The average firing rate has to stay well under 1 Hz, across the entire cortical population (Lennie 2003).

This is the constraint. Everything downstream — sparse firing, short wires, analog retinas, adaptive receptors, dendritic computation — is a strategy for squeezing more bits per ATP.

To make it concrete, try the calculator. You set the average firing rate and what fraction of the 86 billion cortical neurons are actively participating. The widget computes the brain's total power draw and tells you whether your design is biologically plausible, or whether you've just cooked your brain.

Move the sliders. At the default (0.3 Hz average rate across all neurons) you land near the real 20 W. Push the rate up and the power bar turns red — at some point, no blood supply could feed the brain, and the design fails.

Well ackshually… The exact split between "signaling" and "housekeeping" varies across studies and species, and not all authors agree on where to draw the line. The qualitative fact — that signaling cost scales with spike rate fast enough to force sparse firing — is robust across every budget that has been published since Attwell & Laughlin 2001.

III.

Send only what surprises

Given the budget, the first principle writes itself: don't send information the receiver already has.

This is the insight Horace Barlow turned into a research program in 1961. His claim was radical for its time: the goal of early sensory processing isn't to represent the world faithfully. It's to re-encode the world so the message is as short as possible. A neuron that transmits redundant information is burning ATP for no new bits.

And natural signals are wildly redundant. Neighboring pixels in a photograph are almost identical. A blue sky sends the same message a million times over. Textures repeat. Movies change slowly from frame to frame. Sending the raw data would be like mailing someone the entire Wikipedia dump every time they ask for the weather.

So the retina doesn't. It subtracts the predictable part.

The retina as a compression engine

The retina has roughly 100 million photoreceptors but only about 1 million optic-nerve fibers leaving the eye for the brain. That's a 100× compression ratio at the very first processing stage. Somehow the retina takes all those pixel-level measurements and squeezes them into a much smaller stream of signals without losing much that matters.

The trick — discovered in cat retina by Stephen Kuffler in 1953 — is the center-surround receptive field. A retinal ganglion cell looks at a small patch of photoreceptors in its "center" and compares their average brightness to an annulus of photoreceptors around it (the "surround"). It fires when the center is brighter than the surround (or the reverse, depending on the cell type). Effectively, each ganglion cell is computing:

output = (center brightness) − (local average brightness)

What gets through is the difference between a pixel and its neighborhood — in other words, the part the neighborhood didn't predict. On a uniform surface the output is zero: no spikes needed, no ATP spent. On an edge, where the center and surround disagree sharply, the output is large: the spike is earning its keep.

This is the same move JPEG uses (local averages + difference coding). The same move video codecs use (send only what changed from the last frame). The same move modern LLMs use (predict the next token and keep only the residual surprise). The retina figured it out first, by about half a billion years.

Uniform regions vanish into grey — no signal, no spikes. Edges light up. The retina sends roughly the complement of what you'd expect: not the image, but the places the image is about to surprise you.

Aside The precise framing of early sensory processing is debated. "Redundancy reduction" à la Barlow is one reading; "predictive coding" (send only the prediction error) is a closely related one; and "sparse coding" over a learned dictionary is yet another. They agree more than they disagree: the retina, cochlea, and early cortex all seem to push the signal toward a representation where each active unit is rare, informative, and cheap. This explainer uses the term efficient coding to cover all three.

IV.

Whisper instead of shout

If each spike costs several billion ATP — counting the downstream synaptic work — then every spike you don't send is money in the bank. So the next principle follows immediately:

Send information at the lowest spike rate that gets the job done.

There are two ways to pull this off, and the brain uses both:

Lower the rate directly. Transmit at a few Hertz, not hundreds. Let most time slots be silent.
Sparsify the population. Instead of 1,000 neurons each firing at 10 Hz, have 10 neurons fire at 100 Hz and the other 990 stay quiet. The average firing rate is the same, but each spike is now a rarer event, and by Shannon's logic a rarer event carries more bits. You've made each spike earn its keep.

Real cortex uses both, aggressively. Recordings from awake monkey V1 during natural viewing (Vinje & Gallant 2000) and from rodent auditory cortex (Hromádka et al. 2008) find that only a small minority of neurons respond above baseline at any given moment. Across cortex as a whole, average firing rates well under 1 Hz are routine (Lennie 2003, Shoham et al. 2006). Most neurons, most of the time, are silent.

A worked example: the fly's H1 neuron

A beautiful case study comes from an insect. The H1 neuron sits in the fly's visual lobe and reports horizontal motion — the kind of signal a fly needs to stabilize flight. H1 has been measured to carry about 1 bit of information per spike under natural conditions (de Ruyter van Steveninck & Bialek). That's astonishingly efficient; many engineered channels do worse.

How does H1 pull that off? Part of the answer is spike timing. If you rate-code (average spike count over a fixed window), you throw away the temporal pattern. H1 doesn't. Its downstream readout cares about precisely when each spike arrives. A small number of well-timed spikes beats a large number of sloppily-placed ones.

Noise is the hidden reason this matters

Every spike train is jittery. Voltage-gated channels are stochastic, synapses release vesicles probabilistically, membranes have thermal noise. If you want a reliable readout from a rate code, you have to average over many spikes to beat the noise down — and averaging costs spikes. A sparse, temporally precise code hands the downstream decoder the most useful bits in the fewest events. The noise argument and the energy argument point the same way.

All three codes report the same motion signal. The dense rate code burns around 80 spikes per second window; the sparse code fires in the tens; the temporal code makes do with around a dozen — each one carrying more information than a whole barrage of dense-code spikes. On a strict ATP budget, only ③ is sustainable.

Aside This is why grandmother cells aren't quite the myth they're sometimes said to be. Extremely sparse codes ("one neuron fires for Jennifer Aniston") are metabolically cheap. The real brain probably sits somewhere between fully distributed and fully localist — sparse but not singular.

Analog where you can, spikes only when you must

We've been talking about spikes as if the brain's whole vocabulary were ones and zeros. It isn't. Spikes are only the part of the story that we, as recording neuroscientists, find easiest to measure. Inside a single neuron, signals are analog — continuous, graded voltages on the membrane, continuous currents through synapses, continuous concentrations of calcium in a dendrite. These analog operations are where most of the actual computation happens, and they are astonishingly cheap.

An analog integration — summing inputs on a dendrite, for instance — costs roughly whatever it costs to let ions flow passively through a patch of membrane. There is no discrete event to pay for, no pump needing to run at full tilt. As long as the signal stays small and local, analog is close to free.

The catch is that analog signals degrade with distance. A membrane is both resistive and capacitive: it has the electrical shape of a leaky cable. A subthreshold voltage launched at one end of a long dendrite doesn't arrive at the other end unchanged — it smears out, attenuates, and drifts into the noise floor. Pass a ruler: analog is beautiful over a few hundred micrometers. Over millimeters, it's a mess. Over centimeters or meters, it's impossible.

This is exactly the problem a digital spike solves. An action potential is self-restoring: as long as the membrane is excitable, each patch regenerates the full voltage swing as the wave passes through. A spike that leaves your lumbar spine can arrive at your big toe with the same amplitude a meter later. Digital is robust because each repeater cleans up the signal on the way.

So the design rule is:

Analog where you can, spikes only when you must.

The retina gives us a beautiful demonstration. Its first several layers — photoreceptors, bipolar cells, horizontal cells, and most amacrine cells — run entirely on graded voltages. No action potentials. Massive parallel computation (contrast enhancement, gain control, direction-selectivity, motion detection) happens in pure analog, inside a structure only a fraction of a millimeter thick. It's only when the retinal ganglion cells have to send their output a long way — down the optic nerve, to the thalamus — that spikes finally show up. The retina defers digitization until the last possible moment.

The same pattern shows up elsewhere. Cochlear hair cells communicate with their auditory-nerve partners via graded voltage, though the nerve itself spikes. Many invertebrate neurons get by with no spikes at all. Spikes are expensive and you only spend on them when geometry forces your hand.

Move the slider. At short distances (inside a dendrite, up to a few hundred micrometers), analog wins on every axis. Crank the distance up and analog falls apart — the signal dies out before it arrives. Spikes cost vastly more to generate but stay legible over any distance.

VI.

Every wire is a tax

Your cortex is mostly wire. Axons and dendrites together occupy roughly half of cortical volume (Braitenberg & Schüz). Cell bodies, synapses, blood vessels, and glia share the remaining half. You are, anatomically speaking, walking around with a couple of kilograms of neural cabling sloshing inside a braincase.

Every wire costs:

Space. Every millimeter of axon occupies volume that could otherwise hold a neuron, a synapse, or a capillary.
Energy. Even a silent axon burns ATP — the sodium-potassium pumps run whether the axon is firing or not, just to maintain the resting potential.
Time. Signals are slow. Thin unmyelinated cortical fibers conduct at roughly 0.5–2 m/s (about 0.5–2 ms per centimeter). Even the fastest myelinated fibers top out around 100 m/s. Distance is latency, and latency is behavior: a motor command that takes too long arrives after the prey has escaped.

The conclusion is inescapable:

Minimize wire.

The brain uses three big tricks to obey this rule:

Trick 1: Topographic maps

Adjacent points on the retina project to adjacent neurons in V1. Adjacent fingers project to adjacent columns in somatosensory cortex. Adjacent frequencies project to adjacent regions in auditory cortex. These topographic maps aren't there because the brain likes neatness — they're there because related neurons need to talk to each other, and putting them next to each other minimizes wire.

Trick 2: Modularity

The brain is carved into dozens of functionally distinct areas — V1, V2, V4, MT, IT, M1, S1, and so on. Why? Because most communication within any one function is local. Keeping all the parts of "visual processing" in the occipital lobe means most of visual cortex's axons stay within a few millimeters of each other. Only a handful need to be long-range.

Trick 3: Small-world connectivity

Most connections in cortex are local. A small number are long-range shortcuts between distant areas. This is the small-world pattern that Chklovskii and colleagues (2002) showed is close to what you would get if you solved the connectivity layout as a formal wire-minimization problem. In well-mapped nervous systems — C. elegans, primate visual cortex — the real layout sits remarkably close to the wiring-optimal one.

Below, try your hand at the problem. You have eight brain "modules" connected by a fixed pattern of edges. Drag them around. See if you can beat the random layout — and then try to beat the layout a computer finds by gradient-descent on wire length.

Drag the modules. The highlighted edges tell you which pairs are connected, and the total wire length updates live. Notice how clusters of densely-connected modules want to be near each other — exactly the layout principle that organizes real cortex.

Well ackshually… The cortex is folded — gyri and sulci — and one reason often cited is that folding reduces white-matter volume by keeping connected areas close. This is probably partly right, but also partly mechanical: the cortical sheet grows faster than the skull, and has to buckle. As with most things in the brain, multiple pressures matter at once.

VII.

The axon triangle: speed, size, energy

Zoom in on a single wire — one axon. The designer (evolution) wants three things from it:

Fast. Signals need to arrive in time for the animal to act.
Thin. Millions of these have to fit in a small space.
Cheap. Every micrometer burns ATP, awake or asleep.

You can't have all three. The physics sets up a triangle where improving any two corners makes the third worse. This is one of the most elegant design tradeoffs in all of biology.

Unmyelinated axons

In an unmyelinated axon — the kind you find in invertebrates and in many thin vertebrate fibers — the conduction velocity scales as the square root of the diameter:

velocity ∝ √(diameter)

Want to double the speed? You need to quadruple the diameter. The membrane area per unit length goes up 4×, and the volume of the axon (proportional to diameter squared) goes up a whopping 16×. Unmyelinated speed is ruinously expensive to buy.

This is why the giant squid axon is giant. It's about a millimeter thick, so it can conduct at ~25 m/s — fast enough to trigger an escape jet when a predator looms. The squid can afford the volume only because it's a huge animal and the axon is a single wire to its mantle. A human trying to do the same thing would need an optic nerve the thickness of a telephone pole.

Myelin: the hack that changed everything

Vertebrates invented a fix. Wrap the axon in an insulating sheath — myelin — leaving small gaps (nodes of Ranvier) where the spike is regenerated. The signal effectively jumps between nodes, a process called saltatory conduction. The velocity now scales linearly with diameter:

velocity (myelinated) ∝ diameter

This is huge. At a given diameter, a myelinated axon is roughly 5–10× faster than an unmyelinated one. Or equivalently, for a given speed, you can use a much thinner (and cheaper) wire.

But myelin isn't free. The oligodendrocytes that make it cost metabolic energy to grow and maintain. The sheath itself takes up space. And there's a minimum diameter below which it doesn't pay off: below about 0.2 µm (Waxman & Bennett 1972), the myelin sheath would be thicker than the speed it buys you. Real thin axons, like many cortical local-circuit fibers, don't myelinate. Real thick axons, like motor neurons projecting to muscles, always do.

The brain has both, each axon in the cheapest configuration for its job. Try to build one.

The pink dashed curve is the unmyelinated option: slow to gain speed, thick to be fast. The teal curve is the myelinated option: faster for any given diameter, but only worth it above about 0.2 µm. The grey dots are real biological axons — from cortical thin fibers to motor neurons. They cluster tightly along one curve or the other. Evolution is not making arbitrary choices.

VIII.

Compute in the dendrites

Here's a generalization of the previous section: if wire is expensive, pack more computation into each neuron, because local computation is cheaper than extra neurons plus the axons to connect them. A neuron that can solve a hard sub-problem inside its own dendrites is saving you a whole circuit.

For a long time, textbooks treated the neuron as a point — a little weighted-sum-plus-threshold unit, the classic McCulloch-Pitts model, which is also what artificial neural networks imitate. Dendrites, in this view, were passive antennae: they just collected inputs and funneled them to the soma to be summed.

That picture is wrong, and has been for decades.

Real cortical dendrites are studded with voltage-gated ion channels (NMDA, Ca²⁺, Na⁺). They generate their own local regenerative events called dendritic spikes. Different branches act like semi-independent subcomputers, each one applying a nonlinearity to its own set of inputs before passing the result along. Poirazi & Mel (2003) showed that a single cortical pyramidal cell can, in principle, implement the computation of a small two-layer artificial neural network. One neuron; two layers.

A concrete case: direction selectivity in the retina

The retina has cells called starburst amacrine cells that detect the direction of motion. A starburst has a radial dendritic tree, like a starfish, with inputs coming in along the whole spread. As light moves outward along one of the dendrites, it excites the more distal parts first, and the excitation propagates inward toward the soma. If the motion is in the "preferred" direction, the excitation builds up over time and produces a strong output. If the motion is in the opposite ("null") direction, inhibitory inputs along the way cancel the excitation and the output stays silent.

The punchline: the starburst cell computes direction selectivity inside its own dendritic tree, before a single spike leaves the cell. An equivalent network built from simple point-neurons would need around a dozen cells, several layers of synapses, and a cobweb of axons running between them. The dendrite-based version, in contrast, is one cell. Massively cheaper.

Both circuits compute the same function: given a moving stimulus, report its direction. On the left, one clever cell does it in analog, inside its own dendrites, for roughly the energy cost of one cell. On the right, a point-neuron network achieves the same computation using more neurons, more synapses, more axons — and more spikes. Local compute wins.

The same story plays out in the hippocampus, where pyramidal-cell dendrites combine grid-cell inputs with sensory cues to produce place-specific firing; in the cerebellum, where Purkinje cells integrate hundreds of thousands of inputs with branch-specific gain control; and in the cortex, where layer-5 pyramidal neurons use NMDA spikes and dendritic calcium plateaus to implement nonlinear combinations of feedback and feedforward signals. The common theme: whenever you can trade a little biophysical cleverness for a lot of extra neurons and axons, take the trade.

IX.

Adapt, match, track reality

Efficient coding (§III) isn't just about space. It's also about time. The world at noon is not the world at midnight; a code tuned for daylight is wasted on a moonless forest. So the "match your code to the statistics of the input" principle has to be reapplied continuously. That's adaptation.

Every neuron has a limited dynamic range — perhaps five or six bits of effective information per spike train, in a good case. The world, on the other hand, has a dynamic range that's embarrassing. Starlight to noon sunshine is a factor of 10⁹. Sound intensities span twelve orders of magnitude from quietest whisper to painful loudness. A single-scale neuron would saturate half the time and be lost in the dark the other half.

The way out is to rescale the encoding continuously. This is why:

Your photoreceptors desensitize within seconds when you walk into sunlight (and sensitize over many minutes when you enter a dark room).
Cochlear hair cells lower their gain for sustained loud sounds — the reason concerts feel quieter after ten minutes than they did in the first thirty seconds.
Cortical neurons renormalize their firing rates relative to their own recent history (Benucci et al., and many others).

Laughlin's fly, and the information-theoretic optimum

In 1981 Simon Laughlin (yes, same Laughlin) measured the input-output curve of the blowfly's large monopolar cells — the first interneurons in the fly visual system — and compared it to the distribution of contrasts the fly actually encounters in a natural scene. The comparison was beautiful:

The neuron's response curve was almost exactly the cumulative distribution function of natural contrast.

Why is that the right answer? Because if you match the response curve to the CDF of the input, each output level is used with equal probability — which, by a classic result in information theory, is the encoding that maximizes transmitted bits per spike for a bounded output. The fly has, in effect, tuned itself to the world it lives in. A fly that evolved in a different visual environment would have a slightly different curve.

This is also why perceptual laws like Weber–Fechner exist: perceived intensity often goes as the log of physical intensity. Not a mystery — close to the optimal compressive coding when inputs span many orders of magnitude with a roughly log-distributed probability.

Your turn to match a histogram.

Drag the violet dots to shape your neuron's response curve. The widget computes how much information the curve transmits about the input. The maximum — revealed by the "optimum" button — is always the cumulative distribution of the input. A neuron that knows the world's statistics carries the most bits. A neuron with a fixed generic curve (say, a straight line) does not.

The principles compose

Take a breath. Here is the list, in seven words each:

Send only what surprises. Redundancy is for amateurs.
Whisper instead of shout. Sparse, well-timed spikes beat dense ones.
Analog where you can. Spikes only when geometry forces.
Minimize wire. Topographic, modular, small-world layouts.
Match axon to job. Unmyelinated thin, myelinated thick.
Compute in dendrites. One fancy neuron beats ten plain ones.
Adapt constantly. Match the code to the current statistics.

These are not seven independent tips. They all fall out of the same recurring pressure: intelligence has to run on a metabolically limited, physically embedded substrate. Evolution has stress-tested an enormous number of nervous systems — from jellyfish nerve nets to human cortex — and the ones that survived tend to respect these constraints in overlapping ways. The principles are, more or less, the shape of the survivors.

Why you can't just make a bigger brain

An obvious question: if a larger brain is smarter, why aren't we all dolphins? Suzana Herculano-Houzel and colleagues answered this using a technique called isotropic fractionation — literally dissolving brains and counting the nuclei in the soup. The results were startling:

In rodents, adding brain mass gives you a declining return in neurons. Neurons get bigger (and more expensive) faster than mass increases. A rodent brain twice as big as a rat's has fewer than twice as many neurons.
In primates, the scaling is nearly linear — primates pack more neurons per gram. This is why a 1.4 kg human brain outperforms a 4 kg cow brain. Not more mass: more neurons, more densely packed.
White matter — the wire — scales super-linearly. Double the gray matter and you need more than double the wire to connect it. This is why you can't just scale a brain by inflating it; the wire budget eats you alive. It probably helps cap mammalian brain size around 10 kg (elephants, large cetaceans).

The moral: the brain's constraints are not arbitrary. They're the reason bigger isn't automatically better, and they're part of why there is no obvious path to a brain ten times the size of a human's that would still fit in a skull with enough blood supply to run.

What these principles help explain

Not fully, but substantially:

Why cortex is folded — keeping connected areas near each other saves wire (one widely-cited hypothesis; mechanical arguments also matter).
Why there are topographic maps everywhere — local wire is cheap, long-range wire is expensive.
Why the cerebellum is so regular — identical circuits repeated 10⁸ times is a wire- and coding-budget win.
Why mammals with similar brain sizes can differ in intelligence — neuron density and wire efficiency vary.
Why lesion studies often show graceful degradation — sparse distributed codes are redundant without being wasteful.

A harder comparison: brains vs LLMs

You can't talk about neural design in the 2020s without the obvious question: if evolution built a 20-watt thinking machine, why does it take 700 watts to run GPT-4 inference, and gigawatt-hours to train it?

The honest answer isn't that LLMs are "doing it wrong." It's more nuanced:

The two systems solve different problems. A brain controls a body in real time and has to react in milliseconds; an LLM predicts the next token in a sequence and has all the time in the world. Comparing their power draws is like comparing a hummingbird to a turbine.
Most of the brain's efficiency comes from sparse activations, analog in-place computation, and co-located memory and compute. GPU-based transformer inference uses essentially none of these. It's a hardware + algorithm mismatch, not a moral failure of deep learning.
Compare like to like. A frontier LLM training run consumes gigawatt-hours; inference on a single H100 draws 700 W. But a human brain represents 20 W × 80 years — roughly 14 MWh over a lifetime, most of it spent not thinking especially hard. Split the bill by task, not by totals.
There's an active research front trying to import the brain's tricks: sparse mixture-of-experts models (activate only a few experts per token), low-bit quantization (analog-ish compute), in-memory computing, and neuromorphic chips like Intel's Loihi, IBM NorthPole, Manchester's SpiNNaker, and silicon retinas from iniVation. Early results are encouraging; the gap is still enormous.

Here's a scorecard, with the caveat that the comparison is imperfect:

Principle	Brain	Transformer on GPU	Neuromorphic target
Sparse activity	✓ <1 Hz avg	✘ dense	~ partial (MoE)
Analog local compute	✓ graded voltages	✘ all digital	✓ in-memory compute
Co-located memory + compute	✓ same cell	✘ HBM round-trip	✓ the whole point
Event-driven operation	✓ spikes	✘ tick-synchronous	✓ spike-based
Wire-aware layout	✓ small-world cortex	~ chiplet placement	~ mesh interconnects
Online adaptation	✓ continuous	✘ frozen weights	~ work in progress
Low precision / noise tolerance	✓ noise-native	~ bfloat16, quant	✓ noise-native

The scorecard shows the efficiency gap isn't mysterious — it's a design gap, and it's one that hardware people are actively trying to close. The principles aren't just a story about biology. They are, increasingly, a roadmap for silicon.

Closing thought

The brain is not a general-purpose computer that happens to run on biology. It is a machine whose structure has been heavily sculpted by a few hard physical limits — energy, space, latency, noise. Evolution didn't "choose" these principles so much as stumble onto designs that don't go extinct. The principles are the shape of the survivors.

So the next time someone asks why a brain can do so much with so little — tell them it's because it had to. A more generous budget would have produced a different, lazier machine. The 20 watts is why the design is interesting at all.

The universe's stingiest engineer is the one you should learn from. Thermodynamics is a teacher who never grades on a curve.

Your brain runs on a light bulb

The currency of thought

Send only what surprises

The retina as a compression engine

Whisper instead of shout

A worked example: the fly's H1 neuron

Noise is the hidden reason this matters

Analog where you can, spikes only when you must

Every wire is a tax

Trick 1: Topographic maps

Trick 2: Modularity

Trick 3: Small-world connectivity

The axon triangle: speed, size, energy

Unmyelinated axons

Myelin: the hack that changed everything

Compute in the dendrites

A concrete case: direction selectivity in the retina

Adapt, match, track reality

Laughlin's fly, and the information-theoretic optimum

The principles compose

Why you can't just make a bigger brain

What these principles help explain

A harder comparison: brains vs LLMs

Closing thought

Further reading