Shepherd and Flock at the Approach of a Storm, by Jean-François Millet. 1902

Read the Reflection, written 26 August 2021, below the following original Transmission.

COVID-19 is revealing invisible ingredients in our contact networks — breath, touching, and physical surfaces. Understanding these networks is key to assessing how infectious diseases spread and how measures to extinguish epidemics are best deployed. The problem is that these same networks are a manifestation of our social freedoms and economy — and are both hard and dangerous to relinquish. This is basically why voluntary physical distancing is generally proving to be not enough to protect public health and health services. 

Simple epidemiological “toy” models explain why, and to what degree, more restrictive confinement is necessary, and how communities and nations can be eased-out towards normalcy. These models can be revealing and can help with policy, but they come with operating instructions that even specialists can easily overlook.

The central concept in epidemiology found in toy models is the “basic reproductive number,” usually referred to as R0. R0 is the expected number of new infections directly caused by a single infected person at the start of an epidemic (time t=0), and as the epidemic proceeds, the analogous measure is the “effective reproduction number,” or R(t) (hereafter R). R is an average. So, an R=6 in a population of two infected and tens of susceptible individuals could mean that each infected six other people, or possibly that one infected zero and the other, 12. Calculating R for real epidemics can be complex. [1] Notably, this key number may differ from one place to another, change during the course of an epidemic, and, fortunately, be reduced by disease-control measures.

Estimates of R0 for COVID-19 are consistently about 2.5, and, given infectious periods and times from infection to symptoms, this results in the number of cases currently doubling about every three days. [2] The power of exponential growth is clear: after one month or 10 doublings after the 100th case, about 100,000 confirmed cases would have occurred, had physical distancing and confinement measures not been put into place.

R is useful not only in understanding epidemics, but in developing control measures. Unsurprisingly, maintaining R below 1 for a sufficient period will progressively reduce case numbers to zero. The problem for COVID-19 is that not sufficiently lowering R below 1 will mean a more protracted epidemic, many people unnecessarily coming down with the disease, and a progressive strain on — if not the complete exhaustion of — health services. Such measures do indeed “flatten” the epidemic curve, [3] and, with ever-fewer susceptible individuals in the population, eventually reduce R to 0 associated with “herd immunity” (formally, this occurs when the susceptible fraction of the population is less than 1-1/R).

This simple insight is very powerful for yet another reason. Simply touting reductions in R as the ultimate objective of epidemic control overlooks the fact that when the epidemic measures begin will determine the effectiveness of a given reduction in R. As a general rule, R needs to be reduced early to prevent an epidemic, but should it take off (as seen for virtually every country where COVID-19 cases have occurred), strict measures are necessary to enact a reset to near zero cases, as has recently been reported from China. 

For example, starting from 100 infected individuals in a population of 50 million, lowering R from 2.5 to 1 will result in about 100 new infections about every two weeks, based on realistic epidemic parameters. [4,5] This would mean on the order of several thousand cases in a year. But the same measures starting with a population with 100,000 infected people will produce several million infections over the same period. This assumes that people mix freely — real contact networks will result in far fewer new cases. [6] Nevertheless, the take-home message is that the size of the infectious population when measures are engaged is important in determining the degree of confinement, that is, the extent to which measures lower the baseline R0. The near-complete lockdown in parts of China is case-in-point, where R approaching 0 was maintained for about two months, resulting in the virtual collapse of their epidemic. The question now for China — and soon for other nations easing out of strict confinement — is how physical distancing can be tuned to R≈1 or less, with the risk that failure to achieve this will produce a new epidemic. 

Michael Hochberg
Centre National de la Recherche Scientifique
Santa Fe Institute


  1. Heffernan et al. 2005. J R Soc Interface 2, 281–293. doi: 10.1098/rsif.2005.0042
  3. Anderson et al. 2020. The Lancet 395, 931-934. doi: 10.1016/S0140-6736(20)30567-5
  4. Bjornstad. 2018. Epidemics : Models and Data Using R. Springer
  6. Keeling & Eames. 2005. J R Soc Interface 2, 295-307. doi: 10.1098/rsif.2005.0051


T-008 (Hochberg) PDF

Read more posts in the Transmission series, dedicated to sharing SFI insights on the coronavirus pandemic.

Listen to SFI President David Krakauer discuss this Transmission in episode 27 of our Complexity Podcast.


August 26, 2021

Two Easy Pieces

When I wrote my Transmission essay in April 2020, scientists, particularly those with some background in virology and epidemiology, were just beginning to apply their knowledge to this never-before-seen situation. True, we were warned by Bill Gates. Also true, there had been pandemics before, but the two most notable in modern times—the 1918 Influenza and AIDS—were different than COVID-19. The Spanish Flu eventually killed tens of millions worldwide, in part because public education, communications, and epidemiological models were not what they are today. Major breakthroughs in epidemiological modeling1 in the decades following the 1918 Influenza provided a basis for understanding subsequent influenza pandemics and the AIDS pandemic.

More than any infectious disease to date, COVID-19 stands alone in the massive mobilization of the mathematical modeling community. Models of all shapes and sizes were poised, and within weeks all baseline epidemiological parameters had been estimated for the disease and its causal virus, SARS-CoV-2. But perhaps most singularly, thanks to communication networks—social media, traditional media, and the lightning-fast dissemination of research preprints—information flowed freely, fostering progress. But information also flowed in huge quantities, which, along with inaccurate or poor communication between the scientific community and the public at large—generated confusion and mistrust.2

I’ve always championed first principles and parsimonious theory, and COVID-19 made clear that computational technology can go too far in the justification of data-driven, sometimes mind-bogglingly complicated models. John von Neumann said,3 “With four parameters I can fit an elephant, and with five I can make him wiggle his trunk,” which could be construed to imply that data justifies truth, which, in the end, is little more than a compilation of details.4

Two central variables derived from the most basic epidemiological models go a long way toward understanding COVID-19. They are: the effective reproduction number Reff and the fraction of infectiousness in the non-immune population I. True, Reff can be decomposed into further parameters, foremost among them being the basic reproduction number R0, the immune fraction of the population, and contact network heterogeneity. True too, the baseline epidemiological model has other important parameters (e.g., incubation and infectious periods, fatality ratio), but they wind up influencing one or both of Reff and I.

This has both conceptual and practical implications, but above all, it is compellingly simple. In the early stages of an epidemic when the susceptible fraction is large, growth is Reff and the population burden is the product of I and Reff. I cringe when I hear “this is so important because it’s growing by X%.” No! Or rather, yes, it can be: 100% interest on $1 is a pittance compared to the same on $1M, and 0.001% on the latter is still far more lucrative than 100% on the former. Two pieces are needed: growth rate and capital.5 The same logic goes for COVID-19.

So, this is what you need to know:6

Data-driven statistical models now routinely estimate R0 and Reff. R Ceff is the reduction in Reff due to measures such as social distancing and lockdowns. Low numbers of infectious cases “buy time” in exponential growth. Higher numbers mean that health services are stressed and risk future collapse—lockdowns are necessary. Capping R Ceff at 1.0 makes most sense if active case numbers are low (since otherwise although the curve is flattened, new case numbers remain high). But capping at low case numbers is problematic, because it stymies the growth of natural immunity. Capping is also a difficult sell for governments, whose constituents see that freedoms remain limited despite the virus apparently being under control.

Thus, importantly, there is a trade-off between minimizing morbidity and mortality and negative externalities to individuals and society.7 A recent study8 of optimal COVID-19 control is consistent with the above schema, and in particular the endpoint at R Ceff ≈ 1, which in turn is supported at least for some countries.9 R Ceff ≈ 1 is, however, unsustainable due to the negative externalities associated with a long waiting time to achieve herd immunity. Until only very recently, vaccination in conjunction with physical distancing and natural immunity was knocking R Ceff below 1 in many countries, promising local virus endemicity or extinction . . . but now as I write, the Delta variant is surging, suggesting that simple models are limited when it comes longer-term evolutionary dynamics.

Nevertheless, data-driven, computationally complex models are essential for determining what, when, and how much in decision-making. However, we shouldn’t be blinkered into thinking that their “truth” makes them invariably superior to coarse-grained, toy models. Due to the importance of how science is communicated, and the centrality of collective behavior in both preventing the spread of SARS-CoV-2 and in vaccination campaigns, I believe that simple models will be pivotal in vanquishing the COVID-19 pandemic.

Read more thoughts on the COVID-19 pandemic from complex-systems researchers in The Complex Alternative, published by SFI Press.

Reflection Footnotes

1 F. Brauer, C. Castillo-Chavez, and Z. Feng, 2019, “Introduction: A Prelude to Mathematical Epidemiology,” in Mathematical Models in Epidemiology. Texts in Applied Mathematics, Vol 69, New York City, NY: Springer, doi: 10.1007/978-1-4939-9828-9_1

2 M. Hochberg, “COVID-19 in the Information Commons,” 30000’ blog, March 1, 2021,

3 Attributed to von Neumann by Enrico Fermi, as quoted by Freeman Dyson in “A meeting with Enrico Fermi” in Nature 427 (22 January 2004), p. 297.

4“What a useful thing a pocket-map is!” I remarked.

“That’s another thing we’ve learned from your Nation,” said Mein Herr, “map-making. But we’ve carried it much further than you. What do you consider the largest map that would be really useful?”

“About six inches to the mile,” I said.

“Only six inches!” exclaimed Mein Herr. “We very soon got to six yards to the mile. Then we tried a hundred yards to the mile. And then came the grandest idea of all! We actually made a map of the country, on the scale of a mile to the mile!”

“Have you used it much?” I enquired.

—Lewis Carroll, Sylvie and Bruno Concluded

5 Famously the basis of Thomas Piketty’s 2014 book, Capital in the Twenty-First Century.

6 This schema shows the procession from initial attempts to mitigate outbreaks, either to optimization if successful, or suppression if unsuccessful. Once through a cycle, subsequent strategies will depend on active case numbers (and the correlated impact on health systems). Low case numbers are more likely to err toward baseline physical distancing and self-isolating, whereas high numbers meet with more restrictive curfews and lockdowns. Definitions of “low” and “high” and how “mitigation” is demarcated from “suppression” are somewhat arbitrary. “Low” and “high” depend on how decision-makers decide on thresholds of action, whereas mitigation and suppression are the packages of measures considered sufficient to either slow or reverse growth. See M.E. Hochberg, 2020, “Importance of Suppression and Mitigation Measures in Managing COVID-19 Outbreaks,” medRxiv,; M.E. Hochberg, 2020, “Countries Should Aim to Lower the Reproduction to 1.0 for the Short-Term Mitigation of COVID-19 Outbreaks,” medRxiv, doi: 10.1101/2020.04.14.20065268

7 Prioritizing the former, as so many countries have, sacrifices the latter, and as it turns out, also generates negative externalities on social welfare, psychology, and the economy.

8 G. Li, S. Shivam, et al., 2020, “Disease-Dependent Interaction Policies to Support Health and Economic Outcomes during the COVID-19 Epidemic,” available at SSRN, doi: 10.2139/ssrn.3709833

9 M.T. Sofonea, C. Boennec, et al., 2021, “Two Waves and a High Tide: The COVID-19 Epidemic in France,” Anaesthesia Critical Care & Pain Medicine 40:100881, doi: 10.1016/j.accpm.2021.100881