Resources: Complex Systems Summer School 2023

2023 SFI Complex Systems Summer School cohort at the Institute of American Indian Arts

Program Overview

Many challenges in the world today – disease dynamics, collective and artificial intelligence, belief propagation, financial risk, national security, and ecological sustainability – exceed traditional academic disciplinary boundaries and demand a rigorous understanding of complexity. Complexity science aims to quantitatively describe and understand the adaptive, evolvable and thus hard-to-predict behaviors of complex systems. For more than 30 years, SFI's Complex Systems Summer School has provided early-career researchers with formal and rigorous training in complexity science and integrated them into a global research community. Through this transdisciplinary, highly collaborative experience, participants are equipped to address important questions in a range of topics and find patterns across diverse systems. This program took place June 11 – July 7, 2023 in Santa Fe, New Mexico, USA.

Group Projects

Jan Hurt

Complexity Science Hub Vienna

Ravi Ranjan

Helmholtz Institute of Functional Marine Biodiversity (HIFMB)

Helen Scott

Boston University

An Agent-Based Model of Plant Cells to Investigate Lateral Root Development

Lateral root development is an excellent model for studying plant organogenesis due to its welldefined stages and cellular processes. A central challenge in studying lateral root development is to understand how mechanisms at one level of biological scale (i.e., cell-level) interact to produce higher-level (i.e., tissue-level) phenomena. In the root, the plant maintains a supply of quiescent stem cells – stem cells that can be converted into actively dividing stem cells when needed. The stem cells undergo asymmetric cell division, resulting in a large semi-differentiated cell and a smaller stem cell that continues to grow and divide. Under gradients of plant hormones such as auxin and cytokinin, the semi-differentiated cells enter another phase known as endoreduplication, where they synthesize multiple copies of their genome without dividing. The interconversion of these cell types results in a gradient of cell types in the root tissue, from actively dividing stem cells at the growing tip to elongated, differentiated cells at the end. While these processes have been characterized biochemically, they have primarily been studied at the cellular level, and their interaction to generate root tissue has not been investigated. In this project, we plan to use agent-based models to understand how the cellular processes play out in a spatial se[ng to result in a well-organized tissue. Agent-based modeling (ABM) is a computational technique that can be used to model collections of individual biological cells and compute their interactions, which generate emergent tissue-level results. While ABMs of bacterial or animal cells have been developed, in plant science, ABMs have predominantly operated at larger scales, where agents represent individual plants or plant building blocks (aka metamers). Here we present the core of an agent-based model of growing plant cells, which must follow different physical rules than bacterial or animal cells. Future work will develop this core further to incorporate various factors in lateral root formation, such as hormonal regulation and explicit spatial structure.

Lydia Reader

Washington University in St. Louis

Anja Janischewski

Chemnitz University of Technology

Pablo Geraldo

University of California, Los Angeles

Bernise Ang

Zeroth Labs

Carla Coburger

University of Bayreuth

Shabaz Sultan

Radboud University Nijmegen

Jana Lasser

Graz University of Technology

Agent-based models from data

Sophisticated agent-based models (ABMs) are increasingly used to model complex dynamical systems. However, ABMs face strong criticism regarding their inadequacy of validation and calibration practices. ABMs tend to have many free parameters that need to be set. A standard approach to calibrate parameters of agent-based models is to compare the moments of distributions for some outcome of interest as simulated by the model to the moments observed in the real world. However, this approach does not guarantee that on the micro-level the simulation behaves in the same way as the real world, as a number of micro behaviours might result in the same macro outcome. A way to improve on this situation is the use of micro-level observations to calibrate the micro-parameters of the system, such as individual agent attributes [1]. In addition, there might be important mechanisms that are ignored in the simulation to reduce complexity. The choice of which mechanisms to model and which to ignore can be arbitrary, threatening the external validity of the model. Multi-agent inverse reinforcement learning [2] can provide a principled way to select mechanisms for a model from a pool of probable mechanisms given data.

In our project we set out to learn about these methods to calibrate or discover agent-based models given observations from a system of interest. We discovered that there is a substantial gap between the approaches that are used in research applications and the available learning resources. Therefore, next to learning about these approaches, the main aim of the project was to compile learning resources on modern ABM approaches for teaching students and learning teachers. We provide a growing collection of literature on recent attempts to address challenges with ABMs [3]. We created a simple model (“A Bee Model”) with one free parameter that can be used as a benchmark model to test different approaches [4]. We provide translation of the model into a differentiable form and a mechanism to “learn” the model’s parameter from data [4]. Finally, we started to compile a list of materials for learning ABMs from data [4].

References
[1] Monti et al., On learning agent-based models from data, Scientific Reports (2023).
[2] Bergerson et al., Multi-Agent Inverse Reinforcement Learning: Suboptimal Demonstrations and Alternative Solution Concepts, arXiv (2021).
[3] Zotero library „ABM from data“, [https://www.zotero.org/groups/5097156/abm_from_data](https://www.zotero.org/groups/5097156/abm_from_data](https://www.zotero.org/groups/5097156/abm_from_data](https://www.zotero.org/groups/5097156/abm_from_data).
[4] GitHub repository with growing collection of learning resources: https://github.com/lydiareader/abm-from-data

Jana Lasser

Graz University of Technology

Julian Manieson

Baillie Gifford

Guram Mikaberidze

University of California, Davis

Urvish Parikh

Nirvana Health

Asael (Ace) Sorensen

Sandia National Laboratories

Alexius Wadell

Carnegie Mellon University

Mathieu Baltussen

Radboud University (NL)

AI-Lycan

Werewolf requires players to originate and execute elaborate coordination and deception strategies in a dynamic context. Our view is that evaluating LLM's performance in such strategic games could introduce a novel benchmark for assessing intelligence and capacity for coordination, which has been largely overlooked in the prevailing literature.

We constructed a framework that allows AI vs AI and AI vs human games under the eye of a Moderator. The AI agents were given prompts adhering to Microsoft’s guidance framework, enabling us to initiate a chain of thought programming at every stage of the game. Various prompts and chains of thoughts were experimented with, but the results at this juncture remain inconclusive as to the optimal prompt layout.

A significant improvement (p=0.034) was observed in the villager's performance, relative to random guessing, with the use of GPT-3.5-turbo. This is very strong evidence for coordination between agents as the villager character benefits when the other characters 'Seer' and 'Doctor' collaborate. However, a thorough analysis of the conversations and voting behaviour among players is necessary for a comprehensive understanding of the underlying drivers. Despite some open questions, this assessment of villager performance suggests that inter-agent cooperation is possible with current technology, and provides a robust platform for the evaluation of various large language models going forwards.

Joris Bücker

University of Oxford

Damla Akoluk

Delft University of Technology

Bouldering Route Complexity

Indoor bouldering is a climbing sport that requires route setters to come up with the route. Bouldering route setting requires creativity as well as a deep understanding of the physiological feasibility of a climbing route. This paper proposes a novel methodology for assessing route creativity through the analysis of the hold positions of climbing route in a corpus of over 30,000 routes. For each climbing problem, we extract the most likely route and divide those into triangular shapes that mimic body positions on the wall. We create a network of these triangles where triangles that often follow each other sequentially in bouldering routes are more strongly connected.

We find that route setters employ larger and more complex shapes for bouldering routes that have a higher difficulty level. However, we find that the most central shapes by eigenvector centrality are remarkably consistent across grades, pointing to uniformity underlying different creative styles of route setting. Similarly, we find a large variety of triangles that are used by individual route setters. However, using the much more regular triangles that are most central, we propose a methodology that can identify styles of route setting and assign different route setters to these styles.

Our work touches on deeper questions about the meaning of creativity and how creativity can be quantified. Other sports, such as golf, have similar pressures for creativity, and we hypothesize that our techniques can be generalized more widely.

Cheyenne Jarman

Oregon State University

Golnar Gharooni Fard

University of Colorado Boulder

Virginia Domínguez-García

Estación Biológica de Doñana - CSIC

F. N. (Faith) Masibili

Old Dominion University

Yuanmo He

London School of Economics and Political Sciences

Joris Bücker

University of Oxford

Annie Stephenson

Princeton University

The complexity of measuring music complexity

We introduce the use of image compression to analyze the complexity of an artist’s discography over time. The ratio of a compressed file size to the size of the raw file is a proxy for the entropy in the file. We can represent a song as an image using chromographs, which bins the frequencies in the song into the twelve pitches in the Western musical scale and show the intensity for each pitch as a function of time. We chose to focus on one well-known musical group, The Beatles, as a case study for proof of concept. While the mean entropy across songs in each album did not vary significantly over time, the standard deviation in entropy increased over time. We apply the same compression algorithm to quantify the similarity between songs. Within-album song similarity shows a similar increase in standard deviation for later albums. The results do not indicate that songs that were released in close succession are more similar. In addition to this metric, we use the Diffusion Entropy Analysis [Nardelli et al. 2022] to measure the complexity of pitch over time within a song. However, since this method does not include percussion, the results are different from those measured on the chromographs. We also measure the complexity of lyrics. We first use more established text complexity measures for lexical diversity and readability. Then we use the average Euclidean distance or cosine similarity of all the words in a song as a novel measure of the semantic complexity of the lyrics. The different measures correlate with each other. The mean text complexity, especially measured as the average semantic distance between words, appears to increase over time.

Buongiorno Nardelli, M., Culbreth, G., & Fuentes, M. (2022). TOWARDS A MEASURE OF HARMONIC COMPLEXITY IN
WESTERN CLASSICAL MUSIC. Advances in Complex Systems, 25(05n06), 2240008.
https://doi.org/10.1142/S0219525922400082

Omar Aguilar

University of California, Santa Cruz

Vidyesh Rao Anisetti

Syracuse University

Golnar Gharooni

University of Colorado Boulder

Aditi Kathpalia

Institute of Computer Science
of the Czech Academy of Sciences

Determining Abstract Computational Machines using Causal Discovery

Outline for Future Work

Much of natural science is dedicated to discovering mechanisms or models underlying any observed phenomena. Computational mechanics is a recent interdisciplinary field which brings together ideas from natural science, information theory, and theoretical computer science for this purpose. It provides an approach to determine minimal abstract machines underlying information processing in natural systems and also to describe structure and complexity of the process based on the machine. These abstract machines are defined as ε-machines and have been shown to serve as the smallest maximally predictive model of the given process. Despite its big promise and potential, computational mechanics is currently not very practical. One of the reasons for this is that it can reconstruct ε-machine only for systems consisting of an ensemble from a single variable. Further, the reconstruction of these machines requires identification of ‘causal states’ which are based on estimation of conditional probabilities, requiring large amounts of data. In this work, we investigate the potential of incorporating concepts from causal discovery in time series analysis to address the limitations of computational mechanics. By illustrating these limitations with an example, we propose how causal discovery methods such as Transfer Entropy and Compression-Complexity Causality can be employed to overcome the challenges discussed.

Aanjaneya Kumar

Indian Institute of Science Education and Research (IISER)

Henry Secaira

Arizona State University

Kate Bubar

University of Colorado Boulder

Max Olivier

The MITRE Corporation

Samuel Ropert

Universidad San Sebastian

Dialect Split

Languages harbor an astonishing diversity of pronunciations and variants of words. This richness can eventually lead languages to diverge into dialects, such as the variants of Spanish. Conversely, convergence of language is also possible. These dynamics may depend upon the connectedness of populations. The broad goal of this project is to investigate the evolution of language and if/how different dialects might split or converge over time depending on the network structure that connects agents from a population. As a starting point, we considered the Utterance Model of Language change.

The model looks at how the distribution of variations on a specific utterance evolves over time. The utterance can be a word, phrase, or even an entire dialect. In the model, individuals have two-person “conversations” with each other in which they may use several variants of the word, each at different frequencies. After the conversation, each individual updates their distribution of word usage, increasing (or decreasing) the probabilities of the variants used during the conversation, also taking into account how frequently their conversation partner used each variation of the word. As such, the evolution of the word distributions in the population is a direct result of pairwise conversations. Because we wanted to allow for the possibility of different network structures aside from a well-mixed population, though, we implemented an ABM version of the model. Starting with all agents having random distributions for the frequency variants of a word, we considered how those distributions evolved over time depending on the network structure (small world, complete, cycle, tree), number of agents, and weights given to self/others word usage. We specifically look at whether the distributions converge to a case where each agent uses the same word variant with high frequency, and how long that convergence takes. Going forward, we want to consider further parameter modifications such as larger vocabulary vectors, heterogeneous values for the self/other weightings that determine distribution update rules, and more clustered/homophilic network structures.

Jan Hurt

Complexity Science Hub Vienna

Munjung Kim

Indiana University Bloomington

Henry Secaira

Arizona State University

Do societies have a limited information processing rate?

There is a lot of anecdotal and qualitative evidence suggesting a cap on the amount of information that various societies can process. For example, a major event or topic, such as the COVID-19 pandemic that emerged in March 2020, dominates public discourse and eclipses other significant issues like climate change. It hints at a finite "bandwidth" of public attention, where the surfacing of a novel issue decreases attention to pre-existing subjects. Similarly, in the scientific domain, the expansion of a scientific field and the subsequent increase in topics can overwhelm researchers, fostering the rise of new sub-disciplines. These sub-disciplines effectively filter information and enable researchers to manage the influx of new publications. In our study, we used the metadata from 1.7 million arXiv articles to calculate a probabilistic embedding of each abstract. Each abstract is represented as a probability distribution over latent topics. The entropy rate was determined by computing the Shannon entropy of the cumulative distribution of all documents published within a year in each sub-discipline. Plotting the entropy rate against the number of papers published in each field over time (see Figure) reveals no direct correlation between the entropy rate and the number of preprints published in the respective timeframe. Furthermore, contrary to our initial hypothesis, the entropy appears to decrease for some sub-disciplines. For future research, we intend to extend this methodology to other datasets (APS, Patents, Reddit, etc.) and examine the correlations between the entropy rate and other indicators, such as the number of unique authors.

Bernise Ang

Zeroth Labs

Joris Bücker

University of Oxford

Virginia Domínguez-García

Estación Biológica de Doñana - CSIC

Anna Jon-And

Stockholm University

Ravi Ranjan

Helmholtz Institute of Functional Marine Biodiversity (HIFMB)

Global network of linguistic traits

Languages are complex systems. Each language is built up of a set of structural traits that interact with each other, forming the grammar of the language. A grammatical trait can describe for example word order, tense marking on verbs or plural marking on nouns. Relationships between grammatical traits have been studied mainly between pairs of traits and usually based on limited sets of languages. In the study of mechanisms of language change, linguists have predominantly focused either on universal cultural selective pressures, such as learnability, expressivity and ease of production/perception, or on extralinguistic conditions such as speaker population size and proportions of second language speakers. However, factors intrinsic to a language such as the present grammatical structures can regulate language change, allowing some changes to be easier than others. Since grammatical structures vary a lot across languages, their influence on future grammatical change tends to be overlooked. The aim of this project is to study path-dependency in linguistic systems by analysing relationships between linguistic traits at a global systems level.

We use Grambank, a database of 195 traits for 2467 languages, to establish a network of traits and languages. We find that quantitative trade-offs seem to play an important role both in terms of the total number of traits in a language and in terms of the distribution of traits between the verbal and the nominal domain (See Figure 1 left and right respectively). Qualitative or
functional trade-offs seem to have a more limited influence. We also find that hierarchy between traits seems less important than usually assumed.

We plan to do a series of further analyses in the future, especially looking into specific trait groups that tend to co-occur and the role of phylogenetics in language change. When complete, our findings will generate a holistic picture about the path-dependencies in language at an unprecedented global scale.

Alexius Wadell

Carnegie Mellon University

F. N. (Faith) Masibili

Old Dominion University

Hanna Isaksson

Umeå University

Katarzyna (Kasia) Goch

Institute of Geography and Spatial Organization
Polish Academy of Sciences

Lydia Reader

Washington University in St. Louis

Tingting Ji

The Hong Kong Polytechnic University

How do Cities Grow? A Nexus between Urban Policies and Citizen Decisions

Summary of Project Findings:

Urban systems are dynamic, complex and self-adapting human modified environments exhibiting emergent properties: nonlinear dynamics, feedback loops, and high interconnectivity and unpredictability. At the same time, they are a result of hierarchical, top- down planning processes, specified in local development plans, defining in detail spatial development for a given area. In our study, we aim to understand how planning-constrained location decisions of human individuals and their interactions shape urban land-use patterns.

We specify the aim of the study with two objectives: First, we investigate how citizens’ subjective well-being are influenced by the trade-offs between top-down urban planning policies, specified as built-up suitability factors of planning zones, and bottom-up citizen decisions driven by the travel time to their POIs and preferred built-up density. Secondly, we explore how urbanization patterns interpreted as morphological properties of urban structures are influenced by, or sensitive to, the micro-level interactions of individual citizens, affecting the travel time and built-up density. We build an agent-based model where citizen agents make decisions on where to build houses in an artificial urban environment.

Agents’ decision-making on home locations are modeled using utility functions that consider land-use zoning constraints and citizens’ perceptions for travel congestion and neighborhood overcrowding as more agents move in the city over time. Results of our study will help to gain more understanding of the underlying baseline processes of urban growth.

Urvish Parikh

Nirvana Health

Carlos Calvo-Hernandez

RAND Corporation

Jean-Luc Kortenaar

Dalla Lana School of Public Health, University of Toronto

Damla Akoluk

Delft University of Technology

How to Lose a Friend?: a simulation of social network polarization due to climate misinformation

Climate misinformation can hinder public support for climate action and impede climate policy. Misinformation can spread through formal and informal social networks. We aim to understand how misinformation, the spreading of false or inaccurate information, can affect individual beliefs and their relationship with others within a network.

We simulated an agent belief network where agents' beliefs, and their relative proximity, are affected by the content they consume. We initialized 12 agents with news article headlines from a corpus of news articles on climate change (N = 10509, from 2014 to 2023) from sources across the political spectrum including The Washington Post, BBC, Fox News, among others. Nine of the agents received climate change news articles that were scientifically accurate, while the remaining three were given articles containing misinformation. Article headlines were embedded using MiniLM-L6-v2 (dimensions = 384) and represented the initial beliefs of the agents. Network topology was determined using a Watts–Strogatz random graph model. An additional 2000 reputable news article headlines were subsequently added to the network one-by-one; the articles were either accepted or rejected by each agent based on the similarity with their current belief embeddings. The consumption of new media had an impact on both the agents themselves and the distances between them. The agents' beliefs were updated as a result of this consumption, and their beliefs were then diffused to all neighboring agents. Over the 2000 iterations, we kept track of eigenvector centrality, a measure of relative node importance. The node with the highest eigenvector centrality in the network changed during the experiment.

Yannick Oswald

University of Leeds

Anja Janischewski

Chemnitz University of Technology

Aditi Kathpalia

Czech Academy of Sciences

Munjung (MJ) Kim

Indiana University Bloomington

Aanjaneya Kumar

Indian Institute of Science Education and Research (IISER)

Omar Aguilar

University of California, Santa Cruz

Machine learning to predict the total stopping times of Collatz map orbits

The Collatz conjecture, one of the most notorious problems in number theory, posits that any positive integer eventually reaches the number 1 when iteratively subjected to a specific set of mathematical operations. Despite verification of integers up to 268, a formal proof, or disproof, remains out of reach.

Understanding mechanisms that determine the total stopping times — the sequence of numbers generated before reaching 1 — could offer insights into this conjecture and its underlying mathematical structure. The total stopping times of a certain few starting point integers, for example, the powers of two, are obvious. Based on the property that all its prime factors equal two, in conjunction with the rule that any even number will be divided by two, we know what the orbit of the number looks like – it simply decays to 1 in n steps where n is equal to the exponent. This led us to hypothesize that, if we collect enough data about any positive integer, the seemingly chaotic trajectories for most of them, and specifically their length, might be predictable from features like their prime factorization and so on, before computation. We tested hierarchical clustering approaches and autoencoders, a type of neural network, but found, so far, no significant relationship between the starting points and the orbit length. For future work, we do suggest (i) systematically expanding data collection of orbit features as we worked with a rather limited set of features so far and (ii) working with trajectory data from a 2-adic and 3-adic perspective which might incorporate information about “closeness”, and hence trajectory length, of distinct integers that goes undetected in the decimal numeral system.

Jean-Luc Kortenaar

Dalla Lana School of Public Health, University of Toronto

Aanjaneya Kumar

Indian Institute of Science Education and Research (IISER)

Julian A. Manieson

Baillie Gifford

Asael (Ace) Sorensen

Sandia National Laboratories

Max Olivier

The MITRE Corporation

Daniel Torren Peraire

Autonomous University of Barcelona

Meritocracy and Inequality

Meritocracy is the idea that people are selected for positions based on their merits and abilities and not their inherited status. Researchers have proposed many ways in which inherited advantages might continue in a meritocracy, primarily through investment in education, connections, and resources. Through these mechanisms, meritocracy might perpetuate existing inequalities by propagating the advantages of those who are already privileged while giving the illusion of a fair society. To investigate the impact of a meritocratic system (where those with the highest ability accumulate the most wealth) on inequality, we built two separate models: First an agent-based model that looked at the effect of assortative matching in a meritocracy and its effects on wealth inequality. Our model caused a bifurcation of the population by wealth. Moreover, the degree of growing economic inequality is only limited by the fixed stock of total wealth. Secondly, an evolutionary game theory model studies how the composition of employees at a workplace evolves. By interpreting the payoffs of a given type of individual (considering four types: abilities and resources being high or low) as their performance (or fitness), we use replicator dynamics to study the evolution of the work population composition. While our model suggests that meritocratic selection favours individuals with high amounts of resources, the general framework allows us to probe questions about social mobility and possible interventions. Our models demonstrate that meritocracy can lead to inequality and an entrenched aristocracy. Future research could explore parameter ranges and investigate what policies (e.g. a wealth or inheritance tax) might lessen inequality. Additional research questions might be how including immigration can affect wealth distribution. What are the macro-level outputs of the system? What would be the effect of a wealth or inheritance tax?

Hanna Isaksson

Umeå University

Anthony Ogbesor

Morehouse College

Shabaz Sultan

Radboud University Nijmegen

Modeling the Evolution of Sexual Reproduction in Simple Multicellular Life Cycles

Among large multicellular eukaryotes, such as plants and animals, sexual reproduction is a widespread reproductive strategy, while it is less common among smaller and simpler life forms. This observation leads us to believe that there is a connection between the ability to reproduce sexually and the evolution of large and complex organisms.

However, the evolutionary trajectories that link these two characteristic traits are not clear. Although sexual multicellularity comes with various costs, there are also significant benefits. Common arguments for the evolution of sexual reproduction include an enhanced adaptation rate and the ability to more efficiently suppress deleterious mutations.

Moreover, placing sexual reproduction in a multicellular context enhances the ability to evolve complexity. This includes opportunities for cells to divide tasks, such as motility, nutrient acquisition, and reproduction. In a multicellular setting, cells also face coordination challenges, meaning that cells in a group must agree on who is doing what.

Taken together, this raises an interesting question: How did sexual reproduction evolve in novel multicellular life cycles, and how was it coordinated?

In our study, we investigate the differences between unicellular and simple multicellular life cycles with clonal or sexual reproduction and evaluate costs and benefits in terms of fitness and adaptation. We are particularly interested in how sexual reproduction in multicellular group settings can improve the suppression of deleterious mutations that negatively affect population growth.

We study these differences among sexual and asexual reproduction in unicellular and multicellular life cycles using differential equations and agent-based modeling. Our goal is to identify conditions that are determining factors for the evolution of simple sexually reproducing multicellular life cycles.

František (FranČesko) Kalvas

University of West Bohemia

Guram (Guga) Mikaberidze

University of California, Davis

Opinion dynamics on a folded landscape

In this project, we explore the evident and worrying polarization of public opinions. We extend a cusp-catastrophe model of opinion dynamics by Han van der Maas to produce a more economic mathematical model with intuitive dynamical rules and straightforward ways of testing the predictions.

The cusp-catastrophe model examines the interconnected network of individual’s beliefs, feelings, and habits, shaping their overall opinion on a specific topic. In situations of heightened attention, cognitive dissonance emerges as a force aligning these nodes, whereas lower attention levels render misalignment less significant. The model simplifies to the familiar Ising model, wherein magnetization is replaced by opinion, the external field by information bias, and temperature counterintuitively, by inverse attention. This yields a cusp catastrophe of opinions, effectively explaining opinion radicalization when attention is high.

We propose a directly testable, economic cusp-catastrophe model of opinions. This is achieved by a 45 degree rotation of the horizontal axes, subsequently interpreted as measures of positive and negative information.

Experiments conducted on ChatGPT yield mixed results, validating the path dependence of opinions as predicted by the cusp-catastrophe model, yet the anticipated bistability is not observed.

Our implementation of the dynamics for interacting agents enables analysis of the public polarization as a function of several important parameters. We replicate various public opinion phenomena under the additional assumption of the latitude of acceptance from the Hegselmann-Krause model.

Helen Scott

Boston University

Vidyesh Rao Anisetti

Syracuse University

Estelle Janin

Arizona State University

Origins of Life: Uncovering the modularity of metabolic networks

The origins of life remain one of the most intriguing areas of scientific inquiry, with many open questions yet to be fully answered [1]. A key aspect of this exploration involves understanding how prebiotic conditions gave rise to organic molecules, a process that is fundamental to the emergence of life [2]. This can be achieved by exploring the vast and complex chemical space that encompasses all possible metabolic reactions.

Our approach to this exploration commences with the construction of a comprehensive universe encapsulating all metabolic reactions, which is depicted as a global network. Within this expansive network, we delve into the metabolic reaction networks of individual organisms. Leveraging the data from these individual metabolic reactions, we assign a score to each reaction in the global network. This score is determined based on the probability of the reaction's occurrence, providing a quantifiable measure to assess the likelihood of each metabolic reaction within the global network. This allows us to identify possible modules [3], which are subnetworks that occur with high probability. We are also interested in analyzing the distribution of these probabilities and the Laplacian spectrum of the metabolic network.

In addition, we plan to integrate the metabolic networks of Archaea and Bacteria to look for their common modules. These common modules could potentially correspond to the Last Universal Common Ancestor (LUCA), providing valuable insights into the origins of life.

To further investigate the partial ordering of network modules through evolutionary time – i.e. which emerged before the other – we use a newly developed measure of complexity called Assembly Theory [4]. This method captures signatures of selection and evolution by estimating the minimum number of steps required to build an object based on patterns, symmetries and recurring structures. The larger this number (called the Assembly Index), the more evolutionary time and selection pressure have been required to put it into existence, hence the later it emerged.

Altogether, this allows us to place powerful constraints on the relative emergence of different metabolic abilities and agnostically approach the problem of the origin of life.

Kate Bubar

University of Colorado Boulder

Sam Carter

George Mason University

Pablo Geraldo

University of California, Los Angeles

Marilena Hohmann

University of Copenhagen

Julian Manieson

Baillie Gifford

Samuel Ropert

Universidad San Sebastian

Perceived polarization in social networks

In recent years, the algorithmically-driven exposure to information on digital platforms has led to growing concerns about polarization in digital communities. While there is extensive research measuring people's opinions and their level of political sorting in online social networks, there has been comparatively less emphasis on understanding people's perception of overall polarization. Consequently, our study aims to address the question: How can we model perceived polarization in a social network?

To build our model, we identify two factors that might contribute to distorting perceptions of the level of polarization in a social network: (1) loudness, which refers to how strongly individuals express their opinions, and (2) incomplete information, indicating that each individual only has immediate access to a limited portion of the network.

We consider different approaches to modeling loudness. One interpretation takes loudness as a local property: each neighbor's opinion is multiplied by a loudness factor. One loud node can be equivalent to multiple quiet neighbors; and one only cares about the loudness of immediate neighbors. A second interpretation refers to the volume at which nodes express their opinion: louder nodes can reach further away connections. In this second case, the perceived volume of a node should decrease as a function of the distance in the network. To account for the interplay between node loudness and its modulation by distance, we will calculate an effective degree of separation as a product between loudness and the inverse degree of separation. In terms of incomplete information, we consider several possibilities. In the simplest model, we will assume that nodes have perfect access to the opinions of their connections; future iterations will include biased perceptions of others' opinions.

The crucial point is that each focal node observes only a partial and weighted version of the entire network, potentially deviating from the actual polarization observed in the full graph. We plan to test our model in experiments with synthetic data and real-world social media data.

Damla Akoluk

Delft University of Technology

Carlos Calvo-Hernandez

RAND Corporation

Krešimir Jakšić

University of Zadar

Ruth Magreta

Lilongwe University of Agriculture and Natural Resources

Greta Savitsky

University of Vermont

Polarization of Climate Change Narratives

The transmission and evolution of pathogens is mediated by the spatial distribution and immunological diversity of the host population. Among others, these factors influence the risk as well as the size of potential outbreaks.

This research study explores the conflicting perceptions within environmental psychology concerning the impact of climate change and associated extreme weather and political events on global citizens' views and beliefs about climate change. While one might expect direct experiences of extreme weather to amplify belief in climate change, the reality is not quite as straightforward. For example, immediate weather experiences or even the temperature of an experimental room can influence beliefs about climate change, and certain weather anomalies such as floods, droughts, or air pollution-related breathing problems can attract public attention and perhaps shape climate beliefs, but broader temperature increases are often less perceptible or don’t lead to differences in climate change beliefs. Alternatively, political affiliation appears to be the most reliable predictor of climate change beliefs, superseding the impact of weather extremes.

This study further investigates how climatic events influence public narratives on climate change. Two primary research questions guide our ongoing investigation: How do major climate-related events act as tipping points in shaping climate change narratives, and what are the significant tipping points altering these narratives? By exploring these questions, we hope to shed light on the complex dynamics between direct experience, political affiliation, and climate change narratives between 2015 to 2023.

We applied a topical space embedding technique on climate-related news articles from the "Climate News DB" (https://www.climate-news-db.com/) using BERTopic. By clustering these articles based on their topics and visualizing the frequency of these topics over time, we traced the evolution of climate narratives. Then, we superimposed major climate-related events on this frequency distribution to assess their influence on these evolving narratives. Our preliminary findings reveal that some global climate-related events, such as COP 26, COP 27, and heatwaves, are evident in the topic time series, whereas others are more elusive, potentially due to topic selection or regionalized impacts of these events.

Our preliminary results suggest that, under diffusive epidemic dynamics, vaccine-resistant mutants are more likely to emerge and establish when immune types are well mixed rather than spatially clustered.

Carla Coburger

University of Bayreuth

Kate Greene

Unaffiliated

Sam Carter

George Mason University

Yuanmo He

London School of Economics and Political Sciences

Greta Savitsky

University of Vermont

Queer-AI-Zine

Link to Zine: AI is a Queer Intelligence
https://sfieducation.s3.amazonaws.com/2023+CSSS/Queer-AI-Zine.pdf

Daniel Torren Peraire

Autonomous University of Barcelona

Estelle Janin

Arizona State University

Annie Stephenson

Princeton University

The rhythm of history

This project looked for the presence of temporal patterns in civil strife events leading up to major conflicts. We analysed the Social Political Economic Event Database (SPEED) which gathers nearly 10,000 events across three countries: Liberia, the Philippines and Sierra Leone. In total, 142 variables were extracted from news sources between 1979 and 2008 (e.g. type of event, number of initiators, number of victims, weapon type, etc.).

We aimed at investigating the generalizability of the temporal structure of events prior to a major conflict, and how conserved this structure is (if any) across countries and timescales. This required the formalisation of an intensity measure based on available variables and a thorough comparative study across the entire database. We also generated networks to study how various event indicators evolve, e.g. through initiator-target or event-location networks. Moreover, focusing on the variables with the most causal power, we researched what the time-series data revealed about causal relationships between the different variables. We attempted to answer this question using compression-complexity causality, allowing for a bivariate estimation of causality between pairs of variables. Finally, we thought about meaningful comparisons between this work and early warning signals in ecological systems, in order to provide insights into the major phase transitions of the system.

Overall, this project investigated meaningful quantitative approaches into the social, political and economic evolution of these countries. This analysis tried to ground interpretations into a fine-grained network of correlated variables and build up from strategies which are generalizable across space and time for different regions of the world and time periods.

Estelle Janin

Arizona State University

Munjung Kim

Indiana University Bloomington

Omar Aguilar

University of California, Santa Cruz

Simulating the Scientific Process

In the realm of scientific exploration, the choice of data sampling strategies can wield signific ant influence on knowledge acquisition. Recent investigations by Dubova et al. and Musslick et al. assessed the epistemic success of various experimental choice strategies, finding that agents who choose experiments at random tend to develop more accurate theories, thus enabling a better reconstruction of the ground truth (modeled as a multivariate Gaussian distribution and a set of statistical equations in cognitive science, respectively) [1,2]. Building upon these studies, we introduce two novel approaches to simulate the scientific process and investigate the influence of different steps on the validation or rejection of a theory. The first approach uses a transformer-based model called AgentNet to simulate multi-agent models specifically looking for predictive patterns in complex systems [3]. The second approach combines an autoencoder and a neurally-informed symbolic regression [4] pipeline to simulate the scientific intuition and law-making components of the theorization process. In both approaches, we compare three distinct data sampling strategies: confirmation, disagreement, and random sampling, and determine which strategy achieves the most accurate reconstruction of the chosen ground truth(s). Finally, our research delves into the complexities of social learning, exploring how the exchange of knowledge among agents can enhance individual understanding and influence the choice of optimal sampling strategies. By shedding light on the multifacet ed dynamics of knowledge acquisition and synthesis, especially the interplay between data sampling, some of the cognitive processes behind theory-making, and social interactions, we contribute to a more comprehensive understanding of scientific discovery and its future prospects.

Estelle Janin

Arizona State University

Samuel Ropert

Universidad San Sebastian

Anna Jon-And

Stockholm University

The Structure of Language

In an effort to enhance the conceptual effort behind the search for extraterrestrial intelligence and the detection of interstellar communication, this project seeks to uncover universal structural properties of Language, characteristic of its emergence and evolution. The recent improvements of Large Language Models place them as ideal substrates for the study of text generation, semantic embeddings, and the interplay between Cognition and Meaning on an unprecedented quantitative scale. Word-embeddings are of particular interest since they efficiently map the human language to a high-dimensional vector space that we can consistently compare across texts and content. We studied the linguistic features differentially highlighted by two types of embeddings, in Euclidean space (e.g., Word2Vec) and Poincaré space. When coupled to a platform like WordNet (leveraging hyponymy, hypernymy and synonymy among words), we find that Poincaré maps uniquely capture the hierarchy and modularity of language by embedding words in a hyperbolic geometrical space. Future work will focus on determining appropriate tokenization choices, developing a meaningful metric to study the trajectory of a Large Language Model in a semantic embedding as it generates text, and assess its fundamental structural properties. The latter will be achieved by using a newly developed measure of combinatorial complexity called Assembly Theory, which captures signatures of selection and evolution by estimating the minimum number of steps required to build an object based on patterns, symmetries and recurring structures. Altogether, these insights will form the basis towards a better understanding and formalization of Language as a major phase transition in the trajectory of Life, and will open new avenues for its detection elsewhere in the universe.

Yannick Oswald

Complexity Science Hub Vienna

Asael (Ace) Sorensen

Helmholtz Institute of Functional Marine Biodiversity (HIFMB)

Anja Janischewski

Boston University

The visible hand - An agent-based approach to supply and demand simulation

Spatial heterogeneity in prices is frequently found in real-world economic markets, such as housing prices or fuel prices at gas stations. Geographical differences in prices can cause challenges such as the exacerbation of economic inequalities or market distortions. To address these complex dynamics inherent in real-world economic markets, agent-based modeling emerges as a powerful and flexible tool, capable of capturing and analyzing the nuanced impact of spatial heterogeneity.

In the tradition of earlier work on the “Tâtonnement process”, we build an agent-based model of a spatially organized market with a homogeneous good, where traders follow an aspiration level heuristic influenced by their past trading experience. Sellers articulate prices in an adaptive manner and buyers search for the cheapest price within a limited search radius.

We then analyze the convergence dynamics in the spatial and non-spatial model. Furthermore, we compare the results to the corresponding market equilibrium derived from the constructed demand and supply curves.

There are several possible extensions to this work. First, we intend to analyze meta-stable states, specifically, spatially local equilibria, and determine the conditions under which a global equilibrium emerges. Second, we plan to explore meso-level model aggregations, such as equation-based models like partial differential equations. Such models might provide an analytically tractable tool while still maintaining some of the emergent properties of the agent-based model. Third, we aim to advance the model to perform basic welfare analysis, including the examination of taxation effects.

Sam Carter

George Mason University

Pablo Geraldo

University of California, Los Angeles

Yuanmo He

London School of Economics and Political Sciences

Marilena Hohmann

University of Copenhagen

Cheyenne Jarman

Oregon State University

Annie Stephenson

Princeton University

When is conflict engaging?

Social conflicts can have a wide range of consequences, from civic engagement to political polarization. However, the factors determining whether these conflicts result in increased engagement and participation or political apathy remain unclear. Therefore, this project seeks to address the question: when does conflict lead to engagement? We hypothesize that there is an intermediate range of disagreement between individuals in a social network that increases engagement, in between complete apathy and unsolvable disagreement. To investigate this claim, we develop a network model that examines social influence and contagion when people exchange opinions. We plan to use this model to derive hypotheses that we can subsequently contrast with dynamics observed in social media data, and eventually test in an experimental study. The findings are not only relevant for understanding (political) opinion dynamics but also extend to other settings involving group interactions and conflict, such as business, arts, or science. While ABMs of bacterial or animal cells have been developed, in plant science, ABMs have predominantly operated at larger scales, where agents represent individual plants or plant building blocks (aka metamers). Here we present the core of an agent-based model of growing plant cells, which must follow different physical rules than bacterial or animal cells. Future work will develop this core further to incorporate various factors in lateral root formation, such as hormonal regulation and explicit spatial structure.

Director

Dave Feldman

Project Coordinators

Tamara van der Does | Santiago Guisasola

Faculty