Alexander Ortiz

Northwestern University

Mentor: Cristopher Moore

A Summer of Manifolds

Smooth manifold theory is a foundational theory for many branches of mathematics including differential geometry, mathematical physics, and differential topology. In this paper I present a brief reflection on my work in the REU at the Santa Fe Institute and some of the fundamental notions of this theory, including smooth atlases, maps, tangent spaces, immersions, submersions, and Lie groups.

Presentation Video Link


Alyssa Romero Johnson

New Mexico Highlands University

Mentors: Chris Kempes and Geoffrey West

Scaling in Housing Availability and Affordability

When discussing housing affordability and access, we often refer to a community’s average income or median home price, without taking into account the actual distribution, complete with skew and outliers, of these measures. This project aims to create a new measure of availability of affordable housing by quantitatively comparing the shapes of the frequency distributions of income and housing cost. I will then explore whether this measure of availability exhibits a scaling relationship with respect to city size, the way that so many other urban attributes do.


Benjamin Anker

Central New Mexico Community College

Mentors: Mirta Galesic and Joshua Garland

Inferring Prejudice based upon Part of Speech Differences

In this paper we examine the ways in which identical words can have different semantic meanings and contexts, both across time and across political spectra. We propose that using the same words as different parts of speech consistently may suggest an individual’s or community’s political leanings and that examining the semantic contexts of these words may illuminate how beliefs of individuals and communities evolve. E.g. “the gays” (gay used as a noun) might be more prejudicial than “gay people” (gay used as an adjective).

The approach taken is similar to the strategies employed by Kulkarni et al (2014)[1] in that we examine frequency, syntax, and semantics. Frequency analysis involves the ratio of a given word count to corpus size, while syntactic analysis examines the part of speech a word is used as – as mentioned above. Finally, semantic analysis involves training instances of word2vec on corpora divided by time/community before comparing the resulting vectors to find words that carry different meanings in different times/communities.

Data used are news articles and comments published on the websites of Breitbart, The Atlantic, Mother Jones, and The Hill, in the period from 2015 to 2017. We believe that this approach may enable the early detection of individuals or communities drifting towards more radical views, especially those under the influence of demagogues with distinctive patterns of rhetoric.


Edward Greg Huang

University of California, Berkeley

Mentor: David Wolpert

Algorithmic Information and Inference Devices

There has been much interest surrounding what properties about the universe can be derived from applying a mathematical formalization of inference and knowledge. Previous work by Wolpert used the theory of "inference devices" (IDs) to demonstrate bounds on knowledge in any physical universe that allows agents to hold information concerning that universe. We extend previous work on the capacity and limitations of IDs to infer physical variables. Our results impose conditions on the inference of singular functions and sets of functions. We pursue analogues between IDs and their relation to Turing machine theory and algorithmic information theory. In particular, we show that any Turing machine can be strongly inferred and build upon that to demonstrate incompressibility of strong inference complexity. This incompressibility result has led to several analogues between Kolmogorov complexity and inference complexity that suggest further similarities between algorithmic information theory and the theory of inference devices.

Presentation Video Link


Jaeweon Shin

Rice University

Mentors: David Wolpert and Michael Price

Analysis on Seshat Dataset

In 2017, a group of researchers (Turchin et al 2017) published a large database of historical records called Seshat. Using this dataset, the same researchers concluded that there is a major axis of social complexity, which represents how complex a society is at a given time period. By expanding on this analysis, we show that there are, in fact, two clusters within the major axis that displays different characteristics. Furthermore, we show that these two clusters could imply that all societies could be classified into largely two stages based on the relative complexities of their societies.


Keming Zhang

St. John's College

Mentor: Sidney Redner

Junction Problems in Asymmetrical Exclusion Process

The Asymmetric Exclusion Process has long been an interesting subject in the field of statistical physics. The classical model studies the density profile of particles moving in one lane. This model has generated many interesting phenomena, which leads us to consider the following question: What is the density profile if multiple lanes are merged into one lane, or if one lane is expanded into multiple lanes. The way we conducted the research is to run computer simulations many times and get the average density profiles. The outcome is very intriguing and worthy of further analytical analysis.


Maddie Barrie

Michigan Technological University

Mentors: Elizabeth Hobson and Scott E. Page

The Ideal Diversity to Cultivate Successful Humanitarian Engineering

Diversity bonuses occur when a group with diverse cognitive skills works together inclusively on cognitive non-routine tasks. The ideal cognitive repertoire and type of inclusion differ depending on the task that needs to be completed. The goal of this project is to understand to what extent diversity bonuses occur in philanthropic engineering, specifically looking at projects done by Engineers Without Borders (EWB). Furthermore, the project will examine variances in cognitive repertoires and types of inclusion present in EWB projects using surveys sent out to the student international project teams of the organization. By examining the structure and composition of student chapters and comparing them with the successes and failings of their perspective projects, insights into the ideal configuration can be deduced.


Megan Bromley

Arizona State University (ASU)

Mentors: Manfred Laubichler and Elizabeth Hobson

The field of astrobiology is notoriously interdisciplinary, requiring the combined efforts of experts in comparatively more defined disciplines (physics, astrophysics, chemistry, biology, geology,sociology, engineering, and philosophy, just for a few examples). The degree to which each of these disciplines is incorporated into the body of work that we call astrobiology is yet undefined. It is also unclear whether the community that we call astrobiology is robust and clear enough to be called a discipline or a field in its own right. I created a set of networks using two corpora of astrobiology texts in order to make steps toward answering these questions. Results are preliminary, but indicate that the term “astrobiology” can be clustered into two broad categories of study, while “astrobiology” in turn can be regarded as a subset of other broader research interests.


Michael Neuder

University of Colorado, Boulder

Mentors: Joshua Garland and Andrew Berdahl

Animal Tracking using Deep Learning

Collective animal behavior is one of the canonical examples of Complex Systems, and with the immense development of machine learning, computer vision, and big data, we now have a new way to explore this phenomenon. Using video footage taken from drones flying above herds of caribou in the Yamal Peninsula in Russia, we seek to extract the trajectory data of each animal in the herd. This is a complicated image tracking problem because there are many caribou that are small relative to the frame, and they are camouflaged well with the ground. Once we have the trajectory data, there are several questions we seek to answer about the behavior of the group. The most fundamental aspect is the local interaction rules of the individuals in the group (e.g.individuals tend to repel each other at a close distance, while attracting each other at a long distance to keep the herd together). The follow up and long-term goal of this project is to learn how animal herds behave in a resource gradient. We have the technology and capability to get extremely accurate data about the plant life of the land, which would allow us to create a resource map detailing where the plant life is concentrated. Overlaying the herd trajectory on this resource map would allow us to see the collection of animals interact with the gradient of the map and give us great insight into the capabilities of herds.


Naika Dorilas

Florida Atlantic University

Mentors: Jacopo Grilli and Andy Rominger

Inferring Characteristics of Interaction Matrices in an Ecological Context

In ecology, an interaction matrix describes how species in an ecosystem interact with one another and affect each others growth in a given model of population growth. In an attempt to find a method for inferring species interactions, many ecologists have tried to go from time series data on the population fluctuations of species to inferring the entire interaction matrix. In this paper, the objective is to infer parameters of the interaction matrix, given a set of data, so instead of inferring the entire matrix, we just want 3 key statistical properties about how the entries are drawn, the mean m, variance s and average diagonal elements d. So based on a specified ecosystem we are trying to find a method of inferring the statistical properties of the interaction matrix between different species based on a simple model of population fluctuation. We will use maximum likelihood, to predict the probability of observing our data, x given m; s; d. So what we expect from this maximization are the m; s; d that make the data, or fluctuations in population sizes of the species the most likely. The implications of this are that assuming this method is successful, any randomly generated matrix with the given statistical properties found could be used to describe how species in the data set used interact.


Nicolas Gort Freitas

Minerva Schools at KGI

Energetic and informational trade-offs in biochemical networks

Mentors: Chris Kempes and Artemy Kolchinsky

Scaling of Information in Biochemical Systems

Cellular environments are complex and dynamic. The temporal and spatial concentration profiles of external molecules encode information that cells can harness through networks of interacting molecules, which allow them to learn about their environment. Despite the numerous sources of intrinsic and extrinsic noise that limit the capacity of cells as information channels, cells manage to thrive, replicate andmaintain chemical equilibrium. In the following work, I will study how the inherent stochasticity in protein synthesis limits the transmission of the compositional information of the cell over time, and how this stochasticity scales with respect to cell size.


Oluwasunmisola (Sunmi) Ojewumi

Minerva Schools at KGI

Mentors: Mirta Galesic and Michael Price

The Effect of Heterogeneous Susceptibility on the Speed of Disease Spread and the Final Epidemic Size

An assumption that is most times made when designing epidemic models is that all people in a population have the same susceptibility to a disease, however, that assumption is not entirely accurate. People have different susceptibilities to a disease because of difference in genes, lifestyles etc. Nonetheless, not much work has been done to show how this heterogeneous susceptibility can affect the predictions made from disease models. Here we show that – in the SIR model – it is important to take heterogeneous susceptibility into account as it affects the final epidemic size of a disease, although it does not significantly affect the speed of the spread of the disease.


Sahana Subramanyam

Azim Premji University (India)

Mentors: Samuel Bowles and Wendy Carlin and Michael Price

Continuity and change in economics pedagogy:

A Computational Linguistic Analysis of Textbooks

This project explores the discipline of economics using tools from computational linguistic analysis to identify novelty and persistence of economic pedagogy. Some key observations that arise from a computational approach to the history of economic thought reveal novelty and persistence in the Samuelsonian paradigm, and a challenge to this approach post-financial crisis by CORE.


Seung Yeon (Yona) Han

St. John's College

Michael Price and Andy Rominger

Economic Inequality and Professional Diversity

Income inequality and occupational diversity has been many people’s interest separately, but not much research has covered them altogether. Both income inequality and occupational diversity affect society, and we know that society is a complex being by itself. So it is natural to get to the question about the correlation between income inequality and occupational diversity. By this research, we seek to answer this question.


Terran Mott

Grinnell College

Mentor: David Feldman

Building Intuition in Higher Dimensions: A Combinatorial Approach to Higher Dimensional Fractal Geometry

Visual intuition is limited to three dimensions. We can’t trick our imagination into accurately picturing the structure of a four or five dimensional object. Despite this intuitive blindfold, plenty of reasonable geometry lives comfortably in higher dimensions. In this paper, we will build intuition about higher dimensional material in a non-visual way. We will develop tools to explore patterns in any dimension without the need for visualization. This exercise offers great practice at strengthening one’s imagination and at building intuition about a non-intuitive concept.

Our exploration is restricted to the study of N-dimensional cubes—also called hypercubes. Cubes are a simple, platonic solid that get along well with simple notions of dimension. Many common fractals require some sort of N-cube as an originator—either a line segment, square, or cube. We will explore the higher dimensional analogs of such fractals. In the process, we’ll reveal clever patterns about the boundaries and innards of non-fractal hypercubes.


The Santa Fe Institute REU Program is supported by the National Science Foundation under Grant Number ACI-1757923, the ASU-SFI Center for Biosocial Complex Systems and private donors. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation or Arizona State University.