Diana Avila Padilla
Universidad Nacional Autónoma de México & Instituto Tecnológico Autónomo de México (MX)
Mentor: Chris Kempes
A comprehensive understanding of the cell physiology can be the beginning to understand the origins of life and how we can find cell-alike systems beyond Earth. One way to approach this question is by analyzing the different functions performed by a living cell. How do the different functions of the cells are related between each other, and how do the volume constraints affect this relation? By understanding the dependency between the different functions of the cell and the volume constraints for the performance of each function and the total volume of the cell, we can explore fundamental questions in biology, such as what is life? and how can we look for life beyond Earth? This understanding has a relevant importance in astrobiology, the relationship between the amount of the material that a given cell determines to easince the relationship between the amount of material and the cell function can be used as an agnostic signature when looking for life beyond earth, and to understand the origin of life.
In this project, we showed how the functions of information storage (genetics), functionality (metabolism), translation (intermediates) interact with each other using two mathematical models. We used a differential equation system for each model, and we solved both models using Wolfram Mathematica. For the first model, we found the analytical solutions to express the relation between the different variables involving the functions of information storage, metabolism, translation and energy. For the second model, we proposed a simpler version of the cell by taking into account the functions of information storage and metabolism, and we found that two variables of interest (α and γ), since their behavior change between both of the functions. This findings are the beginning of the project, the next steps will be using them to propose a new pathway to look for life beyond Earth that is not based on the chemicals that living cell organisms have on Earth, but the functions they fulfill and the right combinations of the percentages of chemicals that they have to have to accomplish those functions.
University of Washington (US)
Mentor: Michael Lachmann
Sexual reproduction is widespread across life on Earth. Yet, it is unclear why sexual reproduction remains stable in a population, since a female that discards the genetic material she gets from the male, and instead only uses her own genetic material will transfer twice as much genetic material to the next generation. J.B.S. Haldane showed that in asexual species there is a limit to the rate at which selection can accu- mulate information, a limit is known as “Haldane’s dilemma”. Later, John Maynard Smith showed that under some circumstances, this limit can be surpassed. However, it is unknown how much information from the environment enters the genome of a sexually reproducing species per generation. We hypothesize that sexual reproduction can transmit more information about the environment for a given amount of selection than asexual reproduction. This project shows the first steps toward using simulations and theoretical models from information theory to determine if sexually reproducing species cross the asexual information threshold. To do this, we measured the mutual information between an ideal genotype defined by the environment and the genotypes of the individuals in the population. Here we show that the speed at which the environment changes has different effects on the amount of mutual information in asexual and sexual populations.The results from this project could then be used to understand how quickly specific species can respond to changes in their environment.
Florida International University (US)
Mentors: Melanie Mitchell, Arseny Moskvichev
Feedforward Neural Networks (FNNs) and Genetic Algorithms (GAs) have proved to be powerful tools in machine learning (ML). However, FNNs and GAs can be resource expensive in their implementation and brittle. Coupling FNNs with GAs and introducing a spatial structure for parameter optimization can help create more robust ML models. This project aims to optimize a FNN using a GA instead of back-propagation (BP) and a grid structure which contains neural networks (NNs) in each cell. Each NN is tasked with correctly identifying the examples in the MNIST handwritten digit data set. Using this model, we will investigate the following:
(i) We will determine the effects of population size on the maximum validation accuracy achieved by varying the size of our population in our non-spatial GA and comparing the performance of each population size.
(ii) We will measure and compare the diversity between generations by comparing every individual of the current generation to all individuals in the previous generation using cosine similarity and taking the average of their similarities.
(iii) We will gauge the spatial GAs’ ability to generalize by reducing the number of data samples the NNs train on in both the spatial and non-spatial GA. We will then compare their validation accuracy and measure of over-fitting on each new data set. This machine learning model can be used to as a computational framework in fields such as collective computation, population dynamics, and evolutionary biology.
University of New Mexico (US)
Mentors: Chris Kempes, Jack Shaw
Molecular and fossil evidence suggest that complex multicellularity evolved during the late Neoproterozoic era, coincident with Snowball Earth glaciations, where ice sheets covered most of the globe. During this period, environmental conditions—such as sea water temperature and the availability of photosynthetically active light in the oceans—likely changed dramatically. Such changes would have had significant effects on both nutrient availability and optimal phenotypes, creating selective pressures. Recent work identified an increase in viscosity (driven by decreasing water temperatures during glaciation) as a possible driver for an increase in size in heterotrophic eukaryotes, and identifies this increase as a path toward animal multicellularity. Here we explore how changes in light and temperature during Snowball Earth may have impacted organisms and their environments. First, to test whether decreasing temperatures led to a selective pressure for larger organisms—and, perhaps, for multicellularity—we modeled the dependence of nutrient uptake and metabolism on temperature and body size. Second, to test how the ice sheet coverage impacted nutrient availability, we modeled how changes in temperature and the level of photosynthetically active light affected primary productivity. By testing a series of alternate—and commonly debated—hypotheses, we can explore putative selective pressures that led to multicellularity. Understanding the evolution of multicellularity is vital for elucidating the explosion of animal life following the Neoproterozoic.
Howard University (US)
Mentors: Vicky Chuqiao Yang, Michael Price
Under what conditions will United States Senators with diametrically opposed views collaborate? How can knowledge, and possibly even manipulation of, these circumstances inform policymakers’ coalition building strategies? Answering these questions will help policymakers make decisions both about framing and about which Senators they should approach as possible allies. As polarization makes it increasingly difficult to pass legislation which deals with the major issues facing this country (e.g. the climate crisis, rising inequality, growing authoritarian sentiment, etc.), it will become necessary to explore unusual coalitions to successfully pass laws. Here we begin the work of building a probabilistic model to predict which Senators should collaborate by modeling Senators ideologies based on information about their constituents. In evaluating these models of Senator ideologies, we discovered that little to no linear relationship exists between information about a state’s demographics and their Senators’ ideology. We are currently working on constructing a neural network architecture that may provide insight into non-linear dynamics at play between demographic data and ideology. Successful creation and implementation of the probabilistic model would represent an advanced tool in analysis of American politics, and could even possibly be used in other nations with similar legislative structures.
Colorado School of Mines (US)
Mentors: Miguel Fuentes, Marco Buongiorno Nardelli, Chris Kempes
How does a composer use new chords and transitions over the course of their career? Are there certain periods in a composer’s career where they are more exploratory? These ques- tions are of interest to musicologists and historians who hope to understand the evolution of a composer. However, there are currently no quantitative approaches to answer these ques- tions. Using Chopin as a case study, we determined the discovery rate for distinct chords and transitions. Through this analysis, we find that Chopin does not have any punctuated periods of exploration and that these results were invariant to random ordering of the corpus. Because of these findings, we developed a new model of musical exploration by using a diffusive process. To do this, we created a space of all possible chords and transitions and then simulated diffusion on it. We then compare the predicted discovery rates for chords and transitions with the data. We see that Chopin’s exploration of the chord and transitions spaces is not strictly diffusive. Future research will address augmenting the structure the process diffuses upon to account for differences in the rates of exploration.
Minerva University (US)
Mentors: Tyler Millhouse, Yuanzhao Zhang
Over the years, applications of machine learning and neural networks have become increasingly ubiquitous in tasks from voice-to-text transcription to image recognition, generative modeling, and beyond. Although the function of neural networks (NNs) is understood on a mechanistic level, e.g. transmission of signals across neurons/layers and pathfinding strategies like gradient descent, little is understood about the higher-level loss landscapes neural networks navigate or the intricate functions they approximate. Current deep learning research focuses more on industry use cases and optimizing or adding new functionality to NN architectures than it does on understanding or interpreting how existing methods work. As in other systems of collective dynamics, characterizing the emergent rules that govern a system of nodes on a higher level proves difficult even if we know how individual perceptrons fire. This project aims to clarify how neural networks converge to solutions in convoluted high-dimensional loss landscapes. We hypothesize that, as in other coupled high-dimensional systems, the energy/loss landscapes of NNs will follow a complex octopus-like structure with tentacle-like basins of attraction winding through the state space as opposed to smooth/straight paths towards a global optimum. To understand the complex structures of high-dimensional neural network loss landscapes, we apply different tricks to explore the state space such as plotting convergence and distance until convergence along different rays projected out from the local optima of a neural network. We aim to map how basins of attraction spread across a landscape and characterize differences in generalization performance versus basin structure across competing minima. We further study the loss landscapes of NNs of various architectures to interpret how choices of activation function, loss function, depth, etc. affect performance. Visualizing and understanding the loss landscapes of neural networks offers insights into how NNs learn and how we might improve generalization performance.
Willamette University (US)
Mentor: Jack Shaw
Human activities have large-scale repercussions on ecosystem function, as evidenced by the recent increase in species extinction rates and the decrease in biodiversity worldwide. Food webs illustrate the connections and interactions between species on different trophic levels. As such, the structure of food webs inherently allows for the evaluation of emergent patterns which result from such alterations to entire ecosystems. While previous studies have considered the impacts of extinction on community structure, the impacts of rarity — the concept of the relatively sudden removal of a large number of a particular species — are understudied. In the paper, “Rarity in mass extinctions and the future of ecosystems” by Hull et al. it is claimed that mass rarity (as opposed to mass extinction) more accurately reflects function in ecological networks. Hence, greater comprehension of the impacts of rarity as compared to those of alternate types of species extinction or reduction may have implications for the drivers of mass disappearance of fossils in deep time. An understanding of these processes which led to mass disappearance may then inform predictions for the fate of current species as anthropogenic factors continue to impact ecosystems. This project aims to simulate and evaluate the impact of rarity on the evolution of model food webs. As a starting point, the niche model food web is evaluated after being perturbed by random extinction. The stability of the model before and after random extinction is measured using modularity, which gives the degree to which a network is clustered. Current findings show that modularity is effectively zero for all of the networks generated by the model, regardless of any changes to connectance, species richness, or the magnitude of random extinction imposed on the model. It is believed this is due to faults in the construction of the model which will be resolved in future work on the project.
Minerva University (US)
Mentor: Melanie Mitchell
Coevolutionary learning with the genetic algorithm is an evolutionary inspired computational method that seeks to improve the performance of individuals in a population by putting both the host population (individuals) and the parasite population (training dataset) in constant competition (coevolution). The literature has shown structured spatial configurations of the competition better influence the algorithm’s training trajectory. Mitchell hypothesized coevolution of neighboring sets of algorithms preserves higher diversity, compared to training the whole population across the whole space which will homogenize the fitness of the population. This project aims to test this hypothesis by using neural networks and the MNIST dataset to understand how spatial coevolution can outperform other training methods. Specifically, I plan to run a multi-fold experiment: (1) evolutionary training of neural networks in a non-spatial setting, (2) evolutionary training of neural networks in a spatial configuration, (3) coevolving neural networks and MNIST data in a spatial configuration, and (4) extending to GANs in a spatial configuration. Each step is a stepping stone for the next experiment, by concretely testing how the addition of each feature contributes to the improvement of the neural network on the MNIST benchmark. Since the training is based on the random crossover of algorithmic features (i.e., weights and biases for neural networks), the model will provide a framework with salient coevolutionary features that can be applied to non-analytical optimizations. Furthermore, I hope to use Maximum Entropy Theory of Ecology to analyze how evolutionary arm races and preservation of diversity improve the overall fitness of a species across systems.
This program was supported, in part, by the National Science Foundation Research Experiences for Undergraduates (REU) program, NSF grant number 1757923 (PI Cristopher Moore). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Additional support for the UCR program came from SFI's generous donors, including R. Martin Chavez, the Albuquerque Community Foundation: Kimmersteinling Fund, and Eugene & Jean Stark.