![]() |
Syeda Abeera AmirMinerva University (US)Mentors: Maell Cullen, Chris Kempes |
Artificial Neural Networks (ANNs) have revolutionized the field of machine learning, yet their capacity for inferring causality remains a significant limitation. This limitation is critical in applying ANNs to fields where causal relationships are essential, such as medicine, economics, and legal systems. Although substantial progress has been made in developing more robust machine learning models, the inherent structure and functioning of ANNs, which focus on correlation rather than causation, present a notable challenge. Spiking Neural Networks (SNNs), a new class of models that more closely mimic the behavior of biological neurons compared to traditional ANNs, have been proposed as a potential solution to this problem.
In this study, we explored the use of Spiking Neural Networks (SNN), specifically the Leaky Integrate and Fire (LIF) model, in modeling causal reasoning. We used the Abstract Causal Reasoning (ACRE) dataset, which uses the Blicket Detector Experiment as the inspiration for the tasks. The dataset consists of 100,000 images, with 10,000 trials, where each trial consists of 10 images that are processed in a sequence. The goal of the model is to infer which object in the images causes the blicket detector to activate. To test this, we created a multi-level machine learning pipeline that consists of a Spiking Convolutional Neural Network (SCNN), to process the images, and a Spiking Recurrent Neural Network (SRNN), to evaluate the causal relationships.
This research could potentially enhance our understanding of the limitations of machine learning models in inferring causality. Furthermore, by examining the spike pattern in SNNs, we can identify the information processing correlates that allow the model to make certain predictions.
![]() |
Andrew GeykoUniversity of New Mexico (US)Mentors: Melanie Mitchell, Arseny Moskvichev |
Capturing abstract conceptual reasoning is an unsolved problem in artificial intelligence. Recently, large language models have caused a stir in the field with their ability to solve requiring reason by using generated text. We seek to find out to what extent LLMs have abstract reasoning capabilities and generative problem solving skills. We develop a new test suite, known as 1D ConceptARC, and evaluate the performance of ChatGPT on this dataset. We find that performance on our testsuite is greatly improved compared to others, and elucidate the reasoning capabilities of the system. Furthermore, we introduce our own artificial intelligence solver for 1D ConceptARC, and compare its performance to the performance of ChatGPT. Using this, we highlight fundamental roadblocks preventing the development of generalized artificial intelligence and discuss future steps in creating such systems.
![]() |
Nathan HasegawaHarvey Mudd College (US)Mentors: James Holehouse, Sid Redner |
Suppose that people gradually come into a party and join conversations. How large will conversations get over time, and what will the distribution of conversation sizes look like? We examine this using the island growth model, a stochastic process where clusters of mass 1 (monomers) are added over time and combine at random with clusters of mass k to form clusters of mass k + 1. We expand on previous work on this model by introducing preferential attachment, a “rich get richer” effect where more massive clusters grow faster than less massive ones. This accounts for the fact that people are, in theory, more likely to join larger conversations than smaller ones. For an arbitrary amount of preferential attachment, we use Monte Carlo simulations to predict the number of monomers over time, and show that we can use the number of monomers over time to approximate the number of clusters of any arbitrary mass. We also find an analytical solution for the distribution of cluster sizes at large time for a particular degree of preferential attachment, and an approximate solution for any arbitrary degree of preferential attachment. These results allow for insightful predictions about conversation sizes at parties and could be applied to other social systems with preferential attachment, such as the growth of new academic fields.
![]() |
Shloka JanapatyColumbia University (US)Mentors: Mingzhen Lu, Chris Kempes |
Biocrust, terrestrial communities of cyanobacteria, lichen, and bryophytes, extensively cover global surfaces and play a crucial regulatory role in atmospheric nitrogen fixation. Despite their importance, global variation in biocrust biogeochemical flux is not well understood. In particular, current spatiotemporal models of nitrogen flux lack a metabolic picture of biocrust community dynamics in stochastic environments. Recent advances in population-level scaling and new, updated biocrust flux datasets can fill this gap. In this work, we propose a model of biocrust growth and death with competitive constraints, incorporating competition on a background of fluctuating resources and stochastic disturbance. We suggest that biomass and functional group diversity constrain variation in nitrogen flux across productive zones.
First, I develop a reaction-diffusion PDE of biocrust growth and mortality with three resources, N, Fe, and H2O, and four functional groups, non-diazotrophic cyanobacteria, diazotrophic cyanobacteria, lichen, and bryophytes. Functional groups were aggregated into distinct maturity classes. New biocrusts enter the community through spore dispersal. Non-diazotrophic cyanobacteria deposit iron, facilitating the emergence of nitrogen-fixing cyanobacteria, and lichen and bryophytes appear in response to water availability. I also derive first-principle rules for the relationship between functional groups and competition, disturbance, allometry, and resource consumption, which exert probabilistic dependencies on class mortality rates. Our analysis successfully predicts both biocrust spatial aggregation and ecological succession patterns and suggests constraints on nitrogen flux. This extends ecological insight into biogeochemical feedback loops in terrestrial ecosystems.
![]() |
Carina KaneUniversity of Chicago (US)Mentors: David Krakauer, Simon DeDeo, Maell Cullen |
Large language models are challenging the mechanisms required, in humans and computers, to interpret and generate conceptually complex information. This kind of information is highly context dependent and deeply rooted in our systems of knowledge and understanding. Generative Pre-trained Transformers (GPTs) have an astounding capacity to put together, not just coherent language, but language that is highly abstract and nuanced in its meaning. In order for these deep concepts to be captured, they must be represented somewhere in the model, albeit entirely numerically. We can ask the question of how and where in transformer models abstract concept representations are emerging, and whether these representations have similar attributes to those of humans. We probe the various hierarchical layers in the transformer model in order to analyze which layers are most sensitive to which kinds of concepts. Understanding the layers of the transformer as corresponding to a hierarchy of abstract transformations of word embeddings, we can elucidate the process by which abstract concepts are encoded and decoded in the model by analyzing their numerical representations.
![]() |
Ky-Vinh MaiUniversity of California, Irvine (US)Mentors: Melanie Mitchell, Arseny Moskvichev, Maell Cullen |
With the explosion of popularity with Large Language Models (LLMs), artificial intelligence has come closer to mimicking human intelligence. However, recent scrutinization of LLMs capabilities have revealed a divergence between formal competence, the ability to produce and comprehend language, and functional competence, other cognitive abilities not related to the language faculty. Despite the incredible performance in formal competence, LLM’s functional competence, particularly with solving logic problems such as the Conceptual Abstract Reasoning Corpus (Concept ARC), fall dramatically behind compared to human performance. Results show that GPT-4 succeeded only on a third of the problems compared to near complete accuracy for humans.
In this project, we investigate whether GPT-3 models can perform better on the Concept ARC through applied language. More specifically, can adding verbal instructions to each problem improve model performance? Evidence that different applications of language can change model behavior are demonstrated through in-context learning methods such as chain-of-thought prompting or tree-of-thought prompting, all of which have improved model reasoning.
Through finetuning, a supervised learning technique, we developed 3 methods which will take on different variations of language injection. The default format will have a prompt consisting of the task demonstrations and test input, and intended output, only containing the test output. The experiment will go as follows:
(i) Method 1 - Applied Instruction: The instruction will be included in the beginning of the prompt. Here we hope the model can understand and apply the instructions to the tasks.
(ii) Method 2 - Instruction generation: The model will instead generate the instructions as part of the intended output. Perhaps this will induce it to reason about the problems more inductively .
(iii) Method 3 - Verbal Inference Multi-Model: A combination of the previous methods, we’ll instead have a 2 model approach. A first model will take a look at the task demonstrations and generate an explanation, which will be passed along to the second model to apply.
![]() |
Anish PandyaUniversity of Texas, Austin (US)Mentors: James Holehouse, Chris Kempes |
Cells grow and divide. Their growth reflects an irreversible thermodynamic process. This macroscopic asymmetry gives rise to properties necessary for the functions of life. In this project, our goal is to understand how from microscopic reaction networks, cells maintain the macroscopic functions characteristic of life. We take the case of concentration homeostasis: how cells regulate an internal chemical environment conducive to life. The internal protein concentration can be thought to be governed by two parameters: protein number and volume. Therefore, we look for volume-dependence in a minimal model of protein production and degradation. Single-cell experimental (mother machine) data give temporal resolution amenable to analytical methods. Using this model, we derive expressions for the mean and variance for time-dependent protein metabolism. Next, we use a modified stochastic simulation algorithm to simulate data from mother machine experiments in E. coli, where the parameters are pre-determined. Using maximum-likelihood estimation methods, we hope to recover the parameters from the synthetic data produced from the simulation to verify the utility of the inference procedure. Finally, we plan to use the inference method on experimental mother machine data to determine whether stochastic gene expression in growing E. coli is volume dependent. In doing so, we hope to relate the microscopic mechanisms of protein production to the macroscopic emergence of a concentration homeostasis.
![]() |
Juniper RodriguezPurdue University (US)Mentor: Mirta Galesic |
This study delves into the dynamics of belief evolution within the realm of online political discourse. The digital sphere serves as a fertile ground for political discourse to flourish, often contained between those who share political perspectives. Within these spaces, the views expressed are often extreme, divisive, and pose a potential threat against constructive and nonpartisan discourse. Though both models of belief dynamics and analyses of online political discourse currently exist within the literature, no significant efforts have been made to synthesize these frameworks. For this project we looked at the comment sections of five prominent U.S. news websites spanning the political spectrum: Mother Jones, The Atlantic, The Hill, Breitbart, and Gateway Pundit.
Utilizing BERTopic modeling, we created 200-300 topics that were representative of the discourse for each platform. We employed C-TF-IDF vectors to create networks of the topics on each platform in any given month. More specifically, we wanted to observe the behavior of three topics in particular that were the most prolific, well-formed topics across platforms: abortion, climate, and vaccines. Before analyzing belief formation and propagation within these platforms, we first wanted to look into how these networks of political discourse themselves behave, both on a micro and macro scale. The research is rooted in three central inquiries:
Cross-Platform Discourse on Core Topics: In order to answer this question, we created ego networks of these topics of interest, and examined which topics are most prevalent in relation across time, and compared these topics across platforms by utilizing a typology of political discourse configurations.
Platform-Specific Metanarratives: By metanarratives, we are referring to the overarching narratives created within digital spaces that provide a pattern or structure for people’s beliefs. We explored this by decomposing the network of all topics into communities using the Louvain method. After running this algorithm, latent themes within each embedded community of the network emerged.
Core Topics as Embedded within Metanarratives: Finally, we synthesize these results by looking at where the topic of interest is embedded within this network over time.
This program was supported in part by:
(1) The Research Experiences for Undergraduates (REU) supplement to the National Science Foundation (NSF) grant "EAGER: Developing data and evaluation methods to assess the generality and robustness of AI systems for abstraction and analogy-making" (PI Melanie Mitchell; award 2139983)
(2) NSF grant "BIGDATA: Collaborative Research: Mining for Patterns in Graphs and High-Dimensional Data: Achieving the Limits" (PI Cristopher Moore; award 1838251)
(3) NSF grant "The Role of Individual and Social Networks in the Formation and Change of Beliefs" (PI Henrik Olsson; award 1918490)
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
(4) The Emergent Political Economies program as part of a grant from the Omidyar Network (PI David Krakauer).
(5) Additional support for the UCR program came from SFI's generous donors, including R. Martin Chavez, the Albuquerque Community Foundation: Kimmersteinling Fund, and Eugene & Jean Stark.