What algorithms can’t tell us about community detection

“Illusory” communities: two equally good ways to divide a network into communities that have nothing to do with each other.

July 27, 2017

Many who study networks care about groups of interconnected nodes. These groups, called “communities” or “modules,” represent real-world relationships like friend groups on Facebook, businesses in a supply chain, and even species within a food web. The challenge is to identify whether, and ultimately where, these structures exist within a mass of data.

In a recent paper, Jess Banks, a Ph.D. candidate in mathematics at UC Berkeley and a former Santa Fe Institute undergraduate intern, Robert Kleinberg, Associate Professor of Computer Science at Cornell, and SFI Professor Cristopher Moore set out to test under what conditions a computer algorithm can verify the absence of community structure in a network. Without an algorithm that can do this, network scientists can’t tell whether the communities they find are statistically significant — that is, they can’t tell real communities from fake ones.

Banks posed the research as a thought experiment: “If I generate a random network with no community structure ‘baked in,’ will it have communities by chance? If not, can an algorithm certify that it doesn’t?”

After generating random networks with no real community structure, the researchers put one particular algorithm to the test — the simplest algorithm in a popular class called “the Sum of Squares hierarchy.” They decided to investigate the algorithm’s ability to verify the absence of dissasortative community structures, which, like competitive businesses, are characterized by a lack of connections with each other. In computer science, this corresponds to the classic Graph Coloring problem, where nodes connected by an edge are required to have different colors.

By studying the behavior of this algorithm, the researchers uncovered a blind spot. If a network is too sparse, with too few connections, the algorithm cannot tell whether or not it has communities. Using some clever mathematics, they proved that the algorithm can be fooled into thinking that communities exist even when they don’t.

“If we care about doing good science, and honestly testing our hypotheses, then verifying the absence of structure in the data is just as important as being able to find it when it is there,” Banks says.

“We’re all looking for patterns in data,” Moore adds. “But just like humans, our algorithms often find patterns that aren’t really there. We need to understand the fundamental limits on our ability to tell whether patterns truly exist, so we’ll know when we need more and better data before we can draw any conclusions.”

Going forward, the researchers’ method could be used to test other, more powerful algorithms in the same hierarchy.

Read the paper, "The Lovasz Theta Function for Random Regular Graphs and Community Detection in the Hard Regime," on the arXiv server. (May 2, 2017)

More SFI News

View All News

What algorithms can’t tell us about community detection

July 27, 2017

Share

News Media Contact

Santa Fe Institute

Tags

Related Projects

More SFI News

Mirta Galesic receives prestigious ERC Advanced Grant

Carlo Rovelli receives 2024 Lewis Thomas Prize

Research News Brief: Defining a city using cell-phone data

Complexity tools for USDA nutritional guidelines

Quantifying the potential value of data

Carlo Rovelli joins SFI's Fractal Faculty

New book offers thoughtful approach to modeling complex social systems

Research News Brief: A test of AI “personalities” and behavior

Study: To make sense of history, embrace uncertainty

Study: Predicting steps in a random process

Embodied intelligence & a sense of self

How to track important changes in a dynamic network

African and South Asian students build new connections during inaugural Complexity Global School

New gifts support SFI Education and Postdoctoral programs

The cultural evolution of collective property rights

Applications for Complexity Global School are now open

Life as a planetary regulator: an experimental test

Sam Bowles named IEA Fellow

Modeling the UN’s biodiversity goals

How a city is organized can create less-biased citizens