Ramón Ferrer, Reinhard Köhler, Ricard Solé

Paper #: 03-06-042

Many languages are spoken on Earth. Despite their diversity, many robust language universals are known to exist. All languages share syntax, i.e. the ability to combine words to form sentences. The origins of such a trait are an open debate. Most linguistic universals are defined in a way that strictly confines them to a linguistic context. This is not the case for the previously unreported potential syntactic universals presented here. By using recent developments from the statistical physics of complex networks, we show that different syntactic dependency networks (from Czech, German, and Romanian) share many non-trivial statistical patterns such as the small world phenomenon, scaling in the distribution of degrees, and disassortative mixing. Such previously unreported features of syntax organization are not a trivial consequence of the structure of sentences, but an emergent trait at the global scale. Our results strongly suggest that existent languages might belong to the same universality class as it is defined in physics.

PDF