Starostin, George; N. Ezgi Altinsik; Mikhail Zhivlov; Piya Changmai; Olga Flegontova; Sergey A. Spirin; Andrei Zavgorodnii; Pavel Flegontov and Alexei S. Kassian

Relationships between universally recognized language families represent a hotly debated topic in historical linguistics, and the same is true for correlation between signals of genetic and linguistic relatedness. We developed a weighted permutation test and applied it on basic vocabularies for 31 pairs of languages and reconstructed proto-languages to show that three groups of circumpolar language families in the Northern Hemisphere show evidence of relationship though borrowing in the basic vocabulary or common descent: [Chukotko-Kamchatkan and Nivkh]; [Yukaghir and Samoyedic]; [Yeniseian, Na-Dene, and Burushaski]. The former two pairs showed the most significant signals of language relationship, and the same pairs demonstrated parallel signals of genetic relationship implying common descent or substantial gene flows. For finding the genetic signals we used genome-wide genetic data for present-day groups and a bootstrapping model comparison approach for admixture graphs or, alternatively, haplotype sharing statistics. Our findings further support some hypotheses on long-distance language relationship put forward based on the linguistic methods but lacking universal acceptance. Significance statement: Indigenous people inhabiting polar and sub-polar regions in the Northern Hemisphere speak diverse languages belonging to at least seven language families which are traditionally thought of as unrelated entities. We developed a weighted permutation test and applied it to basic vocabularies of a number of languages and reconstructed proto-languages to show that at least three groups of circumpolar language families show evidence of relationship though either borrowing in the basic vocabulary or common descent: Chukotko-Kamchatkan and Nivkh; Yukaghir and Samoyedic; Yeniseian, Na-Dene, and Burushaski. The former two pairs showed the most significant signals of language relationship, and the same pairs demonstrated parallel signals of genetic relationship implying common descent or substantial gene flows.