Berkemer, Sarah J.; Lisa-Katharina Maier; Fabian Amman; Stephan H. Bernhart; Julia Woerta; Pascal Maerkle; Friedhelm Pfeiffer; Peter F. Stadler and Anita Marchfelder

Archaeal genomes are densely packed; thus, correct transcription termination is an important factor for orchestrated gene expression. A systematic analysis of RNA 3 ' termini, to identify transcription termination sites (TTS) using RNAseq data has hitherto only been performed in two archaea, Methanosarcina mazei and Sulfolobus acidocaldarius. In this study, only regions directly downstream of annotated genes were analysed, and thus, only part of the genome had been investigated. Here, we developed a novel algorithm (Internal Enrichment-Peak Calling) that allows an unbiased, genome-wide identification of RNA 3 ' termini independent of annotation. In an RNA fraction enriched for primary transcripts by terminator exonuclease (TEX) treatment we identified 1,543 RNA 3 ' termini. Approximately half of these were located in intergenic regions, and the remainder were found in coding regions. A strong sequence signature consistent with known termination events at intergenic loci indicates a clear enrichment for native TTS among them. Using these data we determined distinct putative termination motifs for intergenic (a T stretch) and coding regions (AGATC). In vivo reporter gene tests of selected TTS confirmed termination at these sites, which exemplify the different motifs. For several genes, more than one termination site was detected, resulting in transcripts with different lengths of the 3 ' untranslated region (3 ' UTR).