Yazbeck, A. M.,Tout, K. R.,Stadler, P. F.,Hertel, J.

The miRBase currently reports more than 25,000 microRNAs in several hundred genomes that belong to more than 1000 families of homologous sequences. Quantitative investigations of miRNA gene evolution requires the construction of data sets that are consistent in their coverage and include those genomes that are of interest in a given study. Given the size and structure of data, this can be achieved only with the help of a fully automatic pipeline that improves the available seed alignments, extends the set of available sequences by homology search, and reliably identifies true positive homology search results. Here we describe the current progress towards such a system, emphasizing the task of improving and completing the initial seed alignment.