Databases
DataBases sources
leBIBI DB is now using the rDNA database of riboDB release 17.0
rDNA are extracted as follows:
From the genomes of Bacteria present in RefSeq + GenBank genomes with a species level absent in the RefSeq DB
From the genomes of Archaea present in RefSeq + GenBank
As concern 16SrDNA, it contains 138,286 genomes of Bacteria and 4,779 of Archaea representing 22,781 species name
In the common case of multiples operons only one rDNA is retained on the basis of its centrality
Taxonomy DB
EMBL-ENA taxonomy (xml format). A Julia-written dictionnary is constructed and is used to access to the taxonomy/nomenclature normalized hierarchy.
Type Strains DB
The source of information concerning Type Strains sequences is now RefSeq
Available DB
A stringent DB
This is the default DB containing the genomes of NCBI reference genomes (R) + genomes of the type-strain (T)If the a species is missing, we try to find one or more present in Ensembl! bacteria (E).
if no T/R/E is found, we select a genome following the completude quality (Complete better than Scaffold better than Unassembled) and the longest RNA. This is the TRECS_16SrRNA.fst DB also named prototype low redundency DB.