Databases
DataBases sources
leBIBI DB is now using the rDNA database of riboDB release 17.0
rDNA are extracted as follows:
From the genomes of Bacteria present in RefSeq + GenBank genomes with a species level absent in the RefSeq DB
From the genomes of Archaea present in RefSeq + GenBank
As concern 16SrDNA, it contains 138,286 genomes of Bacteria and 4,779 of Archaea representing 22,781 species name
In the common case of multiples operons only one rDNA is retained on the basis of its centrality
Taxonomy DB
EMBL-ENA taxonomy (xml format). A Julia-written dictionnary is constructed and is used to access to the taxonomy/nomenclature normalized hierarchy.
Type Strains DB
The source of information concerning Type Strains sequences is now RefSeq
Available DB
Archaea and Bacteria 16S stringent DB
This is the default DB containing the genomes of NCBI reference genomes (R) + genomes of the type-strain (T)If the a species is missing, we try to find one or more present in Ensembl! bacteria (E).
if no T/R/E is found, we select a genome following the completude quality (Complete better than Scaffold better than Unassembled) and the longest RNA. This is the TRECS_16SrRNA.fst DB also named prototype low redundency DB.
Archaea and Bacteria 16S relaxed DB
This is the "all named sequence" DB, available in the "expert" version and the complete riboDB release 17.0 (here named BiBi_16SrRNA.fst).Bacteria DNA-directed-RNA-polymerase subunit beta and beta/beta' DB
This DB is constructed during the riboDB release 17.0 construction process and follows the stringency rules of the 16S stringent DB.Note that it is extended to the case of some taxa where there is a fusion of genes rpoB and rpoC.