Bio Dataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology

    Recent News

    Biodataome Database Update 5/6/2020

    A total of 690 new datasets are now fully available on the Biodataome archive. These data were retrieved from GEO, processed via the latest Biodataome pipeline (see "Documentation"), and manually annotated by an expert biologist. These newly added mollecular datasets refer to Human microarray profiles aquired by 3 different platforms, namely: GPL570, GPL6244, and GPL96.
    Go to Archive

Welcome

BioDataome is a database of uniformly preprocessed and disease-annotated genomic and epigenomic data with the aim to promote and accelerate the reuse of public data. We followed the same preprocessing pipeline for each biological mart (microarray gene expression, RNASeq gene expression, DNA methylation) to produce ready for downstream analysis datasets and automatically annotated them with Disease-Ontology terms. We also designate datasets that share common samples and automatically discover control samples in case-control studies. Currently, BioDataome includes 6289 datasets, 296047 samples spanning 801 diseases and can be easily used in large scale massive experiments and meta-analysis. All datasets are publicly available for querying and downloading.
Technologies
6
Homo sapiens
274795
out of 296047
samples
Mus musculus
21252 out of 296047
samples
GSE Species Entity Technology Type Samples Duplicates Disease ParentNode ChildNode Analyses Annotation Version Release Date
Download metadata