[go: up one dir, main page]

Skip to content
@biominer-lab

BioMiner

BioMiner for Managing and Mining Cancer Associated Omics Data.

BioMiner Project

BioMiner is dedicated to building a data mining platform that integrates high-quality multi-omics data management, distribution and exploratory analysis.

For Users

Please access Metadata System and Indexd to get metadata and multi-omics data.

For Data Collaborators

Full Lifecycle Management of Metadata

Lifecycle

The DataHub repository is used to manage the metadata and specifications for the TCOA project.

|- .github                   -> Scripts for Continuous Integration and Delivery (Metadata QA & QC, Syncing the Metadata to the BioMiner System)
|- data                      -> Metadata Tables (Each sub directory (named with the project name) contains several metadata tables for every entity, please read the specifications for more details)
|    |- FUSCC_LUAD
|    |       |- project/
|    |       |- ...
|    |       |- README.md
|    |- FUSCC_TNBC
|    |- FUSCC_CBCGA
|- docs                      -> Specifications and How-To
|- README.md                 -> Quick Start
|- CHANGELOG                 -> Change Log
|- LICENSE                   -> License File

The SEQC DataHub repository is used to manage the metadata and specifications for the SEQC project.

|- .github/                          -> Scripts for Continuous Integration and Delivery (Metadata QA & QC, Syncing the Metadata to the BioMiner System)
|- data/                             -> Metadata Tables (Each sub directory (named with the project name) contains several metadata tables for every entity, please read the specifications for more details)
|    |- <PROJECT_NAME>_RNA/
|    |       |- project/
|    |       |- donor/
|    |       |- biospecimen/
|    |       |- reference_materials/
|    |       |- library/
|    |       |- sequencing/
|    |       |- datafile/
|    |       |- README.md
|    |       |- CHANGELOG
|- docs/                             -> Specifications and How-To
|- README.md                         -> Quick Start
|- CHANGELOG                         -> Change Log
|- LICENSE                           -> License File

A centralized location for storing curated data ready for inclusion in cBioPortal.

For Pipeline Developers

flowchart LR
  subgraph HPC [Testing on Local HPC]
    publications([Publications]) --Summarized by Pipeline Developer--> scripts[Pipeline with PBS Scripts]
    scripts --Evaluated by Pipeline Developer--> tested_pipeline[The Best Pipeline]
  end
  subgraph AliCloud [Running on AliCloud]
    app[Usable and Low-cost App] --Deployed by System Developer--> platform[ClinicoOmics Web System]
  end
  HPC -- Pipeline >>> App --> AliCloud
  HPC -. Developed by App Developer .-> AliCloud
Loading

For System Developers

The whole system mainly consists of the following parts:Omics Datasets' Repo, Metadata QC & QA System, Workflows & Bioinformatics Workflow Management System, Omics Data Commons(Similar to Genomics Data Commons), A Web Platform for Collaborative Multi-omic Data Visualization and Exploration(Simimar to cBioportal).

Components Temporary Solution Production Description
Omics Datasets' Repo TCOA DataHub
SEQC DataHub
Same as the Temporary Solution GitHub Repo, For Metadata Collaboration and Version Control
Metadata QC & QA System Metabase
Metadata Validator
Same as the Temporary Solution For Metadata QC & QA
Workflows & Bioinformatics Workflow Management System ClinicoOmics Same as the Temporary Solution For Bioinformatics Workflow Management
Omics Data Commons Metabase
BioPoem
Metabase
BioPoem
BioMiner Indexd
Metabase - For Metadata Management of the Omics Datasets

Biopoem - For High-speed Data Transfering (Sequencing Center -> Local HPC -> NODE -> AliCloud/Local HPC)

BioMiner Indexd - A hash-based data indexing and tracking service providing globally unique identifiers.

NODE/GSA/SRA - For Level 1/2 Data Files

OSS - For Level3 Data Files
A Web Platform for Collaborative Multi-omic Data Visualization and Exploration cBioportal
cBioportal DataHub
BioMiner Studio cBioportal - Omics Data Exploration

cBioportal DataHub - A centralized location for storing curated data ready for inclusion in cBioPortal.

BioMiner Studio - Support Online Query and Exploration Analysis Modules(Similar to cBioportal, Stats and Machine Learning Modules
graph TD
    subgraph BioMiner's Management View
      BioMiner -- Manage <> Similar with GDC --> Metadata[Metadata];
      BioMiner -- Manage <> Similar with cBioportal --> AnalysisModules[Analysis Modules];
      Metadata -.- AnalysisModules;
    end

    Metadata -- Contains --> Info[Patient/Sample Information, etc.];
    Metadata -- File Link to --> Level1[Level1 Data - NODE/GSA/SRA];
    Metadata -- File Link to --> Level3[Level3 Data - Cloud Storage];
    SeqSite[Sequencing Center] -- Sync by Biopoem --> HPC[PGx HPC];
    HPC -- Sync by Biopoem --> Level1[Level1 Data - NODE/GSA/SRA];
    Level1 -- Transfer to Cloud Storage by BioPoem & Convert Level1 Data to Level3 Data by SeqPipe  --> Level3;
    AnalysisModules --> DownstreamAnalysis[Downstream Analysis];
    Info -- Part of Samples' Metadata --> Dataset[New Dataset];
    Level3 -- Part of Samples --> Dataset;
    Dataset -- As Input Data --> DownstreamAnalysis(Mining Dataset with Several Analysis Modules on BioMiner);
    DownstreamAnalysis --> Kownledges;
    
    style Level1 fill:#00758f;
    style Level3 fill:#00758f;
    style Info fill:#00758f;
    style Metadata fill:#dafbe1;
    style AnalysisModules fill:#dafbe1;
Loading

Pinned Loading

  1. seqc-datahub seqc-datahub Public

    Datahub for SEQC Project

    Shell

Repositories

Showing 6 of 6 repositories
  • omics-workflow-images Public

    Docker images for omics workflow.

    biominer-lab/omics-workflow-images’s past year of commit activity
    Python 0 0 0 0 Updated Mar 20, 2023
  • biominer-lab/www.3steps.cn’s past year of commit activity
    0 0 0 0 Updated Aug 13, 2022
  • .github Public

    README for All Parteners

    biominer-lab/.github’s past year of commit activity
    0 0 0 0 Updated Aug 3, 2022
  • rnaseq-exp-workflow Public

    RNA Sequencing (RNA-seq) Pipeline

    biominer-lab/rnaseq-exp-workflow’s past year of commit activity
    WDL 0 0 0 0 Updated Jul 21, 2022
  • biominer-lab/conda-builder’s past year of commit activity
    0 0 0 0 Updated Jun 24, 2022
  • seqc-datahub Public

    Datahub for SEQC Project

    biominer-lab/seqc-datahub’s past year of commit activity
    Shell 0 0 0 0 Updated Apr 4, 2022

Top languages

Loading…

Most used topics

Loading…