What is PerturbSeq.db

Perturbation experiments are essential in systems biology for understanding cellular networks. Combining genetic perturbation or chemical perturbation with techniques like scRNA-seq or scATAC-seq allows for single-cell analysis, revealing cellular heterogeneity often missed in bulk studies. Large-scale screenings provide a deeper look into complex behaviors.
   PerturbSeq.db was created—a database that compiles 189 datasets from 77 studies, covering 50+ cell types and technologies. It includes 165 scRNA-seq and 24 scATAC-seq datasets, with data from human and mouse sources, highlighting both genetic and chemical perturbations. This resource is crucial for advancing our understanding of cell responses and interactions at a granular level.


Overview of PerturbSeq.db



Please refer to the following publication
Tongxin He, Wenhui Wang, Xiaoxiao Yang, Yang Tong, Xiaochuan Liu, Yuting Wang, Jiapei Yuan, Yang Yang*. PerturbSeq.db: a comprehensive resource for interactive investigation of single-cell perturbation data. 2024
News and update
- PerturbSeq.db 1.00 released (2024-06-20)
- PerturbSeq.db home page and help page modified (2024-06-12)
- PerturbSeq.db contents related to new datasets added (2024-05-28)
- PerturbSeq.db new single-cell perturbation data curation (2024-05-12)
- PerturbSeq.db query mediators module finished (2024-03-12)
- PerturbSeq.db query targets module added (2024-03-29)
- PerturbSeq.db query perturbation module added (2024-03-28)
- PerturbSeq.db dataset module finished (2024-03-12)
- PerturbSeq.db first version data analysis finished (2023-01-30)
- PerturbSeq.db the uniformed data analysis pipeline finished (2023-9-30)
- PerturbSeq.db single-cell perturbation data curation finished (2023-08-30)
- PerturbSeq.db single-cell Perturbation data curation started (2023-06-15)







Tips: Please wait for loading , and then select the dataset and click the "Submit" button.
Loading...


Quality control metrics before denoising

Loading...







Quality control metrics after denoising





Single-cell clustering

Loading...



Ratio of each cluster in each perturbation



Ratio of each perturbation in each cluster





Differentially expressed genes in each perturbation (Wilcoxon test)
Loading...


Differentially expressed genes in each perturbation (t-test)

Gene ontology enrichment (Wilcoxon DEGs)


Disease enrichment (Wilcoxon DEGs)


Gene ontology enrichment (t-test DEGs)


Disease enrichment (t-test DEGs)




Similarity (Pearson correlation)

Loading...

Similarity (E-distance)














Query Target/Perturbation













Query Mediator











Tips: Please wait for loading , and then select the dataset and click the "Submit" button.


before denoising

Loading...
after denoising

Loading...

Wilcoxon

Loading...








t-test

Loading...



Wilcoxon

Loading...
t-test

Loading...











Tips: Please wait for loading , and then select the dataset and click the "Submit" button.


Loading...








About PerturbSeq.db

Perturbation experiments are a cornerstone of systems biology, allowing researchers to probe the intricate networks within cells. The advent of technologies like CRISPR-Cas9, CRISPRi, and CRISPRa has revolutionized the manipulation and study of cells. CRISPR-Cas9 facilitates precise genome editing for introducing specific mutations or deletions, while CRISPRi and CRISPRa are used to repress or activate gene expression, respectively, without altering the DNA sequence. In contrast, small molecules chemicals typically interact directly with proteins such as receptors and enzymes. These tools provide a multi-tiered approach to modulate cellular functions and dissect genetic and protein interactions. By employing techniques such as scRNA-seq or scATAC-seq in combination with genetic and chemical perturbations, researchers can assess cellular responses at the single-cell level. Such large-scale screenings provide insights into complex cellular behaviors that are not apparent in traditional bulk measurements. Single-cell analysis has unveiled the heterogeneity within cell populations, a facet often overlooked in bulk studies. Large perturbation screens are tailored to study specific systems under a set of perturbations of interest.

PerturbSeq.db, a curated database that consolidates 189 single-cell perturbation datasets from 77 studies. These datasets consist of 165 scRNA-seq and 24 scATAC-seq molecular readouts. The collection includes single-cell perturbation data from approximately 52 different cell lines or tissues such as embryonic stem cells, PBMCs, tumor-infiltrating immune cells, and brain organoids. Out of these, 147 datasets involve genetic perturbations, while the rest are influenced by chemical perturbations. The majority of the data is sourced from Homo sapiens (156 datasets) and Mus musculus (29 datasets). A uniform processing pipeline was implemented to analyze these single-cell perturbation datasets. By creating an interactive user interface, we have established PerturbSeq.db, enabling users to browse, search, visualize, and download single-cell perturbation data conveniently.

  • Data Browsing: Users can select datasets based on species, cell/tissue type, perturbation type, and associated publications. Each dataset includes quality control metrics, clustering results, target identification, functional assessments, and perturbation similarity analyses.
  • Data Querying: Users can search for perturbations and their effects on a gene of interest. The database will display the results in both network and tabular formats, showing how the gene’s expression is influenced by various perturbations. Additionally, it identifies target genes affected by the perturbation of the gene in question. Users can also explore potential mediators by inputting pairs of genes.
  • Perturbation: Users can analyze various results of a perturbation of interest across multiple datasets in specific species. The database will show the quality control metrics, DEG counts and intersections, as well as detailed visualizations of expression changes in terms of both the magnitude and direction of common DEGs across different datasets.
  • Data Downloading: Users can access, and download processed single-cell perturbation data by specifying the species, cell/tissue type, and perturbation type, providing a direct link for data retrieval.


Perturbation Description

PerturbSeq.db includes a comprehensive collection of perturbations, comprising 19,646 genetic and 775 chemical perturbations. These perturbations are categorized based on their type and mode of action.

Genetic perturbations in PerturbSeq.db are classified into two main categories: those targeting specific genes and those targeting chromosomal loci. Perturbations targeting specific genes are identified using gene symbols (e.g., TP53). In contrast, those targeting chromosomal loci are annotated using chromosome coordinates, formatted as chromosome ID:Start-End (e.g., chr8:99763530-99763606). The sgRNAs targeting different genes or chromosomal loci are considered distinct perturbations due to their unique biological effects. However, multiple sgRNAs targeting the same gene or chromosomal locus are classified as a single perturbation to ensure effective gene knockdown, knockout, or overexpression. In addition, the database also includes cases where cells are subjected to multiple perturbations simultaneously. These combinations are denoted using a comma (e.g., “41BB, TGFBR2” indicates perturbation by both 41BB and TGFBR2). To maintain consistency across datasets, the order of genes in combined perturbations is standardized. For example, both “41BB, TGFBR2” and “TGFBR2, 41BB” are sorted to “41BB, TGFBR2” to ensure uniformity in representation.

For Chemical compound perturbations, considering that different treatment methods, such as varying doses or durations, can significantly impact cellular effects. Therefore, each chemical compound perturbation is annotated with detailed treatment information, as provided in the literature (e.g., Erlotinib_Day1, Dacinostat_0.01μM).



Tutorials of PerturbSeq.db