MaGIC Differential Expression Tool

Welcome to the Differential Expression Tool by the Molecular and Genomics Informatics Core (MaGIC).

What This Tool Does

This is the DE engine of the MaGIC bulk-expression pipeline. Upload a raw count matrix and sample metadata, build a design, and run DESeq2, edgeR-GLM, or limma-voom. It produces a results table per contrast — gene, baseMean, log2FoldChange, lfcSE, stat, pvalue, padj — that drops straight into magic-volcano, magic-heatmap, and magic-setcomparison.

Which DE method should I use?

DESeq2

Small-to-moderate sample sizes (~3–10 per group). Shrunken fold-change estimates and the most permissive about low counts. The default choice for most experiments.

edgeR-GLM

Flexible designs, robust to complex multi-factor models, with power comparable to DESeq2 in most settings.

limma-voom

Larger sample sizes (>10 per group). Fast and well-calibrated when you trust the mean-variance assumption.

All three normalize raw counts internally — DESeq2 uses median-of-ratios; edgeR and limma-voom use TMM. Feed this tool raw integer counts, not the normalized output of magic-qc.

When should I correct for batch?

First-line: add batch as a covariate in your design

In Setup & Run → Design, add your batch variable as a covariate. The model handles batch within the statistical framework — no counts modified, fewer assumptions, no inflated Type I error. Try this first for any DE analysis.

Alternative: pre-correct counts with ComBat-seq

Pre-correcting the counts (Setup & Run → Batch Correction) is appropriate only when batch is severely confounded with biology, when the covariate approach produces unstable fits, or when you need a batch-corrected count matrix as an output artifact for visualization / clustering tools that cannot model batch themselves. ComBat-seq discards information and can inflate false positives if used unnecessarily — don't reach for it by default.

Required Input Data Formats

Raw Count Matrix

File format: CSV or TSV
Rows: genes (one gene per row)
Columns: samples (one sample per column)
First column: gene identifiers
Values: raw (un-normalized) integer counts

GeneID, Control1, Control2, Treat1
G0001,  149,      122,      218
G0002,  409,      151,      46

Sample Metadata

File format: CSV or TSV
Rows: samples (one sample per row)
First column: sample names — must match matrix column names
Additional columns: condition, batch, sex, time, ...

Sample,    Condition, Batch
Control1,  Control,   B1
Treat1,    TreatA,    B2

Gene Info Table (optional)

Maps gene IDs to symbols so the results tables carry a Symbol column for downstream annotation.

GeneID,  Symbol
G0001,   TP53
G0002,   EGFR

1 · Data

Upload your own data

Raw count matrix (CSV/TSV)

Browse...

Sample metadata (CSV/TSV)

Browse...

Gene info (optional, CSV/TSV)

Browse...

Demo data is loaded by default: 600 genes × 12 samples in 3 conditions (Control / TreatA / TreatB), with a partially confounded batch and ~100 truly DE genes per treatment. Configure the design below and click Run.

Flip the switch above to upload your own counts + metadata.

2 · Design

Primary condition column:

Reference level:

Fold-changes are reported relative to the reference level.

Additive covariates / blocking factors:

Added to the model additively (no interactions in v1). Add batch here as the first-line way to correct for it.

Batch Correction (ComBat-seq)

Pre-correct for batch with ComBat-seq

Prefer adding batch as a covariate above. Use ComBat-seq only when batch is severely confounded with biology, when covariate fits are unstable, or when you need a corrected count matrix as an output artifact.

Batch column:

Biological variable(s) to preserve:

Passed to ComBat-seq's group / covar_mod so real biological differences are not removed.

3 · Run

Method:

Test:

Wald LRT

Wald tests each contrast directly. LRT tests the condition term as a whole (reduced model = covariates only).

Independent filtering

TMM normalization

Quasi-likelihood F-tests (glmQLFit / glmQLFTest) on a filtered count matrix.

Sample quality weights

voomWithQualityWeights down-weights lower-quality samples. TMM is always applied for limma-voom.

Design formula

Contrasts to test

By default, every non-reference level of the primary condition is compared against the reference. Add custom level-vs-level comparisons below.

Numerator level:

Denominator level:

Replicates per level of the primary condition — a quick check on the power available for each contrast.

Diagnostics

Contrast

MA plot & p-value histogram for:

Fonts

Title size:

X axis label size:

X axis font size:

Y axis label size:

Y axis font size:

Legend

Legend location:

Legend size:

Point size:

MA highlight

padj cutoff:

Significant points:

Non-significant points:

Histogram

Bar fill:

Styling

Text size:

This plot is drawn for the differential expression method you ran.

Size

Resize plots

Plot height (px):

Plot width (px):

Choose download format

Download the MA plot

Choose download format

Download the p-value histogram

Choose download format

Download the plot

Choose download format

Download the PCA plot