Multi-Omics Integration¶
The Multi-Omics module combines two or more datasets from the same project to perform joint analysis. Datasets are linked by a shared sample identifier.
Requirements¶
- At least two datasets in the same project
- A common sample ID that appears in both datasets (e.g. cell line name, patient ID, sample barcode)
- Both datasets must be preprocessed
If your datasets do not share sample IDs directly, you may need to add a mapping column to your sample metadata before uploading.
Linking datasets¶
When you open the Multi-Omics module, GrAndMA shows all datasets in the current project and attempts to detect the shared sample ID automatically. If detection fails, use the Link by dropdowns to specify which column in each dataset's sample metadata contains the shared identifier.
Samples present in one dataset but not the other are excluded from joint analysis. The intersection size is shown before you proceed.
Combined Statistics¶
Runs supervised or unsupervised methods on the concatenated feature matrix from all selected datasets.
PCA on concatenated matrix¶
Feature matrices are concatenated column-wise (each dataset scaled independently) and a single PCA is run. Use Color by and Shape by in Plot Options to annotate samples post-hoc — no group column required at run time.
PLS-DA on concatenated matrix¶
Requires a Group column (a metadata column shared across datasets) as the response variable. Useful for identifying features that discriminate between groups across omics layers.
Unsupervised Integration¶
Methods that find shared latent structure across omics layers without requiring a group label. Group colouring is applied post-hoc via Color by in Plot Options.
MCIA — Multiple Co-Inertia Analysis¶
Finds axes of maximum co-inertia between datasets. Produces joint sample scores (all datasets), per-block sample scores, and a co-inertia-by-component bar chart.
| Parameter | Default | Description |
|---|---|---|
| Components | 2 | Number of co-inertia axes to compute |
JIVE — Joint and Individual Variation Explained¶
Separates variation into joint structure (shared across omics) and individual structure (specific to each dataset). Produces joint structure scores, per-block individual scores, and a variance decomposition chart.
| Parameter | Default | Description |
|---|---|---|
| Components | 2 | Number of components for the full decomposition |
| Joint components | 2 | Rank of the joint structure |
| Individual components | 2 | Rank of each block's individual structure |
MOFA+ — Multi-Omics Factor Analysis¶
Decomposes all datasets into shared latent factors explaining co-variation across omics layers. Each factor captures a coordinated pattern — e.g. a factor driven by both metabolite levels and gene expression.
| Parameter | Default | Description |
|---|---|---|
| Factors | 10 | Number of latent factors to learn |
Results include factor scores per sample, feature weights per factor per dataset, and variance explained per factor.
SNF — Similarity Network Fusion¶
Builds a per-dataset sample similarity network and fuses them iteratively. Assigns samples to clusters based on the fused network. Useful when datasets are very heterogeneous.
| Parameter | Default | Description |
|---|---|---|
| Clusters | 3 | Number of clusters in the fused network |
| K neighbours | 20 | Nearest neighbours used for each sample graph |
Combined Network¶
Builds a cross-dataset correlation network from the top N most variable features across all datasets.
| Parameter | Default | Description |
|---|---|---|
| Top features | 20 | Number of highest-variance features to include |
| Threshold | 0.6 | Minimum |r| for an edge to appear in the network |
Two views are produced:
- Correlation heatmap — all pairwise correlations among selected features
- Network graph — spring-layout graph; blue edges = positive correlation, red = negative; nodes coloured and shaped by source dataset
A table of significant edges (sorted by |r|) is available below each plot.
Plot options¶
All score plots (PCA, MCIA joint scores, JIVE joint scores, MOFA factor scores) support post-hoc styling via the Plot Options panel:
| Option | Description |
|---|---|
| Marker size | Adjust point size across all scatter plots |
| Color by | Colour samples by any metadata column from any linked dataset |
| Shape by | Use marker shape to encode a second metadata variable |
Metadata is merged from all selected datasets — overlapping columns with identical values are deduplicated; conflicting values appear as separate suffixed columns (e.g. condition__dataset1, condition__dataset2).
Saving and reloading analyses¶
After a run completes, click Save in the results header to store the analysis with an optional name. Saved (and recent unsaved) analyses appear in the Load recent analysis dropdown above each section's results panel, marked with ★ if saved.
Example: CCLE paired analysis¶
The built-in CCLE demo includes bulk metabolomics and pseudo-bulk scRNA-seq for 188 matched cancer cell lines. To explore:
- Open the CCLE project from the home page
- Go to Multi-Omics
- Both datasets are linked by cell line name automatically
- Run MOFA to identify transcriptomic–metabolomic co-variation factors
- Use Color by → primary_disease in Plot Options to see cancer-type structure
Example: microbiome–metabolome studies¶
All 14 paired microbiome–metabolome demo studies have microbiome and metabolomics datasets under the same project, linked by sample ID. Use the Combined Network view to find metabolites that associate with specific bacterial genera.