Skip to content

Data Formats

File types

GrAndMA accepts CSV (comma-separated) files. TSV (tab-separated) is not currently supported — convert using Excel or pandas.read_csv(...).to_csv(...) before uploading.

The matrix file

The matrix is the core data file. It must be in samples × features orientation (one row per sample, one column per feature) unless you specify otherwise during configuration.

SampleID,glucose,lactate,citrate,alanine
S001,10.4,7.5,12.6,11.5
S002,10.8,8.1,11.4,11.7
S003,9.9,6.8,13.1,10.2
  • The first row must be a header.
  • One column must contain unique sample identifiers. This can be the row index (specify __axis__ as the sample ID column) or a named column.
  • All other columns are treated as features and must be numeric.

The sample metadata file

One row per sample. The sample ID column must match the matrix.

SampleID,Group,TimePoint,Age,Sex
S001,Case,1,45,F
S002,Case,2,45,F
S003,Control,1,38,M
  • Columns can be categorical (Group, TimePoint, Treatment) or continuous (Age, BMI, Score).
  • Categorical columns are used for grouping in plots and differential abundance tests.
  • Continuous columns can be used as covariates in regression models.
  • A batch column can be included for batch correction.

The feature metadata file

One row per feature. Feature IDs must match the column headers in the matrix.

feature_id,hmdb_id,pathway,annotation
glucose,HMDB0000122,Glycolysis,Glucose
lactate,HMDB0000190,Glycolysis,Lactic acid
citrate,HMDB0000094,TCA cycle,Citric acid

Feature metadata is optional but enriches network analysis and pathway mapping.

Orientations

If your data has features as rows and samples as columns (e.g. MetaPhlAn output, some expression matrices), select features × samples orientation during configuration. GrAndMA will transpose it automatically.

Missing values

Missing values should be represented as empty cells or NA. Do not use 0 to represent a missing measurement — use actual zeros only when a feature was genuinely not detected at a level above zero.

Common pitfalls

Problem Fix
First column unnamed in Excel export Add a column header (e.g. SampleID) or use index_col=0 when exporting from pandas
Extra blank rows at end of file Remove in Excel or with dropna(how='all') before export
Mixed numeric and text in a feature column Check for stray characters (%, units, notes) in the matrix
Sample IDs don't match between matrix and metadata Check for trailing spaces, case differences, or different ID formats
BOM character at start of file from Excel Save as "CSV UTF-8 (comma delimited)" not "CSV (comma delimited)"