Skip to content

This repository contains code for a study evaluating whether gut microbiome profiles can improve prediction of infection and mortality in critically ill ICU patients. Data sourced with the help of Dr. Daniel Freedberg at CUMC

Notifications You must be signed in to change notification settings

korem-lab/ICU_Analysis

Repository files navigation

Project Overview

This repository contains code and results for the analysis of gut microbiome profiles and their ability to predict ICU outcomes, including infection and mortality. The analysis includes two cohorts of ICU patients profiled with 16S rRNA sequencing, combined with clinical metadata.

Summary of Findings

Microbiome-based models did not consistently improve prediction of ICU-acquired infection or mortality compared to standard clinical scores (e.g., SOFA). External validation revealed limited generalizability, with predictive performance varying by outcome, cohort, and time horizon. These results suggest that, in their current form, gut microbiome features offer limited incremental value for clinical risk prediction in the ICU.

Reproduction Steps

  1. Access Raw Data
  • Original cohort: /manitou/pmg/projects/korem_lab/Data/Freedberg_inulin_trial/
  • Validation cohort: /manitou/pmg/projects/korem_lab/Data/Freedberg_inulin_trial/validation_data/
  1. Preprocess 16S Sequencing Data (on manitou)

Follow the first two steps from the pipeline in /burg/pmg/users/se2481/scripts/16S_pipeline/README.md:

(a) Human genome filtering

Remove human reads with the MMMBP pipeline:

pybatch run_mmmbp.py

Output in tmp/HGF2 contains filtered FASTQs and a df_path table of human-read counts.

(b) Primer trimming

conda activate shared
python /burg/pmg/users/se2481/scripts/16S_pipeline/trim_primers.py \
  --reads /manitou/pmg/users/mc5672/orig_data/hgf2_filtered/tmp/HGF2 \
  --fwd CCTACGGGNGGCWGCAG \
  --rev GACTACHVGGGTATCTAATCC \
  --batch 20 \
  --out /manitou/pmg/users/mc5672/orig_data/primer_trimmed \
  --exclude m014,m015 \
  --paired

Verify trimming:

cat *.log | grep 'with adapter'

(should be high, i.e. 98-99%)

  1. (Optionally) Transfer Trimmed Files to Local Machine (& gunzip)

  2. Process Data

(a) DADA2 + Taxonomy + SCRuB

Run Data_Processing.ipynb

This notebook:

  • Performs denoising with DADA2
  • Assigns taxonomy
  • Removes contaminants using SCRuB
  • Computes α- and β-diversity

(b) Enrich with Metadata

Run Data_Enriching.ipynb

This joins ASV tables with clinical reference data, performs CLR-transformation, and adds derived features (e.g., SOFA scores, infection timing).

  1. Run Predictive Models:

Choose one of the models in prediction_models/, e.g:

pybatch Death_Next_7_SOFA.py

Model Naming Convention: Models are named using the format {target}{timepoint}{features}, where target refers to the outcome (e.g., infection, death), timepoint specifies the prediction window (0 for ICU admission samples only, any for samples from the full ICU stay, next_7 or next_10 for predicting events within 7 or 10 days of sample collection), and features indicates the input data used (asv, sofa, or sofaasv for combined).

  1. Evaluate and Plot Results

To evaluate model AUROCs and plot ROC curves, run Evaluate_Model.ipynb

(Helpful utility) To generate plots for all models in prediction_models/, run: Generate_Plots.ipynb

About

This repository contains code for a study evaluating whether gut microbiome profiles can improve prediction of infection and mortality in critically ill ICU patients. Data sourced with the help of Dr. Daniel Freedberg at CUMC

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published