-
Notifications
You must be signed in to change notification settings - Fork 5
Running CellFie in Matlab
- A Matlab version newer than R2014b
Download CellFie in the bash
# in the command bash
git clone https://github.com/LewisLabUCSD/CellFie.git <desired path to cellfie>/CellFie
Add CellFie to the Matlab path
% in matlab
cd PathToCellfie % Example - cd C:\User\CellFie-master
addpath(genpath('PathToCellfie')) % Example - addpath(genpath('C:\User\CellFie-master'))
%% Load an example dataset (expression matrix: entrez ids x samples)
load 'test/suite/dataTest.mat'
%% Define the number of samples (equal to the column number of the expression matrix)
SampleNumber=3;
%% Define the reference genome-scale models you want to use (all listed in the test/suite)
ref='MT_recon_2_2_entrez.mat';
%% Define the type parameters of the method
param.ThreshType='local';
param.LocalThresholdType='minmaxmean';
param.percentile_or_value='percentile';
param.percentile_low=25;
param.percentile_high=75;
[score, score_binary ,taskInfos, detailScoring]=CellFie(data,SampleNumber,ref,param);
You should obtain the following results for the score, the score_binary and the taskInfos
USAGE:
[score, score_binary ,taskInfos, detailScoring]=CellFie(data,SampleNumber,ref,param)
INPUTS:
- data.gene - cell array containing GeneIDs in the same format as model.genes
- data.value - mRNA expression data structure (genes x samples)associated to each gene mentioned in data.gene
- SampleNumber - Number of samples
- ref - Reference model used to compute the metabolic task scores (e.g.,'MT_recon_2_2_entrez.mat')
OPTIONAL INPUTS:
- param.ThreshType - Type of thresholding approach used (i.e.,'global' or 'local') (default - local)
Parameters related to the use of a GLOBAL thresholding approach - the threshold value is the same for all the genes
- param.percentile_or_value - the threshold can be defined using a value introduced by the user ('value') or based on a percentile of the distribution of expression value for all the genes and across all samples of your dataset ('percentile')
- param.percentile - percentile from the distribution of expression values for all the genes and across all samples that will be used to define the threshold value
- param.value - expression value for which a gene is considered as active or not (e.g., 5)
Parameters related to the use of a LOCAL thresholding approach - the threshold value is different for all the genes
-
param.percentile_or_value - the threshold can be defined using a value introduced by the user ('value') or based on a percentile of the distribution of expression value of a specific gene across all samples of your dataset ('percentile'-default)
-
param.LocalThresholdType - option to define the type of local thresholding approach to use
'minmaxmean' (default options )- the threshold for a gene is determined by the mean of expression values observed for that gene among all the samples, tissues, or conditions BUT the threshold :(i) must be higher or equal to a lower bound and (ii) must be lower or equal to an upper bound. 'mean' -the threshold for a gene is defined as the mean expression value of this gene across all the samples, tissues, or conditions
-
param.percentile_low - lower percentile used to define which gene are always inactive in the case of use 'MinMaxMean' local thresholding approach (default = 25)
-
param.percentile_high - upper percentile used to define which gene are always active in the case of use 'MinMaxMean' local thresholding approach (default= 75)
-
param.value_low - lower expression value used to define which gene are always inactive in the case of use 'MinMaxMean' local thresholding approach (e.g., 5)
-
param.value_high - upper expression value used to define which gene are always active in the case of use 'MinMaxMean' local thresholding approach (e.g., 5)
OUTPUTS:
-
score - relative quantification of the activity of a metabolic task in a specific condition based on the availability of data for multiple conditions
-
score_binary - binary version of the metabolic task score to determine whether a task is active or inactive in specific conditions
-
taskInfos - Description of the metabolic task assessed
-
detailScoring - Matrix detailing the scoring
1st column = sample ID 2nd column = task ID 3th column = task score for this sample 4th column = task score in binary version for this sample 5th column = essential reaction associated to this task 6th column = expression score associated to the reaction listed in the 5th column 7th column = gene used to determine the expression of the reaction listed in the 5th column 8th column = original expression value of the gene listed in the 7th column