-
Notifications
You must be signed in to change notification settings - Fork 47
Incorrect evaluation workflow for essential genes in estimateEssentialGenes
#970
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I've noticed that currently, the gene shortNames in the model and the function idMapping = [model.genes, model.geneShortNames];
[grRules,genes,rxnGeneMat] = replaceGrRules(model.grRules,idMapping);
model.grRules = grRules;
model.genes = genes;
model.rxnGeneMat = rxnGeneMat; Currently, one
These examples could lead to errors in the essentiality assessment of some genes in the model, resulting in inaccurate evaluation of essential genes. |
I believe rather than determining which I have organized and prepared the conversion results of the Hart2015 experimental dataset. You can refer to here: |
I have already organized the results of the two methods. First is about changing
Second is about keeping the
Obviously, the second result is superior to the first one. This indicates that the gene |
@johan-gson Johan, Could you check this? Thanks! |
Hi Jiahao, Nice work, I'm sure you are right, thank you for a thorough investigation! I have been away from this quite some time, and actually never worked with the Hart dataset. @feiranl, is there any way we can place this new file somewhere where it is accessible and update any code/docs to refer to this file instead? I also recommend looking at the DepMap data, there are many more cell lines there. |
Current behavior:
Recently, I have been working on evaluating essential genes. I've found that there are issues with the current evaluation workflow (also in auto-tasks in github) in
estimateEssentialGenes
.I found that the output context-specific models were very strange, with only a small amount of content as you can see below.
Further investigation revealed that the reason for this result is due to the fourth parameter
useGeneSymbol
of theestimateEssentialGenes
function defaulting astrue
, which then converts the genes in the template model intogeneSymbol
format. However, in reality, the genes in theHart2015_RNAseq.txt
data are in the 'ENSG0000' format, leading to no gene matches and thus no gene expression being detected by default.So, I manually tried changing the fourth parameter to false, and while the content of the resulting model was much more normal.
However, the result of essential gene evaluation turned out to be all zeros because the genes in in
Hart2015_TableS2.xlsx
(Experimental result) aregeneSymbol
format. So, I believe that after the model is generated, all genes in the model (include template model) need to be converted intoGeneSymbol
format before performing the essential gene evaluation.The text was updated successfully, but these errors were encountered: