Post by DAVID user on Jul 22, 2020 11:09:08 GMT -5
To whom it may concern:
I have two questions regarding the analysis of my RNA-seq result that I hope you can help me with.
1. I know that it is possible to do pathway enrichment analysis either by separating the up- and down regulated genes and there are publications showing that separating the two may be better for many instances or by keeping the up- and down regulated genes together. I get a "better" result when I keep them together. Is there a reason why that is not permissible? Thank you!
Specifically, when I keep the upregulated and downregulated genes together, I get 3 ECM molecules and one ECM receptor. Two of them are up- two are down-regulated in a way that make sense. The fact that ECM-receptor pathway is affected makes sense in the context of my scientific question.
2. I need help interpreting my pathway enrichment analysis result with DAVID. One thing I did found online was that DAVID gives you FDR values that are the actual value*100. Are Bonferroni Benjamin values also modified (f.ex. *10)?
Specifically, the FDR value below is 0.03 (3.13/100) which means it is a statistically significant result. Do Bonferroni and Benjamini agree with this (if they are modified) and if not, what is the logic of preferring one statical value over the other?
Thank you very much!
Last Edit: Jul 24, 2020 15:25:37 GMT -5 by DAVID user
Post by DAVID user on Jul 22, 2020 11:11:26 GMT -5
1) We generally look at it both ways (together and separate). By keeping the up/down genes together, for the terms that you mentioned, you are finding that the significantly differentially regulated genes from your experiment are enriched for these pathways by more than chance alone. That enrichment calculation is based on the number of genes (Count) in your list that are annotated to the specific pathway(mmu04512:ECM-receptor interaction), the number of genes(List Total) in your list that are annotated to any pathway in KEGG, the number of genes (Pop Hits) in the background (i.e. all mouse genes, microarray, etc) that are annotated to the specific pathway, and the number of genes (Pop Total) in the background that are annotated to any pathway in KEGG. These four numbers make up the 2x2 contingency table used for the modified Fisher Exact score (pvalue). When you separate up/down regulated genes, you reduce the count and list total values in this table but not the background numbers, thereby changing the pvalue. There could be some biological relevance to both the up and down regulated genes being involved in a specific pathway but that depends on both the specific experiment and the specific pathway and the role those genes play in that pathway.
2) FDR is multiplied by 100 but the Bonferroni and Benjamini values are not. We feel that both the pvalue and more so the multiple test corrections (due to the conservative nature of these corrections) should be used in conjunction with your biological knowledge and not as a hard cutoff. As DAVID is a discovery tool, if the enriched annotation makes sense to your study then it may be worth exploring further.