Post by Priyali on Jun 30, 2023 22:44:34 GMT -5
Hi
I am having issues getting DAVID to recognise my genes of interest.
Some background: I have a newly annotated Citrobacter rodentium genome. Because it has been annotated recently (using software called Bakta), it contains annotations that are not in the C. rodentium genome. Many of the genes names have been pulled from E. coli K12. In attempt to make IDs as uniform as possible, I used UniProt ID mapping on all of my genes of interest and have tried inputting these into DAVID.
According to how I have understood DAVID, here are where I think the problems are:
As far as I can understand, because of these potential issues, my entire gene list cannot be submitted together. Which in turn means that my list of genes cannot be analysed together. If I were to use the genes that can be submitted, I will get a limited output which will not give me an accurate overview of what pathways the genes are a part of.
Can anyone tell me if I have in fact understood DAVID properly, and if accordingly there are solutions that will allow me to analyse my entire gene set?
Many thanks
I am having issues getting DAVID to recognise my genes of interest.
Some background: I have a newly annotated Citrobacter rodentium genome. Because it has been annotated recently (using software called Bakta), it contains annotations that are not in the C. rodentium genome. Many of the genes names have been pulled from E. coli K12. In attempt to make IDs as uniform as possible, I used UniProt ID mapping on all of my genes of interest and have tried inputting these into DAVID.
According to how I have understood DAVID, here are where I think the problems are:
- DAVID doesn't want to recognise the longer UniProt accession number. There are no alternatives for these particular genes.
- Not all of my genes have a recognisable gene name so DAVID cannot convert them this way
- Not all of my accession numbers appear to be in the same format (e.g. A0A060VEK4 vs D2TGJ)
- I have had to use gene IDs from different organisms as not all are annotated within C. rodentium ICC168. This means that DAVID cannot convert them all at once.
As far as I can understand, because of these potential issues, my entire gene list cannot be submitted together. Which in turn means that my list of genes cannot be analysed together. If I were to use the genes that can be submitted, I will get a limited output which will not give me an accurate overview of what pathways the genes are a part of.
Can anyone tell me if I have in fact understood DAVID properly, and if accordingly there are solutions that will allow me to analyse my entire gene set?
Many thanks