haotu : an open lab notebook

2011/09/20

KEGG in R

Filed under: Bioconductor, Genomics, R, RNAseq — S @ 14:37

I was dealing with a data set that had a list of assembled unigenes from a de novo assembly anaysis that were blasted to the KEGG data base (it was a bioinformatics data base sent from BGI Shenzhen on RNAseq data). I had a list of KEGG class ids (e.g., ko00300) that were associated with some pathway (e.g., Lysine biosynthesis) and then I knew the enzymes in that pathway that matched at least one of the unigenes (e.g.,K00291, K00290).

I used the KEGG.db and KEGGSOAP packages on Bioconductor to parse this data base. For example, the following code gives a list of all the KEGG ided enzymes that are in the ko00300 reference pathway:

unlist(sapply(get.ko.by.ko.class(substring("ko00300",3)),substring,first=4))

 

Blog at WordPress.com.