GOseq – a new method for Gene Ontology analysis of RNA-seq data

Until recently, microarrays have been the method of choice for transcriptional profiling.  The advent of next generation sequencing technologies however has seen the rise of direct sequencing of mRNA (RNA-seq) as a new method for such profiling.  In a recent publication in Genome Biology, Alicia Oshlack and colleagues at the Walter and Eliza Hall Institute in Melbourne, Australia have developed a new method for performing Gene Ontology analysis of RNA-seq data, called GOseq.
GOseq identifies whether a given transcriptional profile is over-represented with transcripts associated with specific biological processes. Up until now, statistical methods, such as this, used for analysing RNA-seq data have generally been modifications of methods developed for use with microarray data.  Oshlack, however, shows that statistical methods are not interchangeable between the two techniques; in particular, there is a bias inherent in RNA-seq data whereby highly-expressed transcripts are more likely to be called as being differentially expressed compared with short or less highly-expressed genes.  The GOseq algorithm takes this into account, thus correcting the bias and providing a more reliable readout.  As well as providing a useful new tool, this paper highlights the need for new statistical analysis techniques tailored specifically for the new technology of RNA-seq.
Given the extent to which RNA-seq is being embraced by the genomics community, for example in defining alternative transcripts, this method is a welcome addition to a growing field.

Andrew Cosgrove

Andrew obtained his PhD in molecular biology from the University of Dundee in 2005. He joined Genome Biology in 2009 after a post doctoral research position at the University of Sheffield investigating chromosome positioning during meiosis in yeast.
Andrew Cosgrove

View the latest posts on the On Biology homepage