Description & Credits

Genie predictions are based on Affymetrix's Genie gene finding software. Genie is a generalized HMM which accepts constraints based on mRNA and EST data.

Method

The MGSCv3 genome sequence was partitioned at large assembly gaps into 377 pieces averaging 7MB. mRNAs were aligned to the human genome using pslayout and only very high stringency alignments were accepted. Alignments were merged and extended into a set of alternative transcripts (AltMerge). One alignment was chosen per gene. EST mate pairs that aligned with high stringency to a common genomic region and agreed in order and orientation were identified as putative transcribed regions ("clone bounds"). Gene regions were identified as the maximal overlapping region of AltMerge, clone bounds, and a statistical ab-initio gene-finder ("Genie"). For each gene region, one or more transcripts were predicted by a generalized HMM gene-finder ("AltGenie"). The AltMerge transcript and the clone bounds were used as constraints to the gene-finder such that scores for states were modified according to splice, exon, and clone bounds information.