Description
This track shows the location of non-protein coding RNA genes and
pseudo-genes.
Feature types include:
- tRNA: Transfer RNA (or pseudogene)
- rRNA: Ribosomal RNA (or pseudogene)
- scRNA: Small cytoplasmic RNA (or pseudogene)
- snRNA: Small nuclear RNA (or pseudogene)
- snoRNA: Small nucleolar RNA (or pseudogene)
- miRNA: MicroRNA (or pseudogene)
- misc_RNA: Miscellaneous other RNA, such as Xist (or pseudogene)
Methods
Eddy-tRNAscanSE (tRNA genes, Sean Eddy):
tRNAscan-SE 1.23 with default parameters.
Score field contains tRNAscan-SE bit score; >20 is good, >50 is great.
Eddy-BLAST-tRNAlib (tRNA pseudogenes, Sean Eddy):
WUBLAST 2.0, with options "-kap wordmask=seg B=50000 W=8 cpus=1".
Score field contains % identity in BLAST-aligned region.
Used each of 602 tRNAs and pseudogenes predicted by tRNAscan-SE
in the human oo27 assembly as queries. Kept all nonoverlapping
regions that hit one or more of these with P <= 0.001.
Eddy-BLAST-snornalib (known snoRNAs and snoRNA pseudogenes, Steve Johnson):
WUBLASTN 2.0, with options "-V=25 -hspmax=5000 -kap wordmask=seg
B=5000 W=8 cpus=1".
Score field contains BLAST score.
Used each of 104 unique snoRNAs in snorna.lib as a query.
Any hit >=95% full length and >=90% identity is annotated as a
"true gene".
Any other hit with P <= 0.001 is annotated as a "related sequence"
and interpreted as a putative pseudogene.
Eddy-BLAST-otherrnalib
(non-tRNA, non-snoRNA noncoding RNAs with Genbank entries
for the human gene.):
WUBLASTN 2.0 [15 Apr 2002]
with options: "-kap -cpus=1 -wordmask=seg -W=8 -E=0.01 -hspmax=0
-B=50000 -Z=3000000000". Exceptions to this are:
- Large ncRNAs (LSU & SSU rRNA, H19, Xist):
change "-W=11"; addition "-maskextra=50".
Xist contains repetitive elements and was masked with
RepeatMasker, Library version 6.8.
- microRNAs:
"-kap -cpus=1 -S=70 -hspmax=0 -B=100" replaces all
above parameters.
The score field contains the BLASTN score.
Used 41 unique miRNAs, and 29 other ncRNAs as queries.
Any hit >=95% full length and >=95% identity is annotated as a
"true gene".
Any other hit with P <= 0.001 and >= 65% identity is annotated
as a "related sequence". An exception to this is: all miRNAs consist
of 16-26 bp sequences in Genbank
and are only annotated if 100% full length and 100% identity.
miRNAs consist of Let-7 from Pasquinelli et al.,
Nature (2000) 408:86; 40 from Mourelatos et al., Gene & Dev (2002)
16:720.
Credits
These data were kindly provided by Sean Eddy at Washington University.