Description

This track was adapted from the Rat Homology data for Ensembl mouse, generated using a new genome-genome alignment method called PhusionBlast.

Methods

Using Phusion [Jim Mullikin and Zemin Ning, Genome Research, in press for Jan 2003 publication] all unique 17mers from the mouse genome were used as probes into the recent rat assembly downloaded from BCM.

There are 927,616,344 unique 17mers in the MGSC v3 mouse genome and 224,179,792 were found once in the rat genome. The mouse genome was partitioned into 45,457 60 kb non-overlapping contigs and The Phusion Assembler clustering algorithm linked these to the Rat assembly contigs. This created 8,780 clusters with an average of 17 contigs from a mix of the mouse and rat sets. This effectively diagonalizes the genome-genome comparison step which was accomplished using wublastn with default setting. Neither mouse nor rat was repeat masked for any stage of this process. The output of wublastn was post-processed to keep only scores above 400 and to identify diagonals of blast alignments. The end result was 2,267,035 aligned segments between mouse and rat. The alignment data are available from ftp://ftp.ensembl.org/pub/repository/PhusionBlast/ .

Credits

Contact Jim Mullikin for questions about PhusionBlast. Thanks to Ensembl for making these data available.