Chaining/Netting README This file describes how to make chain and net tracks. These are higher level structures built on top of blastz mouse/human alignments. This document is based on a README written by Jim Kent. It specifically refers to mouse/human alignment. The programs referred to here are currently in source control in the directory src/hg/mouseStuff. I. Do a little cluster run that does axtFilter -notQ=chrUn chrN.axt | axtChain stdin humanMixedNibDir mouseMixedNibDir chrN.chain so that for each axt file in the axtChrom directory. Check carefully that there are no errors on chr19, as this is very close to running out of memory. See the 'run1' directory for how this was set up. [If chr19 fails in the run1/all directory then you'll need to execute the stuff in run1/19, and then chainMergeSort the results into run1/all/chain]. Note this only takes 4 hours to do serially, so you could do it as an overnight run rather than a little cluster run. Note: this actually took 8 hours for rn3/hg15. Note since it's just on the little cluster and there are only 24 jobs it's ok not to have the input on the node local disks. II. In order to assign unique id's to each chain genome wide do the following on the file server: chainMergeSort run1/all/chain/*.chain > all.chain chainSplit chain all.chain These two steps will take about 20 minutes. If all looks good at this stage you can delete run1/chain/*. III. Load the chains into database as so cd chain foreach i (*.chain) set c = $i:r hgLoadChain hg13 ${c}_mouseChain $i echo done $c end You can go on to step IV while this is proceeding. IV. Convert the chains to nets as so, still while on the file server mkdir preNet cd chain foreach i (*.chain) echo preNetting $i chainPreNet $i ~/oo/chrom.sizes ~/mm/chrom.sizes ../preNet/$i end cd .. This foreach loop will take about 15 min to execute. mkdir n1 cd preNet foreach i (*.chain) set n = $i:r.net echo primary netting $i chainNet $i -minSpace=1 ~/oo/chrom.sizes ~/mm/chrom.sizes ../n1/$n /dev/null end cd .. cat n1/*.net | netSyntenic stdin hNoClass.net This next step requires the database, so it needs to be done on hgwdev netClass hNoClass.net hg13 mm2 human.net -tNewR=$HOME/oo/bed/linSpecRep -qNewR=$HOME/mm/bed/linSpecRep If things look good do rm -r n1 hNoClass.net Make a 'syntenic' subset of these with netFilter -syn human.net > humanSyn.net V. Load the nets into database as so netFilter -minGap=10 human.net | hgLoadNet hg13 mouseNet stdin netFilter -minGap=10 humanSyn.net | hgLoadNet hg13 mouseSynNet stdin VI. Make syntenic and regular subset of axt files, chain files and make gap files. netSplit human.net humanNet netSplit humanSyn.net humanSynNet mkdir ../axtSyn ../axt mkdir humanSynGap humanSynChain humanNetGap humanNetChain cd humanSynNet foreach i (*.net) set c = $i:r netToAxt $i ../chain/$c.chain ~/oo/mixedNib ~/mm/mixedNib ../../axtSyn/$c.axt echo done ../../axtSyn/$c.axt end foreach i (*.net) set c = $i:r netChainSubset $i ../chain/$c.chain ../humanSynChain/$c.chain -gapOut=../humanSynGap/$c.gap echo done $c chain and gap end cd .. cd humanNet foreach i (*.net) set c = $i:r netToAxt $i ../chain/$c.chain ~/oo/mixedNib ~/mm/mixedNib ../axt/$c.axt echo done ../axt/$c.axt end foreach i (*net) set c = $i:r netChainSubset $i ../chain/$c.chain ../humanNetChain/$c.chain -gapOut=../humanNetGap/$c.gap echo done $c chain and gap end VII. Do much of this again for mouse. #Sort chains into mouse chromosomes. mkdir mouse chainSplit -q mouse/chain all.chain #PreNet mouse cd mouse mkdir preNet cd chain foreach i (*.chain) echo preNetting $i chainPreNet $i ~/oo/chrom.sizes ~/mm/chrom.sizes ../preNet/$i end cd .. #Net mouse mkdir n1 cd preNet foreach i (*.chain) set n = $i:r.net echo primary netting $i chainNet $i -minSpace=1 ~/oo/chrom.sizes ~/mm/chrom.sizes /dev/null ../n1/$n end cd .. #Add synteny info cat n1/*.net | netSyntenic stdin mNoClass.net #Move to hgwdev and add repeat and other classifications. netClass mNoClass.net mm2 hg12 mouse.net -tNewR=$HOME/mm/bed/linSpecRep -qNewR=$HOME/oo/bed/linSpecRep #clean up rm -r n1 mNoClass.net rm all.chain #load database with everything but the small gaps netFilter -minGap=10 mouse.net | hgLoadNet mm2 humanNet stdin netFilter -minGap=10 -syn mouse.net | hgLoadNet mm2 humanSynNet stdin