Chaining/Netting README This file describes how to make chain and net tracks. These are higher level structures built on top of blastz mouse/human alignments. This document is based on a README written by Jim Kent. It specifically refers to mouse/human alignment. The programs referred to here are currently in source control in the directory src/hg/mouseStuff. I. Do a little cluster run that does axtFilter -notQ=chrUn chrN.axt | axtChain stdin humanMixedNibDir mouseMixedNibDir chrN.chain so that for each axt file in the axtChrom directory. Check carefully that there are no errors on chr19, as this is very close to running out of memory. See the 'run1' directory for how this was set up. [If chr19 fails in the run1/all directory then you'll need to execute the stuff in run1/19, and then chainMergeSort the results into run1/all/chain]. Note this only takes 4 hours to do serially, so you could do it as an overnight run rather than a little cluster run. Note: this actually took 8 hours for rn3/hg15. Note since it's just on the little cluster and there are only 24 jobs it's ok not to have the input on the node local disks. II. In order to assign unique id's to each chain genome wide do the following on the file server: chainMergeSort run1/all/chain/*.chain > all.chain chainSplit chain all.chain These two steps will take about 20 minutes. If all looks good at this stage you can delete run1/chain/*. III. Load the chains into database as so cd chain foreach i (*.chain) set c = $i:r hgLoadChain hg13 ${c}_mouseChain $i echo done $c end You can go on to step IV while this is proceeding. IV. Convert the chains to nets as so, still while on the file server mkdir preNet cd chain foreach i (*.chain) echo preNetting $i chainPreNet $i ~/oo/chrom.sizes ~/mm/chrom.sizes ../preNet/$i end cd .. This foreach loop will take about 15 min to execute. mkdir n1 cd preNet foreach i (*.chain) set n = $ echo primary netting $i chainNet $i -minSpace=1 ~/oo/chrom.sizes ~/mm/chrom.sizes ../n1/$n /dev/null end cd .. cat n1/*.net | netSyntenic stdin This next step requires the database, so it needs to be done on hgwdev netClass hg13 mm2 -tNewR=$HOME/oo/bed/linSpecRep -qNewR=$HOME/mm/bed/linSpecRep If things look good do rm -r n1 Make a 'syntenic' subset of these with netFilter -syn > V. Load the nets into database as so netFilter -minGap=10 | hgLoadNet hg13 mouseNet stdin netFilter -minGap=10 | hgLoadNet hg13 mouseSynNet stdin VI. Make syntenic and regular subset of axt files, chain files and make gap files. netSplit humanNet netSplit humanSynNet mkdir ../axtSyn ../axt mkdir humanSynGap humanSynChain humanNetGap humanNetChain cd humanSynNet foreach i (*.net) set c = $i:r netToAxt $i ../chain/$c.chain ~/oo/mixedNib ~/mm/mixedNib ../../axtSyn/$c.axt echo done ../../axtSyn/$c.axt end foreach i (*.net) set c = $i:r netChainSubset $i ../chain/$c.chain ../humanSynChain/$c.chain -gapOut=../humanSynGap/$ echo done $c chain and gap end cd .. cd humanNet foreach i (*.net) set c = $i:r netToAxt $i ../chain/$c.chain ~/oo/mixedNib ~/mm/mixedNib ../axt/$c.axt echo done ../axt/$c.axt end foreach i (*net) set c = $i:r netChainSubset $i ../chain/$c.chain ../humanNetChain/$c.chain -gapOut=../humanNetGap/$ echo done $c chain and gap end VII. Do much of this again for mouse. #Sort chains into mouse chromosomes. mkdir mouse chainSplit -q mouse/chain all.chain #PreNet mouse cd mouse mkdir preNet cd chain foreach i (*.chain) echo preNetting $i chainPreNet $i ~/oo/chrom.sizes ~/mm/chrom.sizes ../preNet/$i end cd .. #Net mouse mkdir n1 cd preNet foreach i (*.chain) set n = $ echo primary netting $i chainNet $i -minSpace=1 ~/oo/chrom.sizes ~/mm/chrom.sizes /dev/null ../n1/$n end cd .. #Add synteny info cat n1/*.net | netSyntenic stdin #Move to hgwdev and add repeat and other classifications. netClass mm2 hg12 -tNewR=$HOME/mm/bed/linSpecRep -qNewR=$HOME/oo/bed/linSpecRep #clean up rm -r n1 rm all.chain #load database with everything but the small gaps netFilter -minGap=10 | hgLoadNet mm2 humanNet stdin netFilter -minGap=10 -syn | hgLoadNet mm2 humanSynNet stdin