This directory contains various programs to help map the coding
region inside of a transcript.  In the Known Genes III build
process it is applied to the bed file/fa file output of txWalk.

The overall pipeline is to convert various lines of evidence to
a common cdsEvidence format with:
   txCdsEvFromRna - handles refSeq and genbank mRNA alignments
   txCdsEvFromProtein - handles uniProt and refSeq protein alignments
   txCdsPredict - does predictions
These are then all concatenated together, and the best CDS for each
transcript is picked with
   txCdsPick
Then the transcript bed and sequence are turned into a GTF gene
prediction file and a protein sequence file with
   txCdsToGene
Further information on these genes is put together with
   txInfoAssemble.  
The coding regions within the gene sets are clustered using
   txCdsCluster
Then we pick the strongest looking protein in these clusters,
and flag cases where it looks like a CDS prediction in the
other proteins is just in an unprocessed intron, or is in the
3' UTR of one of the nice proteins with:
   txCdsSuspect
We then strip the weakest CDSs including the ones flagged
as suspect above, and the NMD targets with
   txCdsStripWeak
and then run again the programs
   txCdsToGene
   txInfoAssemble
   txCdsCluster
At this point we have the final CDS assignments including the
final coding/noncoding decision. The next step is to separate
the noncoding transcripts into those that seem to just be
byproducts of the coding clusters from the truly noncoding
transcripts with
   txSeparateNoncoding
The noncoding ones are clustered again with
   txBedToGraph


