[Bioperl-l] Genbank2gff3 script update
Brian Osborne
bosborne11 at verizon.net
Wed Mar 28 02:20:59 UTC 2007
Don,
I took the file http://eugenes.org/gmod/genbank2chado/bin/genbank2gff3.PLS
and replaced the script of the same name with it, in scripts/Bio-DB-GFF.
Brian O.
On 3/27/07 7:42 PM, "Don Gilbert" <gilbertd at cricket.bio.indiana.edu> wrote:
>
> Dear Bioperl developers,
>
> Here is an improved bp_Genbank2gff3.pl script, with bug fixes
> and enhancements. The non-transparent changes in behavior are
> made via non-default command flags. I've updated these against current
> Bioperl CVS. Would one of you care to add this to your CVS repository?
>
> THanks, Don Gilbert
>
> Find at http://eugenes.org/gmod/genbank2chado/
>
> =item Bioperl bp_genbank2gff3.pl
>
> bin/genbank2gff3.PLS (Bioperl CVS scripts/Bio-GFF-DB/genbank2gff3.PLS)
> lib/Bio-new/SeqFeature/Tools/TypeMapper.pm (required for genbank2gff3
> update)
> lib/Bio-new/SeqFeature/Tools/Unflattener.pm (minor change suggested for
> genbank2gff3)
> (put into your Bioperl lib/Bio/... directories)
>
> There are also this unrelated patch
> lib/Bio-new/Graphnics/Glyph/processed_transcript.pm
> -- new flag to ignore excess subfeatures from Chado's
> gene-mrna-polypeptide-exon model.
>
> =item Genbank2gff3 changes
>
> * Polypeptide alternate gene model added (--noCDS option)
> Standard gene model: gene > mRNA > (UTR,CDS,exon)
> G-R-P-E alternate model: gene > mRNA > polypeptide > exon
> Polypeptide contains all the important protein info (IDs, translation, GO
> terms)
>
> * IO pipes: curl ftp://ncbigenomes/... | genbank2gff3 --in stdin --out
> stdout | gff2chado ...
>
> * GenBank main record fields are added to source feature
> and the sourcetype, commonly chromosome for genomes, is used.
>
> * Gene Model handling for ncRNA, pseudogenes are added.
>
> * GFF header is cleaner, more informative, and GFF_VERSION option
>
> * GFF ##FASTA inclusion is improved, and translation sequence stored there.
>
> * FT -> GFF attribute mapping is improved.
>
> * --format choice of SeqIO input formats (GenBank default).
> Uniprot/Swissprot and EMBL produce useful GFF.
>
> * SeqFeature::Tools::TypeMapper has a few FT -> SOFA additions, more
> flexible usage.
>
> -- d.gilbert--bioinformatics--indiana-u--bloomington-in-47405
> -- gilbertd at indiana.edu--http://marmot.bio.indiana.edu/
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list