[Bioperl-l] genbank contig parses

Wed Feb 12 13:03:36 EST 2003

Hello,

    I'm trying to deal some of the large chromosomal contig refseq files
(e.g., NT_006316) that just
    have the annotations and component clone listing but no sequence.
It's possible to download it in fasta and genbank format and then merge
them, but the files are huge (multiMB)and therefore slow to deal with.

    I'd like for example to extract out subsequences spanning genes with
their annotation coordinates still relatively correct with respect to
the subsequence in order to have smaller and faster files to deal with.

    Is there now BioPerl code to deal with join/complement tags and to
make subseqs with annotation transfer? I've had a look but can't find
any, but maybe I'm looking in the wrong modules.

    Thanks

    Richard Adams
    University of Edinburgh
    UK