Splitting genbank

Peter Rice peter.rice at uk.lionbioscience.com
Mon Oct 14 11:27:34 UTC 2002


Kenneth Geisshirt wrote:
> I have a local copy of genbank, and I wish to split it into four
> databases: one for humans, one of rats, one of mouses and one for the
> rest. The applications seqret and seqretsplit can help me with the first
> three by specifying the organism in the usa, but how do I specify "not
> human and not rat and not mouse"?

In EMBOSS ....

split the gbrod file into rat, mouse and other rodents (a simple perl 
script would do)

index and define GenBank

then define subsets using the same index files and exclude the ones you 
don't want using, for example:

exclude: "*pri* *rat* *mus*"

... in copies of your EMBOSS database definition for genbank.

EMBOSS simply checks the excluded files list when using the index files.

regards,

Peter Rice

-- 
------------------------------------------------
Peter Rice, LION Bioscience Ltd, Cambridge, UK
peter.rice at uk.lionbioscience.com +44 1223 224723




More information about the EMBOSS mailing list