[Bioperl-l] Bio::SeqIO::genbank

Fri Apr 9 20:06:47 UTC 2010

Is it the same as regular ol' eutils?

http://eutils.ncbi.nlm.nih.gov/corehtml/query/static/efetchseq_help.html

Note the rettype = gbc|gpc, but retmode = text, not xml.  This worked
for me:

my @ids = @ARGV;

my $eutil = Bio::DB::EUtilities->new(-eutil => 'efetch',
                                     -db    => 'nuccore',
                                     -id    => \@ids,
                                     -rettype => 'gbc',
                                     -retmode => 'text');

say $eutil->get_Response->content;

chris 

On Thu, 2010-04-08 at 20:15 +0000, Mark A. Jensen wrote: 
> FWIW -In my SoapE investigations I found NCBI-hosted XML schema for insdc, but didn't ever run across a format descriptor that gets return data in that format-- MAJ
> 
> >-----Original Message-----
> >From: Chris Fields [mailto:cjfields at illinois.edu]
> >Sent: Thursday, April 8, 2010 04:09 PM
> >To: 'Dave Messina'
> >Cc: 'bioperl-l', 'Wayne Davis'
> >Subject: Re: [Bioperl-l] Bio::SeqIO::genbank
> >
> >On Thu, 2010-04-08 at 21:39 +0200, Dave Messina wrote: 
> >> Hi Wayne,
> >> 
> >> > if $mol is not in the fixed list of genbank molecule types it should
> >> > be set to the default value of 'DNA', or some other smarter way of
> >> > forcing the molecule type into the fixed vocabulary would be a help.
> >> 
> >> Sounds good to me. Did you modify your local copy of Bio::SeqIO::genbank and try it out?
> >> 
> >> I will say, though, that Genbank is a tricky format, both to read and to write. Even if BioPerl would write Genbank records that are fully compliant with the spec, I'm pretty sure they would not be round-trippable*. That is, if you read a Genbank record into BioPerl and then wrote it back out, the output wouldn't exactly match the input.
> >
> >This is true.  Jason and I talked about this recently and arrived pretty
> >much at the same conclusion.  We're mainly interested in parsing data
> >into a usable framework for manipulation.  Recreating data isn't our top
> >priority.
> >
> >> I think that NCBI is trying to nudge people toward their XML format. I know it won't help this particular situation, but it might be an option to consider for the future.
> >
> >The only problem I had with the XML spit out from eutils has been it was
> >an on-the-fly conversion of the ASN.1.  Not sure what the status of it
> >is now.
> >
> >What's going on with the INSDC XML format?  That was supposed to be an
> >international standard and appeared more lightweight (if such a thing
> >can be said about XML).
> >
> >> Speaking of which, what is the current status of the BioPerl Genbank XML parser? Jay, did you ever release that?
> >> 
> >> 
> >> Dave
> >> 
> >> 
> >> 
> >> * not that they were designed to be: http://www.bioperl.org/wiki/HOWTO:SeqIO#Caveats
> >
> >I think it was in a branch, can't recall.
> >
> >chris
> >
> >
> >
> >_______________________________________________
> >Bioperl-l mailing list
> >Bioperl-l at lists.open-bio.org
> >http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l