[Bioperl-l] [How to add features in genbank flat file]
Sebastien Moretti
sebastien.moretti at igs.cnrs-mrs.fr
Thu Mar 24 06:05:27 EST 2005
Hello,
No one seems to have a solution to this problem I posted a month ago.
So, I changed my mind and use 'wget' to get the GenBank sequences.
I get the full GenBank entry, with most of features.
And I can avoid another bug: COMMENT lines are not well formated with
the BioPerl script I used (not as COMMENT lines are on NCBI), and blank
lines are removed.
#!/usr/bin/perl -w
use strict;
use diagnostics;
use File::Cat;
my $acc=$ARGV[0] or die "\n\tThe accession number you seek for is
missing.\n\tTry something like: $0 NM_178432\n\n";
`wget -O output_file.tmp
"http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&qty=1&c_start=1&val=$acc&dopt=gbwithparts&send=Send&sendto=t&from=begin&to=end&extrafeatpresent=1&ef_SNP=1&ef_CDD=8&ef_MGC=16&ef_HPRD=32"
2>/dev/null`;
cat ("output_file.tmp", \*STDOUT);
unlink("output_file.tmp");
# wget -O output_file
'http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&qty=1&c_start=1&val=NM_178432&dopt=gbwithparts&send=Send&sendto=t&from=begin&to=end&extrafeatpresent=1&ef_SNP=1&ef_CDD=8&ef_MGC=16&ef_HPRD=32'
exit;
Sorry, I don't use BioPerl to Query GenBank (but for other applications)
but BioPerl 1.5 has not corrected the COMMENT bug and the missing features.
> Hello,
> I saw that Genbank web site have changed:
> Now, features like 'SNPs' are no more included in the EST flat files.
> At the NCBI web site, we must click on 'features: SNP' to add them in our flat
> file.
>
> With BioPerl, 1.4 or 1.5, it's the same, the variation features are no more
> included in the EST flat files that I upload.
>
> Here is the script I use:
> #!/usr/bin/perl -w
>
> use strict;
> use Bio::DB::GenBank;
> use Bio::DB::Query::GenBank;
> use Bio::SeqIO;
> my $acc=$ARGV[0] or die "\n\tThe accession number you seek for is missing.
> \n\tTry something like: $0 NM_178432\n\n";
>
> $acc=$acc."[Accession]";
>
> my $query_string = "$acc";
> my $query = Bio::DB::Query::GenBank->new(-db=>'nucleotide',
> -query=>$query_string);
>
> my $gb = new Bio::DB::GenBank;
> my $stream = $gb->get_Stream_by_query($query);
>
> my $out=Bio::SeqIO->new(-format=>'genbank');
> my $seq = $stream->next_seq();
>
> my $result=$out->write_seq($seq);
> $result =~ s/^1.*$//;
> #print $out->write_seq($seq);
> print $result;
>
> exit;
>
> How can I add most of features to my nucleotide flat files ?
>
> Thanks
--
Sébastien Moretti
http://igs.cnrs-mrs.fr/
CNRS - IGS
31 chemin Joseph Aiguier
13402 Marseille cedex
More information about the Bioperl-l
mailing list