[Bioperl-l] getting data from ncbi
aditi gupta
aditi9783 at yahoo.co.in
Sun Jun 13 06:14:48 EDT 2004
hi to all,
i had a file which contained following data:
# BLASTN 2.2.9 [May-01-2004]
# Query: gi|37182815|gb|AY358849.1| Homo sapiens clone DNA180287 ALTE (UNQ6508) mRNA, complete cds
# Database: nr
# Fields: Query id, Subject id, % identity, alignment length, mismatches, gap openings, q. start, q. end, s. start, s. end, e-value, bit score
gi|37182815|gb|AY358849.1| gi|28592069|gb|U63637.2|BTU63637 100.00 17 0 0 552 568 3218 3234 1.1 34.19
gi|37182815|gb|AY358849.1| gi|14318385|gb|AC089993.2| 95.24 21 1 0 435 455 56604 56624 1.1 34.19
gi|37182815|gb|AY358849.1| gi|14318385|gb|AC089993.2| 100.00 16 0 0 260 275 89982 89967 4.2 32.21
gi|37182815|gb|AY358849.1| gi|7385112|gb|AF222766.1|AF222766 100.00 17 0 0 345 361 242 226 1.1 34.19
but i required only some of the fields, and with the help of members of this maillist, i succeeded and obtained following output:
gi|28592069|gb|U63637.2|BTU63637 100.00 17 0 552 568
gi|14318385|gb|AC089993.2| 95.24 21 1 435 455
gi|14318385|gb|AC089993.2| 100.00 16 0 260 275
gi|7385112|gb|AF222766.1|AF222766 100.00 17 0 345 361
the code is:
#!/usr/bin/perl
$/ = undef;
use Getopt::Long;
(GetOptions("f|filename=s"=>\$file));
open (IN,$file) or die "Error opening $file:$!\n";
open (OUT,">>$file.txt")or die "Error opening $file.txt:$!\n";
$list = <IN>;
@seqs = split( /gi\|37182815\|gb\|AY358849.1\|/, $list );
foreach $seq(@seqs){
if ($seq =~ /(gi\|\d+\|gb\|[0-9A-Z.]+\|([0-9A-Z.]+)?)
\s*
([0-9.]+)
\s+
(\d+)
\s+
(\d+)
\s+
\d+
\s+
(\d+)
\s+
(\d+)
/x)
{
$id=$1;
$identity_percentage=$3;
$align_length=$4;
$mismatches=$5;
$q_start=$6;
$q_end=$7;
}
print OUT "\n$id\t$identity_percentage\t$align_length\t$mismatches\t$q_start\t$q_end\n";
}
exit;
but i also have to feed the gi number(the first field) into ncbi entrez nucleotide site:
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Nucleotide
and retreive the gene and chromosome name, if available from the resulting web page ........
is it possible to get the gene n chromosome info in the output with other fields?what changes in code are required?
please help!! i don't have any idea of using internet with perl......
thanx a lot in advance,
regards,
aditi.
Yahoo! India Matrimony: Find your partner online.
More information about the Bioperl-l
mailing list