[Bioperl-l] Parsing the accession numbers in Refseq
Heikki Lehvaslaiho
heikki@ebi.ac.uk
Tue, 08 May 2001 15:32:58 +0100
Suraj,
If you have the entry (or the interesting part of it)
in variable $s, then the following line will put the accession number
into variable $a:
($a) = $s =~ /ACCESSION +(\w+)\W+COMMENT +REVIEWED/;
If you want to parse this straight from the file input, you'll have to
play with the $INPUT_RECORD_SEPARATOR, more commonly known as $/. Set
it to entry delimiter (// ?) and write something like:
$/ = '//';
while (<>) {
($a) = $_ =~ /ACCESSION +(\w+)\W+COMMENT +REVIEWED/;
print "$a\n";
}
-Heikki
Suraj Peri wrote:
>
> hi ,
> I took the refseq database and parsed only the
> Accession numbers and the entries with Reviewed Refseq
> by using PERL RegEX. now i want only the accession
> number preceding the REVIEWED.
> like
> ACCESSION NM_021640
> COMMENT REVIEWED REFSEQ:
> lines only and not the accession numbers followed with
> out comment line.
>
> how can i do this using Regular Expressions. Please
> help me ASAP.
> Thank you in advance.
>
> Example:
> ACCESSION NM_021640
> ACCESSION NM_001158
> COMMENT REVIEWED REFSEQ: This record has been
> curated by NCBI staff. The
> ACCESSION NM_018607
> COMMENT REVIEWED REFSEQ: This record has been
> curated by NCBI staff. The
>
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Auctions - buy the things you want at great prices
> http://auctions.yahoo.com/
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
--
______ _/ _/_____________________________________________________
_/ _/ http://www.ebi.ac.uk/mutations/
_/ _/ _/ Heikki Lehvaslaiho heikki@ebi.ac.uk
_/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute
_/ _/ _/ Wellcome Trust Genome Campus, Hinxton
_/ _/ _/ Cambs. CB10 1SD, United Kingdom
_/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________