[Bioperl-l] Bio::DB::EUtilities and RefSeq

Senthil Kumar M senthil.debian at gmail.com
Thu Jun 9 02:25:34 UTC 2011


I tried the Bio::DB::EUtilities example mentioned in this discussion:
with " -term => 'Escherichia coli BW2952[Orgn] AND rpoB[Gene/Protein
Name]' and -db => 'protein' ". This retrieves three identical rpoB
amino acid sequences, for brevity I provide only the fasta headers and
not the actual sequences below:

>gi|259494181|sp|C5A0S7.1|RPOB_ECOBW RecName: Full=DNA-directed RNA polymerase subunit beta; Short=RNAP subunit beta; AltName: Full=RNA polymerase subunit beta; AltName: Full=Transcriptase subunit beta
>gi|238863495|gb|ACR65493.1| RNA polymerase, beta subunit [Escherichia coli BW2952]
>gi|238903043|ref|YP_002928839.1| RNA polymerase, beta subunit [Escherichia coli BW2952]

I am only interested in the RefSeq entry, ie YP_002928839.1 and not
the other two. I can filter such duplicate entries after I download
them from NCBI, but it would be nicer if there is a way to retrieve
ONLY the RefSeq entries that match my query and download just them.

I am aware that it is easier to do this online at the NCBI protein
site, where there is a filter option
(http://www.ncbi.nlm.nih.gov/protein?term=escherichia coli
BW2952[organism] AND rpoB[Gene%2FProtein Name]), but I would like to
know if the same is achievable through EUtilities since I have many
sequences to download from NCBI.

Reading "$ perldoc
/usr/share/perl5/Bio/Tools/EUtilities/EUtilParameters.pm" and
searching google did not provide any clues, but I might have missed
something that was blindingly obvious. Any help would be much

Thanks in advance,


For I am Vader, Darth Vader, Lord Vader. I can kill you with a single thought."
"Well, you'll still need a tray."
"No, I will not need a tray. I do not need a tray to kill you. I can
kill you without a tray, with the power of the Force, which is strong
within me. Even though I could kill you with a tray if I so wished,
for I would hack at your neck with the thin bit until the blood flowed
across the canteen floor."
"No, the food is hot. You'll need a tray to put the food on."
"Oh, I see, the food is hot. I'm sorry, I did not realize."
               -- Eddie Izzard

More information about the Bioperl-l mailing list