[Bioperl-l] Bio::DB::EUtilities and RefSeq

Chris Fields cjfields at illinois.edu
Thu Jun 9 02:40:29 UTC 2011


If you can find a 'details' link on the NCBI search page it gives you a hint as to how it's done (in a fairly archaic way :).  Here is your term:

"Escherichia coli BW2952"[Organism] AND rpoB[Gene] AND srcdb_refseq[PROP]

So, adding 'srcdb_refseq[PROP]' to your searches should limit to RefSeq only.  

chris


On Jun 8, 2011, at 9:25 PM, Senthil Kumar M wrote:

> Hi,
> 
> I tried the Bio::DB::EUtilities example mentioned in this discussion:
> http://biostar.stackexchange.com/questions/3043/how-can-i-get-protein-sequence-in-fasta-format-using-taxon-id,
> with " -term => 'Escherichia coli BW2952[Orgn] AND rpoB[Gene/Protein
> Name]' and -db => 'protein' ". This retrieves three identical rpoB
> amino acid sequences, for brevity I provide only the fasta headers and
> not the actual sequences below:
> 
>> gi|259494181|sp|C5A0S7.1|RPOB_ECOBW RecName: Full=DNA-directed RNA polymerase subunit beta; Short=RNAP subunit beta; AltName: Full=RNA polymerase subunit beta; AltName: Full=Transcriptase subunit beta
>> gi|238863495|gb|ACR65493.1| RNA polymerase, beta subunit [Escherichia coli BW2952]
>> gi|238903043|ref|YP_002928839.1| RNA polymerase, beta subunit [Escherichia coli BW2952]
> 
> I am only interested in the RefSeq entry, ie YP_002928839.1 and not
> the other two. I can filter such duplicate entries after I download
> them from NCBI, but it would be nicer if there is a way to retrieve
> ONLY the RefSeq entries that match my query and download just them.
> 
> I am aware that it is easier to do this online at the NCBI protein
> site, where there is a filter option
> (http://www.ncbi.nlm.nih.gov/protein?term=escherichia coli
> BW2952[organism] AND rpoB[Gene%2FProtein Name]), but I would like to
> know if the same is achievable through EUtilities since I have many
> sequences to download from NCBI.
> 
> Reading "$ perldoc
> /usr/share/perl5/Bio/Tools/EUtilities/EUtilParameters.pm" and
> searching google did not provide any clues, but I might have missed
> something that was blindingly obvious. Any help would be much
> appreciated.
> 
> Thanks in advance,
> 
> Senthil
> 
> -/
> For I am Vader, Darth Vader, Lord Vader. I can kill you with a single thought."
> "Well, you'll still need a tray."
> "No, I will not need a tray. I do not need a tray to kill you. I can
> kill you without a tray, with the power of the Force, which is strong
> within me. Even though I could kill you with a tray if I so wished,
> for I would hack at your neck with the thin bit until the blood flowed
> across the canteen floor."
> "No, the food is hot. You'll need a tray to put the food on."
> "Oh, I see, the food is hot. I'm sorry, I did not realize."
>               -- Eddie Izzard
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l





More information about the Bioperl-l mailing list