[Bioperl-l] From Blast hits to Taxanomy lineage for Short DNA Sequences (reads)

Thu Mar 10 17:42:51 UTC 2011

Hey Ian,
             Writing that kind of script would not take that much time. But
i am not very much familiar with xml format.
But still you can do without writing a complicated script.

once you get the output you can get all the ids (use grep or simple perl
script to parse ids).
Then use that list of ids to fetch sequences from your database by using
"fastacmd" .

-Shalabh

On Thu, Mar 10, 2011 at 12:31 PM, Ian Mc Dowell <ian.mcdowell at gmail.com>wrote:

> I have a deeply sequenced transcriptome so when I look for a gene of
> interest in my short reads (108bp) I get thousands of hits, at very low E
> values.  And I want to do this for many sequences of interest. That's why I
> want a script to automate the process of grabbing the accession numbers from
> the blast xml output, then searching in the fasta file which is equivalent
> to my local database and pulling those relevant reads.
>
> On Thu, Mar 10, 2011 at 12:24 PM, shalabh sharma <
> shalabh.sharma7 at gmail.com> wrote:
>
>> Hey Ian,
>>             I am not sure if i understood your problem completely.
>> But if you have ids of blast hits you can use 'fastacmd' to fetch
>> sequences from the database you used for blast.
>>
>> -Shalabh Sharma
>> -----------------------------------------
>> Shalabh Sharma
>> Scientific Computing Professional Associate (Bioinformatics Specialist)
>> Department of Marine Sciences
>> University of Georgia
>> Athens, GA 30602-3636
>>
>> On Thu, Mar 10, 2011 at 12:11 PM, Ian Mc Dowell <ian.mcdowell at gmail.com>wrote:
>>
>>> Hi all,
>>>
>>> I would like to take local blast hit sequences, i.e. hsp_hseq, and
>>> extract
>>> the full sequences of those hits from the original fasta file and put
>>> them
>>> in a fasta file of all hits that I can use later.
>>>
>>> This should be a widely performed task but I can't find any scripts that
>>> directly address this issue.  I have not acquired the skills to make my
>>> own
>>> scripts for this task.
>>>
>>> Thanks so much if you have anything that can help me out,
>>>
>>> Ian McDowell
>>> Aquatic Pathology
>>> University of Rhode Island
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>

-- 
Shalabh Sharma
Scientific Computing Professional Associate (Bioinformatics Specialist)
Department of Marine Sciences
University of Georgia
Athens, GA 30602-3636