[Bioperl-l] Bioperl-l Digest, Vol 117, Issue 13

Wed Jan 30 23:13:49 UTC 2013

We certainly accept support for updates to the code.  In fact, if you are familiar with git/github the process is fairly straightforward:

1) Fork the code to your github account
2) Make and commit changes
3) Submit a pull request
4) Post something to the list just in case.

We also accept code patches; the best way to submit these is as a bug report to the redmine repository (doesn't hurt to post here as well):

https://redmine.open-bio.org/

chris

On Jan 30, 2013, at 3:40 PM, Dan kilburn <dr_kilburn59 at yahoo.com> wrote:

> Hi Jason,
> 
> Are there any plans to keep SearchIO up to date with ncbi blast? I know they change formats ridiculously often, but I had to write my own parser to get sequence identity, which I would rather not have done. I realize that this job would be a big load on anyone who takes it, but it's so fundamental. Maybe I can help.
> 
> --Dan
> Sent from my iPhone
> 
> On Jan 30, 2013, at 12:00 PM, bioperl-l-request at lists.open-bio.org wrote:
> 
>> Send Bioperl-l mailing list submissions to
>>   bioperl-l at lists.open-bio.org
>> 
>> To subscribe or unsubscribe via the World Wide Web, visit
>>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> or, via email, send a message with subject or body 'help' to
>>   bioperl-l-request at lists.open-bio.org
>> 
>> You can reach the person managing the list at
>>   bioperl-l-owner at lists.open-bio.org
>> 
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Bioperl-l digest..."
>> 
>> 
>> Today's Topics:
>> 
>>  1. Re:  Parsing Blast-Report extracting "Features flanking    .."
>>     (Jason Stajich)
>> 
>> 
>> ----------------------------------------------------------------------
>> 
>> Message: 1
>> Date: Tue, 29 Jan 2013 11:00:16 -0800
>> From: Jason Stajich <jason.stajich at gmail.com>
>> Subject: Re: [Bioperl-l] Parsing Blast-Report extracting "Features
>>   flanking    .."
>> To: buschj at hhu.de
>> Cc: bioperl-l at lists.open-bio.org
>> Message-ID: <6E83E3F3-C304-4DC4-9A11-FE1CA90F207D at gmail.com>
>> Content-Type: text/plain;    charset=us-ascii
>> 
>> We don't parse the NCBI feature info from the BLAST reports per your query. To look up a specific feature you can use Bio::DB::GenBank to query for sequence from a specific feature by accession number - see the HOWTOs for that.
>> 
>> However, most people use tools that generate SAM/BAM files with short reads - then you can use a tool like bedtools to find overlaps of reads with the locations of features.
>> 
>> basically:
>> - download the genome and GFF for arabidopsis
>> - align your sRNA to the genome with a short read aligner - bowtie, bwa, others
>> - convert your sam to bam file with SAMtools or picard
>> - compare the location of features with the reads to get expression summaries or individuals reads with BEDTools
>> 
>> 
>> On Jan 25, 2013, at 2:20 AM, jobu <buschj at hhu.de> wrote:
>> 
>>> Am 22.01.2013 19:03, schrieb Mgavi Brathwaite:
>>>> What upstream and downstream elements are you interested in?
>>> 
>>> 
>>> I've got a huge pile of short RNA reads.
>>> Part of the question now is whether those RNA fragments originate from
>>> siRNA events,
>>> or may represent miRNAs / parts of pre-miRNAs.
>>> 
>>> So I did an online  blast search against database nt.
>>> The resulting report quite often just gives subject information like this:
>>> 
>>> -----
>>>> gb|CP002686.1| Arabidopsis thaliana chromosome 3, complete sequence
>>> Length=23459830
>>> -----
>>> 
>>> Now I would like to get the hit's neighbouring regions  for further
>>> analysis.
>>> Preferably I would like to do that  in an automized way, but the only
>>> possible action with this kind of subject gi | description would be to
>>> fetch the entire chromosomal  sequence I guess ?
>>> 
>>> However,
>>> right below the line above, the report states more precisely:
>>> 
>>> ------
>>> Features flanking this part of subject sequence:
>>> 8872 bp at 5' side: cytochrome P450 90B1
>>> 402 bp at 3' side: U1 small nuclear ribonucleoprotein-70K
>>> ------
>>> 
>>> Still I would like to have the possibility to automatically fetch the
>>> subject's sequence(s),
>>> as of now I think  parsing the report with SearchIO won't let me aquire
>>> that information, because SearchIO does not recognize report sections
>>> like those.
>>> 
>>> I hope I did not miss any of SearchIOs capabilities, but I could not
>>> find any method covering my wish?!
>>> 
>>> Right now maybe the only way to get the information I want is to
>>> construct my own parser and write it out into a separate file, which in
>>> turn again  I could read into a hash before processing the Blast-Report
>>> with SearchIO to combine both data for further automized work.
>>> 
>>> I am aware though that even successfully getting the flanking features
>>> would leave me with the more or less wide  intergenic gap my hsp is
>>> located in.
>>> 
>>> However I'm in need of a way to get the flanking features including
>>> their annotation and the region spanning between them.
>>> But I hope I do not have to get complete sequences to accomplish that,
>>> as this would be kind of an overkill.
>>> 
>>> with kind regards
>>> Jochen
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>> 
>> 
>> 
>> 
>> ------------------------------
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
>> End of Bioperl-l Digest, Vol 117, Issue 13
>> ******************************************
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l