[Bioperl-l] embl sequences 2 fasta

Pedro Antonio Reche reche at research.dfci.harvard.edu
Tue Aug 9 11:48:47 EDT 2005


Hi
I am interesting in finding all sequences from embl matching a given 
feature  in sub cellular location and then create a single file for 
each of them in fasta format. Any help will be appreciated.
Regards,

pedro

On Jun 30, 2005, at 5:55 PM, Josh Lauricha wrote:

> On Thu 06/30/05 16:48, Joseph Bedell wrote:
>> You can calculate the score given the bit score (from the tabular
>> output) and Lambda (calculated from the matrix). The equation is 
>> Score =
>> (Bits)/(Lambda in bits).
>>
>> Lambda is only dependent upon the matrix. Did you use NCBI-blast or
>> WU-BLAST? Which flavor of blast (blastn, blastp, etc)? In any case, 
>> you
>> can just run a single blast and look at the stats at the bottom of the
>> report to get the value of lambda. For example, a default NCBI-blastn
>> (+1/-3) search has a lambda of 1.37
>>
>> ============================
>> Lambda     K      H
>>     1.37    0.711     1.31
>>
>> Gapped
>> Lambda     K      H
>>     1.37    0.711     1.31
>> ===============================
>>
>> But, what is difficult to discover is this lambda is in NATS. To 
>> convert
>> it to bits, divide it by the natural log of 2, or in perl:
>>
>> perl -e 'print 1.37/log(2),"\n"'
>> 1.97649220601788
>>
>> So, now you can take all of your bit scores divided by 
>> 1.97649220601788
>> to get the Score.
>>
>> HTH,
>> Joey
>
> Cool, thanks. That'll save me a bunch of time ;) This was NCBI blastp,
> so I've already got it calculated ;)
>
> Thanks.
>
> -- 
>
> ------------------------------------------------------
> | Josh Lauricha            | Ford, you're turning    |
> | laurichj at bioinfo.ucr.edu | into a penguin. Stop    |
> | Bioinformatics, UCR      | it                      |
> |----------------------------------------------------|
> | OpenPG:                                            |
> |  4E7D 0FC0 DB6C E91D 4D7B C7F3 9BE9 8740 E4DC 6184 |
> |----------------------------------------------------|
> | Geek Code: Version 3.12                            |
> | GAT/CS$/IT$ d+ s-: a-->--- C++++$ UL++++$ P++ L++++|
> | $E--- W+ N o? K? w--(---) O? M+(++) V? PS++ PE-(--)|
> | Y+ PGP+++ t--- 5+++ X+ R tv DI++ D--- G++          |
> | e++ h- r++ z?                                      |
> |----------------------------------------------------|
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l



More information about the Bioperl-l mailing list