[Biopython] Write IncRNA Output with Biopython

Peter Cock p.j.a.cock at googlemail.com
Thu Oct 31 15:44:19 UTC 2019


Hi Zirui,

Please avoid screenshots for this kind of data - your input and output
are plain text, so you can easily share two or three FASTA records and
the matching output lines directly, even in a plain text email.

Getting the length of the sequence in Python is easy - but how would
you know if it was translated or not? What is in your starting FASTA
file - hopefully it is non-coding nucleotide sequences, but your
example looks like proteins. Depending on the organism, you might be
better off starting with an annotated genome in a format like EMBL or
GenBank which has the RNA genes explicitly labelled (your script would
then look for all the RNA features longer than 200bp).

Peter

On Thu, Oct 31, 2019 at 3:17 PM Zirui Zhou <zirui at vt.edu> wrote:
>
> Hi Peter,
>
> Thanks for your reply and help.
>
> Long non-coding RNAs (long ncRNAs, lncRNA) are a type of RNA, defined as being transcripts with lengths exceeding 200 nucleotides that are not translated into protein. This is the definition I googled.
>
> I have attached the input file and desired output file. However, I have no idea of what to do, haha.
>
>
>
> On Thu, Oct 31, 2019 at 9:03 AM Peter Cock <p.j.a.cock at googlemail.com> wrote:
>>
>> Ah - I see now that you'd tried sending a version of this email with a
>> screenshot (which got held for moderation), which gave a hint of the
>> output - but my questions remain.
>>
>> Peter
>>
>> On Thu, Oct 31, 2019 at 1:00 PM Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> >
>> > What's an IncRNA list in this context? Do you have an example input
>> > FASTA file, and the desired output?
>> >
>> > The Biopython Tutorial covers parsing FASTA sequences files (and other formats):
>> >
>> > http://biopython.org/DIST/docs/tutorial/Tutorial.html
>> >
>> > Peter
>> >
>> > On Thu, Oct 31, 2019 at 2:07 AM Zirui Zhou <zirui at vt.edu> wrote:
>> > >
>> > > Dear All,
>> > >
>> > > I am looking for help for a python script through JupyterNotebook.
>> > >
>> > > Its function is to use protein fasta file to output IncRNA list.
>> > > I really do not have an idea about how to do it. I would more than appreciate it if you could help me with it.
>> > > Thanks a lot.
>> > > Best,
>> > >
>> > > --
>> > > Zirui Zhou
>> > > Ph.D. Student
>> > > Department of Chemical Engineering
>> > > 392 Goodwin Hall, Virginia Tech
>> > > _______________________________________________
>> > > Biopython mailing list  -  Biopython at mailman.open-bio.org
>> > > https://mailman.open-bio.org/mailman/listinfo/biopython
>
>
>
> --
> Zirui Zhou
> Ph.D. Student
> Department of Chemical Engineering
> 392 Goodwin Hall, Virginia Tech


More information about the Biopython mailing list