[Biopython] sort fasta file
Eric Talevich
eric.talevich at gmail.com
Wed Mar 17 18:32:44 UTC 2010
xyz <mitlox at op.pl> wrote:
>
> Hello,
> I would like sort multiple fasta file depends on the sequence length,
> ie. from the read with longest sequence to the read with the shortest
> sequence.
>
> I have tried to do it but I do not how to sort the records depends on
> the sequence length.
>
> [...]
>
> If I could not hold all the records in memory at once what could I do?
>
There's also a program called uclust which can sort reads by sequence length
very quickly:
http://www.drive5.com/uclust/
It's designed for clustering short reads, but it includes a feature to sort
sequences by decreasing length. I think it can handle files larger than
available RAM, too, though I haven't tested that.
-Eric
More information about the Biopython
mailing list