[Biopython] remove list redundancy
ferreirafm at usp.br
ferreirafm at usp.br
Fri Mar 23 23:35:54 UTC 2012
Thanks everyone for helping.
Have I weekend.
Fred
Citando ferreirafm at usp.br:
> Hi Biopy users,
> I have a mult-sequence fasta file which I've read as a list. Is
> there a clever way/method to remove redundant sequences?
> Thanks in advance,
> Fred
>
> ### CODE:
> def redundancy(fastafile):
> f=open(fastafile, 'r')
> record = list(SeqIO.parse(f,"fasta"))
> new_rec = record
> f.close
> print len(record)
> for i in range(len(record)):
> for j in range(len(record)):
> if i < j:
> if record[i].seq == record[j].seq:
> del new_rec[j]
> print len(new_rec)
>
>
> ### RESULTS:
> $ redundancy.py -run all_emm_fake.fasta
> 823
> /usr/lib64/python2.7/site-packages/Bio/Seq.py:197: FutureWarning: In
> future comparing Seq objects will use string comparison (not object
> comparison). Incompatible alphabets will trigger a warning (not an
> exception). In the interim please use id(seq1)==id(seq2) or
> str(seq1)==str(seq2) to make your code explicit and to avoid this
> warning.
> "and to avoid this warning.", FutureWarning)
> 823
>
> ### EXPECTING:
> Worse, the function above is not working. I was expecting 823 before
> and 822 after running it.
>
>
>
>
> _______________________________________________
> Biopython mailing list - Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>
More information about the Biopython
mailing list