[Bioperl-l] Very basic Perl/BioPerl Help
Colin Erdman
cerdman2 at du.edu
Thu Apr 14 11:03:06 EDT 2005
Hello all,
I certainly pounded away at this one last night, I thought this part would
be easy, but after spending so much time getting my Entrez gene data parsed
etc my brain was a bit rubbery.
What I am trying to do is take either A) Two fasta files with refseq/genbank
data OR B) Two text files with 1 accession# per line and compare them,
outputting only those fasta seqs or accession #'s that are not present in
both.
So is it easier to just use perl somehow to compare the two raw
acc# text files?
Or should I keep them as FASTA seqs and compare using Bio::Seq
objs somehow?
The idea is to update a list of Chromosome 21 genes last revised in 2003 by
comparing those accession numbers in our list with all of those accession
#'s that I pulled from an entrezgene 21[CHR] AND Homo sapiens[ORGN] NOT
pseudogene query and then saved the output as an ASN.1 file. I have all the
accession #'s.
I just will need to match up those accession #'s NOT currently in our list
with the appropriate Entrez Genes using gene2accession, but I am not sure
how to do that either. I am assuming using a hash, but they have been steep
for me in terms of learning curve, but I'd like to learn them now, I will
just need some intuitive support.
Thanks all!
Colin
More information about the Bioperl-l
mailing list