[Bioperl-l] Very basic Perl/BioPerl Help

Stefan Kirov skirov at utk.edu
Thu Apr 14 11:44:47 EDT 2005


Sorry Colin,
I was thinking of sort/diff but this may not work as there will be 
insertions/deletions... You can just use perl to cycle through both lists:
my $f1=shift;
my $f2=shift;
open (F1,$f1)||die;
open (F2,$f2)||die;
my @accn1=<F1>;
my @accn2=<F2>;
my @unique1;
foreach my $accn (@accn1) {
push @unique1,$accn unless (grep(/$accn/, at accn2));
}
Sorry for the confusion
Stefan

Colin Erdman wrote:

>Hello all,
>
> 
>
>I certainly pounded away at this one last night, I thought this part would
>be easy, but after spending so much time getting my Entrez gene data parsed
>etc my brain was a bit rubbery. 
>
> 
>
>What I am trying to do is take either A) Two fasta files with refseq/genbank
>data OR B) Two text files with 1 accession# per line and compare them,
>outputting only those fasta seqs or accession #'s that are not present in
>both.
>
>            So is it easier to just use perl somehow to compare the two raw
>acc# text files?
>
>            Or should I keep them as FASTA seqs and compare using Bio::Seq
>objs somehow?
>
> 
>
>The idea is to update a list of Chromosome 21 genes last revised in 2003 by
>comparing those accession numbers in our list with all of those accession
>#'s that I pulled from an entrezgene 21[CHR] AND Homo sapiens[ORGN] NOT
>pseudogene query and then saved the output as an ASN.1 file. I have all the
>accession #'s. 
>
> 
>
>I just will need to match up those accession #'s NOT currently in our list
>with the appropriate Entrez Genes using gene2accession, but I am not sure
>how to do that either. I am assuming using a hash, but they have been steep
>for me in terms of learning curve, but I'd like to learn them now, I will
>just need some intuitive support.
>
> 
>
>Thanks all!
>
>Colin 
>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>  
>

-- 
Stefan Kirov, Ph.D.
University of Tennessee/Oak Ridge National Laboratory
5700 bldg, PO BOX 2008 MS6164
Oak Ridge TN 37831-6164
USA
tel +865 576 5120
fax +865-576-5332
e-mail: skirov at utk.edu
sao at ornl.gov

"And the wars go on with brainwashed pride
For the love of God and our human rights
And all these things are swept aside"



More information about the Bioperl-l mailing list