[Biojava-l] GCG Checksums

Russell Smithies russell.smithies@xtra.co.nz
Tue, 15 Oct 2002 09:08:12 +1300


Hi,

Here's a bit of code I wrote last year to calculate the GCG checksums:

 public int GCGChecksum(String seq){
    int count = 0,
        check = 0;

  for(int i = 0; i , seq.length(); i++){
     count ++;
     check += count * seq.toUpperCase().charAt(i);
        if(count == 57)
              count = 0;
   }
   return check % 100000;
}

As you can see, it's only a hashing checksum so I don't think you'll be
breaking anyone's copyright.

hope this helps,

Russell





> Message: 2
> Subject: RE: [Biojava-l] GCG format...
> Date: Mon, 14 Oct 2002 12:05:29 +1300
> From: "Schreiber, Mark" <mark.schreiber@agresearch.co.nz>
> To: "Wiepert, Mathieu" <Wiepert.Mathieu@mayo.edu>,
>        "Andrew Macgregor" <andrew@anatomy.otago.ac.nz>,
>        "BioJava" <biojava-l@biojava.org>
>
> One of the issues with GCG format is the checksum. I have seen methods
> that calculate and verify the checksum but I'm not sure that they are
> supposed to be used. Ie GCG won't tell you how to do it so doing it
> anyway might be breaking some annoying little copyright law.
>
> - Mark
>
>
> > -----Original Message-----
> > From: Wiepert, Mathieu [mailto:Wiepert.Mathieu@mayo.edu]
> > Sent: Saturday, 12 October 2002 4:54 a.m.
> > To: 'Andrew Macgregor'; BioJava
> > Subject: RE: [Biojava-l] GCG format...
> >
> >
> > Hi,
> >
> > Have you considered doing a system call to seqret (from
> > emboss)?  Is that a possibility for you?  Not sure what
> > system you are running on, etc.  May be a hack, but it would
> > work.  Or else system calls to GCG, which will also do some
> > formatting for you.  Again a hack, but workable...
> >
> > -Mat
> >
> > > -----Original Message-----
> > > From: Andrew Macgregor [mailto:andrew@anatomy.otago.ac.nz]
> > > Sent: Thursday, October 10, 2002 10:58 PM
> > > To: BioJava
> > > Subject: [Biojava-l] GCG format...
> > >
> > >
> > > Hi all,
> > >
> > > I see from the mailing list archive that there was mention
> > of someone
> > > creating a GCG format for BioJava. Did this happen? Is it
> > necessary?
> > > I'm interested in seeing how to convert from one format to
> > > another a bit like I
> > > could with this bioperl script. Is something like this
> > > possible. I can see
> > > how to use Embl, Genbank and Fasta but not GCG.
> > >
> > > TIA for any pointers.
> > >
> > >
> > > use Bio::SeqIO;
> > >
> > > my $in  = Bio::SeqIO->new(-file => "$ARGV[0]" , '-format'
> > => 'Fasta');
> > >
> > > # number for files
> > > my $i=1;
> > >
> > > while ( my $seq = $in->next_seq() ) {
> > >
> > >     my $out = Bio::SeqIO->new(-file => ">$i.seq" , '-format'
> > > => 'gcg');
> > >     $out->write_seq($seq);
> > >
> > >     $i++;
> > > }
> > >
> > >
> > > Cheers, Andrew.
> > >