[Bioperl-l] SeqIO; writing to custom out format?

Charles Hauser chauser at duke.edu
Fri Feb 21 10:04:31 EST 2003


Hi Brian,

Primary aim is to parse genbank data into chado schema (postgres).  

In addition,I really need a better way to manage the data locally. I
have been swamped of late and have not had time to look into biosql, but
it may be well suited for the task.  My impression was that it was a
MySQL DB, is there a postgres port?

I have a postgres EST/contig DB on the web server for web-based queries.

In addition to the genbank data I have lots of EST sequences, associated
quality and  several sets of contigs assembled from the ESTs to
store/manage/retrieve.  

I know I am depending too heavily on data retrieval from flat files -
just have not made the time to get it done yet.

All suggestions/pointer most welcome.

regards,

Charles


On Thu, 2003-02-20 at 20:39, Brian Osborne wrote:
> Charles,
> 
> Do you mean you'd like to load your Genbank files into postgres? Do you 
> need to use your own schema or can you use the biosql database? Are you 
> simply going to discard the fasta files after? Excuse the many 
> questions but the answers are slightly different depending on what you 
> want to do.
> 
> Brian O.
> 
> 
> On Friday, February 21, 2003, at 05:22 AM, Charles Hauser wrote:
> 
> > All,
> >
> > I think there is a clean way using SeqIO to write to a custom format,
> > but am missing it.
> >
> > Parsing genbank files, I would like to write a modified fasta outfile
> > which includes/or uses the gene name as the top line in lie of the
> > default
> > 	$name = format_name($feat->_tag_value('gene'));
> >  to generate :
> >
> >> 'gene name'	'accession'
> > seq	
> >
> >
> > Or am I better off outputting a GFF file?
> >
> > I am going to be using these to load a database(postgres).
> >
> > regards,
> >
> > Chuck
> >
> >
> > my %outfile = ('Cr' => {
> >                         'Fasta' => Bio::SeqIO->new('-file' => '>Cr.fa',
> >                                                    '-format' => 
> > 'fasta')
> >                        }
> >                );
> >
> >
> > FEATURES             Location/Qualifiers
> >      source          1..5131
> >                      /organism="Chlamydomonas reinhardtii"
> >                      /strain="2137"
> >                      /db_xref="taxon:3055"
> >                      /dev_stage="vegetative"
> >      gene            
> > join(21..117,199..264,618..685,1031..1123,2513..2578,
> >                      2892..3023,3355..3505,3906..4109,4383..4498)
> >                      /gene="Pgp1"
> >      CDS             
> > join(21..117,199..264,618..685,1031..1123,2513..2578,
> >                      2892..3023,3355..3505,3906..4109,4383..4498)
> >                      /gene="Pgp1"
> >                      /codon_start=1
> >                      /product="phosphoglycolate phosphatase precursor"
> >                      /protein_id="BAC56941.1"
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-l
> >
> 
> 




More information about the Bioperl-l mailing list