[Bioperl-l] Output a subset of FASTA data from a single large file
Chris Fields
cjfields at uiuc.edu
Fri Jun 9 17:59:18 UTC 2006
No; I saw the same thing here. It's not FASTA in the traditional sense:
http://www.bioperl.org/wiki/FASTA_sequence_format
though he did get it to build a database successfully. Well, 'success' in
the sense that no errors were thrown. I've learned the absence of error
messages does not necessarily mean that everything went as planned; it
depends on how much error handling has been added to the module by the
submitting author.
It's possible that the second annotation line was ignored completely. I
suppose it's also possible that two sequences are entered into the database,
an empty sequence for the first '>' line and the full sequence for the
second. It's all dependent on how the parser handles this.
Chris
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of M Senthil Kumar
> Sent: Friday, June 09, 2006 5:21 PM
> To: simon andrews (BI)
> Cc: bioperl-l at lists.open-bio.org; Michael Oldham
> Subject: Re: [Bioperl-l] Output a subset of FASTA data from a single large
> file
>
>
>
> On Fri, 9 Jun 2006, simon andrews (BI) wrote:
> |
> |
> |> -----Original Message-----
> |> From: bioperl-l-bounces at lists.open-bio.org
> |> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of
> |> Michael Oldham
> |> Sent: 09 June 2006 03:08
> |> To: bioperl-l at lists.open-bio.org
> |> Subject: [Bioperl-l] Output a subset of FASTA data from a
> |> single large file
> |>
> |> Dear all,
> |>
> |> I am a total Bioperl newbie struggling to accomplish a
> |> conceptually simple task. I have a single large fasta file
> |> containing about 200,000 probe sequences (from an Affymetrix
> |> microarray), each of which looks like this:
> |>
> |> >probe:HG_U95Av2:1138_at:395:301; Interrogation_Position=2631;
> |> >Antisense;
> |> TGGCTCCTGCTGAGGTCCCCTTTCC
> |
> |Unfortunately that's not Fasta format (which only has a single header
> |line starting with a '>'. I'd imagine that most programs which deal
> |with fasta which read that entry would see it as two sequences, the
> |first of which is empty.
> |
>
> [snipped]
>
> hi,
>
> I think the file is in fasta format and probably you might have seen it
> differently because of your mail transport agent.
>
> Senthil
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list