[Biopython] Paired-End Read Splitting & Joining
Peter Cock
p.j.a.cock at googlemail.com
Thu Nov 17 12:31:30 UTC 2011
On Thu, Nov 17, 2011 at 11:53 AM, Yaqiang Cao <caoyaqiang0410 at gmail.com> wrote:
>
> Thanks for replying.
>
> Yes, I have a .fastq file convert from .sra, used one of NCBI
> sratools,fastq-dump . And the file is over 1G. I want to split this into two
> FASTQ files because the tophat requires two files of paired-end sequence.
> The screenshot of the first 20 lines of the .fastq file is like the attached
> picture file:
Looking at the names, that file seems not to have both parts of each pair.
I looked on the NCBI SRA page, and the library is described as paired:
http://www.ncbi.nlm.nih.gov/sra?term=srr100235
There only seems to be one SRA file for this accession,
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByExp/sra/SRX/SRX042/SRX042254/SRR100235/
i.e. This file:
ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByExp/sra/SRX/SRX042/SRX042254/SRR100235/SRR100235.sra
I'd look more but the SRA website tells me "Our database is
temporarily unavailable. Please come back later."
Peter
More information about the Biopython
mailing list