Tag Archives: high-throughput sequencing

Simple Python FastQ Parser


UPDATED (Sun Feb 19 14:56:28 PST 2012)

High-throughput sequencing (HTS) is rapidly advancing our ability to understand how the genome responds to its environment.  It also presents a challenge to those tasked with analyzing the results.  Massive files can be produced that can overwhelm a modest computer’s store of available memory.  The simplest way around this problem is to only work with a small part of the file at a time.  I have provided an example of a very simple; easy to extend; and stand-alone python parser that returns a single fastQ record at a time to provide memory efficient access to these commonly massive files.  It is also small, simple to understand, and does not depend on other packages.

Continue reading