Quality control¶
HTS machines read thousands or millions of sequences in parallel.
As you can imagine, this usually generates large fastq
files, with millions of lines.
Manually inspecting the quality of each read is out of the question.
Specialized software has been developed to provide quality measures for fastq
files generated by HTS machines.
FastQC is a popular program to generate quality reports on fastq
data.
In fact, this is usually the first thing you should do once you receive a new dataset.
FastQC reports provide a series of plots that allow the user to assess the overall quality of their raw data and detect potential biases and problems.