R package - ABS filter

Identification and removal of low-complexity sites in allele-specific analysis of ChIP-seq data

Sebastian M. Waszak1,3, Helena Kilpinen2,3,4, Andreas Gschwind3,5, Andrea Orioli5, Sunil K. Raghav1, Robert M. Witwicki5, Eugenia Migliavacca3,5, Alisa Yurovsky2,3,4, Tuuli Lappalainen2,3,4, Nouria Hernandez5, Alexandre Reymond5, Emmanouli T. Dermitzakis2,3,4, and Bart Deplancke1,3

1Institute of Bioengineering, School of Life Sciences, Ecole Polytechnique Federale de Lausanne (EPFL), Lausanne, Switzerland
2Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
3Swiss Institute of Bioinformatics, Lausanne, Switzerland
4Institute of Genetics and Genomics in Geneva, University of Geneva, Geneva, Switzerland
5Center for Integrative Genomics, Faculty of Biology and Medicine, University of Lausanne, Lausanne, Switzerland

Bioinformatics 2013 Nov 18, doi:10.1093/bioinformatics/btt667


The development of high-throughput sequencing technologies has enabled the genome-wide analysis of the impact of genetic variation on molecular phenotypes with a single base pair resolution. However, while powerful, these technologies can also introduce unexpected artifacts into the data leading to significant biases in downstream analyses. We investigated the impact of library amplification bias on the identification of allele-specific molecular events from high-throughput sequencing data derived from chromatin immunoprecipitation assays (ChIP-seq). Using RNA polymerase II ChIP-seq data from six cell lines from the 1000 Genomes Project, we identified putative allele-specific (AS) DNA binding events. We found that many of the sites showing a significant AS effect suffered from an amplification bias, as evidenced by a larger number of clonal reads carrying one of the two alleles. To eliminate such false positive allele-specific DNA binding signals, we devised an amplification-bias detection strategy, which filters out sites with low read complexity as well as sites featuring a significant excess of clonal reads. This method should prove useful for allele-specific analyses involving ChIP-seq and other functional sequencing applications.

ABS filter 1.0

1. Requirements

2. Download

3. Usage

4. Notes

Please visit the ABS google group to get news on updates and also feel free to post bugs and questions that might be of general interest for other users. For further information please contact Sebastian Waszak or Bart Deplancke.

License: GNU General Public License (Version 3.0). Last update: 11.12.2013