This tool is designed to detect epistatic interactions between SNP pairs and a binary trait. EpiShell is based on popular Boolean Operation-based Screening and Testing (BOOST) method (1) with a couple of enhancements related to missing data handling. Specifically, the significance of the association test statistic is more accurately estimated by taking into account missing genotype(s) via estimation of the degrees of freedom (df) for each candidate SNP pair. EpiShell offers several ways to calculate statistical epistasis including using the score and the log-likelihood ratio tests referred to as BOOST and MB-MDR like modes. Another advantage of the BOOST method lies in Boolean representation of genotype data and bitwise operations to obtain SNP x SNP contingency tables in line with the computer hardware design significantly improving the calculation speed decreasing the overall run-time requirements. In addition to the serial mode (1 CPU), the software offers a parallel mode (> 1 CPU).
In the BOOST-like mode, EpiShell handles binary traits and fits a full generalized linear model with the main SNP effects, 2 degrees of freedom (df) for each main effect, and SNP x SNP interaction effects (4 df). EpiShell identifies significant (specific) interactions via a Log-Likelihood Ratio Test (LRT) based on 4 df. The Bonferroni correction can be applied a posteriori multiple testing corrective measure.
EpiShell successfully runs on datasets with >300,000 markers in parallel mode. The software also provides total run-time estimates.
Reference:
(1) Wan X, Yang C, Yang Q, Xue H, Fan X, et al. (2010) BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies. The American Journal of Human Genetics 87: 325-340