Neale Lab, Lander Lab, Broad Institute; Analytic and Translational Genetics Unit, Massachusetts General Hospital
Primer: The multiple testing problem
How do we control false positives when testing multiple hypotheses simultaneously? I will review the family-wise error rate (FWER), the false discovery rate (FDR) and related quantities, and discuss some classical and recent methods to control them. Highlights will include the (in)famous genome-wide significance threshold for association studies and Stephens's "New Deal."
Paninski Lab, Blei Lab, Dept. of Statistics, Data Science Institute, Columbia University
Smoothed nested testing on directed acyclic graphs
Frequentists can use prior knowledge, too! Testing many hypotheses at once is difficult (the "multiple testing problem"). Here we explore a special kind of prior knowledge that can make it easier: a logical nested structure to the hypotheses. When one hypothesis is nested "inside" another, the outer hypothesis must be false if the inner hypothesis is false. These nestings can be represented as the edges of a directed acyclic graph where the nodes are the hypotheses. Intuitively, this kind of information ought to help us find more true discoveries with lower false discovery rates, and indeed it does prove useful under multiple nesting logics. We will dive into the details in the case where the logical structure corresponds to a simple chain-graph. When the logical structure corresponds to a more complicated graph, the same ideas carry through. We will discuss the latest methods for making the most of your prior knowledge in multiple testing scenarios. Throughout we will be guided by examples from drug discovery and differential gene expression analysis.