Hail Team, Neale Lab, Broad Institute

Broad Institute

How can multiple biobanks perform a genome-wide association study (GWAS) without sharing the underlying data? Jon will review multi-party linear regression as a two stage process: compressing big data within party, and combining small data between parties. From a geometric perspective, he'll then to derive a simple, efficient distributed algorithm for testing millions of variants in multi-party GWAS (https://github.com/jbloom22/DASH). To add provable security to stage two, Hoon will introduce key concepts from cryptography (secret sharing and secure multiparty computation) that enable privacy-preserving linear algebra. He'll then describe his work on developing a practical and secure pipeline for principal component analysis, as needed to control for population stratification in GWAS.

MIA Talks Search