Primer: Introduction to the tree sequence toolchain

April 24, 2019
Dept. of Statistics, University of Oxford

The succinct tree sequence data structure is a concise and efficient encoding of whole-genome ancestry and sequence data, with a rapidly maturing software ecosystem. The tskit (tree sequence toolkit) library provides a comprehensive framework for working with tree sequences using Python and C APIs. The ecosystem growing around this central technology now includes several genome simulators as well as our highly-scalable method for inferring ancestry from data, tsinfer. In this primer session, we will introduce tskit and the tree sequence data structure as well as demonstrate both the simulation and inference of genomic datasets in real-time using downloadable Jupyter Notebooks.