Opinion: For all this science, what did we learn?
Ed. note: The Broad Institute notes that the study "Large-scale GWAS reveals insights into the genetic architecture of same-sex sexual behavior" by Ganna et al. raises important social, ethical, and scientific issues that are worth considering and discussing. Because we believe it is important to represent a range of perspectives about this work, we invited members of the Broad community to provide their thoughts on the study, the process, the implications, and lessons we might learn. We hope these perspectives will inform a needed discussion.
Since the first genome-wide association study (GWAS) nearly two decades ago, scientists have successfully correlated diseases with thousands of genetic variants (statistical ‘hits’). The field has increasingly developed high standards of methodological rigor, learning how study design and non-genetic confounders can overpower genetic signals. A caveat to GWAS hides directly in its acronym: association studies identify correlations, and correlation does not necessarily equal causation. It is especially difficult to bring rigor to the analysis and interpretation of highly complex phenotypes with large societal implications.
A recent GWAS published in Science by Ganna et al., attempted to identify specific genetic variants influencing ‘same sex sexual-behavior’ using genomic data and self-reported sexual behavior from a cohort of 40-80 year olds living in the UK — data that were originally collected primarily for disease research. Twin studies have long established that sexual orientation and behavior has a strong heritable (genetic) component.1,2 As with most complex traits, thousands of genetic variants, each with tiny effects, likely underlie this heritability.3,4 This article aims to explore the scientific cautions and limits a geneticist may have for the variants identified by Ganna et al.’s. Skepticism is a crucial part of interpreting scientific results, especially correlations with a complex phenotype designating a minority population such as the LGBTQIA+ community.
Phenotype definition matters
One of the first lessons learned in GWAS was that the success or failure of these statistical tests hinges on how close the phenotype used in the analysis is to the actual phenotype of scientific interest. Ganna et al. stratify sexual behavior using the criterion of ‘having ever participated in non-heterosexual intercourse.’ In effect, a single one-off experience is conflated with a life-long identity. Given the small number of statistically significant signals found by Ganna et al., contrasted with similarly powered studies of comparable heritability2, it seems likely that either the phenotype used is extremely heterogeneous or that sexual orientation is highly polygenic. For example, early GWAS researchers were successful in identifying a small number of variants with large effects contributing to macular degeneration, but the genetic nature and heterogeneous phenotypes of high blood pressure proved more elusive initially.5
Socio-economic status and the environment influence phenotype
Many studies have focused on how large socio-economic and environmental influences can impact interpretation of GWAS for even highly heritable traits.6 The self-reported phenotype of same-sex sexual behavior is strongly conditioned on social context, which is difficult, if not impossible to accurately account for. Religiosity, economic status, age, gender, political persuasion, education, and birthplace can all influence ‘observation’ of this phenotype, introducing possibilities for spurious associations. Ganna et al. used participants from UK Biobank born between approximately 1940 and 1970. Nearly all participants were born in the UK prior to the decriminalization of (male) homosexuality in 1967. Each of these individuals — data points in this analysis — lived most of their lives under the World Health Organization’s classification of homosexuality as a mental disease (reversed in 1990).
Mental health and risky phenotypes
Ganna, et al. identify a genetic predisposition to “risk taking” as one of the main correlates for ‘non-heterosexual behavior’ associated variants. As skeptical scientists, it is not unreasonable to question whether this correlation could more easily be explained by the social context of the dataset.7 As recently as 2018, 14 percent of LGBTQIA+ Britons avoided healthcare out of fear of discrimination.8 Given the reality that merely reporting this behavior is a form of perceived risk to many in this community, self-reporters are likely more enriched for individuals who engage in “risk taking” than the community at large. Thus, it is premature and possibly irresponsible to conclude a shared biology between homosexuality and “risk taking.”
Ganna et al. also report genetic correlations with personality traits and mental health, identifying depression, loneliness, and schizophrenia among others. Signals of sexual orientation may exist in genetic studies of mental health, as their prevalence is markedly increased in the LGBTQIA+ community (52 percent report depression and 12 percent report a suicide attempt in the previous year).8 As the authors effectively articulate in their supplmentary data, an increase in psychiatric disorders resulting from discrimination may ‘generate a genetic correlation despite an environmental cause.’ Thus, any speculation of a causal genetic link between mental illness and LGBTQIA+ identity may be confounded and must be reported with extreme caution.
Residual population stratification
Population structure — non-random genotypes extant in data otherwise believed to vary independently — can give rise to statistical artifacts in GWAS. One recent study showed that many supposed genetic associations with height differences in Europeans were better explained by unrelated underlying population differences between countries.9 Similarly, people self-identifying as LGBTQIA+ are unequally distributed around the UK.10 Furthermore, the dataset used by Ganna et al. shows a three- to six-fold increase in self-reported same-sex behavior over 30 years (almost certainly reflective of reduced stigma and not due to a biological change), representing a major stratification based on age. Regional or temporal differences in self-reporting, migration, admixture, and acting on sexual preferences are then also problematic, even within a seemingly homogeneous country such as the UK. Although the authors took steps to ameliorate several of these concerns, it remains difficult to assess how successful these attempts were.
Caution for broad conclusions
Considering the large sample size and power of Ganna et al.’s dataset, the small number and weak effect sizes of the GWAS hits identified seem to reflect the limitations discussed above. With only five hits identified at genome-wide significant p-values (< 5*10-8 ), reproducibility of “top hits” is difficult to assess (three of five hits replicate in small, independent datasets) and genetic correlations across datasets suggest substantial heterogeneity. Recent work proposes, perhaps counter-intuitively, that the most significant GWAS hits with the largest effect sizes in highly polygenic traits are often peripheral to the underlying genes and pathways for a phenotype.4 Claims of gender differences in sexual behavior or links to olfaction and hormones would require conjecture based upon three loci. While speculation might be harmless for other traits, it is too soon to spin stories about a marginalized population from a few small effect size “hits” that may have only peripheral biological importance.
So what have we learned?
Ganna et al. aimed to identify specific genetic loci contributing to same-sex sexual behavior. The study identified few such loci, which collectively have <1 percent power to accurately predict a phenotype. If the goal was to understand the deeper biology behind sexual behavior, this study design is likely not ideal — which the authors note — as the complexities of societal, cultural, and environmental effects on human behavior are hard to account for in post-hoc GWAS interpretation. Any one of the specific biological insights, such as correlation with mental illness or a specific variant near an olfactory gene, have a chance of being spurious.
In a society without LGBTQIA+ discrimination, understanding the biology of sexual orientation would be a reasonable goal, and the stakes of those findings low. Scientists knew prior to this study that sexual behavior has a strong heritable component and is influenced by many genes. At best, our understanding remains essentially the same, despite the study’s publication in a top-tier journal. At worst, the public will be misinformed and confused about why scientists would study this trait over thousands of serious diseases, all while a historically marginalized group has been left more vulnerable.
Steven Reilly is a postdoctoral associate at the Broad Institute.
- Bailey JM and Pillard RC. A genetic study of male sexual orientation. Arch Gen Psychiatry. 1991. DOI: :10.1001/archpsyc.1991.01810360053008.
- Långström N, et al. Genetic and environmental effects on same-sex sexual behavior: a population study of twins in Sweden. Arch Sex Behav. 2010. DOI: 10.1007/s10508-008-9386-1.
- Ripke S, et al. Genome-wide association analysis identifies 13 new risk loci for schizophrenia. Nature Genetics. Online August 25, 2013. DOI: 10.1038/ng.2742.
- O'Connor LJ, et al. Extreme polygenicity of complex traits is explained by negative selection. American Journal of Human Genetics. Online August 8, 2019. DOI: 10.1016/j.ajhg.2019.07.003.
- MacRae CA and Vasan RS. Next generation GWAS: Time to focus on phenotype? Circ Cardiovasc Genet. 2011. DOI: 10.1161/CIRCGENETICS.111.960765.
- Belsky DW, et al. Genetics and the geography of health, behaviour and attainment. Nature Human Behavior. Online April 8, 2019. DOI: https://doi.org/10.1038/s41562-019-0562-1.
- Linnér RK, et al. Genome-wide association analyses of risk tolerance and risky behaviors in over 1 million individuals identify hundreds of loci and shared genetic influences. Nature Genetics. Online January 14, 2019. DOI: 10.1038/s41588-018-0309-3.
- Bachman CL and Gooch B. LGBT in Britain: Health report. Stonewall and YouGov. 2018.
- Sohail M, Maier RM, et al. Polygenic adaptation on height is overestimated due to uncorrected stratification in genome-wide association studies. eLife. Online March 21, 2019. DOI: 10.7554/eLife.39702.
- UK Government Equalities Office. National LGBT Survey: Summary report. July 2018.
Perspectives from the Broad community
Community engagement strengthens science
A Broadminded blog opinion piece by Broad institute member and study co-senior author Benjamin Neale
Seeking justice in the age of genomics and a call for higher ethical standards for research involving human populations
A Broadminded blog opinion piece by Broad Institute research associate Bryan Ferguson
Unintended, but not unanticipated: the consequences of human behavioral genetics
A Broadminded blog opinion piece by Broad Institute bioinformatics analyst Carino Gurjao
Discovery or discrimination? Starting the conversation about the potential outcomes of a LGBTQIA+ targeted study
A Broadminded blog opinion piece by Broad Institute operations specialist Meagan Olive
Weighing the positive and negative impacts of studies regarding sexual minorities
A Broadminded blog opinion piece by Broad community members Liam Spurr, Julian Avila-Pacheco, Meagan Olive, and Denisse Rotem
Big data scientists must be ethicists too
A Broadminded blog opinion piece by Broad Institute postdoctoral fellow Joseph Vitti
What Genetics Is Teaching Us About Sexuality
A New York Times opinion piece by Broad Institute and Harvard University postdoctoral fellow and study co-author Robbee Wedow and UT Austin integrative biologist Steven Phelps
Same-sex sexual behavior and genes: like love, the answer is complicated
A STAT opinion piece by 23andMe vice president of business development Emily Drabant Conley