Identifying biologically active compound classes using phenotypic screening data and sampling statistics.

J Chem Inf Model
Authors
Keywords
Abstract

Scoring the activity of compounds in phenotypic high-throughput assays presents a unique challenge because of the limited resolution and inherent measurement error of these assays. Techniques that leverage the structural similarity of compounds within an assay can be used to improve the hit-recovery rate from screening data. A technique is presented that uses clustering and sampling statistics to predict likely compound activity by scoring entire structural classes. A set of phenotypic assays performed against a commercially available compound library was used as a test set. Using the class-scoring technique, the resultant activity prediction scores were more reproducible than individual assay measurements, and class scoring recovered known active compounds more efficiently than individual assay measurements because class scoring had fewer false positives. Known biologically active compounds were recovered 87% of the time using class scores, suggesting a low false-negative rate that compared well to individual assay measurements. In addition, many weak and potentially novel classes of active compounds, overlooked by individual assay measurements, were suggested.

Year of Publication
2005
Journal
J Chem Inf Model
Volume
45
Issue
6
Pages
1824-36
Date Published
2005 Nov-Dec
ISSN
1549-9596
DOI
10.1021/ci050087d
PubMed ID
16309290
Links