Combinatorial Chemistry in Drug Discovery (original) (raw)

. Author manuscript; available in PMC: 2018 Jun 1.

Published in final edited form as: Curr Opin Chem Biol. 2017 May 8;38:117–126. doi: 10.1016/j.cbpa.2017.03.017

Abstract

Several combinatorial methods have been developed to create focused or diverse chemical libraries with a wide range of linear or macrocyclic chemical molecules: peptides, non-peptide oligomers, peptidomimetics, small-molecules, and natural product-like organic molecules. Each combinatorial approach has its own unique high-throughput screening and encoding strategy. In this article, we provide a brief overview of combinatorial chemistry in drug discovery with emphasis on recently developed new technologies for design, synthesis, screening and decoding of combinatorial library. Examples of successful application of combinatorial chemistry in hit discovery and lead optimization are given. The limitations and strengths of combinatorial chemistry are also briefly discussed. We are now in a better position to truly leverage the power of combinatorial technologies for the discovery and development of next-generation drugs.

Keywords: combinatorial chemistry, combinatorial library, drug discovery, high-throughput screening, computer-assisted drug design, one-bead one-compound library, DNA-encoded chemical library

Introduction

Combinatorial chemistry involves the generation of a large array of structurally diverse compounds, called a chemical library, through systematic, repetitive and covalent linkage of various “building blocks”. Once prepared, the compounds in the chemical library can be screened, concurrently, for individual interactions with biological targets of interest. Positive compounds can then be identified, either directly (in position-addressable libraries) or via decoding (using genetic or chemical means).

The concept of combinatorial chemistry was developed in the mid 1980’s, with Geysen’s multi-pin technology [1] and Houghten’s tea-bag technology [2] to synthesize hundreds of thousands of peptides on solid support in parallel. In 1991, Lam et al. [3] introduced the one-bead one-compound (OBOC) combinatorial peptide libraries and Houghten et al. [4] described the solution-phase mixtures of combinatorial peptide libraries. In 1992, Bunin and Ellman reported the first example of a small-molecule combinatorial library [5]. In addition to being displayed on microbeads, peptides and other synthetic compounds can be displayed on planar surfaces or solid supports, such as glass, to form planar microarrays [6]. In 1985, Smith described the phage-display peptide library method [7]. Similar to OBOC libraries, each M13 phage displays one unique peptide entity (five copies); i.e., one-phage one-peptide. Positive phages can then be isolated for amplification, re-panning, and eventually decoding with DNA sequencing. Unlike synthetic library methods, early biological libraries (phage-display, yeast-display, polysome-display peptide libraries) are restricted to the use of the 20 natural L-amino acids and simple cyclization with disulfide bonds. In the mid 2000’s, Frankel et al. [8] Josephson et al. [9], and Murakami et al. [10] reported the mRNA-display macrocyclic peptide libraries using unnatural and D-amino acids as building blocks. In 2009, Heinis et al. introduced the method of post-translational chemical modification of phage-displayed peptide libraries [11]. The latter approaches enable the generation of libraries of conformationally constrained peptides with greater chemical diversity and resistance to proteolysis, and are, thus, potentially more useful as drugs. Recent advances in DNA-encoded chemical libraries (DECLs) have allowed investigators to create and decode huge diversity small-molecule organic, peptide or macrocyclic libraries.

Combinatorial chemistry has been used for both drug lead discovery and optimization [12,13,14•]. Figure 1 summarizes the various combinatorial library methods, the nature of the library compounds involved and the screening methods available to each of the technologies. As shown in Figure 1 (orange boxes), most of the combinatorial library methods have the ability to generate hugely diverse chemical libraries (e.g. >1 million). These include the phage-display, yeast-display, bacteria-display, mRNA-display, OBOC, DECL, and solution phase mixture libraries. In addition to generating a huge number of compounds, these combinatorial library methods also allow rapid concurrent screening against specific drug targets (see below). The parallel synthesis library and synthetic planar microarray library methods (black boxes, Figure 1) are much lower throughput, and the resultant libraries far more focused, than the aforementioned methods. The planar microarray method has mostly been used as a tool for peptide research; although, in theory, other types of compounds can be chemically prepared in situ, via automation. The highly focused parallel synthesis small-molecule libraries (hundreds to thousands of compounds), when developed in conjunction with computational chemistry, are particularly useful for optimization of drug leads (see below). The subject of combinatorial chemistry has been extensively documented and reviewed [1416]; as such, this short review covers only recent advances in combinatorial library design, synthesis and high-throughput screening methods. Selected examples that utilize combinatorial library approaches for drug discovery will also be briefly discussed; however, nucleic acid-based combinatorial libraries (e.g. aptamer library [17]) will not be discussed here.

Figure 1.

Figure 1

Overview of combinatorial technologies. The various combinatorial technologies are shown in orange (diverse and focused libraries) and black (focused small library), the nature of chemical compounds is shown in blue, and the two broad groups of screening assays are shown in green. Depicted within the red ovals are the screening assays and nature of library compounds pertaining to each technology. The question mark indicated that, in practice, synthetic planar microarray is limited to peptides and simple oligomers.

Computational Chemistry for Combinatorial Library Design

As the fields of combinatorial chemistry and computational chemistry began to mature, it became clear that combining the two would lead to higher hit rates. It is more cost-effective to design and screen virtual chemical libraries in silico, such that subsets of the chemical space of likely hits can be defined, prior to the actual synthesis and screening of the libraries. Computer-assisted drug design, such as generation of virtual libraries, analogue docking and in silico screening now becomes the standard procedure used in drug discovery programs. Fragment-based drug design (FBDD) involves the experimental screening of libraries of small chemical fragments, via nuclear magnetic resonance (NMR) spectroscopy or other biophysical technologies such as surface plasmon resonance (SPR) for low affinity hits (low mM to high μM), or in silico screening of virtual fragments if the structural information of the target is available. Proper linkers are then used to connect the fragment hits while maintaining their relative positions in the sub-pockets. High-affinity ligands have been found with these approaches [18,19]. Vemurafenib is the first drug discovered via FBDD to gain FDA approval [20]. To enhance the probability of obtaining hits that are more drug-like, ADMET (absorption, distribution, metabolism, excretion and toxicity) filters have also been included in the algorithm for library design [21]. Examples of other library design methods include multi-objective optimization methods [22], the “adaptive” library approach with a simulated evolutionary process [23], and the multiple copy simultaneous search method which uses active site mapping and a de novo structure-based design tool [24]. A rapid and simple Python-based method for target-focused combinatorial library design was recently developed by Li et al. [25]. This method utilizes flexible SMILES strings, which are concatenated by Python language, to encode structures of molecules and create the library at a rate of approximately 70,000 molecules per second. The authors used the hybrid 3D similarity calculation software SHAFTS to help refine the size of the libraries and improve hit rates. Although the aforementioned computational methods can be applied to both diverse and focused library design, they are particularly important for the development of focused libraries of limited diversity, so that the hit rate can be increased.

Generation of Combinatorial Libraries

Parallel synthesis of combinatorial libraries can be achieved manually or robotically, in solution or on solid support. Diversity of these libraries tends to be small (hundred to a few thousands) but the choice of coupling chemistry is not limiting, and each library compound can be purified via automatic chromatography if needed. The intended structures of each of the library compounds are known. In contrast, the OBOC libraries are synthesized on microbeads using the split-pool synthesis strategy [3,4,26], resulting in greater diversity (thousands to millions) of bead-bound library compounds. However, these library compounds are non-addressable, and the positive bead isolated from screening must be decoded via a chemical or physical barcode, which can be constructed during library synthesis. Solution-phase positional scanning libraries can be prepared on solid support via split-pool synthesis, and later cleaved off the beads into a compound mixture in solution. Methods for the generation of biological peptide libraries such as phage-display, yeast-display, mRNA-display, and chemically modified phage-display libraries have been well described in the literature [14,27] and will not be discussed here. DECL libraries can be assembled via proximity ligation of DNA-tagged building blocks to form peptides, small-molecules or macrocycles. The available coupling chemistries for DECL; however, are more limited because they must be mild and compatible with the oligonucleotide tags. For reviews on the synthesis of chemical libraries, please refer to references [2830] and the series of “Comprehensive Survey of Combinatorial Library Synthesis” in the Journal of Combinatorial Chemistry (currently ACS Combinatorial Science). Here, we would like to highlight several recently developed new chemical approaches and technologies in the preparation of combinatorial libraries.

Huang and Bode recently reported a “synthetic fermentation” method that does not require the use of organisms, enzymes or reagents to generate a combinatorial library of complex organic molecules “grown” from small building blocks in water [31••]. In this method, the authors adapted ketoacid ligation, which produces β-amino acid linkages. By adjusting the reaction conditions and the building blocks, products with different sequences, structures and compositions can be modulated. The authors prepared a 6,000-membered library from 23 simple building blocks and discovered a 1.0-μM inhibitor against hepatitis C virus NS3/4A protease.

Litovchick et al. developed a chemical ligation method for the construction of DECLs [32•]. The method relies on the ability of the Klenow fragment of DNA Polymerase I to translocate to a DNA backbone through triazole linkages via click cycloaddition. The authors have developed a strategy that allows for repetitive and specific installation of multiple oligonucleotide tags. Compared with previous DECL methods, this chemical ligation method represents an advance over, and could expand the scope and diversity of chemistry suitable for DECLs.

Many bioactive peptidic natural products contain macrocyclic structures. Suga and Bashiruddin recently published a review article [33] on the construction and screening of large libraries of natural product-like macrocyclic peptides using reconstituted translation systems where designated codons are made vacant and then reassigned to unnatural amino acids. Ribosomal synthesis of macrocyclic peptides can be achieved with a custom-made in vitro translation system containing flexizymes, amino acids (natural and unnatural), as well as unnatural amino acid capable of crosslinking with other amino acids. Fasan et al. recently reported a novel and versatile method for generating side chain-to-tail cyclic peptide macrocycles from ribosomally derived polypeptides in vitro in a pH-triggered manner or directly in living bacterial cells [34••]. Unnatural amino acids bearing a side chain of 1,3-aminothiol (AmmF) or 1,2-aminothiol (MeaF) are first ribosomally inserted into intein-containing precursor proteins (Figure 2). Then spontaneous post-translational cyclization via a _C_-terminal ligation/ring contraction is achieved via an intein-catalyzed intramolecular transthioesterification, followed by ring closure through an irreversible S, N acyl transfer rearrangement. More recently, the Suga group reported a strategy for efficient post-translational modification of a library of ribosomally translated peptides by introducing exogenous free thiols, followed by ligation of carbohydrates to generate proteolytically stable thioglycopeptides [35].

Figure 2.

Figure 2

Strategy for generating side chain-to-tail macrocyclic peptides in vitro in a pH-triggered manner or directly in living cells.

Screening of Combinatorial Libraries

The screening of a combinatorial library can be divided into two categories: virtual screening and experimental real screening. Virtual screening uses computational methods to predict or simulate how a particular compound interacts with a given target protein. The three virtual screening methods used in modern drug discovery include molecular docking, pharmacopoeia mapping, and quantitative structure-activity relationships. The disadvantages of virtual screening are that it cannot replace real screening, and generated hits may be very difficult to chemically synthesize. Real screening approaches, such as high-throughput screening (HTS), can test the activity of hundreds of thousands of compounds experimentally, providing real results; however, these methods are far more expensive and slower than virtual screening methods.

The most common assay to screen a combinatorial library is to determine the binding of the library compounds to the target protein. Other common assays are functional assays, such as biochemical and enzymatic assays, or cell-based assays. Cell-based assays can be direct cytotoxic assays, receptor-binding assays, or cell-signaling assays using cell lines with specific genetic reporter systems. Selection of screening methods greatly depends on the nature of the combinatorial libraries to be screened. Position-addressable soluble libraries prepared from parallel synthesis can be screened with automated HTS methods in 96-, 384-, and 1536-well plates. Libraries on solid supports (e.g. OBOC library) can be easily screened against a variety of biological targets (proteins, cells, viruses, etc.) for binding or functional activities [14], or released in situ for solution phase functional assays [36]. Phage-display peptide libraries can be screened with bio-panning [37]or limited cell-based functional assays, such as cell-binding and cellular uptake assays [37]. Structure-based virtual libraries are screened in silico. Several new screening approaches for combinatorial libraries have recently been developed; below are some examples.

Heusermann et al. recently reported the use of a standard wide-field fluorescence microscope, equipped with LED-based excitation and a modern CMOS camera [38] to detect signals associated with target proteins bound to beads in an OBOC library. The autofluorescence issue was overcome by an optical image subtraction approach. The screening system is ultra-high throughput and >200,000 bead-bound compounds can be screened in 1.5 h. Perez-Pineiro et al. reported a direct label-free ultra-fast method for the identification and spectroscopic classification of hits from OBOC peptide libraries [39]. They synthesized peptides directly on TentaGel beads decorated with bimetallic Au/Ag clusters on the surface, and subsequently use surface-enhanced Raman scattering analysis to detect the signals of the peptide on each bead. Because the Raman scattering intensity is closely associated with the distance to the surface, this method is limited to short peptides with lengths of 7 to 10 amino acids. MacConnell et al. described a microfluidic circuit that enables automated and quantitative functional screening of DNA-encoded compound beads [40]. The device sequentially carries out the following steps: distribution of the library bead into picoliter-scale assay reagent droplets, photo-cleavage of compound from the bead, assay incubation, laser-induced fluorescence-based assay detection, and fluorescence-activated droplet sorting to isolate hits.

Agnew et al. reported the use of in situ click chemistry as a screening approach to assemble multi-ligand protein-capture agents on an OBOC library [41]. This method has several advantages, including: 1) the production of the capture agent does not require prior knowledge of affinity agents against the target protein; 2) the in situ click screening covers a very large chemical space; and 3) the process can be repeated until ligands with the desired affinity and specificity are identified. For example, once a bi-ligand has been identified, it can serve as the anchor ligand to click back to the same OBOC library for discovery of a tri-ligand, and so forth. Upon the addition of each ligand to the capture agent, the affinity and the selectivity of the capture agent for its target protein increase rapidly.

We have recently developed a screening platform to identify death ligands (pro-apoptotic agents) via the screening of one-bead two-compound (OB2C) libraries [4244]. In an OB2C library, a fixed cell-capturing ligand and a random library compound are co-displayed on each bead surface, and a coding tag resides inside the bead to exclude screening interference (Figure 3A). When live cells bind to the capturing ligand on the bead surface, the cells are forced to expose their cell membrane proteins to the OB2C library compounds (Figure 3B). After incubation, dead cells or cells undergoing apoptosis can be readily detected using propidium iodide (PI) or anti-cleaved caspase 3 antibody staining (Figure 3C). Peptide (LWK1) [42], peptidomimetic (S7-Y) [43] and small-molecule (LLS2) [44] death ligands have been identified through OB2C library approach (Figure 3D).

Figure 3. OB2C combinatorial library technology for the discovery of death ligands.

Figure 3

A: Structure of an OB2C combinatorial library bead (an example). B: A cartoon illustrates the OB2C concept. C: A snapshot of a positive bead (indicated by a red arrow, stained with anti-cleaved caspase 3 antibody) from an OB2C library. D: Structures of representative death ligands identified from OB2C libraries. LWK1: peptide; S7-Y: peptidomimetic; LLS2: small-molecule.

Several approaches have been used to generate DECLs with different library-encoding methods and assembly of chemical building blocks [45••,46••]. As all compounds in the library can be identified by their DNA tags, a very large number of compounds (up to billions of molecules) can be screened simultaneously in mixture in affinity-capture experiments on target proteins. The screening process involves three steps: 1) physical isolation of the binder using automated affinity selection; 2) PCR-amplification and sequencing of the DNA codes of the isolated binders; and, 3) evaluation of the obtained sequencing data using a computer program to eliminate false binders. DECL technology can yield specific binders to a variety of target proteins and is a very useful tool for hit discovery and lead expansion.

Encoding and Decoding of Combinatorial Libraries

Since the chemical structure of individual compounds in conventional addressable combinatorial libraries or planar microarray libraries are known, there is no need to encode and decode the chemical hits. For mixture libraries in solution, such as positional-scanning libraries, deconvolution is needed to determine the identity of the hits. Biological-displayed peptide libraries (e.g., phage, yeast or mRNA-display) are genetically encoded and can be decoded with PCR and DNA sequencing. Similarly, DECL decoding can be easily achieved through PCR-amplification of the DNA barcode, followed by high-throughput DNA sequencing. Buller et al. reported another approach named “Illumina sequencing of DECLs” which can yield over 10 million DNA sequence tags per flow-lane [47]. This technology can be used in a multiplex format, allowing the encoding and subsequent sequencing of multiple selections in the same experiment.

Many encoding and decoding strategies have previously been developed for OBOC libraries [48], with chemical barcodes usually decoded using automatic Edman microsequencing of bead-bound peptide tags [49] or mass spectroscopy of released coding tags [50,51]. Marcon et al. recently reported a fluorescence-based encoding method called “on-the-fly” encoding using colloidal barcoding [52]. In this method, 10–20 μm beads were encoded during a split-pool synthesis with smaller 0.6–0.8 μm silica colloids that contain specific and identifiable combinations of fluorescent dye. After screening, the colloidal barcode can be decoded with confocal microscopy. Recently, Lee et al. reported a simple and efficient surface-enhanced Raman spectroscopic (SERS) barcoding method using highly sensitive SERS nanoparticles (SERS ID) [53]. More than one million codes can be generated by using combinations of 44 different SERS IDs, which are highly stable and reliable under bioassay conditions.

Applications of Combinatorial Chemistry for Drug Discovery – Examples

Over the last decade, the combinatorial library approach has been applied successfully to various applications including drug discovery. Table 1 summarizes some of the published applications of various different combinatorial library approaches. Below is an account on two recent reports on using DECL for drug development.

Table 1.

Examples of recent application of combinatorial chemistry for drug discovery

Blakskjaer et al. reported a screening method called “binder trap enrichment”, which allows libraries to be screened robustly in a homogeneous manner [62]. In this method, building blocks are spatially confined at the center of the DNA junction (called Yoctoreactor), facilitating both the chemical reaction between building blocks and library encoding. The screening of DECLs can be performed in a single tube for binding. This approach has increasingly been applied as a viable technology for the identification of small-molecule modulators to protein targets. Wichert et al. recently reported using dual-pharmacophore DECLs to efficiently identify synergistic ligand pairs that bind to a target protein [63••]. In this method, small-molecules were first conjugated to the 3′ and 5′ ends of complementary DNA strands that contain a unique identifying code, followed by DNA hybridization and subsequent inter-strand code-transfer. The authors identified a low micromolar binder to alpha-1-acid glycoprotein from a dual-pharmacophore DECL containing 111,100 unique small molecules. The authors also applied dual-display technology to affinity maturation of a known inhibitor of carbonic anhydrase IX (CAIX). They successfully developed a high affinity bidentate ligand of CAIX (_K_D=0.2 nM) which showed efficient in vivo tumor targeting in a SK-RC-52 kidney cancer xenograft mouse model.

Conclusion and Perspectives

Combinatorial chemistry has accelerated the development of a whole set of combinatorial tools comprising combinatorial library design, efficient synthetic methods, reagents for library synthesis (including solid supported reagents), linkers, bilayer beads, library encoding and decoding strategies, HTS methods and equipment, etc. The large diversity combinatorial bead and planar microarrays in the early 1990’s had inspired investigators in fields beyond chemistry to think “combinatorially”; this change in thinking led to the development of oligonuleotide bead and planar microarrays, genomics and many other “-omics” technologies that involve the concurrent interrogation of thousands to hundreds of thousands of analytes or biomolecules. A recent report on single-cell RNAseq analysis with nanodroplet, indeed uses the “split-pool” synthesis approach to prepare sets of DNA barcodes on microbeads, for subsequent tracking of sequences derived from the same cell [64]. Many investigators, particularly in the pharmaceutical industry, are now working on smaller target-focused solution-phase libraries of compounds with drug-like properties, and incorporating ADMET filters and structure-based drug design approaches into library development [65]. However, for novel lead discovery against a large number of therapeutic targets, particularly for those targets with little structural information, the various high diversity library methods outlined in this mini-review will undoubtedly be invaluable.

Many macrocyclic natural products are non-peptides. Some of them are polyketide-based. There is a great need to develop novel and efficient chemistry for the generation of macrocycles that mimic such structures [33]. Incorporating chemical features of such molecules into the design of “easy-to-couple” building blocks will enable the development of large, diverse natural productlike macrocyclic libraries for the discovery of novel drug leads. Another promising method in combinatorial chemistry is the use of nature’s highly stable peptides, such as cyclotides [66], as scaffolds [67] for library design. Random peptide loops can be grafted, chemically [68] or recombinantly [69], into cysteine knots to form cyclotide libraries.

Although the initial high expectations of combinatorial chemistry for drug discovery have yet to be realized, much has been learned over the last 30 years. Many new chemical, biological, computational, and screening tools have been developed. The limitations and strengths of combinatorial chemistry are better understood. We are now in a better position to truly leverage the power of combinatorial technologies for the discovery and development of next-generation drugs. The future of utilizing combinatorial chemistry for drug discovery is bright.

Acknowledgments

We want to thank Jonathan S. Huynh for proofreading the manuscript.

Funding

This work was supported by the National Institutes of Health (R21 CA135345 for Liu and R01CA115483, R33CA196445 and U01EB021230 for Lam).

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Papers of particular interest, published within the period of review, have been highlighted as:

• of special interest

•• of outstanding interest