JComputAidedMolDes(2017)31:29–44 DOI10.1007/s10822-016-9956-6 A combined treatment of hydration and dynamical effects for the modeling of host–guest binding thermodynamics: the SAMPL5 blinded challenge Rajat Kumar Pal1,5 • Kamran Haider2 • Divya Kaur6 • William Flynn4,7 • Junchao Xia4 • Ronald M Levy4 • Tetiana Taran3 • Lauren Wickstrom3 • Tom Kurtzman2,5,6 • Emilio Gallicchio1,5,6 Received:17June2016/Accepted:25August2016/Publishedonline:30September2016 (cid:2)SpringerInternationalPublishingSwitzerland2016 Abstract As part of the SAMPL5 blinded experiment, we charged guests. Binding free energies obtained with computed the absolute binding free energies of 22 host– Debye–Hu¨ckel treatment of salt effects were in good guestcomplexesemployinganovelapproachbasedonthe agreement with experimental measurements. Water dis- BEDAM single-decoupling alchemical free energy proto- placement effects contributed favorably and very signifi- colwithparallelreplicaexchangeconformationalsampling cantly to the observed binding affinities; without it, the and the AGBNP2 implicit solvation model specifically modeling predictions would have grossly underestimated customized to treat the effect of water displacement as binding. The work validates the implicit/explicit solvation modeled by the Hydration Site Analysis method with approach employed here and it shows that comprehensive explicit solvation. Initial predictions were affected by the physical models can be effective at predicting binding lack of treatment of ionic charge screening, which is very affinities of molecular complexes requiring accurate treat- significant for these highly charged hosts, and resulted in ment of conformational dynamics and hydration. poor relative ranking of negatively versus positively Keywords SAMPL5 (cid:2) Hydration Site Analysis (HSA) (cid:2) Debye–Hu¨ckel (cid:2) Salt effects (cid:2) AGBNP2 (cid:2) BEDAM Theoriginalversionofthisarticlewasrevised:Correctionsdonein theoriginalarticlehasbeenpublishedintheerratum. Introduction & EmilioGallicchio [email protected] Accurate modeling of hydration is important for under- 1 DepartmentofChemistry,BrooklynCollege,2900Bedford standing the thermodynamics of binding and molecular Avenue,Brooklyn,NewYork11210,USA recognition[1–9].Bindingfreeenergymodelswithexplicit 2 DepartmentofChemistry,LehmanCollege,TheCity solvation are considered as the ‘‘gold standard’’ for mod- UniversityofNewYork,250BedfordParkBlvd.West, eling receptor-ligand complexation because of the critical Bronx,NewYork,NY10468,USA role of desolvation, the hydrophobic effect, and bridging 3 BoroughofManhattanCommunityCollege,Departmentof waters. In recent years, there has been significant progress Science,TheCityUniversityofNewYork,199Chambers in the development of computational tools to quantify the Street,NewYork,NY10007,USA structure and thermodynamics of water molecules on host 4 CenterforBiophysicsandComputationalBiology,Institute and protein surfaces [10–15]. Many of these approaches ofComputationalMolecularScienceandDepartmentof Chemistry,TempleUniversity,Philadelphia,PA,USA have highlighted the importance of displacing water 5 Ph.D.PrograminBiochemistry,TheGraduateCenterofthe molecules from enclosed regions of protein active sites in CityUniversityofNewYork,NewYork,NY10016,USA boosting the binding affinity [16–20]. 6 Ph.D.PrograminChemistry,TheGraduateCenteroftheCity Inthiswork,weemployHydrationSiteAnalysis(HSA) UniversityofNewYork,NewYork,NY10016,USA [10, 18, 20], a methodology based on Inhomogeneous 7 DepartmentofPhysicsandAstronomy,RutgersUniversity, SolvationTheory(IST)[21]tomapoutsolvationstructural Piscataway,NJ08854,USA and thermodynamic features in protein binding sites. High 123 30 JComputAidedMolDes(2017)31:29–44 density regions of water, termed hydration sites, are clas- AGBNP hydration site spheres to mimic the thermody- sified based on a number of energetic and structural mea- namicsofenclosedwatermolecules[24,32].Theresulting sures such as the water-protein interaction energy, water hybrid explicit/implicit model was then employed to pre- entropy and the strength of local water-water and water- dict the binding affinities of the SAMPL5 datasets. The protein hydrogen bonding. These measures are used to SAMPL5 blinded experiment, based on the octa-acid, [33] identifyregionsfromwhichthedisplacementofwaterleads tetra-endo-methyl octa-acid [34] and CB-clip host–guest to either significant gains or significant penalties to the systems [35], offers a unique opportunity to validate the predicted binding affinity. Ligands that displace high free approach outlinedhere.The tetra-endo-methylocta-acid is energy water molecules are morelikelyto bind strongly. referred to as methyl octa-acid throughout the article. We Moleculesareinherentlyflexible.Hence,computational have studied the octa-acid system before as part of the models ofmolecular recognition can not ignore dynamical previous SAMPL challenge [36], where we trained the aspects of binding. In this work, we employ the Binding AGBNP water sites empirically to reproduce available Energy Distribution Analysis Method (BEDAM) [22–24], experimental data. Here instead, we intend to train the a well established alchemical protocol for the quantitative model systematically using data from the HSA method, prediction of absolute binding free energies. BEDAM has grounded in well established hydration theories. Further- been specifically designed to capture conformational reor- more, the SAMPL5 datasets give us the opportunity to ganization and entropic contributions to the binding free validate our approach in an unbiased manner on hosts for energies. To overcome the slow time-scales of conforma- which experimental data is not available. tional reorganization, the method employs parallel multi- dimensional replica exchange conformational sampling algorithms[25,26]coupledwithanimplicitrepresentation Methods of the solvent [27–29]. The implicit solvent representation enables the formu- Hydration Site Analysis of the host binding cavity lationofbindingintermsofthedirecttransferoftheligand from the implicit solvent continuum to the receptor site We investigated the hydration properties of the three host [22]. This single-decoupling feature circumvents some of molecules using Hydration Site Analysis (HSA) approach the convergence difficulties encountered with double-de- [10,18].HSAutilizesanexplicitsolventMDsimulationof coupling approaches with explicit solvation [30, 31]. Key the solute molecule to identify spherical sites where water to the method is the sampling of the effective binding is present at high density. The thermodynamic quantities energy function u(x), which represents the ligand-receptor forthesesites,suchasenthalpy,first-orderentropyandfree interaction energy of a specific conformation x of the energy are calculated using inhomogeneous solvation the- complex averaged over the conformations of the solvent. ory (IST) [21]. A detailed description of the simulation The benefits of the single-decoupling approach is particu- setup used for HSA is given in the Computational Details larly significant for large and charged ligands, as in the section. In summary, the explicit solvent simulation for present application. The same approach is simply not fea- each of the host system in a restrained configuration is sible withexplicitsolvation since itwouldentailalengthy processed to obtain hydration site locations where water potential of mean force calculation for each evaluation of molecules are present with at least twice the bulk density. the effective binding energy function [30]. The clustering procedure to obtain hydration site locations The advantages of implicit solvation modeling in terms is described in detail elsewhere [20]. Briefly, for each of conformational sampling and the single decoupling hydration site, the total energy of the water, E , is cal- total pathway are, however, counterbalanced by the lack of a culatedasthesumofitsmeanwater-waterE ,andsolute- ww detailed representation of hydration effects. A way around water interaction energies E , where E is one half the sw ww these conflicting requirements is to train the implicit sol- mean interaction energy of the water in the site with all ventenergyfunctionbasedonexplicitsolvationdataaswe other waters, and E is one half the mean interaction sw set out to do in this work. BEDAM employs the Analytic energy of the water in the site with the solute. The factors Generalized Born plus Non-Polar (AGBNP) model which of one half follow the convention that half of the interac- has been specifically tailored for protein-drug binding tion energy in a pairwise interaction is assigned to each applications[29].Oneimportantaspectofthemodelisthe molecule of the pair. The locations and average solvation treatment of short ranged water-solute interactions by energies of the hydration sites obtained for each host means of hydration site spheres [24]. molecule are shown in Fig. 1 and Table 1. The solvation In this work, structural and thermodynamic data energies were compared to TIP3P neat liquid simulations obtained from explicit solvent Hydration Site Analysis are todeterminewhichhydrationsitescontainedfavorableand used to inform the placement and energetic strength of unfavorable water molecules relative to bulk solvent. The 123 JComputAidedMolDes(2017)31:29–44 31 Fig.1 Locationofhydrationsiteswithinthebindingcavityofthehostsasidentifiedfromhydrationsiteanalysis.aOcta-acid,bmethylocta- acid,cCB-clip Table1 SummaryofHydrationSiteAnalysis(HSA)resultsforthreehostsystemsinvestigatedinthisstudy Siteid Location Occupancy(p) Esw Eww Etotal ½Etotal(cid:3)Ebulk(cid:4)p Nnbrs fenc A. Octa-acid 0 Bottom 0.62 -4.57 -2.42 (cid:3)6:99 1.58 1.27 0.76 1 Top 0.40 -0.64 -7.59 (cid:3)8:23 0.52 3.06 0.42 2 Top 0.41 -0.61 -7.41 (cid:3)8:02 0.62 2.89 0.45 3 Top 0.40 -0.58 -7.62 (cid:3)8:19 0.53 3.12 0.41 4 Top 0.40 -0.58 -7.55 (cid:3)8:13 0.56 2.96 0.44 5 Middle 0.29 -1.48 -7.50 (cid:3)8:98 0.16 2.89 0.45 B. Methylocta-acid 0 Bottom 0.60 -4.52 -2.14 (cid:3)6:66 1.72 1.32 0.75 1 Top 0.36 -0.48 (cid:3)7:00 (cid:3)7:48 0.74 2.27 0.57 2 Top 0.31 -0.73 -6.84 (cid:3)7:57 0.61 2.54 0.52 3 Rim 0.31 -0.02 (cid:3)8:49 (cid:3)8:51 0.31 3.19 0.39 4 Top 0.32 -0.85 -6.83 (cid:3)7:68 0.60 2.50 0.52 5 Top 0.30 -0.62 -6.68 (cid:3)7:30 0.66 2.24 0.57 6 Rim 0.27 0.60 -9.67 (cid:3)9:06 0.66 3.64 0.31 C. CB-clip 0 Center 0.92 -5.76 -2.27 (cid:3)8:03 1.38 1.75 0.67 3 Naphthalenerings 0.42 -1.55 (cid:3)6:97 (cid:3)8:52 0.42 3.43 0.35 Allenergeticquantitiesareexpressedinkcal/mol.ForCB-clip,atotalofninehydrationsitewereobtainedinthecavity,thetwositeusedfor AGBNP2parametrizationarereportedhere.E =-9.53kcal/mol bulk relative solvation energies and positions of the explicit below.TheHSAscores(listedinTable 1)aregivenbythe hydration sites were used to position and parameterize the expression ½E ðiÞ(cid:3)E (cid:4)pðiÞ where i is the index of the total bulk strength of the AGBNP2 hydration site adjustments which HSAhydrationsites,p(i)isthewateroccupancyofthesite, account for enclosed hydration effects. We chose to only E ðiÞ is the total energy of the site and E is the cor- total bulk useHSAenergiestoparameterizethestrengthofhydrogen responding reference value obtained from TIP3P neat bond correction parameters because HSA entropies were water.Twowatermoleculesareconsideredtobefirst-shell previouslyshowntobenon-predictiveinascoringfunction neighbors in a given MD frame if their oxygen atoms are usedtopredictrelativebindingaffinitiesofligandsbinding separatedbylessthan3.5A˚ Thisdistancecriterionisbased to Factor Xa [20]. For each system, we derive scores for onthe location ofthe minima ofthe oxygen-oxygen radial hydration sites and apply them as corrections to AGBNP2 distributionfunctionforbulkwater[37].Themeannumber sites for the corresponding system (Fig. 2a) as described of these first shell neighbors for water in each hydration 123 32 JComputAidedMolDes(2017)31:29–44 Fig. 2 a Position of the hydration spheres added to AGBNP2 between two naphthalene rings; the center of masses of carbons parameters for each hosts, b the carbon atoms used to position the markedingreenwereusedtopositionwater-sitesinthemiddlecavity hydration sphere in all three hosts; the carbon atoms marked in of the octa-acids; carbons marked in orange were used to model a magentaareusedforpositioningthewatersitesonthetopcavityof watersiteatthebottomcavityoftheocta-acids;inCB-clip,awater octa-acids;inCB-clip,thosecarbonsareusedtopositionwatersitein siteispositionedatthecenterofmassofthosefourcarbons site is termed Nnbrs, and this first-shell neighbor count is DG ¼DG þDG þDG ð2Þ h elec np hs used to define the fractional enclosure of a hydration site, Theelectrostaticcomponentofthehydrationfreeenergyis byreferencingthenumberofitsfirst-shellneighborstothe represented by a pair-wise descreening implementation of mean number of first-shell neighbors of a TIP3P water molecule in bulk, Nnbrs: the Generalized Born model [28, 40, 41]. The non-polar bulk component of the hydration free energy is modeled as the Nnbrs sum of cavity and solute-solvent dispersions interactions f ¼1(cid:3) ð1Þ enc Nnbrs [28]. AGBNP2 includes algorithms to place and score bulk hydration spheres on the surface of the solute to describe This quantity indicates the degree to which the water in a short-range solute-solvent interactions such as hydrogen hydration site is blocked from contact with other water bonding [29]. The functional form of the short-ranged molecules. The value of Nnbrs is 5.25 as calculated from hydrogen bonding component of the solvation free energy bulk TIP3P pure water. in Eq. (2) above is X DG ¼ h Sðw Þ hs s s ð3Þ The AGBNP2 implicit solvation model s wherew isthewateroccupancyfactorofthesitemeasured s Here we employed the OPLS-AA/AGBNP2 effective as the fraction of the volume of the hydration site sphere potential,whereOPLS-AA[38,39]representsthecovalent actually accessible to water defined as and non-bonded inter-atomic interactions and solvation is Vfree modeled implicitly by the Analytic Generalized Born plus w ¼ s ð4Þ s V non-polar (AGBNP2) model [29]. The hydration free s energyDGh iscomputedasthesumofelectrostaticDGelec, whereVs isthevolumeofthewatersphereandVsfree isthe non-polar DGnp, and short-range solute-water hydrogen free volume of the water site, that is the volume not bonding interaction DGhs: occludedbysoluteatoms,obtainedbysummingallthetwo 123 JComputAidedMolDes(2017)31:29–44 33 body,threebody,etc.overlapvolumesofthewatersphere sodium phosphate solution used for the measurements of with the solute atoms i, j, k, etc. as [29] CB-clip guest binding, was used throughout. X X X Vfree ¼V (cid:3) V þ V (cid:3) V s s si sij sijk ð5Þ i i\j i\j\k Binding free energy protocol In Eq. (3), S is a switching function, and the h are s adjustable parameters which measure the strength of the Binding free energies were computed using the Binding hydrogen bonding energy (more precisely the portion of it EnergyDistributionAnalysisMethod (BEDAM)[22].The not captured by the Generalized Born model) [29]. The methodusesasingle-decouplingalchemicalapproachwith specification of the solute atoms carrying such hydration implicit solvation. An alchemical progress parameter k is spheresisaccomplishedbymeansofSMARTSpatterns.A introducedrangingfrom0,correspondingtotheuncoupled parameterdatabaseisusedtomapSMARTSpatternstothe state of the complex, to 1, corresponding to the coupled energeticparametersandplacementgeometryofhydration state of the complex. The k-dependent effective soft-core spheres [29]. The h parameters are normally negative and hybrid potential energy function is: s adjusted to describe favorable solute-solvent hydrogen U ðrÞ¼U ðrÞþkuðrÞ ð7Þ k 0 bonding interactions. In this context, the DG term disfa- hs vors binding by increasing the ligand and receptor desol- where r ¼ðr ;r Þ are the atomic coordinates of the A B vationpenalties.Inaddition,inthiswork,weusethesame receptor-ligand complex and r and r denote the coordi- A B functionalformtomodelenclosedwatermolecules,which nates of the receptor and ligand, respectively. U ðrÞ¼ 0 contribute favorably to binding, when displaced by the Uðr ÞþUðr Þisthepotentialenergyofthecomplexwhen A B ligand.Thekeydistinctionbetweenhydrogenbondingsites the receptor and ligand are uncoupled, i.e. as if they are andenclosedhydrationsitesisthesignoftheh parameter, separatedatinfinitedistancefromeachother.Thequantity s whichispositiveforthelatterandnegativefortheformer. u(r) is the binding energy and is defined as the change in The specific values of h assigned to enclosed hydration the effective potential energy of the complex for bringing s sites are derived from explicit solvent HSA analysis and the receptor and ligand from infinite separation to the are given in the Results section below. conformation r of the complex: uðrÞ¼Uðr ;r Þ(cid:3)ðUðr ÞþUðr ÞÞ ð8Þ A B A B Incorporation of Debye–Hu¨ckel parameters In this work we adopted a soft-core form of the binding in the implicit solvation model energy mentioned above and as described [23, 43]. UWHAM analysis of the binding energy samples was In this work the Generalized Born components of the carried as described [43] to obtain binding free-energy AGBNP2 model included modifications based on the estimates and their corresponding statistical uncertainties. Debye–Hu¨ckel screening aimed at modeling the effects of Average interaction energies, DE were obtained by aver- b electrolytes present in the aqueous solution. Specifically, aging the binding energy values collected at k=1 (the we adopted the following functional form for the GB pair bound state). The uncertainty on interaction energies were potential [42] measured as the standard error of the mean. The entropic u ði;jÞ¼(cid:3)(cid:2) 1 (cid:3)e(cid:3)jrij(cid:3) qiqj contributiontothefreeenergy,theeffectofreorganization GB (cid:2) (cid:2) rffihffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiiffiffi energy and reorganization free energy to binding were out in ri2jþBiBje(cid:3)ðri2j=4BiBjÞ derived [26, 44] from binding energies and other thermo- dynamic parameters as described in the computational ð6Þ details section. The uncertainties of all these quantities where r is the distance between atoms i and j, (cid:2) ¼80 were computed by error propagation. ij out and (cid:2) ¼1 are the dielectric constants of the solvent and In this work, BEDAM conformational sampling is in the interior of the solute respectively; q and q are the acceleratedbymeansoftwo-dimensionalreplicaexchange i j partialchargesofatomsiandj.ThetermsB andB arethe along alchemical and temperature parameters [26, 45]. k i j Born radii of atoms i and j. The Debye–Hu¨ckel screening exchanges facilitate the mixing of intermolecular degrees p parameter is defined as j¼ ffi8ffiffipffiffiffikffiffiffiffiIffiffi where I is the ionic of freedom, while temperature exchanges activate B strength of the solution and k is the Bjerrum length intramolecular degrees of freedom thus allowing confor- B [40, 42]. For water at room temperature, k =7.0 A˚. An mational transitions to explore the conformational space B ionic strength of 0.12 M, corresponding to the 20 mM more efficiently [46]. 123 34 JComputAidedMolDes(2017)31:29–44 Computational details the water molecules in those locations. For each host, 190, 000 frames of the trajectory were processed with Explicit solvent calculations and Hydration Site Analysis HSA analysis. Hydration site locations were identified (HSA) using a subset of 10,000 frames which were evenly spaced in time. High density spherical regions (hydration For each host molecule, a molecular dynamics simulation sites) of 1A˚ radius were identified using a clustering was performed using the OPLS 2005 force field (Schro- procedure [10] on the water molecules that were found dinger Inc.) in a TIP3P [47] cubic water box, using the within 5 A˚ of the host molecule. The resulting hydration DESMOND package [48]. The initial startingstructure for sites were each populated by retrieving all water mole- eachsimulationwasarepresentativeholostructurewithout cules, which had oxygen atoms within 1.0 A˚ from the the ligand. These representative structures were obtained corresponding hydration site center. The hydration sites from BEDAM simulations of the octa-acid and methyl were then enumerated according to their occupancies, octa-acid hosts with guest 4 and the CB-Clip host with with the highest populated site given the index 0. The guest1.Eachwaterboxwasbuiltwithaminimumdistance average solvation energies were calculated based on of 10 A˚ between the solute and the edge of the box. The energies of the water molecules that were within 1.0 A˚ default Desmond equilibration protocol was used to mini- from the coordinates of each hydration site center in the mize and thermalize the systems for the production run. original 190,000 frames. The system was further relaxed for 25 ns under NPT conditions at 1 atm and 300 K. An NVT production run was started from a configuration in which the box volume Systempreparationforthebindingfreeenergycalculations was close to the average of the previous NPT simulation. The production MD simulations were run for 100 ns and Thestructures ofthe octa-acid and CB-clip hostsandtheir snapshotswerecollectedevery0.5ps.Thefirst1nsofthe guests were prepared using Maestro (Schrodinger Inc.) production run was discarded. In the octa-acid and methyl starting from structure files provided by the SAMPL5 octa-acidsimulations,positionalrestraintswereusedonall organizers. Bond orders and formal charges were adjusted oftheheavyatomswithaforceconstantof5.0kcal/mol/A˚2 asappropriate.Alltheeightcarboxylategroupsoftheocta- which allowed for free rotation of hydrogens. In the CB- acids were modeled as deprotonated resulting in a net clip simulations, position restraints were used on all of the negativechargeof(cid:3)8.Similarly,thefoursulfonategroups heavy atoms of the host with a force constant of 5.0 CB-cliphostweremodeledasallionized,resultinginanet kcal/mol/A˚2, excluding the sulfonate groups. The use of overall charge of (cid:3)4. Alternate protonation and tau- one rigid receptor should not have a significant effect on tomerization states of the CB-clip guests were generated determining the hydration sites and the following free using the LigPrep facility (Schrodinger Inc.) using Epik energycalculationsduetothenatureofthescalingfunction [49] with a pH range of 7±2. Ionization and tautomeriza- used to calculate enclosure energies and the rigidity of the tion free energy penalties were recorded and added to the SAMPL5 hosts. If the host structures are unrestrained BEDAM binding free energy estimates to compute the duringHSAanalysis,theoccupanciesofthehydrationsites predictedbindingfreeenergies.LigPreppredictedmultiple are expected to decrease due to the effect of the confor- protonation and tautomerization states for the CB-clip mationalfluctuationsofthehostonthesolvent.Incontrast, guests. Multiple protonation states of guests 1, 4, 5, 6, 8 the energies of these hydration sites may become more and 9 were tested individually. Guests 3 and 5 which favorable due to the ability of the solvent to optimize its contain alkylammonium groups were modeled as posi- interactionswitheithertheneighboringwatermoleculesor tively charged and the guests containing carboxyl groups, nearby solute atoms. These changes to the HSA energies guests 1, 2, and 4 were modeled as deprotonated each and occupancies will have an impact on the global having a single negative charge. Guest 6, which is parameter c and should not impact the enclosure energies nitrobenzoic acid, holds an overall single negative charge usedinthefreeenergycalculations.Ontheotherhand,this contributed by the carboxyl moiety. To match the experi- treatmentisinappropriateformoreflexiblehoststhatadopt mental pH, the acidic guests of the octa-acid set were multiple bound states because the hydration site locations modeledasdeprotonatedandionizationpenalties werenot will vary for each bound representative structure. SHAKE applied. wasusedtoconstrainanybondsinvolvinghydrogenatoms. All individual guests were manually docked to the Hydration Site Analysis involved two steps: (1) identi- bindingcavityofthehostsintwooftheocta-acidsandthe fying regions of high water density around each host sur- in-betweenspaceofthetwonaphthaleneringsonCB-clip. face and (2) calculating the average solvation energies of Joint Hamiltonian and temperature 2-D Asynchronous 123 JComputAidedMolDes(2017)31:29–44 35 Replica-exchange Molecular dynamics simulations [26] Results employed18intermediate kstepsasfollows: k=0,0.002, 0.004, 0.008, 0.01, 0.02, 0.04, 0.07, 0.1, 0.17, 0.25, 0.35, The locations of hydration sites obtained for each host 0.5, 0.6, 0.7, 0.8, 0.9, and 1.0 and 8 temperatures were moleculeareshowninFig. 1.Fortheocta-acidhost,atotal distributed between 300 K and 534 K (T = 300, 328, 358 of six hydration sites are identified, one in the bottom ,390,424,458,494,534).Atotalof144replicaswereused cavity,oneinthemiddlecavityandfourintheuppercavity corresponding to all possible pairing of temperature and k (Fig. 1,left).Forthemethylocta-acidhost,atotalofseven values. The replica-exchange (AsyncRE) simulations were hydrationsitesareidentified,oneinthebottomcavity,four started from energy-minimized and thermalized structures in the upper cavity and two at the level of the rim of the fromthemanuallydockedmodels.Aflat-bottomharmonic host (Fig. 1, middle). The introduction of the methyl sub- restraint with a tolerance of 5 A˚ between the centers of stituentsmainlycausespartialorderingofwatermolecules massofthehostandtheguestwasapplied.Eachcycleofa near the rim of the octa-ocid host, which are otherwise replica lastedfor100picosecondswith 1fstime-step. The bulk-like. In the particular conformation examined, two average sampling time for a replica was approximately 10 partially ordered sites are found at the center of the mouth ns. Calculations were performed on the campus computa- of the cavity(Fig. 1, middle). For the CB-clip host, a total tional grids at Brooklyn College and at Temple University of nine hydration sites were identified, one of which is and on the XSEDE SuperMIC cluster, the latter using 3 located at the center of the host and the remaining eight compute nodes utilizing all CPU’s and both MIC devices. were symmetrically distributed, spanning the region from The binding energies obtained from all replicas were ana- naphthalene rings to the outer edges of the cavity. In the lyzed using UWHAM [43] method and the R-statistical bindingfreeenergycalculations(seebelow)weconsidered package to compute the binding free energy DG(cid:5) and thecentralsiteplusthesitesandwichedinbetweenthetwo b average interaction energies DE of individual complexes. naphthalene rings (Fig. 1, right). b Energy values from the first 1 ns of each replica were In this work, as in the previous SAMPL4 host–guest excluded from the free energy calculation. Calculation at edition [36] , we used custom AGBNP2 hydration spheres multiple temperature allows estimation of the conforma- to model the effect of enclosed water molecules displaced tional entropy of binding DS(cid:5) from the temperature upon ligand binding (see Methods). The idea is that these conf derivative of the binding free energy (here we used a sites have large water occupancy ðws ’1Þ in the unbound simple finite difference estimator using adjacent binding state and small water occupancy ðws ’0Þ in the presence free energy values). Reorganization free energies DG(cid:5) oftheboundligand.Hencepositivevaluesofthewatersite reorg were measured as the difference between the binding free energy parameter ðhsÞ can mimic the favorable effect of energy and the average interaction energy at the bound water expulsion towards binding. The approach we fol- state as lowed here for SAMPL5 is to place and score enclosed hydration sites which are distinct from the empirical DG(cid:5)reorg ¼DG(cid:5)b(cid:3)DEb ð9Þ approach adopted for the octa-acid host in SAMPL4. In SAMPL4 hydration site parameterization was conducted Finally, the reorganization energy for each complex was essentially by trial and error in an attempt to reproduce measured by subtracting the entropic component from the known experimental data. In this work instead the place- reorganization free energy: ment and scoring of AGBNP2 hydration sites is based on HSA analysis of explicit solvent data. Furthermore, DE(cid:5) ¼DG(cid:5) þTDS(cid:5) ð10Þ reorg reorg conf because of limitations of the AGBNP2 hydration site Statistical uncertainties for the reorganization free energy, placement algorithm, in SAMPL4 some of the effects of conformational entropy and reorganization energy were water expulsion were captured by empirical modifications obtained from error propagation of the uncertainties of the of surface tension and solute-solvent van der Waals binding free energy and the average binding energy. The parameters.Forthiswork,wedevelopednovelgeometrical data submitted to SAMPL5 challenge were obtained from algorithms to place hydration sites based on geometrical the 2D replica exchange binding free energy calculations centersofarbitrarygroupsofatomsandmoresignificantly, without inclusion of salt effects. Uncertainties were thanks to HSA analysis, we were able to reduce substan- reported as twice the standard error. The binding free tially, the number of empirically-derived parameters as energy predictions were submitted to the SAMPL5 host– described next. guest challenge on January 29, 2016. For our submissions, The locations of enclosed hydration sites identified by the octa-acid, methyl octa-acid and CB-clip were assigned HSA analysis were reproduced as best as possible using prediction IDs # 18, 20 and 10 respectively. AGBNP2 water spheres and the anchoring methods 123 36 JComputAidedMolDes(2017)31:29–44 available in AGBNP2 [29] and those developed for this octa-acid set [36]. Deviations from the experiments sug- work. Solute anchoring points correspond to hydrogen gested that the HSA analysis over-emphasized the benefit bondingsitessuchasalongpolarhydrogenbondsandlone of displacing water from the bottom site. To correct this pairs in various geometries [29], in addition to anchoring deficiency we decided to assign h parameters by dis- s points based on geometrical centers of groups of atoms. tributing the total enclosure energy equally over the six Waterspherelocationsdefinedinthiswaynaturallyfollow sites. This led to results in reasonable agreement with solutemovements.Thefourtopsitesoftheocta-acidhost, experimental affinities with a scaling factor of c¼2:88 sites (1, 2, 3, and 4) in Table 1, were modeled with eight withoutionic screeningandc¼1:85withionicscreening. AGBNP2 hydration spheres placed perpendicularly to The predictions for the octa-acid and methyl octa-acid inner phenyl rings of the cavity (Fig. 2b, left). The middle hosts were produced using this scheme. For the CB-clip hydration site of the octa-acid host (site 5, Table 1A) was host, lacking experimental evidence supporting one strat- modeled by an AGBNP2 site placed at the geometrical egy over another, we employed the more physically rea- center of a set of aromatic carbons as shown in Fig. 2b, sonableapproachofderivingh parametersfromindividual s left.Thebottomwatersite(site0inTable 1A)wasplaced enclosure energies as described above. The same scaling similarly.Thesix hydration sitesidentifiedforthe methyl- factors derived for the octa-acid system were used for the octa-acid host (Table 1B), were modeled using three CB-clip system. AGBNP2hydrationspheresasindicatedinFig 2a,middle. The blinded binding free energy predictions DG(cid:5) calc The one at the rim ofthe cavity corresponds to sites 3 and submitted to the SAMPL5 challenge, obtained without the 6,thesecondinthetopregionofthecavitycorrespondsto inclusionofsalteffects,areshowninTables 2, 3and 4for sites 1, 2, 4 and 5, and finally the bottom site. All these the octa-acid, methyl octa-acid and CB-clip sets, respec- individual water sites were placed at the center of mass of tively and Fig. 3, together with the corresponding experi- specific set of carbons surroundingthe three regionsin the mentalvalues.Thesepredictionsdidnotreproducewellthe pocket (Fig. 2b, middle). For the CB-clip host, one water experimental measurements, with the most noticeable sitewasplacedatthecenterofmassof4aromaticcarbons shortcoming being the significant overestimation of the and4 alkyl carbonsinthe center ofthe cavity as shownin affinitiesofpositivelychargedguests(guests3and5ofthe (Fig. 2b, right). The second water site was placed in octa-acid guests, and guests 6 and 10 of the CB-clip host). betweenthetwonaphthaleneringsatthecenterofmassof Theresultswiththeinclusionofionicchargescreeningare 4 aromatic carbon atoms indicated (Fig. 2b, right). We inmuchbetteragreementwiththeexperiments.Rootmean have not attempted to model hydration sites located above squareerrors(RMSE)relativetoexperiments(Tables 2, 3; and below the CB-clip host cavity partly because of their Fig. 3a, b) are reduced by more than a factor of 2 for the polar character already captured by AGBNP2 hydrogen- octa-acidhostandbymorethanafactorof3forthemethyl bonding sites [29]. octa-acid host. The RMSE for the CB-clip set (Table 4; The solvation parameters h associated with each Fig. 3c) is reduced from 4.7 to 3.6 kcal/mol. Introduction s AGBNP2 water site of the octa-acid host were derived of salt effects also improved the correlation between from HSA analysis using the water enclosure energies as computational predictions and the experiments. Pearson correlationcoefficients (r)improvedfrom0.66to0.89and E ðiÞ¼c½E ðiÞ(cid:3)E (cid:4)pðiÞ ð11Þ enclosure total bulk from0.67to0.90forthemethylocta-acidandCB-clipsets, wherep(i)isthewateroccupancyofthesiteand½E ðiÞ(cid:3) respectively. total E (cid:4)istheaverageenergyofthesiterelativetotheenergy The correlation coefficient for the octa-acid set bulk of bulk water (Table 1). Enclosure energies were rescaled improvedsignificantlywiththeinclusionofsalteffectsbut by a global parameter, c, to obtain AGBNP2 hydration remained low. This can be considered an acceptable result parameters for enclosed water spheres. The global scaling given that, with the exception of guest 4, the range of parameterwasadjustedsoastoreproducetheexperimental experimental affinities of the octa-acid set is small (less binding free energies of the SAMPL4 octa-acid set as than1.8kcal/mol).Thestrongestbinderinthissetisguest described below. Scaled enclosure energies of hydration 4. The calculations do not reproduce this feature well. sites corresponding to a single AGBNP2 hydration sphere Instead,guest3,apositivelychargedligand,ispredictedto were aggregated into the h parameter of that sphere. bindthestrongestwith-8.8kcal/molwithionicscreening. s Conversely, the h parameters for AGBNP2 sites corre- The trend of overprediction of the affinity of positively s spondingtomorethanonehydrationsiteswereobtainedby charged ligands, while significantly reduced, remain even distributing enclosure energies equally. after inclusion of ionic screening. The origin of the rela- Using the scheme above, we were unable to reproduce tivelylargedeviationbetweenexperimentalandcalculated accurately the experimental affinities for the SAMPL4 affinities for guest 4 is not clearly evident. This complex 123 JComputAidedMolDes(2017)31:29–44 37 Table2 Calculatedandexperimentalbindingfreeenergieswithandwithoutsalteffectsforocta-acidsetandthermodynamicdecompositionof bindingfreeenergieswiththeinclusionofsalteffects Guests Structure DG(cid:5)a Withoutsalteffects Withsalteffects exp DG(cid:5)b DG(cid:5)c DEd DG(cid:5)e DEf (cid:3)TDS(cid:5)g calc calcDH b reorg reorg conf G1 O (cid:3)5:4 (cid:3)2:2(cid:6)0:06 (cid:3)3:8(cid:6)0:08 (cid:3)15:5(cid:6)0:01 11.7±0.09 1.0±1.4 10.7±1.4 O– G2 O (cid:3)4:7 (cid:3)6:3(cid:6)0:06 (cid:3)6:6(cid:6)0:09 (cid:3)15:9(cid:6)0:01 9.2±0.09 0.9±1.6 8.4±1.5 O– N G3 (cid:3)4:5 (cid:3)15:4(cid:6)0:06 (cid:3)8:8(cid:6)0:09 (cid:3)20:4(cid:6)0:01 11.6±0.09 -0.9±1.5 12.5±1.5 N+ G4 Br (cid:3)9:4 (cid:3)6:3(cid:6)0:08 (cid:3)6:8(cid:6)0:2 (cid:3)20:0(cid:6)0:02 13.2±0.1 1.3±2.6 11.9±2.6 O –O G5 (cid:3)3:7 (cid:3)11:3(cid:6)0:06 (cid:3)6:3(cid:6)0:1 (cid:3)16:6(cid:6)0:02 10.3±0.1 1.7±1.7 8.6±1.7 N+ G6 O O (cid:3)5:3 (cid:3)7:4(cid:6)0:06 (cid:3)7:0(cid:6)0:09 (cid:3)18:8(cid:6)0:02 11.8±0.1 0.2±1.6 11.6±1.6 N+ –O O– RMSE 5.8 2.6 Correlationcoefficient(r) (cid:3)0:39 (cid:3)0:04 Allvaluesinkcal/mol.aExperimentalITCbindingfreeenergy[35].bPredictedbindingfreeenergywithoutadditionofioniceffects.cBinding freeenergypredictionwiththeadditionofsalteffects.dAverageinteractionenergyoftheboundcomplexat=1.Theuncertaintyisestimatedas thestandard errorofthemean. eReorganization free energyascalculatedusingEq.9, fReorganization energyorintra-molecular strainas calculatedusingEq.10, gBindingentropycomputedbyfinitedifferencemethodandevaluatedat300K;theuncertaintyiscomputedbyerror propagationforallthedecompositions has the largest reorganization penalty in the set, which is negatively-charged counterpart, guest 7, is correctly pre- surprising, given the rigidity of the guest. We ascribe this dicted to be a relatively weak binder. Guest 2 is correctly effect to the side-ways orientation of the bromine atom predictedastheweakestbinderoftheset.Thebindingfree relative to the carboxylate substituent. This causes con- energies of the guests in the middle of the pack are formationalstraininthe hostandtiltingofthe carboxylate reproduced reasonably well and within the expected to accomodate the bromine atom. We speculate that our accuracy limits of the model. The inclusion of ionic model may be over-estimating the steric hindrance expe- screening has the general effect of weakening binding and rienced by the bromine substituent. shifting the predictions closer to the experimental values. Binding trends in the methyl octa-acid are better repro- For some guests the shift is very substantial (as much as 9 duced, partlythankstolargervariationsinaffinities inthis kcal/molforguest3),whiletheaffinitiesofotherguests(5, set. Focusing on the more accurate results with ionic 8and9forexample)arebarelyaffectedbyionicscreening. screening, guest 4, the strongest binding for the octa-acid These differences appear to be related to the number of host, is here the weakest binder both experimentally and charged groups and the degree of solvent screening affor- computationally.Thecalculationsreproducethelargelossof ded by alkyl substituents. affinityofguest4uponmethylationofthehostreasonably In Tables 2, 3 and 4, we also report the results of free well. Furthermore, in agreement with the experiments, the energy decomposition analyses [44] (values without ionic calculationspredictthatthestrongestbinderisguest3. screeningarenotshown).AveragebindingenergiesðDE Þ, b The two strongest binders of the CB-clip host, guests 6 obtainedfrom theaverage ofthe binding energyvalues,u, and 10, are correctly ranked the best by the computational in the bound state (k¼1), reflect the strength of ligand- model, although their relative rankings are reversed. The host interactions, including desolvation. Subtracting these complexwithguest6ispredictedtohaveanunreasonably from the binding free energies yields reorganization bind- favorable binding free energy (-16.8 kcal/mol). Its ing free energies (DG(cid:5) ). Using data from higher reorg 123 38 JComputAidedMolDes(2017)31:29–44 Table 3 Experimental and computed binding free energies with and without salt effects for methyl octa-acid hosts and thermodynamic decompositionsofbindingfreeenergiescalculatedwithinclusionofsalteffects Guests Structure DG(cid:5)a Withoutsalteffects Withsalteffects exp DG(cid:5)b DG(cid:5)c DEd DG(cid:5)e DEf (cid:3)TDS(cid:5)g calc calcDH b reorg reorg conf G1 O (cid:3)5:3 (cid:3)7:5(cid:6)0:06 -4.8±0.08 -17.2±0.01 12.4±0.08 0.8±1.4 11.5±1.4 O– G2 O (cid:3)5:1 (cid:3)10:6(cid:6)0:07 -7.7±0.09 -17.6±0.01 9.9±0.1 1.4±1.7 8.5±1.6 O– N G3 (cid:3)6 (cid:3)16:2(cid:6)0:07 -9.4±0.1 -21.4±0.01 12±0.1 0.7±1.7 11.3±1.7 N+ G4 Br (cid:3)2:4 (cid:3)2:4(cid:6)0:1 -0.7±0.2 -21.1±0.02 20.4±0.2 6.7±2.7 13.7±2.7 O –O G5 (cid:3)3:9 (cid:3)14:5(cid:6)0:07 -5.7±0.1 -16.5±0.02 10.8±0.1 1.2±1.8 9.6±1.8 N+ G6 O O (cid:3)4:5 (cid:3)8:9(cid:6)0:06 -5.8±0.09 -18.0±0.02 12.2±0.1 1.4±1.6 10.8±1.6 N+ –O O– RMSE 6.7 2.1 Correlationcoefficient(r) 0.66 0.89 Allvaluesinkcal/mol.aExperimentalITCbindingfreeenergy,exceptforG4andG5forwhichonlyNMRmeasurementsareavailable[35]. bPredictedbindingfreeenergywithoutadditionofioniceffects. cBindingfreeenergypredictionwiththeadditionofsalteffects. dAverage interactionenergyoftheboundcomplexat=1.Theuncertaintyisestimatedasthestandarderrorofthemean. eReorganizationfreeenergyas calculated using Eq.9, fReorganization energy or intra-molecular strain as calculated using Eq.10, gBinding entropy computed by finite differencemethodandevaluatedat300K;theuncertaintyiscomputedbyerrorpropagationforallthedecompositions simulated temperatures, the latter is further decomposed predictors of affinity. For example, guest 3, which is the intoconformationalentropy((cid:3)TDS(cid:5) )andreorganization second weakest binder in the set, is predicted to have the conf energy (DE ) components. The conformational entropy strongest interactions with the host (-47.2 kcal/mol). reorg component measures the entropy loss for the formation of Conversely, guest 5, which is one of the top binders, has the complex. The reorganization energy measures the relatively weak interactions with the host (-26.2 kcal/- intramolecular energy change of the guest and the host as mol). Clearly, as observed previously [32, 36], reorgani- theychangeconformationstobindtoeachother.Thelatter zation(entropiclossandintramolecularenergystrain)isan is commonly referred as intramolecular energy strain important determinant of binding. because it often (but not always) opposes binding. As further discussed below, decompositions of binding free energiesprovideusefulphysicalinterpretationsofobserved Discussion bindingaffinitytrends.Asoftenobserved[32],thebinding free energy is the result of a large compensation between Water molecules solvating hydrophobic enclosures exhibit theaveragebindingenergyandreorganizationfreeenergy. unique thermodynamic and structural signatures. Because Forexample,thepooraffinityofguest4tothemethylocta- they are in concave regions, they have fewer proximal acid host can be ascribed primarily to a large and unfa- water neighbors with which they can form favorable vorable reorganization energy (20.4 kcal/mol) rather than hydrogen bonding interactions. Simultaneously, due to the the strength of the host–guest interaction energy (-21.1 hydrophobicityofthesolute,theyareunabletocompensate kcal/mol), which is among the most favorable in the set. for these lost interactions by forming favorable hydrogen The average binding energies for the CB-clip complexes bond contacts with the solute surface. As a result, these have a particular wide range, from -20.2 kcal/mol to watermoleculesareenergeticallyunfavorablewithrespect -47.2 kcal/mol. Interestingly, these are imperfect to bulk and their displacement from the surface into the 123
Description: