ebook img

Computational Molecular Biology PDF

636 Pages·1999·10.011 MB·1-645\636
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Computational Molecular Biology

PREFACE The turn of the millennium undoubtedly marks an exciting period in which molecular biology is progressing with constant acceleration. Since events typically occur in parallel, at the forefront of technology, this progress is mirrored by brisk development of computer hardware and software. Currently, modem supercomputers equipped with new, efficient algorithms are used to predict the structures and properties of species consisting of hundreds of thousands of atoms and are applied to obtain data with experimental accuracy for molecules with more than 100 atoms. A crusade to find the Holy Grail of molecular biology - an understanding of biological processes at a molecular level - has resulted in sophisticated computational techniques and a wide range of computer simulations involving such methods. Among the areas where progress has been profound in the last few years is in the modeling of DNA structure and functions, the understanding at a molecular-level of the role of solvents in biological phenomena, the calculations of the properties of molecular associations in aqueous solutions, computationally assisted drug design, the prediction of protein structure, and protein - DNA recognition, to mention just a few examples. The goal of this book is to cover selected examples of the most notable applications of computational techniques to biological problems. These techniques are used by an ever-growing number of researchers with different scientific backgrounds biologists, chemists, and physicists. A broad group of readers ranging from beginning graduate students to molecular biology professionals should be able to find useful contributions in this selection of reviews. This volume comprises a balanced blend of contributions. They reveal the details of computational approaches designed for biomolecules and provide extensive illustrations of current applications of modem techniques. However, since the area of computational molecular biology is vast, a single volume cannot accommodate even a representative number of contributions from all important fields, and this book is by no means designed to cover the entire range of this explosively expanding territory. I would like to thank all the authors for their excellent contributions and fruitful collaboration. The very efficient technical assistance of Mr. Yevgeniy Podolyan in putting together this volume is greatly appreciated. Jerzy Leszczynski January 1999 J. Leszczynski (Editor) lanoitatupmoC Molecular ygoloiB Theoretical Computational Chemistry, Vol. 8 © 1999 Elsevier Science B.V. All rights reserved 1 Chapter HYBRID POTENTIALS FOR LARGE MOLECULAR SYSTEMS Patricia Amara and Martin J. Field Laboratoire de Dynamique Mol6culaire Institut de Biologie Structurale ~ Jean-Pierre Ebel, 41 Avenue des Martyrs, 38027 Grenoble Cedex ,1 France 1. Introduction Numerical simulation techniques are increasingly powerful tools in all areas of science. They are a 'third way' between the traditional theoretical and experi- mental approaches because they allow more sophisticated theories to be probed than is possible with analytical methods and because they can be employed to examine processes which are inaccessible to experiment. For the study of phenomena at the atomic level, the application of simulation methods has been particularly helpful owing to the complexity of the equations that determine molecular behaviour. In principle, it is known that the theory of quantum mechanics gives a com- plete description of a system at an atomic level [ .]1 In practice, the equations that result are impossible to solve, either analytically or numerically, except in a very few cases. It is usual, therefore, to invoke a number of simplifications. The first is the Born-Oppenheimer approximation which states that the dynamics of electrons and nuclei can be treated separately because of the large disparity in their masses. This leads to a two-step procedure in which the electronic problem is solved first and the nuclear problem is dealt with afterwards [2]. The equation governing the electronic problem is the electronic version of the time-independent Schr6dinger equation which is: )1( Here H is the quantum mechanical (QM) Hamiltonian. It can take many forms but the simplest for use with molecular systems (expressed in atomic units) is: +KE i a i zoa i j rij a b tab where the subscripts i and j refer to electrons and a and b to nuclei, Za is the nuclear charge for nucleus a and r ts is the distance between particles s and t. The first term on the fight-hand side is the kinetic energy operator for the elec- trons. The symbols ~ and E refer to the wavefunction for the electrons in the system and to the system's potential energy, respectively. The wavefunction is important because its square gives the probability density distribution function of the electrons. In equation 1 the coordinates of the electrons, ri, are the variables and the nuclear coordinates, ra, only enter parametrically. This means that each time the values of the nuclear coordinates change, equation 1 must be re-solved for the wavefunction and the energy at the new nuclear configuration. The fact that the energy, E, is dependent upon the nuclear coordinates in this way makes it a multidimensional function. It is this function which defines the potential energy surface for the system and which goes a long way to determining the system's behaviour. Ideally, the best approach would be to be able to solve equation 1 directly to obtain the potential energy surface for the system. The most accurate way of doing this is by using one of the classes of ab initio QM methods thath ave been developed to solve equation 1 with as few as approximations as possible. Popu- lar ab initio algorithms are Hartree-Fock (HF) molecular orbital (MO) ]3[ and density functional theory (DFT) methods [4]. The problem with all these tech- niques is that they are expensive to apply and are generally limited to handling relatively small systems (of a few tens of atoms at the most). As we shall see in section 3.3, recent algorithmic advances have improved this situation somewhat [5], but quicker methods are needed nevertheless. One way to achieve this is to keep the basic framework of the ab initio me- thods but simplify or approximate the time-consuming parts of the calculation. This leads to the class of semiempirical QM methods [6]. These can be applied to much larger systems but it is still by no means routine to use them to study systems the size of even a small protein. The problem with these methods is that, owing to the approximations introduced, they must be parametrized against ex- perimental data if they are to provide reliable results. Probably the most popular semiempirical methods are the MNDO, AM1 and PM3 methods developed by Dewar and co-workers and by Stewart [7, 8, 9]. A second way of calculating the potential energy of a system is to make no attempt at solving Schrtidinger's equation but to use an empirical function, usu- ally called a force field or a molecular mechanics (MM) energy function, that reproduces in a reasonable way the potential energy surface for the system in the regions of interest [ 10]. There is a huge variety of energy functions but those employed for simulations of biomacromolecular systems all have the same ba- sic form. Thus, it is normal to express the MM energy as the sum of two types of term that describe covalent and non-covalent or non-bonding interactions, respectively. The covalent energy includes, at the very least, contributions from the energies of the bonds, the bond angles and the dihedral or torsion angles whereas the non-bonding energy is comprised of electrostatic and Lennard- Jones interactions. The MM energy, EMM, is the sum of all these terms: MME = Ebond + + Eangle lardehidE + Eelec + ELJ (3) Typical forms for these terms are as follows. The bond energy is commonly written as a harmonic function of the length of the bond, :b Ebond- ~ 5 1 kb(b - )05 2 )4( sdnob where 0b is an equilibrium length appropriate for the bond and bk is the bond's force constant. The bond angle energy is also a harmonic function of the bond angle, and "si 1 k (o - 2 Oo) (5) selgna where 00 is an equilibrium angle and ko is the bond angle's force constant. In contrast to the bond and bond angle energies, the dihedral angle energy is a periodic function of the dihedral angle, ¢: 1 lardehidE -- Y~ ~]¢¢ SOC (?2¢ + )~ (6) slardehid where k¢ is the force constant, n is the periodicity of the term and 5 is a phase. Note that the sums in equations 4 to 6 are over all the bonds, bond angles and dihedral angles that are defined for the system and that the parameters in the equations (b0, ,bk etc.) will depend upon the types of atom involved in each individual energy term. The electrostatic energy "si Eelec-- ~ nqmq (7) nmr nm sriap and the Lennard-Jones energy is: (Amn Bmn) ELj-- Z \r~. r6. (8) nar sriap In these equations, rmn is the distance between atoms m and n, mq is the partial charge on atom m and Amn and Bm~ are constants for the Lennard-Jones inte- raction which depend upon the types of the atoms m and n. The sums for both these interactions are over all possible pairs of atoms m and n in the system although it is normal to exclude pairs that are directly bonded together or that are separated by only two bonds. The types of MM energy functions described above have been employed ex- tensively for the simulation of molecular and macromolecular systems and are efficient enough to treat systems comprising many thousands of atoms 1I l, 12]. They have disadvantages, though, including: • They contain many parameters which must be refined by parametrizing the results produced by simulations with the energy function against those from experiment. • Their analytic form precludes their use for the study of certain important processes, such as chemical reactions. Once the method for calculating the potential energy for the system has been defined, it is possible to tackle the nuclear problem. At its most limited, this may involve exploring a small, local region of the potential energy surface to find the most stable structures and, perhaps, the reaction paths between them. This is the only sort of study that is normally done with the expensive ab ini- tio QM potentials because relatively few calculations of the potential energy and its derivatives are required. At the other extreme are simulations which ex- plore large regions of the potential energy surface with the aim, for example, of understanding the dynamics of the system or of calculating thermodynamic quantities which can be compared directly to values measured experimentally. It is normal with these latter methods to treat the nuclei classically because the corresponding quantum treatments are much more difficult [13, 14]. In this review, we shall restrict ourselves to considering the problem of calcu- lating the potential energy for the system and shall not discuss explicitly algo- rithms, such as molecular dynamics or Monte Carlo, which employ the energy and its derivatives. We have discussed above two broad categories of method for determining the potential energy ~ QM and MM methods. Each have their ad- vantages and disadvantages. Ab initio QM methods are, in principle, precise but they are expensive. MM methods are much cheaper, but they can lack flexibility when investigating certain processes, such as chemical reactions or photoexci- tation phenomena. One solution to the disadvantages of both methods, which is applicable in certain circumstances, is to develop hybrid potentials that use QM and MM potentials to treat different parts of the same system. It is these that we shall discuss at length below. The outline of this review is as follows. Methodological aspects are discussed in section 2, which gives a brief presentation of the principles behind hybrid potentials and their implementation, and in section 3 which describes some of the research that is being conducted to make them more precise. Section 4 con- tinues with a look at applications of hybrid potentials to molecules of biological interest and section 5 concludes. 2. Hybrid Potentials In the most general terms a hybrid potential can be defined as any one that combines two or more potentials for the description of different parts of a mo- lecular system. This definition is a very broad one and covers a wide range of possible combinations. In this review, therefore, we shall limit the discussion of a particular class of hybrid potentials that have found widespread use in the study of solute-solvent and protein-ligand systems. These types of potential were first introduced by Warshel and Levitt 15] [ with significant later enhance- ments by Singh and Kollman 16] [ and by Field et al [ 17]. In passing, we shall mention alternative potentials that have been developed for other applications. With the types of hybrid potential we shall be talking about, the system is partitioned spatially into distinct regions and the atoms or other particles within each region are treated with different potentials. A schematic illustration of the partitioning of a system into two different regions, plus a boundary, is shown in figure .1 Spatial partitioning is not the only way in which the division of a system can take place and some potentials have used other partitionings. Thus, for example, in some early potentials that were developed for the study of con- jugated molecules, the division was made so that the nuclear framework and the a-bonding electrons were treated with a simplified empirical potential and the 7r-bonding electrons were treated with a semiempirical QM approximation [18,19]. A convenient formulation of hybrid potentials for spatially partitioned sys- // / / / / / / l ,,,,/ / / / Figure :1 The partitioning of a system into different regions. tems is in terms of Hamiltonians 17]. [ Thus, if there are N regions and if I/I denotes the Hamiltonian for region I and/:/IJ denotes the interaction Hamil- tonian between regions I and J, the total Hamiltonian for the system,/:/Total, is: N N-1 N Y~ II'[ -4- y~ Z fIlJ f/Total- (9) 1=1 1=1 J=I+l In many applications of the hybrid potential method a single region will be of the greatest interest. In these cases it will be normal to treat this region with the most accurate potential and use potentials of decreasing sophistication for the regions that are further and further away. Let us consider as an example a simulation study of a chemical reaction either in solution or in a protein in which the system is partitioned into two. There will be a small core region which contains the atoms that are reacting and for which a QM potential will be needed and a larger outer region that contains the remainder of the atoms and to which a simpler potential, say a MM potential, will be applied. In this case, equation 9 reduces to: /Z/Total- HqM +/2/MM +/:/qM/MM (10) The various terms on the right-hand side of equation l0 need further explana- tion. Taking each in turn they are: MQ/:/ is the Hamiltonian for the QM region. It will have the same form as the normal Hamiltonian of the QM potential that is being employed. Thus, an ab initio QM potential would use a Hamiltonian like the one in equation 2. f/MM is the Hamiltonian for the MM region and, like the QM Hamiltonian, it will be the same as that of the MM potential that is being used. For the majority of MM potentials the Hamiltonian will be equal to the MM energy of equation 3 because the MM energy function does not contain any operator terms. This is not the case for all MM potentials -- the most notable exceptions being those that include polarizability terms. We shall leave discussion of these until section 3.2. M/MQH ^ M is the Hamiltonian for the interaction between the QM and MM re- gions. It is the definition of this Hamiltonian which is crucial to the success of the hybrid potential method. Several forms for this term are possible but one of the simplest and probably the most widely used consists of a sum of electrostatic and Lennard-Jones terms. As in the MM potential, the elec- trostatic terms model the interactions between the charge distributions of different atoms whereas the Lennard-Jones terms model the short range repulsive and the longer range dispersion interactions which are not ac- counted for by the electrostatic interactions. For the case of an ab initio QM method and an MM potential in which there are partial charges on the MM atoms, the interaction Hamiltonian will have the form: MM/MQ/f -- --~'~-'~ qm _jf_~-~y~Zaqm m) (Aam i m rim a m ram (11) q-~~ 21 m6r a m ram where the subscript m refers to MM atoms, mq is the partial charge on atom maB m and A~m and are the coefficients for the Lennard-Jones interaction between the QM atom a and the MM atom m. Note that if an MM atom has no partial charge, the only interaction it will have with the QM atoms will be its Lennard-Jones one. An extra complication arises when a molecule is partitioned between QM and MM regions and so has covalent bonds between QM and MM atoms. In these cases approximations must be introduced to satisfy or to terminate the density of the broken bonds. It is not possible to do nothing as o- therwise the electronic structure of the QM fragment would be profoundly affected. The development of efficient methods for doing this is one of the principal challenges facing the hybrid potential field. A fuller discussion will be left until section 3.1. Having defined the Hamiltonian for the system, we can use it in a time in- dependent SchrSdinger equation (equation )1 to solve for the wavefunction of the electrons in the QM region and the total potential energy of the system, E. The latter is expressed as the expectation value of the wavefunction over the Hamiltonian: E (tI/ lat°T/S/ tI// -- (12) The exact method of solution of the Schr6dinger equation will depend upon the QM potential that is being used. For variational QM methods, though, which comprise the majority, it will involve minimization of the energy expression, equation ,21 with respect to a set of variable parameters. This minimization pro- cedure gives rise to the well known self-consistent iterative methods of solution, for example, that are characteristic of ab initio and semiempirical Hartree-Fock QM methods. Once the wavefunction and energy for the system are known, other quantities derived from them can be calculated. The most important of these are the forces on the nuclei of the QM region and the atoms of the MM region. These are obtained straightforwardly by differentiating the energy expression, equation 12, with respect to the positions of the QM and MM atoms" OE OE af -- mf = (13) Ora Orm We have described in detail the Hamiltonian formulation of a hybrid potential. This may not be the most convenient for some potentials, notably those of Db'T type, for which it is easier to work with the energy directly. The central quantity in DFT is the single particle electron density, p, which is related to the square of the wavefunction as: p(r) - / dsldx2.., dx,~ [~ (rlsl, x2, ..., Xn)[ 2 (14) In this equation the integration is over the position vectors, r, and spin variables, s, of all but one of the n electrons in the wavefuncfion. The shorthand x is used to represent both the position and the spin variables for an electron, i.e. x - r s. The energy of a system described with a DFI' QM method, and thus of the hybrid potential too, is a functional of the electron density: E [p]- MQE ]p[ -q MME -q MM/MQE ]P[ (15) where EqM, MME and M/MQE M are the QM, MM and QM/MM interaction e- nergies, respectively. The MM energy is independent of the electron density and will have the same form as that discussed above. In the most popular version of the DFT method for studying molecules, the Kohn-Sham method, the QM energy is written as: (16) where ,~T J and ~xE are the electron kinetic, Coulomb and exchange-correlation energies, respectively. The last term is the interaction energy between the elec- tron density and the electrostatic potential, ,MQU due to the charges on the nuclei of the QM atoms. The QM/MM interaction energy is similar to that discussed previously and consists of electrostatic and Lennard-Jones parts. The Lennard-Jones energy is the same as that in equation 11 and the electrostatic energy is: f )r(MM/MqV)r(p)r(d MM/MQE __-- E E mar -]- (17) a m M/MQ2/erehw is the potential due to the charges on the MM atoms and is: M Ir qm (18) )r(MU/MqV -- -- ~ - rml m The procedure for determining the electron density and the energy of the sys- tem within the DFT method is similar to the approach used in the Hartree-Fock technique. The wavefunction is expressed as an antisymmetfic determinant of occupied spin orbitals which are themselves expanded as a set of basis func- tions. The orbital expansion coefficients are the set of variable parameters with respect to which the DFF energy expression of equation 51 is optimized. The optimization procedure gives rise to the single particle Kohn-Sham equations which are similar, in many respects, to the Roothaan-Hall equations of Hartree- Fock theory. Although the hybrid potential example we have discussed in detail contains two regions, it is easy to generalize the method for more. This might be advan- tageous in those circumstances in which a more gradual transition between a high level QM potential and an MM potential is desired. Many of the systems studied with a hybrid potential will be in the condensed phase and so some method will have to be employed to mimic the interactions the simulation system feels from the infinite environment at its boundary. These methods fall into two classes ~ those, such as the technique of periodic boun- dary conditions that try to model the environment at an atomic level and those which replace an atomic representation by a simpler one, such as a boundary potential or a dielectric continuum. Both classes of methods can be used with

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.