Expressio n , P u rificati o n a n d C r ystallization o f h u m a n H e a t S h o c k T r anscription Factors Adit y a M o jumdar Degr e e p r oje c t i n a p pli e d b i o technolo g y , M a st e r o f S c ien c e ( 2 y e ar s ) , 2 0 10 Examensarbe t e i t i l l ämp a d b i o tekn i k 4 5 h p t i l l m a sterexame n , 2 0 10 Biolo g y E d ucati o n C e nt r e , U p ps a l a U n iversi t y , a n d C e nt e r f o r S t r uctu r a l B io chemist r y , D e partment o f B io scienc e s a n d N u tritio n , N O VU M , K a rolins k a In s titu t e , H u ddinge Superviso r : D r . W e i L i u Abstract Expression, Purification and Crystallization of human Heat Shock Transcriptional Factors Aditya Mojumdar Heat shock factors are known to be the key players in heat shock response. They regulate the transcription of hsp genes to produce heat shock proteins. Heat shock factors (HSFs) are known to function cooperatively. The exact role of HSFs in heat shock response is still not completely clear. They are also known to have different functions in the process of development regulations like reproduction, embryonic development, lens development, cortical lamination and olfactory epithelium maintenance. Three-dimensional structures of the proteins will give a more detailed view of their functions. In this project we try to determine the three-dimensional structure of human HSFs by X- ray crystallography. To achieve the goal, we transformed E. coli with the plasmid called pMAL-c2 containing the gene encoding the DNA binding domain and trimerization domain (1-232 residues) of HSF1 fused with maltose binding protein (MBP) at the N- terminus. We tried to get HSF1-232 in soluble form by expressing it with MBP and then cleaving it inside the cell by co-expressing tobacco etch virus (TEV) protease which can cleave the fusion protein at the TEV cleavage site between the MBP and HSF1 proteins. But this strategy failed so we tried in vitro cleavage which gave rise to insoluble proteins. The fusion protein was then over-expressed and was purified using amylose affinity chromatography, HisTrap column chromatography and Superdex 200 gel filtration chromatography. After purification we got almost 95% pure proteins which was sufficient for crystallization. The protein solution was then concentrated to almost 15 mg/ml and the drops were set for crystallization under 200 crystal screen conditions. Vapor diffusion was used as the crystallization method. After almost 20 days, 2 out of those 200 conditions gave some crystals in the form of needles, plates and rod cluster. These conditions were further optimized to get better crystals. Further optimization of crystallization conditions and making new constructs with shorter linker between MBP and the target protein can help in getting better crystals that can be used for X-ray diffraction experiments and determining three-dimensional structures of HSFs. Contents Abbreviations iii 1. Introduction 1 1.1 Heat shock response 1 1.2 Heat shock response regulation in eukaryotes 1 1.3 Structure and function correlation of eukaryotic heat shock factors 3 1.4 Structural features of HSFs in eukaryotes 3 1.5 Roles of HSFs other than in heat shock response 4 1.6 Aims and objectives 5 2. Materials and Methods 6 2.1 Materials 6 2.1.1 Laboratory equipment 6 2.1.2 Biochemical ingredients 7 2.1.3 Reagents and buffers 8 2.2 Methods 11 2.2.1 Microbiology 12 2.2.1.1 Transformation of host competent cells 12 2.2.1.2 Preparation of competent cells 12 2.2.1.3 Pilot level overexpression of protein for in vivo cleavage 12 2.2.1.4 Pilot scale overexpression of MBP-HSF fusion protein 13 2.2.2 Protein purification and analysis 13 2.2.2.1 Cell lysate preparation for purification 13 2.2.2.2 Amylose affinity chromatography 14 2.2.2.3 HisTrap column chromatography 14 2.2.2.4 Gel filtration chromatography 14 2.2.2.5 Sodium-dodecylsulfate polyacrylamide gel electrophoresis (SDS-PAGE) 14 2.2.2.6 Western blot analysis 15 2.2.3 Crystallization trials 15 2.2.3.1 Concentrating the protein samples 15 2.2.3.2 Crystallization drop set-up 15 2.2.4 Molecular biology 15 2.2.4.1 Mutating pMAL-c4E plasmid by polymerase chain reaction 15 2.2.4.2 Agarose gel electrophoresis 16 2.2.4.3 Isolation of mutated pMAL-c4E plasmid fragment from the gel 17 2.2.4.4 Ligation of the two ends of mutated pMAL-c4E plasmid 17 2.2.4.5 Plasmid DNA purification 17 i 3. Results 18 3.1 In vivo cleavage of MBP-HSF1-232 18 3.2 In vitro cleavage of MBP-HSF1-232 18 3.3 Purification of MBP-HSF1-232 fusion protein 19 3.4 Preliminary results of crystallization of the fusion protein 21 4. Discussion 23 Acknowledgements 25 References 26 ii Abbreviations A absorbance AAC amylose affinity chromatography AD activation domain Amp ampicillin APS ammonium persulfate ATP adenosine triphosphate CaCl2 calcium chloride Cam chloramphenicol CD circular dichroism cDNA complementary DNA cm centimeter DBD DNA binding domain ddH2O double distilled water DNA deoxyribonucleic acid dNTP deoxyribonucleotide triphosphate DTT dithiothreitol EDTA ethylenediaminetetraacetic acid Fig figure g gram HAC heparin affinity chromatography HCl hydrochloric acid HR-A/B heptad hydrophobic repeat region A and B HR-C heptad hydrophobic repeat region C HRP horseradish peroxidase HSE heat shock elements HSF heat shock transcription factor hsp genes encoding heat shock proteins HSP heat shock proteins HSR heat shock response IMAC immobilized metal ion affinity chromatography IPTG isopropyl-ß-D-thiogalactopyranoside kb kilobase kDa kilodaltons L litre LA luria agar LB luria bertani M molar mA milliamperes MBP maltose binding protein mg miligram Mg magnesium MgCl2 magnesium chloride min minutes ml mililitre iii mM milimolar mm millimeter NaCl sodium chloride NaPi sodium phosphate Ni nickel OD optical density PAGE poly acrylamide gel electrophoresis PCR polymerase chain reaction PEG polyethylene glycol pH hydrogen ion concentration rpm revolutions per minute SDS sodium dodecyl sulphate sec seconds TBE tris borate EDTA TBS tris buffered saline TBS-T tris buffered saline – tween 20 TE tris EDTA TEMED tetramethylethylenediamine TEV tobacco etch virus Tris tris(hydroxymethyl)aminomethane UV ultraviolet V volts v/v volume to volume wHTH winged helix-turn-helix α alpha β beta % percent ºC degree celsius µl microlitre 3D three dimensional iv Chapter 1 Introduction 1.1 Heat shock response Acute exposure to severe pathological and environmental stress conditions leads to several problems in cell life which may also cause cell death. The heat shock response (HSR) is an evolutionarily conserved defense mechanism against such stress conditions that may be due to elevated temperature, chemical toxicants, heavy metals, oxidative stress, infection, etc. From bacteria to mammals the general features of HSR are conserved but in detail they are different. The HSR includes rapid and increased expression of HS genes which produces heat shock proteins (HSP). Some heat shock proteins are known to act as molecular chaperones that help proteins to fold and assemble correctly and take care of their intracellular translocation in stress as well as in non-stress conditions [Santoro, 2000, Pirkkala et al., 2001 and Voellmy, 2004]. 1.2 Heat shock response regulation in eukaryotes Heat shock response is controlled by a group of heat shock transcription factors (HSF) in eukaryotes. Heat shock factors are the proteins that interact with heat shock elements (HSE) present upstream to the Hsp promoter in multiple copies and thus regulating Hsp expression. It has been seen that in yeast, nematode and fruit fly one HSF is present while in plants and vertebrates there are several. HSF1 and HSF2 are present in all vertebrates together with HSF3 in avian species and HSF4 in mammals. HSF1 is reported to be the major transcription regulator of HSR and is activated in response to increased temperature, exposure to heavy metals, oxidants, viral and bacterial infections. HSF2 has been reported to play a role in differentiation and development processes. HSF4 has been suggested to be involved in forming and maintaining the olfactory epithelium but not in any stress-related functions [Schöffl et al. 1998; Santoro, 2000, Pirkkala et al., 2001 and Voellmy, 2004]. In HSR, heat shock factors are activated in a multi-step process which includes trimerization, localization to the nucleus and DNA binding. Several post-transcriptional modifications like phosphorylation and sumoylation are also involved in activation regulation of HSFs [Holmberg et al., 2002; Hietakangas et al., 2003; Hietakangas et al., 2006 and Anckar et al., 2007]. Both HSF1 and HSF2 form trimers from monomers and dimers respectively, upon activation. The DNA-binding domain (DBD) is the most preserved domain in HSFs. The DBD of HSFs recognizes and binds HSE in the major groove. HSEs are very much conserved and consist of several inverted repeats of pentameric sequences nGAAn, ‘n’ being any nucleotide [Sorger et al., 1989; Wu, C., 1995; Sistonen et al., 1994; Anckar et al., 2007; Amin et al., 1988]. The inverted repeats of nGAAn results in two possible orientations, nGAAnnTTCn called “head-to-head” and nTTCnnGAAn called “tail-to-tail” repeat. The HSFs bind to HSE cooperatively [Xiao et al., 1991]. The recognition of HSE by HSF and the transcriptional activation depends on the number and conservation of repeats. It has been reported that one DBD binds to one nGAAn repeat and thus a homotrimer of HSF binds to three such repeats. In vitro studies 1 have shown that HSF trimers also bind to “head-to head” and “tail-to-tail” repeats. Binding of HSF trimers to “tail-to-tail” repeats is more efficient than their binding to “head-to-head” repeats [Sakurai and Takemori 2007]. Fig. 1.1 Regulation of the heat shock response. [Redrawn from Santoro 2000, Biochemical Pharmacology] Usually in normal cells the heat shock proteins Hsp70, Hsp90 and Hdj1 are bound to the inactive monomer or dimer forms of HSF1 or HSF2, respectively. As shown in fig. 1.1, when the cell gets exposed to stress conditions there is an increase in the number of non- native proteins. These non-native proteins need the molecular chaperones to prevent aggregation and misfolding. As a result, Hsp70 and Hsp90 bound to the inactivate form of HSFs are released from HSFs and are available to the non-native proteins. On the other hand, the inactive HSFs translocate to the nucleus, trimerize and get phosphorylated at specific serine residues. Thus HSFs get activated and bind to HSEs resulting in transcription of hsp genes. As the synthesized HSPs level reaches a certain limit, the chaperones bind to HSFs resulting in dissociation of trimers and folding of HSFs to its inactive form [Santoro, 2000]. The HSFs are known to have hydrophobic repeat regions at their N and C terminus which interact with each other and thus this intra-molecular interaction maintains the inactive form of HSFs. In stress conditions, inter-molecular interactions among the trimerization domains of adjacent HSF monomers replace the above mentioned intra-molecular 2 interactions and hence result in formation of homotrimers of HSFs [Schöffl et al. 1998; Santoro, 2000, Pirkkala et al., 2001 and Voellmy, 2004]. 1.3 Structure and function correlation of eukaryotic heat shock factors Eukaryotic HSFs bind to DNA sequences specifically in their homotrimeric activated form. They are reported to be approximately 60 kDa in molecular weight. All eukaryotic HSFs have similar structural and functional features. The structural features are shown in fig. 1.2, showing the DNA-binding domain at the N-terminus followed by the heptad hydrophobic repeat region (HR-A/B) and another heptad hydrophobic repeat region (HR- C) with the transcriptional activation domain at the C-terminus. N DBD HR-A HR-B HR-C AD C Fig. 1.2 Structural features of HSFs in eukaryotes. The colored regions are the conserved structural domains of HSFs, consisting of maroon colored DNA-binding domain (DBD) and two green colored heptad hydrophobic repeat regions (HR-A/B) at the N-terminus and blue colored heptad hydrophobic repeat region (HR-C) and an orange colored transcriptional activator domain (AD) at the C-terminus. The DNA-binding domain – DBD - is the most conserved domain of HSFs and is present at the N-terminus. It has a winged helix-turn-helix (wHTH) structural motif that interacts with HSE. DBD consists of three alpha-helices and a four-stranded antiparallel beta sheet as shown in fig 1.3(a). There is a helix-turn-helix motif between helices 2 and 3 where helix 3 recognizes and binds to the major groove of DNA, shown in fig 1.3(b). There is a winged loop between beta-strands 3 and 4 that may mediate protein-protein interactions [Harrison et al., 1994; Vuister et al., 1994; Littlefield and Nelson, 1999; Cicero et al., 2001; Ahn et al., 2001]. (a) (b) Fig. 1.3 Structure of the HSF DNA binding domain. (a) the three-dimensional structure depicting the alpha-helices (H1, H2 and H3) in red and beta-sheets (B1, B2, B3 and B4) in yellow and the loop in green. [Harrison et al., 1994; Vuister et al., 1994] (b) protein –DNA complex with 3 protein in blue color and DNA in yellow color, showing H3 in the major groove. [Littlefield and Nelson, 1999] The study of crystal structure of the HSF DBD-DNA complex from Kluyveromyces lactis revealed that the winged loop does not interact with DNA as it does in other wHTH proteins. Instead, the wing is involved in protein-protein interaction among the adjacent HSF monomers forming homotrimers thus increasing the cooperativity between the HSF monomers [Littlefield and Nelson, 1999]. The C-terminus of DBD has a linker region further connecting the DBD to the heptad hydrophobic repeat regions (HR-A/B) [Harrison et al., 1994; Vuister et al., 1994]. Heptad hydrophobic repeats – Next to the DBD is the heptad hydrophobic repeat region (HR-A/B) which is connected to the DBD by a linker. The region HR-A contains one hydrophobic repeat and region HR-B contains two overlapping hydrophobic repeats. Three arrays of hydrophobic repeats (HR-A/B) are involved in trimerization of HSFs [Pirkala et al., 2001]. The structure of this heptad hydrophobic repeat region is not available; however some studies by chemical crosslinking and circular dichroism spectroscopy show that the HR-A/B region has a three fold symmetry on trimerization. This is due to the formation of a structure containing three strands of alpha-helical coiled- coil [Peteranderl and Nelson, 1992]. At the C-terminus of HSFs another hydrophobic repeat (HR-C) is present which is thought to stabilize the inactive monomer HSFs by interacting at intramolecular level with HR-A/B. Regulatory domain – Near the central region of HSFs between HR-A/B and the activation domain there is another domain called regulatory domain. Not much is known about this domain but it is reported that it plays an important role in regulating activation domains and sensing stress [Newton et al., 1996]. Activation domain – The activation domain (AD) is commonly present at the C-terminus of HSFs but in S. cerevisiae it is also present at the N-terminus [Santoro, 2000; Pirkkala et al., 2001 and Voellmy, 2004]. Although, the degree of sequence conservation is very low there are several negative and positive regulatory modules in it that regulate the activation of HSFs under stress conditions. The activation domain is in turn regulated by the regulatory domain. At normal temperature two serine residues in the regulatory domain are phosphorylated thus negatively regulating the AD. Although, no structure for this domain is known, certain studies have shown that normally the C-terminal end is very flexible and unfolded and gets more ordered under stress thus influencing the transcriptional activity of HSFs [Bulman and Nelson, 2005; Pattaramanon et al., 2007]. 1.4 Roles of HSFs other than in heat shock response. Various gene knock-out studies show that HSFs play an important role not only in heat shock response but also in various developmental regulations like reproduction, embryonic development, lens development, cortical lamination and olfactory epithelium maintenance [Kallio et al. 2002; Wang et al. 2003; Chang et al. 2006; Fujimoto et al. 2004; Takaki et al. 2006; McMillan et al. 2002; Min et al. 2004]. 4
Description: