Supplementary data: The core of the SAM-IV riboswitch aptamer mimics the ligand-binding site of SAM-I riboswitches ZashaWeinberg1,ElizabethE.Regulski1,MingC.Hammond2,JeffreyE.Barrick2,3,ZizhenYao4,WalterL.Ruzzo4,5 andRonaldR.Breaker1,2,3 1 Department of Molecular, Cellular and Developmental Biology, 2 Department of Molecular Biophysics and Biochemistry, 3 Howard Hughes Medical Institute, Yale University, New Haven, Connecticut 06520. 4 Department of Computer Science and Engineering, 5 Department of Genome Sciences, University of Washington, PO Box 352350, Seattle, WA, 98195. Correspondence should be addressed to R.R.B. ([email protected]) Note: Supplementary Figures S3 and S4 present data on RNA motifs that is of a similar nature to previous reports from our laboratory. Therefore, we used a similar or identical design of figures and wording of figure legend explanations to those in our previous reports [4, 5]. The underlying data on SAM-I riboswitches (Supplementary Figure S4) are derived from a previously established alignment [1]. The data on SAM-IV riboswitches is novel. Contents Additional analysis of SAM-IV 2 Non-actinomycete SAM-IV instance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Other environmental sequences searched . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 The number of SAM-IV riboswitches per species . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Experiments designed to deplete cellular SAM concentrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Supplementary Figure S1: consensus sequence/structure of SAM-IV motif, including evidence of covariation 3 Supplementary Figure S2: in-line probing gel showing more detail of SAM-IV’s cleavage pattern 4 Supplementary Figure S3A: taxonomy of SAM-IV riboswitches 5 Supplementary Figure S3B: gene context of SAM-IV riboswitches 6 Supplementary Figure S3C: conserved domains present in genes downstream of SAM-IV riboswitches 8 Supplementary Figure S3D: SAM-IV multiple sequence alignment 9 Supplementary Figure S4A: taxonomy of SAM-I riboswitches 14 Supplementary Figure S4B: gene context of SAM-I riboswitches 16 Supplementary Figure S4C: conserved domains present in genes downstream of SAM-I riboswitches 22 Supplementary Figure S4D: SAM-I multiple sequence alignment 24 2 Additional analysis of SAM-IV Non-actinomycete SAM-IV instance The number of SAM-IV riboswitches per species OnlyoneinstanceoftheSAM-IVmotifwasidentifiedoutsideofActinomycetales, Within Actinomycetales, the number of SAM-IV riboswitches per species varies and was in Magnetospirillum magnetotacticum, in the class α-proteobacteria. markedly. For example, Streptomyces coelicolor has one, versus four in S. aver- However, we detected no SAM-IV in M. magneticum, a fully sequenced relative mitilis. Similarly, Corynebacterium glutamicum has two, while C. diphtheriae has of M. magnetotacticum. Similarly, the only predicted SAM-I riboswitch in α- none. See Supplementary Figure S3A. proteobacteria is also in M. magnetotacticum. Previously, fourteen protein coding genestypicalofActinobacteriawerefoundinM.magnetotacticum,anditwaspro- posed that these might have been acquired by horizontal transfer [2], which could Experiments designed to deplete cellular SAM concentra- also apply to SAM-IV. tions Other environmental sequences searched WemeasuredreportergeneexpressioninamethionineauxotrophstrainofS.coeli- We performed homology searches on the following additional environmental se- color[3], growninminimalmediawithandwithoutmethionine. Sincemethionine quences downloaded from GenBank, but failed to uncover SAM-IV riboswitches is directly used to make SAM, its depletion induces low SAM levels [6]. Unfor- in any of them: acid mine drainage (GenBank project AADL), soil and whale tunately, XylE activity in all experiments was at background, i.e., similar to a fall (AAFX-AAFZ, AAGA), the human gut (AAQK, AAQL), gutless sea worms strain lacking the xylE gene (data not shown). We presume that growth in min- (AASZ),mousegut(AATA-AATF)andsludgecommunities(AATN,AATO).Sar- imal media lowered gene expression below detection limits with XylE, rendering gasso Seq sequences (AACY) contain one SAM-IV riboswitch; see Supplementary the data inconclusive. By contrast, while the results in Fig. 3 of our manuscript Figure S3A. are somewhat close to background, the difference is significant. 3 Supplementary Figure S1: consensus sequence/structure of SAM-IV motif, including evidence of co- variation Conservednucleotidepositionsandevidenceofcovariationwerecalculatedasin[5]. Stems are labeled P1-P5. P5 in SAM-IV is often missing, but its 50 side involved in the pseudoknot is always present. R: A or G. Y: C or U. pseudoknot G C G C pseudoknot C G YC base pahira asn cnoovtaartyiionngs mutations CCC GGGU C GGC Y YRGCGCGC has compatible mutations C GY no mutations observed SAM-IV U C G P3C G variable hairpin AC GU P5 AC G Y nNNNuidcelen799ot570it%%%tiyde nupcrele9795soe0570tn%%%%idte CCGCGCGCUAAY RGCGCPUA2GCYRYRYGCAGGGYCCACCCGGGGAUG AGYGCGYR Y U A A P4 P1G C CGG G C 5‘ GGYY 4 Supplementary Figure S2: in-line probing gel showing more detail of SAM-IV’s cleavage pattern G9 G13 G14 G19 G21 G23 G27 G32 G34G35 G40G41 G48 T1 - OH NC 1 mM SAM G48 G63 G67G69G70 G74G75 G82G83G84G86 G G63 G67 G74G75 G82G83G84G86G89 G94G95G98100G101G105 G 1 1 GG G105 6/117 G121G123125126 U132 5 Supplementary Figure S3A: taxonomy of SAM-IV riboswitches The taxonomy of each organism containing a putative SAM-IV riboswitch is listed, with abbreviations (e.g., “Cef-1-1”) used to denote that riboswitch in later figures. (ThisexplanatorytextislargelycopiedfromsupplementarydataonadifferentRNAmotif [4].) abbrev. of hits taxonomy of species Cef-1-1 ActinobacteriaActinobacteridaeActinomycetalesCorynebacterineaeCorynebacteriaceaeCorynebacterium efficiens YS-314 Cgl-1-1toCgl-1-2 ActinobacteriaActinobacteridaeActinomycetalesCorynebacterineaeCorynebacteriaceaeCorynebacterium glutamicum ATCC13032 Mfl-1-1toMfl-1-2 ActinobacteriaActinobacteridaeActinomycetalesCorynebacterineaeMycobacteriaceaeMycobacterium flavescens PYR-GCK Mle-1-1 ActinobacteriaActinobacteridaeActinomycetalesCorynebacterineaeMycobacteriaceaeMycobacterium leprae TN Msm-1-1toMsm-1-2 ActinobacteriaActinobacteridaeActinomycetalesCorynebacterineaeMycobacteriaceaeMycobacterium smegmatis str. MC2155 Msp-1-1toMsp-1-2 ActinobacteriaActinobacteridaeActinomycetalesCorynebacterineaeMycobacteriaceaeMycobacterium sp. JLS Msp-2-1toMsp-2-2 ActinobacteriaActinobacteridaeActinomycetalesCorynebacterineaeMycobacteriaceaeMycobacterium sp. KMS Msp-3-1toMsp-3-2 ActinobacteriaActinobacteridaeActinomycetalesCorynebacterineaeMycobacteriaceaeMycobacterium sp. MCS Mul-1-1 ActinobacteriaActinobacteridaeActinomycetalesCorynebacterineaeMycobacteriaceaeMycobacterium ulcerans Agy99 Mva-1-1toMva-1-2 ActinobacteriaActinobacteridaeActinomycetalesCorynebacterineaeMycobacteriaceaeMycobacterium vanbaalenii PYR-1 Mav-1-1 ActinobacteriaActinobacteridaeActinomycetalesCorynebacterineaeMycobacteriaceaeMycobacterium avium 104 Mav-2-1 ActinobacteriaActinobacteridaeActinomycetalesCorynebacterineaeMycobacteriaceaeMycobacterium avium subsp. paratuberculosis K-10 Mbo-1-1 ActinobacteriaActinobacteridaeActinomycetalesCorynebacterineaeMycobacteriaceaeMycobacterium bovis AF2122/97 Mtu-1-1 ActinobacteriaActinobacteridaeActinomycetalesCorynebacterineaeMycobacteriaceaeMycobacterium tuberculosis C Mtu-2-1 ActinobacteriaActinobacteridaeActinomycetalesCorynebacterineaeMycobacteriaceaeMycobacterium tuberculosis CDC1551 Mtu-3-1 ActinobacteriaActinobacteridaeActinomycetalesCorynebacterineaeMycobacteriaceaeMycobacterium tuberculosis F11 Mtu-4-1 ActinobacteriaActinobacteridaeActinomycetalesCorynebacterineaeMycobacteriaceaeMycobacterium tuberculosis H37Rv Mtu-5-1 ActinobacteriaActinobacteridaeActinomycetalesCorynebacterineaeMycobacteriaceaeMycobacterium tuberculosis str. Haarlem Nfa-1-1toNfa-1-3 ActinobacteriaActinobacteridaeActinomycetalesCorynebacterineaeNocardiaceaeNocardia farcinica IFM10152 Rsp-1-1toRsp-1-2 ActinobacteriaActinobacteridaeActinomycetalesCorynebacterineaeNocardiaceaeRhodococcus sp. RHA1 Fal-1-1toFal-1-2 ActinobacteriaActinobacteridaeActinomycetalesFrankineaeFrankiaceaeFrankia alni ACN14a Fsp-1-1 ActinobacteriaActinobacteridaeActinomycetalesFrankineaeFrankiaceaeFrankia sp. EAN1pec Kra-1-1toKra-1-4 ActinobacteriaActinobacteridaeActinomycetalesFrankineaeKineosporiaceaeKineococcus radiotolerans SRS30216 Bli-1-1 ActinobacteriaActinobacteridaeActinomycetalesMicrococcineaeBrevibacteriaceaeBrevibacterium linens BL2 Jsp-1-1toJsp-1-3 ActinobacteriaActinobacteridaeActinomycetalesMicrococcineaeIntrasporangiaceaeJanibacter sp. HTCC2649 Aau-1-1toAau-1-5 ActinobacteriaActinobacteridaeActinomycetalesMicrococcineaeMicrococcaceaeArthrobacter aurescens TC1 Asp-1-1toAsp-1-4 ActinobacteriaActinobacteridaeActinomycetalesMicrococcineaeMicrococcaceaeArthrobacter sp. FB24 Sar-1-1 ActinobacteriaActinobacteridaeActinomycetalesMicromonosporineaeMicromonosporaceaeSalinispora arenicola CNS205 Str-1-1 ActinobacteriaActinobacteridaeActinomycetalesMicromonosporineaeMicromonosporaceaeSalinispora tropica CNB-440 Nsp-1-1 ActinobacteriaActinobacteridaeActinomycetalesPropionibacterineaeNocardioidaceaeNocardioides sp. JS614 Sav-1-1toSav-1-4 ActinobacteriaActinobacteridaeActinomycetalesStreptomycineaeStreptomycetaceaeStreptomyces avermitilis MA-4680 Sco-1-1 ActinobacteriaActinobacteridaeActinomycetalesStreptomycineaeStreptomycetaceaeStreptomyces coelicolor A3(2) Mma-1-1 α-proteobacteriaRhodospirillalesRhodospirillaceaeMagnetospirillum magnetotacticum MS-1 env-1 environmentalsample 6 Supplementary Figure S3B: gene context of SAM-IV riboswitches All riboswitches (indicated by “RNA→”) are listed with their downstream laps a hypothetical protein, but BLAST can find no homolog of this protein. The genes, according to the RefSeq annotation. Environmental sequences and some direction of each downstream gene is indicated with an arrow (→), and each con- RefSeq entries lack gene annotations, and no genes are listed for such sequences. served domain in the gene is colored. Conserved domains associated with more Lines beginning with a superscript “1” indicate riboswitches that partially over- than one SAM-IV riboswitch are assigned a color; other domains are gray. Con- laps reverse complement of a hypothetical protein. However, BLAST cannot find served domains are explained in Supplementary Figure S3C. Nucleotide coordi- any homologs of the overlapping region. The superscript “2” riboswitch over- nates are given for the 50 and 30 boundaries of the riboswitch. Note that these lap conserved hypothetical protein (COG5515). However, the overlapping part coordinates are for the full sequence listed in Supplementary Figure S3D, includ- of the hypothetical protein gene has no homologs (by BLAST), even though the ing extra downstream nucleotides used to annotate transcription terminators and non-overlapping part does. COG5515 genes are regulated by other instances of start codons. Therefore the listed 30 coordinate will extend past the actual ri- the riboswitch according to genome annotations; we presume that the indicated boswitch aptamer. (This explanatory text is largely copied from supplementary data riboswitchalsoregulatestheCOG5515gene. Thesuperscript“3”riboswitchover- on a different RNA motif [4].) abbrev RefSeq accession 50 at 30 at genes 4Cef-1-1 NC 004369.1 + 1474341 1474771 RNA→hypo→ Cgl-1-1 NC 003450.3 + 1372526 1372970 RNA→hypo→ Cgl-1-2 NC 006958.1 + 1373993 1374437 RNA→hypo→ Aau-1-1 NC 008711.1 - 3791869 3791425 RNA→Nitrilotriacetate monoxgenase(cd01095)→Nitrilotriacetate monoxgenase(cd01095)→ DdpA(COG0747)→DppB(COG0601)→DppC(COG1173)→COG1123(COG1123)→ Aau-1-2 NC 008711.1 - 1118288 1117855 RNA→hypo→ 1Mva-1-1 NC 008726.1 - 1175546 1175106 RNA→DSPc(pfam00782)ADP ribosyl GH(pfam03747)→hypo→ Msm-1-1 NC 008596.1 - 1362007 1361570 RNA→DSPc(pfam00782)ADP ribosyl GH(pfam03747)→hypo→ Sav-1-1 NC 003155.3 + 3688850 3689305 RNA→hypo→hypo→ 2Asp-1-1 NC 008541.1 + 4342124 4342550 RNA→COG5515(COG5515)→RHOD 1(cd01522)→MetC(COG0626)→ Aau-1-3 NC 008711.1 + 4015346 4015771 RNA→RHOD 1(cd01522)→metZ(pfam01053)→putativeintegralmembraneprotein→ Fal-1-1 NC 008278.1 + 4831332 4831773 RNA→DdpA(COG0747)→DppB(COG0601)→ Sav-1-2 NC 003155.3 - 3688791 3688366 RNA→hypo→ Sav-1-3 NC 003155.3 + 8585278 8585711 RNA→HisM(COG0765)→GlnQ(COG1126)→SBP bac 3(pfam00497)→COG3393(COG3393)→ Fal-1-2 NC 008278.1 + 4609186 4609626 RNA→CsdB(COG0520)→ env-1 AACY01218155.1 + 112 544 RNA→unknown→ Sco-1-1 NC 003888.3 - 2308784 2308334 RNA→CsdB(COG0520)→ Sav-1-4 NC 003155.3 + 7291219 7291670 RNA→CsdB(COG0520)→ 4Jsp-1-1 NZ AAMN01000002.1 + 104068 104500 RNA→metA(COG2021)→ Nsp-1-1 NC 008699.1 + 3644112 3644553 RNA→CsdB(COG0520)→ Rsp-1-1 NC 008268.1 - 6724436 6724002 RNA→metC(COG2873)→metA(COG2021)→ Aau-1-4 NC 008711.1 + 1618323 1618754 RNA→metA(COG2021)→ Rsp-1-2 NC 008268.1 + 4627879 4628337 RNA→CsdB(COG0520)→ Jsp-1-2 NZ AAMN01000002.1 + 4561 5010 RNA→CsdB(COG0520)→DegV(COG1307)→DUF205(pfam02660)→ Nfa-1-1 NC 006361.1 - 1024561 1024137 RNA→metA(COG2021)→ 4Asp-1-2 NC 008541.1 + 1479272 1479711 RNA→metA(COG2021)→Sugar tr(pfam00083)→AdhC(COG1062)→ Kra-1-1 NZ AAEF02000064.1 + 13374 13810 RNA→DszC(cd01163)→ Asp-1-3 NC 008541.1 + 2980393 2980837 RNA→CsdB(COG0520)→SmtA(COG0500)→TroR(COG1321)FeoA(pfam04023)→CrcB(COG0239)→ CrcB(COG0239)→hypo→ Aau-1-5 NC 008711.1 + 2904748 2905189 RNA→CsdB(COG0520)→ Jsp-1-3 NZ AAMN01000002.1 + 91101 91530 RNA→metC(COG2873)→ Mle-1-1 NC 002677.1 - 819835 819388 RNA→metA(COG2021)→ Msm-1-2 NC 008596.1 - 1748407 1747949 RNA→metC(COG2873)→metA(COG2021)→UbiE(COG2226)→ 7 Mfl-1-1 NZ AAPA01000001.1 + 833500 833951 RNA→metC(COG2873)→metA(COG2021)→UbiE(COG2226)→ Mva-1-2 NC 008726.1 - 1643759 1643310 RNA→metC(COG2873)→metA(COG2021)→UbiE(COG2226)→ Msp-3-1 NC 008146.1 - 1307453 1307006 RNA→metC(COG2873)→metA(COG2021)→UbiE(COG2226)→ Msp-2-1 NC 008705.1 - 1311762 1311315 RNA→metC(COG2873)→metA(COG2021)→UbiE(COG2226)→ Msp-1-1 NZ AAQC01000009.1 + 7131 7578 RNA→metC(COG2873)→metA(COG2021)→UbiE(COG2226)→ Mul-1-1 NC 008611.1 - 1553444 1552996 RNA→metC(COG2873)→metA(COG2021)→UbiE(COG2226)→ Mtu-2-1 NC 002755.2 + 3723565 3724013 RNA→metC(COG2873)→metA(COG2021)→UbiE(COG2226)→ Mtu-4-1 NC 000962.2 + 3725957 3726405 RNA→metC(COG2873)→metA(COG2021)→UbiE(COG2226)→ Mbo-1-1 NC 002945.3 + 3683207 3683655 RNA→metC(COG2873)→metA(COG2021)→UbiE(COG2226)→ Mtu-3-1 NZ AAIX01000036.1 + 18996 19444 RNA→metC(COG2873)→metA(COG2021)→UbiE(COG2226)→ Mtu-1-1 NZ AAKR01000147.1 + 6245 6693 RNA→metC(COG2873)→metA(COG2021)→UbiE(COG2226)→ Mtu-5-1 NZ AASN01000046.1 + 389912 390360 RNA→ Mav-2-1 NC 002944.2 + 3838558 3839009 RNA→metC(COG2873)→metA(COG2021)→UbiE(COG2226)→ Mav-1-1 NC 008595.1 + 4445969 4446420 RNA→metC(COG2873)→metA(COG2021)→UbiE(COG2226)→ Asp-1-4 NC 008541.1 + 1477689 1478157 RNA→metC(COG2873)→ Fsp-1-1 NZ AAII01000122.1 + 12974 13413 RNA→CsdB(COG0520)→ 3Kra-1-2 NZ AAEF02000003.1 + 44250 44673 RNA→DdpA(COG0747)→DppB(COG0601)→DppC(COG1173)→COG1123(COG1123)→ Nitrilotriacetate monoxgenase(cd01095)→ Kra-1-3 NZ AAEF02000019.1 - 36781 36357 RNA→NlpA(COG1464)→AbcC(COG1135)→AbcD(COG2011)→ 4Nfa-1-2 NC 006361.1 - 5665654 5665216 RNA→RHOD 1(cd01522)→metZ(pfam01053)→ Str-1-1 NZ AATJ01000006.1 - 318699 318245 RNA→CsdB(COG0520)→ Sar-1-1 NZ AAWA01000001.1 - 73920 73461 RNA→CsdB(COG0520)→ Nfa-1-3 NC 006361.1 + 396879 397316 RNA→CsdB(COG0520)→ Kra-1-4 NZ AAEF02000050.1 - 6285 5844 RNA→CsdB(COG0520)→ Mma-1-1 NZ AAAP01003574.1 + 1800 2239 RNA→COG0520: Selenocysteinelyase→CsdB(COG0520)→ Msp-1-2 NZ AAQC01000005.1 + 250651 251094 RNA→←ABC Class3(cd03229) Mfl-1-2 NZ AAPA01000002.1 + 124393 124836 RNA→ERCC4(pfam02732)→ 4Msp-3-2 NC 008146.1 - 920195 919752 RNA→hypo→ Msp-2-2 NC 008705.1 - 928924 928481 RNA→hypo→hypo→COG0714(COG0714)Mrr(COG1715)→McrC(COG4268)→ Bli-1-1 NZ AAGP01000018.1 - 74585 74159 RNA→hypo→ 8 Supplementary Figure S3C: conserved domains present in genes downstream of SAM-IV riboswitches Conserved domains found in downstream genes (Supplementary Figure S3B) areassignedacolor,whileothersareshowningray. (Thisexplanatorytextislargely are listed, with the first sentence in their description from the Conserved Domain copied from supplementary data on a different RNA motif [4].) Database. Conserved domains downstream of more than one SAM-IV riboswitch cd01095nitrilotriacetatemonoxygenaseoxidizesnitrilotriacetateutilizingreducedflavinmononu- portandmetabolism] cleotide(FMNH2)andoxygen. COG1135 ABC-type metal ion transport system, ATPase component [Inorganic ion transport cd01163 Dibenzothiophene(DBT)desulfurizationenzymeC(DszC). andmetabolism] cd01522 MemberoftheRhodaneseHomologyDomainsuperfamily,subgroup1. COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components cd03229ThisclassiscomprisedofallBPD(BindingProteinDependent)systemsthatarelargely [Aminoacidtransportandmetabolism/Inorganiciontransportandmetabolism] representedinarchaeaandeubacteriaandareprimarilyinvolvedinscavengingsolutesfromthe COG1307 Uncharacterizedproteinconservedinbacteria[Functionunknown] environment. COG1321 Mn-dependenttranscriptionalregulator[Transcription] COG0239 Integral membrane protein possibly involved in chromosome condensation [Cell divi- COG1464 ABC-typemetaliontransportsystem,periplasmiccomponent/surfaceantigen[Inor- sionandchromosomepartitioning] ganiciontransportandmetabolism] COG0500 SAM-dependent methyltransferases [Secondary metabolites biosynthesis, transport, COG1715 Restrictionendonuclease[Defensemechanisms] COG2011 ABC-typemetaliontransportsystem,permeasecomponent[Inorganiciontransport andcatabolism/Generalfunctionpredictiononly] COG0520 Selenocysteinelyase[Aminoacidtransportandmetabolism] andmetabolism] COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components COG2021 Homoserineacetyltransferase[Aminoacidtransportandmetabolism] COG2226 Methylaseinvolvedinubiquinone/menaquinonebiosynthesis[Coenzymemetabolism] [Aminoacidtransportandmetabolism/Inorganiciontransportandmetabolism] COG2873 O-acetylhomoserinesulfhydrylase[Aminoacidtransportandmetabolism] COG0626 Cystathioninebeta-lyases/cystathioninegamma-synthases[Aminoacidtransportand COG3393 Predictedacetyltransferase[Generalfunctionpredictiononly] metabolism] COG4268 McrBC5-methylcytosinerestrictionsystemcomponent[Defensemechanisms] COG0714 MoxR-likeATPases[Generalfunctionpredictiononly] COG5515 Uncharacterizedconservedsmallprotein[Functionunknown] COG0747 ABC-typedipeptidetransportsystem,periplasmiccomponent[Aminoacidtransport pfam00083 Sugar(andother)transporter. andmetabolism] pfam00497 Bacterialextracellularsolute-bindingproteins,family3. COG0765 ABC-typeaminoacidtransportsystem,permeasecomponent[Aminoacidtransport pfam00782 Dualspecificityphosphatase,catalyticdomain. andmetabolism] pfam01053 Cys/MetmetabolismPLP-dependentenzyme. COG1062 Zn-dependentalcoholdehydrogenases,classIII[Energyproductionandconversion] pfam02660 DomainofunknownfunctionDUF. COG1123 ATPasecomponentsofvariousABC-typetransportsystems,containduplicatedAT- pfam02732 ERCC4domain. Pase[Generalfunctionpredictiononly] pfam03747 ADP-ribosylglycohydrolase. COG1126 ABC-typepolaraminoacidtransportsystem,ATPasecomponent[Aminoacidtrans- pfam04023 FeoAdomain. 9 Supplementary Figure S3D: SAM-IV multiple sequence alignment The multiple sequence alignment of SAM-IV riboswitches follows. The align- “2” denotes base pairs exhibiting covariation, “1” denotes base pairs exhibiting ment includes sequences containing the putative SAM-IV riboswitches, as well as compatible mutations, “0” denotes base pairs that are not observed to mutate downstream sequence, in which Shine-Dalgarno sequences, rho-independent tran- and “?” denotes base pairs that have a significant frequency of non-canonical scription terminators and start codons are annotated. Superscript “1”, “2”, “3” nucleotides for Watson-Crick or G-U pairs. Below these base pair annotation is and “4”are as explained in Supplementary Figure S3B. Nucleotides proposed to the consensus sequence: “R” = “A” or “G”, “Y” = “C” or “U”, red nucleotides: basepair are colored when they comprise Watson-Crick or G-U pairs. Otherwise nucleotide identity conserved more than 97% of the time, black nucleotides: 90%, qa they are gray. Colors are as follows: P1, P2, P2b, P3, P4, P5, pseudoknots and gray nucleotides: 75%, red circle (): nucleotide is present 97% of the time, black qa qa qa stems of predicted rho-independent transcription terminators. Putative Shine- circle (): 90%, gray circle (): 75%, white circle (): 50%. The following SAM-IV Dalgarno and start codons are colored green. They can be distinguished from riboswitches are not shown because they have an identical nucleotide sequence pseudoknotsbecausetheyappear30 totheconservedaptamermotif. Startcodons to other hits that are shown: Cgl-1-2, Mav-1-1, Mbo-1-1, Msp-2-1, Msp-2-2, are derived from RefSeq annotations, while Shine-Dalgarno sequences were esti- Mtu-1-1, Mtu-3-1, Mtu-4-1, Mtu-5-1 mated manually based on annotated start codons. Stems (except for terminators) arealsoindicatedatthebottomofthealignmentbyanglebrackets, wherematch- (ThisexplanatorytextislargelycopiedfromsupplementarydataonadifferentRNA ing < and > denote base-paired columns. Below these angle brackets, the symbol motif [4].) alignmentpositions1···149 4Cef-1-1 AACCUCAUCUUCGCGAUCAAGAAGUUCAGCCU..CUAAGCCCUUCGG.CA.GGCUGAC.UGGCAACCGCGC.....AA.C.GC.....ACA..CGGUGCCC.CCGAAGGAAGAUCCGCUCUGUACUC......AAAA............ Cgl-1-1 AAUGUCGAUUUUACGAUCAAGAAGAUCAGCC...GCAAGCCCUUCGG.CA.GGCUGAC.UGGCAACCGCGC......AUC.GC.....GCA..CGGUGCCC.CCGAAGGAAGAUCGGCUCUGUACUC......AAAA............ Aau-1-1 UAGAUUCAAGUCCAGGUCAAGAGUCAGCACG...CCAAGG..ACCGG.CU.CGUGCUG.CGGCAACCCUCA....GGCAU.U.AAG..UGGCGGGGUGCU..CCGGAAACAGACCAGGCCGCA..........CA.............. Aau-1-2 UAGGGUGGUCCCAGCUUCAAGAGAUGUGGUG...CCAAGC..UCCGA.CU.GGCCACA.CACCAACCCCAU.......GU.UC.....AC...GGGUGGU..UCGGAUGAGGAAGCGGCCCGUCCCGG.....ACCAGAGACAA..... 1Mva-1-1 UACAGUGCUCAUUGCUUCACGAGAUGCCGUCG..CCAAGC..UCCGA.CUGGGCGGCA.CGCCAACCCCAC.......GC.UC.....GU...GGGUGGC..UCGGAUGACGAAGCGGCCA.GCACC......GACC............ Msm-1-1 UUAUCCUGAACCUGCUUCACGAGACGCCGUCG..ACAAGC..UCCGA.CUGGAUGGCG.CGCCAACCCCAC.......GC.UC.....GU...GGGUGGC..UCGGAUGACGAAGCGGCCUGCACC.......ACCC............ Sav-1-1 AAAGCCUCCCGCUUGGUCACGAAGCCGAUCG......CGCC.CCUGGACGAUCAUCGGCUGGCAACCUCGC....ACACC.GC.....GCA..AGGUGCCUUCCAGG.GAAGACCGGGCCCCCACU.......GUCCA........... 2Asp-1-1 UUACGUUUUUGGGGAGUAAUGAGAACCGGCG...CCAAGC..CCUGA.CU.GGCCGGU.CGGCAACCCUCC....UUC.C.AC.....GGCG.GGGUGCC..UCAGGUGAAUACUCGGCAUAUCGA.......UAU............. Aau-1-3 UUACGUUUUUCUUGAGUAAUGAGCACCAGCG...CAAAGC..CCUGA.CU.UGCUGGU.CGGCAACCCUCU......CUC.C......GGCG.GGGUGCC..UCAGGUGACUACUCGGCG.UAUCGA......CAAC............ Fal-1-1 CUGAAGAUCACAACUGUCACGAGUGCCAACG...UCCAGC..CCCGG.CU.UGUUGGC.CGGCAACCCUCC......ACC.GCGA...AG...GGGUGCC..CCGGGUGAGGACACG.GCCGCAUCC......AGU............. Sav-1-2 UUAAUUUCACGCCCGGUCACGAGUUCCAGCG...UCAAGC..CCCGG.CU.UGCUGGA.CGGCAUCCC.GC.......CC.GCU....AC...GGGUGCC..CCGGGUGACGACCGGCCCCGUGCG.......CGG............. Sav-1-3 UUACGCCCCAGAACGGUCAAGAGCGUCAGCG...ACAAGC..CCUGG.CU.UGCUGAC.CGGCAACCCUCG.....CGAC.GC.....GGUG.GGGUGCC..CCAGGUGACGACCGGACCGGA..........UGACC........... Fal-1-2 UAAGGUUUCUCCCACGUCUUGAGUGCCAGCG...UUCAGC..CCCGG.CU.UGCUGGC.CGGCAACCCUCC.....UUAU.GCA.....GCG.GGGUGCC..CCGGGUGGUGACGAGGCCG.CGGC.......AACCCC.......... env-1 UAUAGUCGGAGACAGGUCAAGAGAUUCAGCU...UUAAGC..CCCGG.CU.AGCUGGA.CGGCAACCCUCC.....AACC.GC.....GGUG.GGGUGCU..CC.GGUGGAGACCUGGCCC............UCAC............ Sco-1-1 UAGGUUUUUCGACAGGUCAUGAGUGACAGUCA..UGAGGC..CCCGG.CC.GACUGUC.CGGCAACCCUCC.....GUCC.GU.....GGCG.GGGUGCC..CCGGGUGAAGACCAGGUCGUGGAC.......AGCAAG.......... Sav-1-4 AAGAUGCUGUAUCAGGUCAUGAGCGACAGUCA..UGAGGC..CCCGG.CC.GACUGUC.CGGCAACCCUCC.....GUCC.GU.....GGCG.GGGUGCC..CCGGGUGAAGACCAGGUCGUAGGC.......AGCGAG.......... 4Jsp-1-1 UAGACUCGUGCCCAGGUCAUGAGUCCCAGCG...ACAAGC..CCCGG.CU.UGCUGGG.CGGCAACCCUCC.......UC.GC.....GGUG.GGGUGCC..CCGGGUGAAGACACGGCCCUUCCGG......UA.............. Nsp-1-1 GCUAGCGUCCCGUCCGUCACGAGUACUGGCG...CGAAGC..CCCGG.CU.GGCCAGU.CGGCAACCCUCC......UCC.GC.....GGCG.GGGUGCC..CCGGGUGAGGACACGGCCGACUGCG......AC.............. Rsp-1-1 UAAGAUUCACUCCAGGUCAUGAGUGCCAGCA...ACGAGC..CCCGG.CU.AGCUGGC.CGGCAACCCUCC......ACC.GC.....GGUG.GGGUGCC..CCGGGUGAUGACCUGGUUUCCUGAUU.....CACACAAC........ Aau-1-4 GACUCUUCGUAUUAGGUCAUGAGUGCCAGCA...CACAGC..CCCGG.CU.UGCUGGC.CGGCAACCCUCC.....UC.C.GC.....GGUG.GGGUGCC..CCGGGUGAAGACCUGGCCUGUUCGC......ACGCAAG......... Rsp-1-2 CAAUCUUCGAAACAGGUCACGAGUACCAGCG...UCAAGC..CCCGG.CU.AGCUGGU.CGGCAACCCUCC......ACC.GC.....GGCG.GGGUGCU..CCGGGUGACGACCAGGCUGAGGUC.......CAUACC.......... Jsp-1-2 GACUCCACAACCAAGGUCACGAGUACCAGUG...UUCAGC..CCCGG.CU.UGCUGGU.CGGCAACCCUCC......UCC.GC.....GGUG.GGGUGCU..CCGGGUGACGACCUGGUCGGCUGC.......AGCAA........... Nfa-1-1 UACCAUGGCGGUCAGGUCAUGAGCGCCAGCG...UCAAGC..CCCGG.CU.CGCUGGC.CGGCAACCCUCC.....AGCU.GC.....GGUG.GGGUGCU..CCGGGUGAUGACCGGGCUCCCGA........AGG............. 4Asp-1-2 GUAGACUCUUUAAAGGUCAUGAGUGCCAGCG...AC.AGC..CCCGG.CU.UGCUGGC.CGGCAACCCUCC.....UUUC.GC.....GGCG.GGGUGCC..CCGGGUGAAGACCUGGCCUGCCGGCCA....GU.............. Kra-1-1 GUUCGCUGGCGCCAGGUCAUGAGCGCCAGCG...ACAAGC..CCCGG.CU.CGCUGGC.CGGCAACCCUCC.......UC.GU.....CGUG.GGGUGCC..CCGGGUGAGGACCUGACCUGGUGCC......CC.............. Asp-1-3 CUAAUCUCGAAACAGGUCACGAGUGCCAGCG...CUAAAC..CCCGG.UU.UGCUGGC.CGGCAACCCUCC.....AUUC.GC.....GGUG.GGGUGCC..CCGGGUGACGACCAGGCCGG.UCCG......GAA............. Aau-1-5 CUAGCCUUUUUACAGGUCACGAGUGCCAGCG...CUAAAC..CCCGG.UU.UGCUGGC.CGGCAACCCUCC.....AUUC.GC.....GGUG.GGGUGCC..CCGGGUGACGACCCGGCCG.GUCCG......GAA............. Jsp-1-3 ACGUUUCAACCACAGGUCAUGAGUGCCAGCG...ACAAGC..CCCGG.CU.CGCUGGC.CGGCAACCCUCC.....AACC.GC.....GGUG.GGGUGCC..CCGGGUGAAGACCAGGCGGAGCGUC......GACC............ Mle-1-1 AUAGGCUGCAACGCGGUCAUGAGCGCCAGCG...UCAAGC..CCCGA.CU.UGCUGGC.CGGCAACCCUCC....AAC.C.GC.....GUUG.GAGUGCC..CC.GGUGAUGACCAGGUUGAGUAGCC.....AGAACC.......... ..............<<<<....<.<<<<<<.......<<.........>>..>>>>>>.><<<..<<<.<<.................>>...>>>.>>>...........>>>>..<<<<<<<<<<<<<<<<............>>>> ..............2202....2.222222.......22.........22..222222.2220..002.2?.................?2...200.022...........2022..222??2??22220000............0000 .......................................<..<<<<<.......................................................>>>>>.>........................................ .......................................2..22202.......................................................20222.2........................................ ............................................................................<<<<.<<.................................................................. .qa.qa..qa.qa...qa.qa.qa.qa.qa.qa.qa.....................qa.................qa........................?qa?qa?2.22.........qa...................qa.....qa.....qa.qa.qa.qa.qa.qa.qa.......qa.qa.qa.qa............ R YY GGUCAYGAGYRYCAGCR--- YAAGC--CCCGG-CU- GCUGRY-CGGCAACCCUCC----- YC-GC-----GGY -GGGUGCC--CCGGGUGA GACC GGYY ------ ------------ 10 Msm-1-2 AUAGUCUCUUUGUCGGUCAUGAGCGCCAGCG...ACAAGC..CCCGG.CU.UGCUGGC.CGGCAACCCUCC.....AACC.GC.....GGUG.GGGUGCC..CCGGGUGAUGACCAGGUUGAGUGGUUGACGGAUUCCU......CCGU Mfl-1-1 AUACAGUUCGUCUCGGUCAUGAGUGCCAGCG...ACAAGC..CCCGG.CU.UGCUGGC.CGGCAACCCUCC.....AACC.GC.....GGUG.GGGUGCC..CCGGGAGACGACCAGGUUGAGUAGCCG....AACA............ Mva-1-2 AUAGUCUUUGACUCGGUCAUGAGUGCCAGCG...AUAAGC..CCCGG.CU.UGCUGGC.CGGCAACCCUCC.....AACC.GC.....GGUG.GGGUGCC..CCGGGAGACGACCAGGUUGAGUAGCCG....GCCA............ Msp-3-1 AUAAGCUUCCUGGCGGUCAUGAGUGCCAGCG...UCAAGC..CCCGG.CU.UGCUGGC.CGGCAACCCUCC.....AACC.GC.....GGUG.GGGUGCC..CCGGGUGAUGACCAGGUUGAGUAGCCG....UGA............. Msp-1-1 AUAAGCUUCCUGGCGGUCAUGAGUGCCAGCG...UCAAGC..CCCGG.CU.UGCUGGC.CGGCAACCCUCC.....AACC.GC.....GGUG.GGGUGCC..CCGGGUGAUGACCAGGUUGAGUAGCCG....UGA............. Mul-1-1 GUAGGCUUCAGAUCGGUCAUGAGCGCCAGCG...UCAAGC..CCCGG.CU.UGCUGGU.CGGCAACCCUCC.....AACC.GC.....GGUG.GGGUGCC..CCGGGUGAUGACCAGGUUGAGUAGCC.....ACAGUC.......... Mtu-2-1 CUAGGCUUCGAGUCGGUCAUGAGCGCCAGCG...UCAAGC..CCCGG.CU.UGCUGGC.CGGCAACCCUCC.....AACC.GC.....GGUG.GGGUGCC..CCGGGUGAUGACCAGGUUGAGUAGCC.....AUCGCC.......... Mav-2-1 AUAGUCUUUGUAUCGGUCAUGAGCGCCAGCA...UCAAGC..CCCGG.CU.UGCUGGC.CGGCAACCCUCC.....AACC.GC.....GGUG.GGGUGCC..CCGGGUGAUGACCAGGUUGAGUAGCC.....ACCACAACC....... Asp-1-4 CUACACUGGGUCAAGGUCAUGAGCGCCAGCA...UUGAGC..CCCGG.CU.UGCUGGC.CGGCAACCCUCG.....UUCC.GC.....GGUG.GGGUGCC..CCGGGUGAGGACCUGGCCU.CCGGC......AACC............ Fsp-1-1 UACAGUUUUCGCCAGGUCCUGAGCGCCAGCG...UCAAGC..CCCGG.CU.UGCUGGC.CGGCAACCCUCC......UCC.GC.....GGUG.GGGUGCC..CCGGGUG.UGACCAGGCCC.GCGGCG.....AACC............ 3Kra-1-2 CUAGCUUCUUCCUCGGUCAUGAGUGACAGCU...GUAGGC..CCUGG.CU.GGCUGUC.CGGCAACCCUCC.....UUCC.GU.....GGCG.GGGUGCC..CCAGGUGACGACCCGGCCGGGCG........ACG............. Kra-1-3 CUACGGUCCUCCCAGGUCAUGAGUGACAGCG...ACAAGC..CCUGG.CU.CGCUGUC.CGGCAACCCUCC......UCC.GC.....GGCG.GGGUGCC..CCAGGUGAAGACCCGGCCGGACG........AAAA............ 4Nfa-1-2 UCUCAGUGACCGAAGGUCAUGAGCACCAGCG...CCAAGC..CCCGG.CU.CGCUGGU.CGGCAACCCUCC......UCC.GC.....GGCG.GGGUGCU..CCGGGUGACGACCUGGCCGUACCG.......AAGA............ Str-1-1 CUACACUGCAAGCAGGUCACGAGCGCCAGCG...ACAAGC..CCCGG.CU.UGCUGGC.CGGCAACCCUCGUCGAGGUUC.GC.....GGUG.GGGUGCC..CCGGGUGAUGACCGGGCCUGGCGCG......GCG............. Sar-1-1 CUACACUGCGAACAGGUCACGAGCGCCAGCG...ACAAGC..CCCGG.CU.UGCUGGC.CGGCAACCCUCGUCGAGGUUC.GC.....GGUG.GGGUGCC..CCGGGUGAUGACCGGGCCUGGCGCG......GCG............. Nfa-1-3 CCAAUCUCGGUCCAGGUCAUGAGUGCCAGCG...CAAAGC..CCCGG.CU.CGCUGGU.CGGCAACCCUCC......UCC.GC.....GGUG.GGGUGCU..CCGGGUGACGACCUGGCCGCCGUCCG.....AC.............. Kra-1-4 GCAACCUCGCCGCAGGUCAUGAGCGCCAGCG...ACAAGC..CCCGG.CU.CGCUGGC.CGGCAACCCUCC......UCC.GC.....GGCG.GGGUGCC..CCGGGUGACGACCUGGCCGGUGCCG......ACG............. Mma-1-1 CAUCCUGGUCCGCAGGUCAUGAGUGCCAGCG...CGAAAC..CCCGG.UU.UGCUGGC.CGGCAACCCUCC.....UCUC.GC.....GGCG.GGGUGCC..CCGGGUGACGACCAGGCCGCACGCCG.....AC.............. Msp-1-2 UUAUUCUGAGACCAGCUCACGAGCGGACGGCAGUUUGGGUA.CCUGG.CCCGUCGUCC.CGGCAACCGCGC.....GACCACC.....GCA..CGGUGCC..CCAGGGGAAGAGCGGGCGCG...........GCUUA........... Mfl-1-2 UUACUCUGGCUUCAGCUCACGAGCGGACGACUGUUCGGGUA.CCUGG.CCCGUCGUCC.CGGCAACCGCGC.....GACCACC.....GCA..CGGUGCC..CCAGGGGAAGAGCGGGCACGC..........GGU............. 4Msp-3-2 UUAACCUGGGAGCAGCUCACGAGCGGACGACUGUUCGGGUA.CCUGG.CCCGUCGUCC.CGGCAACCGCGC.....GACCGCC.....GCA..CGGUGCC..CCAGGGGAAGAGCGGGCACGA..........GCA............. Bli-1-1 AAGAUCAUUCACACGUUCACGAGUCGUGGCU...AUCGGUA.CCUGG.CC.GGCCACG.CGGCAACCC.GU.....CAUC.G......ACC..GGGUGCC..CCAGGGAAAGAACGGGCCCGCACCA......CG.............. ..............<<<<....<.<<<<<<.......<<.........>>..>>>>>>.><<<..<<<.<<.................>>...>>>.>>>...........>>>>..<<<<<<<<<<<<<<<<............>>>> ..............2202....2.222222.......22.........22..222222.2220..002.2?.................?2...200.022...........2022..222??2??22220000............0000 .......................................<..<<<<<.......................................................>>>>>.>........................................ .......................................2..22202.......................................................20222.2........................................ ............................................................................<<<<.<<.................................................................. .qa.qa..qa.qa...qa.qa.qa.qa.qa.qa.qa.....................qa.................qa........................?qa?qa?2.22.........qa...................qa.....qa.....qa.qa.qa.qa.qa.qa.qa.......qa.qa.qa.qa............ R YY GGUCAYGAGYRYCAGCR--- YAAGC--CCCGG-CU- GCUGRY-CGGCAACCCUCC----- YC-GC-----GGY -GGGUGCC--CCGGGUGA GACC GGYY ------ ------------ alignmentpositions150···309 4Cef-1-1 ..GACUGCAGAGAAGA.GC.GAUU...................................................AUGAAGGAGAUUCCCAUGCGGAAUACCUAUGAAAACACCCCGCUAGUAAUCAGAUUGCACAACAAGUGCGGGCACACUCAAUCCG Cgl-1-1 ..GAAUGCAGAGAAGA.GC.GAUUUUU.......UGAAG............................GAGAUU.CCCAUGGGGAAUAUUUCUGCUAAGCCUCCAUUGCCCAAUAGCUGCGGACAUCAUUUACCAGGACACGAACCCUACUACCUACAAGU Aau-1-1 ......UGCGGCAAGU.GA.AUGUCGU.......CCCGGCCGCCGC.....................GCGGCC.UCUGAAAGAGAGAAUCAUGCCUUCCACUCCUGUACCUGAAUUCCGCUCCCGCGUCAUCGCAUUGGAGCUCGACGGCGAUGGCGCCC Aau-1-2 .CCGA.GCGGGCAAGA.GC.ACUCU..................................................CCCAAGGAGCAGUCCCUUGCCGUACCAAGGCGAUACCACCCCGGCAGCAAGCACCCCAAAAACACCCAGCAUCCUGGAAGCACCA 1Mva-1-1 ..GGUGCCUGGCAAGA.GC.GCCCGGC.......AACGG............................GCCCGG.UUGAUCCGGAGCAUCAAUGAACACCUCACGAACGACGAUUCAGCGUGACCGCGCCGCCGGUGCACUGGUCGGUCUGGCCUCGGGGG Msm-1-1 ...GGUGCGGGUAAGA.GC.GCCUCC........GCGUG.............................GGAGG.CCCGGAUCGACCCGGGAGAAGCCAUGGGUUCUUCACGUAUCGAGUUGACGGCUGCGCAGCGAGAUCGGGCUGUCGGCGUGCUGUUG Sav-1-1 ...GUUGGGGGUAAGC.GCGGAUCUCC.......CUGCACGACGCGUAA..................GGGGGU.CCCCGAUUCUCAGGAGACCCCAUGGACAACACCCGCGCAUCAGCCACGCCCCGGAACUGGGCGGGCUACCAGCAGGCAGUCGUGGA 2Asp-1-1 ...UCGAAUUGCAAGC.GU.GAGAG..................................................AAGGAGAUCCGUCGUGUCCGACGCACCCGCACAGCUGGCCCCGGCAGCCGAACCAGACGCCGCCGGCCAGGAACAACCGGCCCUG Aau-1-3 ..UCG.UACUGCAAGC.GU.GACUG..................................................AGGAGCACCAGCAAUGCCUGAACCCACUGAAGAAAAACUCUCCUACCGCCUGAUUACAGGCCCGGAUACCCGUGAUUUCUGCGAG Fal-1-1 ..GGAGGCGGCAAAGC.GC.GGGCCC.GG.....UCU............................CCUGGGCC.UCGGCCGAGGGAGCAAAGACAUGCUGGGAGCGACUCGCCACGCGGGUGGCCGCUCCACGACCGCCGGUCGCGGGCCGGGCACGAUC Sav-1-2 ...UGCACGGGGAACC.GC.GG............AUCA..................................C.CGGGGUGUGGCACUACCGGGGUCCACCGACUCCGGAAGGCCGUCCGCCCAUGCCCGCGUCCGACGCGAUCCCCGCCCUGCCACAGC Sav-1-3 ......UUCGGUAAGC.GC.GAGGCCC.......CCCAA............................GGG....CCGGAACAGUGACGGUGACGGCGAUGGACGCAGCUACGAGGACGACCGCACAGCGGCGCCCCCGCCGCGUCCUGCUGUGCGCGAAC Fal-1-2 ...GCCGACGGCAAGU.GC.CGACCCU.......GUG..............................AGGGUC.CUACCAGGAGCGAUGAUGACGUCGAUCAUCUCUGCCGUGUCCCCGUCCCCUGCCCGGCAGGCCUGCCCCGACCAGCCGACCGGGCA env-1 ........GGGCAAGU.GC.GGUCUUG.......CGCAGGG..........................CAAGGC.CCCGGAUGAAUGCGAGGUCUCAAAUGCGCUGUACCUGUCUCAUCUAGUGACCUGGCGCCACGCCCUGCGACUGUGCGCGCCAUCAG Sco-1-1 ...GUCCACGGCAAGC.GC.GGACCCCU......CGCGGAACC.......................AGGGGUC.CUGGGUCGUCCGAGGGAGUCUCCCGUGAACCAGCGAAUGCGAUAGGGCCGCAGAGCCCCGCCUCCGCACACCCCUUUUCUUCUCUU Sav-1-4 ...GUCUACGGCAAGC.GC.GGACCCCU......CGUAAUCACC......................AGGGGUC.CUGGUCGUUCGAGGGAGCCUUCCGUGAGCAAGCGAAUGCGAUAGGGCGCCGAGCCCCGCCUCCGCAAUCCCCUUUUCACCUUUCCU 4Jsp-1-1 ..UCGGUGGGGCAAGC.GC.GAUCC.........GA.................................GGAG.CACUCCGAUGACCAUCACCGACCUCCCGACAGCGUCGUUCUGGCGCCCGGGCGAUCACCCGGGCGCGCGCCAGUUCGUCCAGGUCG Nsp-1-1 ..CGCAGCCGGCAAGC.GC.GGACUCG.......GCGACC...........................CGGGUC.CCUGACCCGACGGAUGUGCCCACAUGCCCACGAACACUCGCACCUCGGCCCAGGCCCGGGCCGCUCGGACCCGCCCACUGGCCCGC Rsp-1-1 .AGUCGGGUGACAAGC.GC.GGGUC..................................................CGACAUCGGAGGUUCAUGAUGUGUUGUUGUCGUAGAUAGACCCCAGACAAGCCCUGCGCGCGAACGAACCAGCAACUCUGUCGCG Aau-1-4 ..GCG.ACGGGCAAGC.GC.GAGGUU.................................................AAGGUGUCAAAGGAAUGACCAUUACUGCUACAGCCCUUCCAAAAUCCGGAGAAGAAGACGGAAUCGUCAAGUACGCCGGCAUAGG Rsp-1-2 ...GGACCCGGCAAGC.GC.GGACUC........CGAGAGGGCACCGACCCUGCGAC...........GAGUC.CCUGACUACGAGGGAUGUCGUGAUGACUGCUGUACUGACCGCUGACGUGUGUGUUGCCCCCCUAGCCCAGGUUUCGGGCUCCGACC Jsp-1-2 ..CGCAGUUGGCAAGC.GC.GGACUCCCGAG...CACC.........................UUCGAGGGUC.CAACGACUCUCCAGGGAGCACCGGAUGUCCAUCGCCAGCUCGAUUCCUCACACCUUCACCCGUGAACGUCGCCAGCCGUGGCUGCG Nfa-1-1 ....UCGGGGGCAAGC.GC.GCUCG..................................................CAGGAGGUUUUCCCGACAUGAUCGAGUGGUCCGCGUGACCGUGAGCACCGAUCAGAGCCCCUGCCCCUCGGCCACCGGGGCGGAA 4Asp-1-2 UGG.CGGCAGGCAAGC.GC.GAAGA.........AGAGG..............................UCCU.CAGAUGACGAUUGCCGUCACCCGCAGCGGUGUACCCGAAACAUCCAGCCACAGCCUGUCAGCCCGUGACGUGAAAACCACCGCGGG Kra-1-1 ..GGCAC.GGGUAAGG.AC.GAGCCG........CCGCA.............................CGCCU.CCCGGUCCGGUCGCCGACGCGUCCACGAUCGAGGAGAACCCCGGUGACCACCACCACCCCGCCCGUCCCCGCCACCACCGCCCGGC >>>>>>>>>>>>........<<<<<<<<<<<................................>>>>>>>>>>.>..................................................................................... 2222??2??222........?????2220??................................??0222????.?..................................................................................... ................................................................................................................................................................ ................................................................................................................................................................ .................>>.>>>>........................................................................................................................................ ...qa.qa.qa.qa.qa.qa.qa........22.2??qa?..qa.qa........qa.qa.qa.qa.qa.............................qa.qa.qa.qa.qa.qa...qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa.qa -- GGCAAGY-GC-GR YY ------- ---------------------------- -Y
Description: