BMC Microbiology BioMed Central Research article Open Access Sequence analysis of percent G+C fraction libraries of human faecal bacterial DNA reveals a high number of Actinobacteria Lotta Krogius-Kurikka1, Anna Kassinen1, Lars Paulin2, Jukka Corander3, Harri Mäkivuokko4,6, Jarno Tuimala5 and Airi Palva*1 Address: 1Department of Basic Veterinary Sciences, Faculty of Veterinary Medicine, PO Box 66, FI-00014 University of Helsinki, Finland, 2DNA Sequencing Laboratory, Institute of Biotechnology, University of Helsinki, Finland, 3Department of Mathematics, Åbo Akademi University, Finland, 4Danisco Innovation, Kantvik, Finland, 5CSC – Scientific Computing Ltd, Espoo, Finland and 6The Finnish Red Cross, Blood Service, Helsinki, Finland Email: [email protected]; [email protected]; [email protected]; [email protected]; HarriMä[email protected]; [email protected]; AiriPalva*[email protected] * Corresponding author Published: 8 April 2009 Received: 16 December 2008 Accepted: 8 April 2009 BMC Microbiology 2009, 9:68 doi:10.1186/1471-2180-9-68 This article is available from: http://www.biomedcentral.com/1471-2180/9/68 © 2009 Krogius-Kurikka et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract Background: The human gastrointestinal (GI) tract microbiota is characterised by an abundance of uncultured bacteria most often assigned in phyla Firmicutes and Bacteroidetes. Diversity of this microbiota, even though approached with culture independent techniques in several studies, still requires more elucidation. The main purpose of this work was to study whether the genomic percent guanine and cytosine (%G+C) -based profiling and fractioning prior to 16S rRNA gene sequence analysis reveal higher microbiota diversity, especially with high G+C bacteria suggested to be underrepresented in previous studies. Results: A phylogenetic analysis of the composition of the human GI microbiota of 23 healthy adult subjects was performed from a pooled faecal bacterial DNA sample by combining genomic %G+C -based profiling and fractioning with 16S rRNA gene cloning and sequencing. A total of 3199 partial 16S rRNA genes were sequenced. For comparison, 459 clones were sequenced from a comparable unfractioned sample. The most important finding was that the proportional amount of sequences affiliating with the phylum Actinobacteria was 26.6% in the %G+C fractioned sample but only 3.5% in the unfractioned sample. The orders Coriobacteriales, Bifidobacteriales and Actinomycetales constituted the 65 actinobacterial phylotypes in the fractioned sample, accounting for 50%, 47% and 3% of sequences within the phylum, respectively. Conclusion: This study shows that the %G+C profiling and fractioning prior to cloning and sequencing can reveal a significantly larger proportion of high G+C content bacteria within the clones recovered, compared with the unfractioned sample in the human GI tract. Especially the order Coriobacteriales within the phylum Actinobacteria was found to be more abundant than previously estimated with conventional sequencing studies. Page 1 of 13 (page number not for citation purposes) BMC Microbiology 2009, 9:68 http://www.biomedcentral.com/1471-2180/9/68 Background study subjects was constructed. The results provide more The gastrointestinal (GI) microbiota is considered to play detailed insight into the human GI microbiota especially an important role in human health and disease via essen- in the context of the diversity of high %G+C bacteria, i.e. tial metabolic, trophic and protective functions in the Actinobacteria. host [1]. Since the majority of the GI bacteria are unculti- vable, molecular biology methods are needed to reveal Results the detailed composition, diversity and specific role of Percent guanine plus cytosine -profiling, cloning and this complex microbial community [2]. The bacterial sequencing groups most often detected in molecular studies of the To analyse the diversity of the healthy human intestinal healthy human GI tract are phyla Firmicutes (especially microbiota, a %G+C profiled and fractionated (Figure 1) Clostridium clusters XIVa and IV), Bacteroidetes, Proteobacte- pooled faecal bacterial DNA sample of 23 individuals was ria, Actinobacteria, Fusobacteria and Verrucomicrobia [3]. cloned, and the partial 16S rRNA genes were sequenced. The predominant microbiota in adults is considered The previously published 976 sequences from three rather stable and host-specific [4,5], but gender, geo- %G+C fractions (%G+C 25–30, 40–45 and 55–60) [21] graphic origin, age [6,7], and host genotype [8] may influ- were combined with the 2223 new sequences cloned in ence its composition. Furthermore, alterations within an this study (%G+C fractions 30–35, 35–40, 45–50, 50–55, individual's environmental factors, such as diet [9] and 60–65, 65–70 and 70–75) for phylogenetic and statistical dietary supplements [10], intestinal health status [11] and analyses of the complete %G+C profile ranging from 25% antibiotics [12], may also have a substantial effect on the G+C to 75% G+C (Figure 1, Table 1). Altogether, 3199 intestinal microbiota. Therefore, as a reference to altered sequences encompassing approximately 450 bp from the conditions, knowledge of the characteristics of a healthy 5'-end of the 16S rRNA gene, covering two variable areas intestinal microbiota is essential. V1 and V2, were sequenced from all clones from the frac- tioned sample. For comparison, 459 clones were The proportional amounts of bacterial phyla detected in sequenced from an unfractioned pooled faecal bacterial studies on the GI tract microbiota depend on both the DNA sample originating from the same individuals. sample handling and DNA extraction methods applied [13] and the analysis [14]. Recent metagenomic and pyro- Determination of operative taxonomic units and library sequencing studies on the human intestinal microbiota coverage highlight the potential amount of the yet undiscovered The quality-checked 3199 sequences from the combined diversity of phylotypes and reshape the porportional fractioned sample libraries represented 455 operative tax- abundances of the detected phyla, revealing e.g. a higher onomic units (OTUs), and the 459 sequences from the abundance of Actinobacteria than previously estimated unfractioned sample represented 131 OTUs with a 98% [14-16]. However, the conventional 16S rRNA gene clon- similarity criterion (Table 1). All novel OTUs with less ing and sequencing is still a valuable method, since it than 95% sequence similarity to public sequence database gives a relatively high taxonomic resolution due to longer entries were further sequenced to near full-length (Addi- read length [12] and can be targeted to a phylogenetically tional file 1). The coverages of the individual clone librar- relevant gene (16S rRNA gene) in comparison with the ies of the fractioned sample ranged from 77% to 93%, metagenomic approach. Furthermore, the clone library while the coverage for the unfractioned sample was 86% obtained serves as a valuable reference for possible future [23] (Table 1). Compared with other fractions, the frac- use. To enhance the recovery of phylotypes in bacterial tions %G+C 50–55, 55–60 and 60–65 had low OTU community samples, the genomic %G+C content -based numbers and few singletons, resulting in high Good's cov- profiling and fractioning of DNA can be used [17-20]. erage values. The combined sequences from the fractioned and unfractioned samples clustered into 481 OTUs (Fig- In a previous study comparing patients suffering from irri- ure 2). table bowel syndrome (IBS) with healthy volunteers, the faecal DNA of 23 healthy donors was pooled and %G+C Phylogenetic analysis and sequence affiliation profiled and three selected fractions, covering 34% of the When the sequence data from the fractioned clone librar- fractioned DNA, were cloned and sequenced [21]. With ies were combined, the majority of the sequences were the aim to comprehensively elucidate the bacterial phylo- assigned to the phyla Firmicutes (68.5%), Actinobacteria type diversity of the GI microbiota of healthy subjects, the (26.6%), Bacteroidetes (3.1%) and Proteobacteria (1.3%) remaining seven %G+C fractions were cloned and (Figure 2, Table 2, Additional file 1). Clostridium clusters sequenced in this study, to represent the scale of bacterial IV and XIV were the most abundant Firmicutes represented genomic %G+C content ranging from 25% to 75% [22]. by 23.5% and 33.0% of the sequences, respectively. The For methodological comparison, a clone library from 65 actinobacterial phylotypes consisted of the orders Bifi- unfractioned pooled faecal DNA samples of the same dobacteriales, Coriobacteriales and Actinomycetales account- Page 2 of 13 (page number not for citation purposes) BMC Microbiology 2009, 9:68 http://www.biomedcentral.com/1471-2180/9/68 Table 1: Characteristics of the sequence libraries. Library(s) Sequences OTUs %G+Cb Singletons Coveragec (no.) (no.)a (no.) Fr G+C 25–30% 319 91 51.5 43 87 Fr G+C 30–35% 350 94 52.6 48 86 Fr G+C 35–40% 313 93 53.4 50 84 Fr G+C 40–45% 346 119 53.9 67 81 Fr G+C 45–50% 316 112 56.0 62 80 Fr G+C 50–55% 292 62 58.1 22 93 Fr G+C 55–60% 311 45 62.1 22 93 Fr G+C 60–65% 303 64 61.7 26 91 Fr G+C 65–70% 362 130 57.6 65 82 Fr G+C 70–75% 287 116 55.5 67 77 Fr G+C 25–75%d 3199 455 56.2 180 94 Unfractioned 459 131 53.6 66 86 a. The number of OTUs determined with DOTUR using 98% similarity criterion [53] b. Average %G+C content of the partial 16S rRNA gene sequences c. Coverage according to Good [23] d. The combined G+C fractions ing for 12.4%, 13.4% and 0.8% of the sequences, the combined fractioned clone libraries' the amount of respectively (Figure 3, Table 2). Firmicutes (93.2%), especially the percentage of the Clostridium cluster XIV (51.0%), increased while the The distribution of phyla within the individual clone number of Actinobacteria (3.5%) decreased. The propor- libraries of the fractioned sample revealed that Firmicutes tion of Bacteroidetes (2.8%) and Proteobacteria (0.2%) were settled mostly in the lower %G+C content portion of the the least affected phyla when fractioned and unfractioned profile, whereas Actinobacteria were found in the fractions libraries were compared (Figure 2, Table 2, Additional file with a %G+C content ranging from 50% to 70% (Figure 1). All 16 actinobacterial sequences of the unfractioned 2, Additional file 1). Prominent phylotypes had a seem- library were included in OTUs of the fractioned libraries ingly broader distribution across %G+C fractions. In the and Actinomycetales phylotypes were absent in this library fractions having %G+C content above 65%, a bias was (Figure 3). The phyla Actinobacteria differed significantly observed, i.e. a decrease in high G+C Actinobacteria and an (p = 0.000) between the fractioned and unfractioned increase in low G+C Firmicutes. The three OTUs with the libraries in the UniFrac Lineage-specific analysis, though highest number of sequences fell into the Clostridium clus- the libraries overall were similar according to the UniFrac ters XIVa and IV, representing the species Eubacterium rec- Significance test (p = 1.000). Clones from the phylum Fir- tale (cluster XIVa), Faecalibacterium prausnitzii (cluster IV) micutes present in the fractioned library but absent in the and Ruminococcus bromii (cluster IV) with over 98.7% unfractioned library affiliated with Enterococcaceae, Lacto- sequence similarity. bacillaceae and Staphylococcacceae. Furthermore, only one Gammaproteobacteria was found in the unfractioned Within the phylum Actinobacteria, the most abundant library whereas the fractioned samples contained also the Coriobacteriales phylotypes (6 OTUs) according to the members of Alphaproteobacteria, Betaproteobacteria and number of representative clones (228 clones) affiliated Deltaproteobacteria (Table 2). with Collinsella sp. (C. aerofaciens). The remainder repre- sented Atopobium sp., Denitrobacterium sp., Eggerthella sp., Comparison of individual libraries Olsenella sp. and Slackia sp. The order Bifidobacteriales con- The Shared OTUs and Similarity (SONS) program [24] sisted of 398 sequences and 15 phylotypes out of which was used to compare the unfractioned sample with each Bifidobacterium adolescentis was the most abundant. Rest of of the %G+C fractions and with the combined sequence the bifidobacterial OTUs affiliated with B. catenulatum, B. data from the fractions (Table 3). Using a 98% similarity pseudocatenulatum, B. bifidum, B. dentium and B. longum. criterion for the phylotypes, at least 80% of sequences The order Actinomycetales comprised of 11 OTUs affiliat- from %G+C fractions 30–35 and 35–40 were shared with ing with Actinomyces sp., Microbacterium sp., Propionibacte- the unfractioned sample (V values). However, for two obs rium sp., Rhodococcus sp. and Rothia sp. (Figure 3). of the high %G+C content fractions with %G+C content from 55 to 65, the V values were considerably lower obs The unfractioned sample essentially resembled the %G+C (32–33%). When comparing the combined sequence data fractions 40–45 and 45–50 (Figure 2). In comparison to from the fractioned sample with the unfractioned sample, Page 3 of 13 (page number not for citation purposes) BMC Microbiology 2009, 9:68 http://www.biomedcentral.com/1471-2180/9/68 sequences affiliating with high G+C Gram-positive bacte- Amount of DNA (%) 1 2 7 14 18 15 14 14 9 4 1 0 ria, namely the phylum Actinobacteria, proportionally over in fraction sevenfold compared with cloning and sequencing of an 3.5 unfractioned sample. %) 3.0 e ( 2.5 A high amount of actinobacterial sequences recovered c an If the proportional amount of DNA in each fraction is d 2.0 n u taken into account in estimating the abundance of phyla, b a 1.5 e 28.5% of the sequences would affiliate with Actinobacteria. v elati 1.0 Since the %G+C profile fractions represent individual R 0.5 cloning and sequencing experiments, in which an equal 0.0 amount of clones were sequenced despite the different 20 25 30 35 40 45 50 55 60 65 70 75 80 proportional amounts of DNA within the fractions, quan- % G+C titative conclusions should be drawn carefully. However, PgFeeignrcouemrneitc 1 gDuaNnAin ep opolulse dc yfrtoosmin 2e 3p rhoefaillteh oy fs iunbtejescttinsal microbial %G+C fractions 50–70 were dominated by Actinobacteria, Percent guanine plus cytosine profile of intestinal comprising 41% of the total DNA in the original sample microbial genomic DNA pooled from 23 healthy sub- fractioned (Figures 1 and 2, Additional file 1). The %G+C jects. The amount of DNA is indicated as relative absorb- fractions 30–50 yield a similar phylotype distribution as ance (%) and the area under the curve is used for calculating the unfractioned library (Figure 2). These fractions, the proportional amount of DNA in the separate fractions accounting for 54% of the profiled DNA, are dominated (modified from Kassinen et al. [21]). by the Firmicutes (Clostridium clusters XIV and IV) (Figure 1 and 2). a higher percentage of sequences and OTUs in the unfrac- The relatively high proportion of actinobacterial tioned were shared. sequences (26.6%) and phylotypes (65) identified in the combined sequence data of the %G+C fractioned sample Shannon entropies of clone libraries of the %G+C profiled exceed all previous estimations. In a metagenomic study sample by Gill and colleagues [14], 20.5% of 132 16S rRNA The %G+C fractions 50–55 and 55–60 had comparatively sequences from random shotgun assemblies affiliated low Shannon entropies (Additional file 2), indicating with 10 phylotypes of Actinobacteria whereas no Bacter- lower diversity, and were abundant with bifidobacteria oidetes was detected. In accordance with our results, also a (Figure 2, Additional file 1). The peripheral %G+C frac- pyrosequencing study by Andersson and colleagues [16], tions and the %G+C fraction 45–50 with sequences affili- the Actinobacteria (14.6%), dominated by a few phylo- ating mainly with Clostridium clusters IV and XIV had types, outnumbered Bacteroidetes (2.5%). By contrast, in comparatively higher diversity according to Shannon most of the earlier published studies on human faecal entropies. The peripheral fraction from the low %G+C samples applying 16S rRNA gene amplification, cloning end (25–30% G+C content) contained a substantial pro- and sequencing, the relative amount of Actinobacteria has portion of Firmicutes that do not belong to the Clostridum been 0–6% of the detected intestinal microbiota [12,25- clusters IV and XIV. It had the highest Shannon entropy 33]. Thus, the proportion of sequences affiliating with (Additional file 2), indicating rich diversity, and did not Actinobacteria (3.5%) in the unfractioned sample analysed reach a plateau in the rarefaction curves (data not shown), in this study is comparable with previous estimations which means that more OTUs would have been likely to applying conventional 16S rRNA cloning and sequencing appear after further sequencing. without %G+C fractioning. Discussion Order Coriobacteriales abundant within Actinobacteria For a comprehensive evaluation of the human intestinal We observed that several clones in the high %G+C frac- microbiota, 16S rRNA gene clone libraries were con- tions (60–70% G+C content) were tricky to sequence due structed from a %G+C fractioned pooled faecal DNA sam- to extremely G+C rich regions. These clones turned out to ple of 23 healthy subjects followed by a sequence analysis be members of order Coriobacteriales, which have been of 3199 clones. Previously, only selected fractions of such rare or absent in earlier 16S rRNA gene -based clone librar- profiles have been sequenced and analysed. For method- ies of the intestinal microbiota. Over half of the actino- ological comparison, a 16S rRNA gene library of unfrac- bacterial OTUs in our study belonged to the order tioned DNA from 22 individuals representing the same Coriobacteriales. Harmsen et al. [34] earlier suggested that subject group was also constructed. The %G+C fractioning applications based on 16S rRNA gene cloning as well as prior to cloning and sequencing enhanced the recovery of other methods of molecular biology may overlook the Page 4 of 13 (page number not for citation purposes) BMC Microbiology 2009, 9:68 http://www.biomedcentral.com/1471-2180/9/68 CFilgaduorger a2m and abundance plot of the phylogenetic affiliation of the 481 OTUs comprising 3658 sequences Cladogram and abundance plot of the phylogenetic affiliation of the 481 OTUs comprising 3658 sequences. The grey scale indicates the OTU abundance in the %G+C fraction libraries and in the unfractioned library. Actinobacteria are abun- dant in the high %G+C fractions (in square brackets). Acidobacteria and Verrucomicrobia phylotypes are denoted with a cross. A phylotype having 79% affiliation with Proteobacteria is indicated with an open circle. Phylotypes having 100% affiliation with Cyanobacteria, and 94% affiliation with TM7 with RDPII Classifier [55] are indicated with a black sphere. Page 5 of 13 (page number not for citation purposes) BMC Microbiology 2009, 9:68 http://www.biomedcentral.com/1471-2180/9/68 Table 2: Phylogenetic affiliation of OTUs and sequences of the %G+C fractioned libraries and the unfractioned library. Library Fractioned G+C 25–75% Unfractioned Group OTUs Sequences OTUs Sequences n (%) n (%) n (%) n (%) Phylum Firmicutes 323 (71.0) 2190 (68.5) 113 (86.3) 428 (93.2) Clostridium cluster IV 107 (23.5) 753 (23.5) 36 (27.5) 131 (28.5) Clostridium cluster XIV 131 (28.8) 1057 (33.0) 52 (39.7) 233 (51.0) Enterococcaceae 2 (0.4) 5 (0.2) 0 (0) 0 (0) Lactobacillaceae 4 (0.9) 34 (1.1) 0 (0) 0 (0) Staphylococcaceae 2 (0.4) 2 (0.1) 0 (0) 0 (0) Streptococcaceae 6 (1.3) 20 (0.6) 2 (1.5) 5 (1.1) Other Firmicutes 71 (15.6) 311 (9.7) 22 (16.8) 58 (12.6) Phylum Actinobacteria 65 (14.3) 851 (26.6) 8 (6.1) 16 (3.5) Actinomycetales 10 (2.2) 24 (0.8) 0 (0) 0 (0) Bifidobacteriales 17 (3.7) 398 (12.4) 5 (3.8) 11 (2.4) Coriobacteriales 38 (8.4) 429 (13.4) 3 (2.3) 5 (1.1) Phylum Bacteroidetes 37 (8.1) 99 (3.1) 8 (6.1) 13 (2.8) Phylum Proteobacteria 24 (5.3) 42 (1.3) 1 (0.8) 1 (0.2) Alphaproteobacteria 3 (0.7) 6 (0.2) 0 (0) 0 (0) Betaproteobacteria 9 (2.0) 16 (0.5) 0 (0) 0 (0) Deltaproteobacteria 5 (1.1) 11 (0.3) 0 (0) 0 (0) Gammaproteobacteria 7 (1.5) 9 (0.3) 1 (0.8) 1 (0.2) Other phylaa 6 (1.3) 17 (0.5) 1 (0.8) 1 (0.2) Sum 455 3199 131 459 a. Affiliation with Acidobacteria, Cyanobacteria, TM7 and Verrucomicrobia presence of the family Coriobacteriaceae in the human GI faecal microbiota of obese subjects [32]. This indicates tract and they designed a group-specific probe for Atopo- that more detailed data are required to judge the role of bium (Ato291), covering most of the Coriobacteriaceae, the Actinobacteria in health and disease. Coriobacterium group. Using Ato291, the abundance of detected intestinal cells in fluorescence in situ hybridiza- Methodological observations tion (FISH) is up to 6.3%. [6,7,35,36]. Recently, Khacha- When the %G+C gradient is disassembled, the fractions tryan and colleagues [8] did not detect any Actinobacteria with the highest G+C content are collected last, making from the 16S rRNA gene clone libraries of healthy subjects them most susceptible to turbulence. This phenomenon but the abundance with FISH using Ato291 was 7%. The together with possible remnants of DNA from previously authors suggested that constant underestimation of the collected fractions could have caused the bias of a high G+C Gram-positive bacteria might lead to misunder- decrease in high G+C Actinobacteria and an increase in low standing their role in the healthy and diseased gut. G+C Firmicutes observed in fractions %G+C 65–75. These fractions, however, comprise only 5.5% of the total DNA, There are some data suggesting that the members of Cori- making the observed bias less important. Regarding faecal obacteriaceae may be indicators of a healthy GI microbiota. DNA extraction, the method used here was rather rigor- Subjects with a low risk of colon cancer have been ous, allowing efficient DNA isolation also from more observed to have a higher incidence of Collinsella aerofa- enduring Gram-positive bacteria. This might lower the rel- ciens than subjects with a high risk of colon cancer [37]. ative amount of DNA from more easily lysed Gram-nega- Furthermore, when faecal 16S rRNA gene sequences from tive bacteria and thus explain the comparatively low metagenomic libraries of Crohn's diseased and healthy amount of Bacteroides in both of the samples. Moreover, subjects were compared, the Atopobium group was more the relative share of Bacteroidetes phyla may be affected by prevalent and the groups designated "other Actinobacteria" the delay and temperature of freezing. In a real-time PCR were exclusively detected in healthy subjects' samples study, a decrease of 50% in the Bacteroides group was [11]. A lower abundance of a C. aerofaciens-like phylotype observed in faecal sample aliquots frozen in -70°C within within the Atopobium group has been associated with IBS 4 h compared to samples that were immediately snap-fro- subjects' samples [21]. Diminished amount of Atopobium zen in liquid nitrogen (Salonen et al., personal communi- group bacteria is also associated with patients with Medi- cation). In our study, the samples were transported within terranean fever [8]. On the other hand, increased amount 4 h of the defecation and stored at -70°C. of Actinobacteria have recently been associated with the Page 6 of 13 (page number not for citation purposes) BMC Microbiology 2009, 9:68 http://www.biomedcentral.com/1471-2180/9/68 PFhigyulorgee n3etic tree of actinobacterial OTUs in the fraction libraries and in the unfractioned library Phylogenetic tree of actinobacterial OTUs in the fraction libraries and in the unfractioned library. The amount of sequences in the representative OTUs are denoted after the letter F (fractioned sequence libraries) and U (unfractioned library). Bootstrap values are percentages of 100 resamplings and the scale bar represents 0.05 substitutions per nucleotide position. Page 7 of 13 (page number not for citation purposes) BMC Microbiology 2009, 9:68 http://www.biomedcentral.com/1471-2180/9/68 Table 3: Results from library comparisons with SONS [24]. underestimation of high G+C gram positive bacteria, have hidden the importance of these bacteria in a healthy gut. Library A Unfractioned U a V b A c B d obs obs otu_shared otu_shared The phyla Actinobacteria were the second most abundant phyla detected in the %G+C fractioned sample consisting Library B Fr G+C 25–30% 0.41 0.40 0.22 0.34 mainly of sequences affiliating with mainly Coriobacte- riaceae. Library B Fr G+C 30–35% 0.59 0.83 0.40 0.56 Library B Fr G+C 35–40% 0.67 0.82 0.44 0.64 Methods Study subjects Library B Fr G+C 40–45% 0.72 0.75 0.45 0.51 The faecal samples were collected from 23 healthy donors (females n = 16, males n = 7), with an average age of 45 Library B Fr G+C 45–50% 0.62 0.63 0.33 0.40 (range 26–64) years, who served as controls for IBS stud- ies [21,38-40]. Exclusion criteria for study subjects were Library B Fr G+C 50–55% 0.34 0.64 0.20 0.40 pregnancy, lactation, organic GI disease, severe systematic disease, major or complicated abdominal surgery, severe Library B Fr G+C 55–60% 0.18 0.33 0.13 0.34 endometriosis, dementia, regular GI symptoms, antimi- crobial therapy during the last two months, lactose intol- Library B Fr G+C 60–65% 0.44 0.32 0.17 0.36 erance and celiac disease. All participants gave their written informed consent and were permitted to withdraw Library B Fr G+C 65–70% 0.68 0.53 0.39 0.39 from the study at any time. Library B Fr G+C 70–75% 0.69 0.67 0.42 0.47 Faecal DNA samples Faecal samples were immediately stored in anaerobic con- Library B Fr G+C 25–75%e 0.92 0.60 0.81 0.26 ditions after defecation, aliquoted after homogenization and stored within 4 h of delivery at -70°C. The bacterial a. Fraction of sequences observed in shared OTUs in library A b. Fraction of sequences observed in shared OTUs in library B genomic DNA from 1 g of faecal material was isolated c. Fraction of shared OTUs in library A according to the protocol of Apajalahti and colleagues d. Fraction of shared OTUs in library B [41]. Briefly, undigested particles were removed from the e. The combined G+C fractions faecal material by three rounds of low-speed centrifuga- Abundance of Actinobacteria in the faeces of Scandinavian tion and bacterial cells were collected with high-speed (Finnish and Swedish) subjects has been discovered inde- centrifugation. The samples were then subjected to five pendent of the methodology; the techniques used include freeze-thaw cycles, and the bacterial cells were lysed by %G+C profiling and 16S rDNA gene cloning (this study), enzymatic (lysozyme and proteinase K) and mechanical FISH coupled with flow cytometry [7] and pyrosequenc- (vortexing with glass beads) means. Following cell lysis, ing [16]. These findings may suggest existence of demo- the DNA was extracted and precipitated. graphic similarities among Scandinavians, which could be caused by environmental or genetic factors and that are Percent guanine plus cytosine fractioning and purification not obscured by methodological bias of DNA extraction, of fractions primers and PCR conditions used. The faecal microbial DNA of 23 healthy individuals was pooled, and genomic DNA fractions were separated with Conclusion 5% intervals on the basis of %G+C content using caesium The results further confirm that %G+C fractioning is an chloride-bisbenzimidazole gradient analysis described in efficient method prior to PCR amplification, cloning and previous studies [21,41]. The gradient was disassembled sequencing to obtain a more detailed understanding of into %G+C fractions with 5 G+C% intervals using per- the diversity of complex microbial communities, espe- fluorocarbon (fluorinert) as a piston. In the procedure, cially within the high genomic %G+C content region. This the highest %G+C fraction is collected last, exposing it to is proven by the proportionally greater amount of OTUs the most turbulence. The DNA quantification during the and sequences affiliating with the high G+C Gram-posi- dismantlement was based on A , as described by Apa- 280 tive phylum Actinobacteria in the 16S rRNA gene clone jalahtiand colleagues [41], to avoid background. The DNA libraries originating from a %G+C-profiled and -frac- fractions were desalted with PD-10 columns according to tioned faecal microbial genomic DNA sample compared the manufacturer's instructions (Amersham Biosciences, with a sample cloned and sequenced without prior %G+C Uppsala, Sweden). For the unfractioned DNA sample, fae- profiling. The clone content obtained from the unfrac- cal microbial DNA of the same healthy individuals was tioned library is in accordance with many previous clone pooled (n = 22; there was an insufficient amount of faecal library analyses and thus suggests that the potential DNA left for one of the individuals). Page 8 of 13 (page number not for citation purposes) BMC Microbiology 2009, 9:68 http://www.biomedcentral.com/1471-2180/9/68 Amplification of the 16S rRNA genes, cloning and For the ligation reaction, 2 μl of the reaction mixture used sequencing for adding adenine overhangs to the amplicons was used The 16S rRNA gene from each of the seven DNA fractions as an insert. The ligation reaction was incubated overnight was amplified, cloned and sequenced, as in the study by at 4°C. The plasmids were isolated and purified from the Kassinen and colleagues [21]. To maximize the recovery E. coli culture using MultiScreen (Millipore, Billerica, HTS of different phylotypes, two universal primer pairs were MA, USA), and aliquots were stored in -80°C. used independently for all samples. The first primer pair corresponded to Escherichia coli 16S rRNA gene positions The cloned inserts were amplified from the pDrive plas- 8–27 and 1492–1512, with sequences 5'-AGAGTTTGATC- mids using M13 forward 5'-GTAAAACGACGGCCAGT-3' CTGGCTCAG-3' [42] and 5'-ACGGCTACCTTGTTAC- and M13 reverse primers 5'-AACAGCTATGACCATG-3', GACTT-3' [43], respectively. The second primer pair visualized on a 1% agarose gel, stained with ethidium bro- corresponded to E. coli 16S rRNA gene positions 7–27 and mide and purified using a MultiScreen PCR Filter Plate 384 1522–1541, with sequences 5'-GAGAGTTTGATYCT- (Millipore, Billerica, MA, USA). Sequencing of the 5'-end GGCTCAG-3' and 5'-AAGGAGGTGATCCARCCGCA-3' of 16S rDNA clones was performed with primer pD' 5'- [44], respectively. The 50-μl PCR reactions contained 1 × GTATTACCGCGGCTGCTG-3' corresponding to the E. coli DyNAzyme™ Buffer (Finnzymes, Espoo, Finland), 0.2 16S rRNA gene position 536-518 [45]. Near full-length mM of each dNTP, 50 pmol of primers, 1 U of sequencing was performed on one representative of each DyNAzyme™ II DNA Polymerase (Finnzymes, Espoo, Fin- OTU showing less than 95% similarity to any EMBL land), 0.125 U of Pfu DNA polymerase (Fermentas, Viln- nucleotide sequence database entry. For this purpose, ius, Lithuania) and 10 μl of desalted fractioned DNA primers pF' 5'-ACGAGCTGACGACAGCCATG-3' [45] and template (containing less than 2 ng/μl of DNA) or pooled pE 5'-AAACTCAAAGGAATTGACGG-3' [46], correspond- extracted DNA from the faecal samples. The thermocy- ing to E. coli 16S rRNA gene positions 1073-1053 and cling conditions consisted of 3 min at 95°C, followed by 908–928, respectively, were used. Sequencing of the prod- a variable number of cycles of 30 s at 95°C, 30 s at 50°C, ucts was performed with the BigDye terminator cycle 2 min at 72°C and a final extension of 10 min at 72°C. sequencing kit (Applied Biosystems, Foster City, CA, The number of PCR cycles used for each fraction was opti- USA). For templates that failed to be sequenced due to mized to the minimum amount of cycles which resulted high G+C content, 1% (v/v) of dimethyl sulfoxide was in a visually detectable band of the PCR product on ethid- added to the reaction mixture. The sequencing products ium bromide stained agarose gel. A protocol of 27, 20, 25 were cleaned with Montage SEQ plates (Millipore, Bill- 96 and 30 cycles was applied to %G+C fraction 25–30, 30– erica, MA, USA) and run with an ABI 3700 Capillary DNA 60, 60–65 and 65–75, respectively. The 16S rRNA gene Sequencer (Applied Biosystems, Foster City, CA, USA). from the unfractioned pooled faecal DNA sample was amplified using 20 PCR cycles. The amplifications were Sequence analysis and alignment performed using 15 reactions, and the products were Sequences were checked manually utilizing the Staden pooled, concentrated using ethanol precipitation, and Package pregap4 version 1.5 and gap v4.10 assembly pro- eluted with 50 μl of deionized MilliQ water (Millipore, grams [47], and primer sequences were removed. Billerica, MA, USA). Sequences that occurred in more than one clone library were considered non-chimeric. Revealing the potential The precipitated PCR products were purified with the chimeras was also performed by manually browsing the QIAquick PCR Purification Kit (Qiagen, Hilden, Ger- ClustalW 1.83 sequence alignment [48] with Bio Edit ver- many), or using the QIAquick Gel Extraction Kit (Qiagen, sion 7.0.5.3 [49] and for the near full-length sequences Hilden, Germany) after excising from 1.25% SeaPlaque using Ribosomal Database Project II Chimera Check [50]. agar (Cambrex, East Rutherford, NJ, USA), and eluted in Sequences from %G+C fractions 25–30, 40–45 and 55– 35 μl of elution buffer. The concentration of the purified 60 with accession numbers AM275396-AM276371 [21] amplicons was estimated with serially diluted samples on were added prior to further analyses. Sequences of all frac- 0.8% agarose gels with ethidium bromide staining. To tions and the unfractioned sample were aligned separately enhance the cloning efficiency, adenine overhangs were with ClustalW 1.83 [48] using the FAST DNA pair-wise added to the amplicons as follows: The two purified alignment algorithm option (Gap penalty 3, Word size 4, inserts were mixed in a 1:1 molecular ratio (the reaction Number of top diagonals 1 and Window size 1) and cut mixture thus contained 10–30 ng/μl DNA) and incubated from E. coli position 430 (totally conserved GTAAA) with in a volume of 20 μl with 1 × DyNAzyme™ Buffer BioEdit version 7.0.5.3 [49]. The lengths of the align- (Finnzymes, Espoo, Finland), 0.2 mM dNTPs and 0.4 U of ments of the fractioned sample and the unfractioned sam- DyNAzyme™ II DNA Polymerase (Finnzymes, Espoo, Fin- ple were 478 and 457 base pairs, respectively. The 16S land) for 40 min at 72°C. The cloning was performed rRNA variable regions V1 and V2 were included in the with the QIAGEN® PCR Cloning plus Kit (Qiagen, Hilden, alignments. The variable regions V1 and V2 have been Germany) according to the manufacturer's instructions. demonstrated to be sufficient to reflect the diversity of a Page 9 of 13 (page number not for citation purposes) BMC Microbiology 2009, 9:68 http://www.biomedcentral.com/1471-2180/9/68 human GI clone library [51]. The alignments were visually ales and Actinomycetales, sequences with nearest FASTA inspected, but they were not edited manually to avoid EMBL Prokaryote search (all >98% similarity), and for subjectivity and to maintain reproducibility of the align- Coriobacteriales sequences with nearest FASTA EMBL ments. From the cut alignments, distance matrices were prokaryote and environmental database searches (>85% created with Phylip 3.66 Dnadist [52] using Jukes-Cantor and >91%, respectively), were selected and aligned correction. together with OTU representative sequences. Sequences from the European ribosomal RNA database representing Determination of OTUs and library coverage Actinobacteria and Clostridium leptum (AF262239) were The sequences were assigned into OTUs according to the used as a reference in the profile alignment (Additional distance matrices using DOTUR [53], applying the fur- file 4). The alignment, distance matrix, and visualizing thest neighbour rule option in which all sequences within was done as described above. A bootstrap analysis of hun- an OTU fulfil the similarity criterion with all the other dred replicates was performed using seqboot and con- sequences within the OTU. The 98% cut-off for sequence sense programs of Phylip 3.66 [52]. similarity was used to delimit an OTU. The coverage of the clone libraries was calculated with the formula of Good To describe whether the phylogenies of the combined [23] to evaluate the adequacy of amount of sequencing. sequence data from the fractioned libraries and the The Fasta EMBL Environmental and EMBL Prokaryote unfractioned library were significantly different, the Uni- database searches [54] and Ribosomal Database Project II Frac Significance analysis was applied for each pair of (RDP II) Classifier Tool [55] were used to affiliate phylo- environments using abundance weights [58]. The UniFrac types. Lineage-specific analysis was used to break the tree up into the lineages at a specified distance from the root, and to Phylogenetic analysis test whether any particular group differed between the For the phylogenetic analysis, all sequences from the sample libraries [58]. The phylogenetic tree for the analy- %G+C fractioned sample and the unfractioned sample ses was constructed from OTU representative sequences were aligned and designated into OTUs with a 98% cut- determined separately for the combined fractioned librar- off as described above. A representative sequence of each ies and for the unfractioned library as described above, OTU and unaligned reference sequences representing dif- with the exception that in the profile alignment a root ferent clostridial groups (Additional file 3) were aligned sequence (Methanobrevibacter smithii AF054208) was with ClustalW 1.83 using the SLOW DNA alignment algo- added and left to the alignment. rithm option (Gap penalty 3, Word size 1, Number of top diagonals 5 and Window size 5) and cut from the E. coli Comparison of individual libraries using SONS position 430 (totally conserved GTAAA) with BioEdit ver- The microbial community composition differences sion 7.0.5.3[49]. For a profile alignment, 16S rRNA refer- between libraries of individual %G+C profile fractions ence sequences, aligned according to their secondary and the unfractioned sample were analysed using SONS structure, were selected from the European ribosomal [24], which calculates the fraction of sequences observed RNA database [56] (Additional file 4) so that they would in shared OTUs in each library (U and V ) and the obs obs represent the overall diversity of the faecal microbiota, observed fraction of shared OTUs in each library including the most common clostridial 16S rRNA groups (A and B ). For the SONS analyses, an otu_shared otu_shared expected, and sequences closely related to the OTUs com- alignment with all of the sequences from the clone librar- posed of over 20 sequences. The sequences in this study ies of the fractioned sample and the unfractioned sample were profile-aligned against the European ribosomal RNA was created, and a distance matrix was calculated as database secondary structure-aligned sequences using described above in the Sequence analysis and alignment sec- ClustalW 1.83 profile alignment mode and the SLOW tion. DNA alignment algorithm option (Gap penalty 3, Word size 1, Number of top diagonals 5 and Window size 5). Shannon entropies of clone libraries of the %G+C profiled The reference sequences were then deleted from the align- sample ment with BioEdit version 7.0.5.3 [49], and the alignment To compare the diversity of the clone libraries derived was cut at the E. coli position 430 (totally conserved from the fractioned sample, OTUs were also determined GTAAA). A phylogenetic tree with a representative using a Bayesian clustering method [59], followed by the sequence from each OTU was generated with a neighbour- estimation of Shannon entropies with a standard Baye- joining algorithm from a Jukes-Cantor-corrected distance sian multinomial-Dirichlet model. In the estimation, 100 matrix using Phylip 3.66 dnadist and neighbour [52]. The 000 Monte Carlo samples were used for each library under tree was visualized with MEGA4 [57]. a uniform Dirichlet prior [60]. The Shannon entropy value correlates with the amount and evenness of clusters A phylogenetic tree was constructed for the OTU repre- or phylotypes in a community sample, but disregards the sentatives of the phylum Actinobacteria. For Bifidobacteri- disparity between them [61]. The Bayesian clustering Page 10 of 13 (page number not for citation purposes)
Description: