Time‐series metatranscriptomes reveal conserved patterns between phototrophic and heterotrophic microbes in diverse freshwater systems

Microbial communities form the base of food webs in freshwater ecosystems, yet the interactions within these diverse assemblages are poorly understood. Based on evidence showing that primary production and respiration follow diurnal trends in lakes, we hypothesized that gene expression in freshwater microbes would have similar diel cycles. We used three 2‐d time series of metatranscriptomes to test this hypothesis in a eutrophic lake, an oligotrophic lake, and a humic lake. We identified prominent diel cycles in all three lakes, particularly in genes related to photosynthesis, sugar transport, and carbon fixation. The maximal time of expression for sugar transport genes tended to trail that of photosynthesis genes by several hours, indicating possible metabolic exchange between co‐occurring microbial lineages. These results provide an initial step in understanding sophisticated multispecies transcriptional organization within freshwater microbial communities.

Many of the core ecosystem functions in freshwater lakes are driven by microbial communities. While the impact of each cell is miniscule, their collective actions form a dynamic, interconnected community whose emergent functions are visible on the ecosystem-level (Sunagawa et al. 2015;Goldford et al. 2018). Previous research in a wide range of freshwater ecosystems indicates that diurnal cycles drive photosynthesis, respiration, and dissolved organic matter (DOM) concentrations (Kaplan and Bott 1989;Bertilsson and Jones 2003;Solomon et al. 2013). This implies metabolic interactions between the phototrophic (photosynthetic, also known as phytoplankton) and heterotrophic (nonphotosynthetic, also known as bacterioplankton) microbial communities. We hypothesized that these diel trends would be reflected in gene expression. Therefore, we investigated the mechanisms of specific community interactions by studying the timing of gene expression across the community. To this end, we produced three 2-d time series of metatranscriptomes from three lakes with contrasting biogeochemistry. We hypothesized that diel trends in gene expression occur in heterotrophs as well as phototrophs due to the direct impacts of sunlight (such as photosynthesis and reactive oxygen species [ROS]) and its indirect effects (such as metabolite exchange), regardless of the features that distinguish individual lakes.
Previous metatranscriptomic work in marine and freshwater systems has highlighted multispecies diel trends that may underpin metabolic links between phototrophic and heterotrophic microbes. One metatranscriptomic study in a phosphorus-limited mountain lake found differential gene expression between day and night in both phototrophs and heterotrophs, particularly in energy acquisition pathways and pyrophosphatase (Vila-Costa et al. 2013). Another study in marine systems also observed enhanced expression of energy acquisition pathways during the day and higher expression of biosynthesis and housekeeping pathways at night (Poretsky et al. 2009). Strong diel patterns in phototrophic gene expression followed by a cascade of heterotrophic gene expression have also been observed in marine systems (Ottesen et al. 2014). Furthermore, patterns in transcriptional networks of gene expression were consistent in two different regions of the Pacific Ocean, potentially indicating that linkages between phototrophs and heterotrophs are a generalizable feature of marine microbial communities (Aylward et al. 2015). These studies suggest that diel transcriptional trends may be a universal characteristic of both phototrophs and heterotrophs in aquatic microbial communities.
Other methods also suggest strong connections between phototrophs and heterotrophs in aquatic ecosystems. Cocultures of phototrophic algae and heterotrophic bacteria are often stable over time, indicating mutualistic interactions, although competition or predation has also been described in the laboratory (Cole 1982;Posch et al. 1999;Pernthaler et al. 2001). In both marine and freshwater systems, the composition of phototrophic and heterotrophic communities is inextricably linked (Verity et al. 1999;Teeling et al. 2012;Paver et al. 2013Paver et al. , 2015. Perturbations in one portion of these communities have been shown to quickly ripple through the rest (Šimek et al. 2002;Kent et al. 2006;Sjöstedt et al. 2012). One potential mechanism that can explain these trends is DOM exuded by phototrophs (Moran et al. 2016). This DOM is to a large extent composed of low-molecular weight compounds, such as sugars, amino acids, organosulfur compounds, carboxylic acids, and alditols (Hellebust 1965;Maršálek and Rojí cková 1996;Fiore et al. 2015). Approximately 20% of photosynthetic carbon is released extracellularly in marine and freshwater systems (Bertilsson and Jones 2003). Although factors causing DOM release are not fully understood, this DOM likely supports a substantial portion of the heterotrophic community. In fact, some ubiquitous freshwater bacteria, such as Limnohabitans, appear to specialize in algal-derived DOM (Simek et al. 2011).
Exposure to solar radiation may be a major factor driving gene expression in aquatic ecosystems. Beyond the canonical oxygenic photosynthesis, the presence of opsins, extensively documented in both freshwater and marine heterotrophs, may also lead to cycles of diel gene expression in microbes (Atamna-Ismaeel et al. 2008;Pinhassi et al. 2016). Even without opsins, some freshwater microbes such as Actinobacteria may sense light in order to optimally time uptake and catabolism of organic substrates (Maresca et al. 2019). Photodegradation of complex DOM into more labile forms is another potential mechanism that could drive diel trends in heterotrophs (Jorgenson et al. 1998;Bertilsson and Tranvik 2000). Sunlight also causes oxidative stress, and heterotrophs may time their metabolisms to minimize the impact of such harmful conditions (Sommaruga et al. 1997). Additionally, photoheterotrophic microbes can also exist outside the categories of phototroph and heterotroph, such as those that perform aerobic anoxygenic photosynthesis (AAP). These microbes use sunlight for energy, but do not fix carbon (Eiler 2006;Martinez-Garcia et al. 2012).
To identify generalizable diel interactions in freshwater microbial communities, we sequenced metatranscriptomes from the epilimnia of three freshwater lakes representing oligotrophic, eutrophic, and dystrophic (humic) lake types. These metatranscriptomes form a 2-d time series for each lake, with samples collected every 4 h. We additionally sequenced metagenomes and single-cell amplified genomes (SAGs) from each lake to generate highly specific references, allowing us to obtain higher quality annotations and classifications than possible through read-based annotations. We observed diel trends in both phototrophs and heterotrophs and were able to propose biotic and abiotic mechanisms for these trends based on annotations of expressed genes. Although different taxa and genes were expressed in the three lakes studied, we identified diel trends in all sites, particularly in genes related to photosynthesis and sugar transport, suggesting that common transcriptional patterns exist in these disparate systems.

Study design and in situ measurements
Three lakes in Wisconsin, U.S.A. were chosen for this study based on their different trophic status: oligotrophic (Sparkling Lake), eutrophic (Lake Mendota), and humic (Trout Bog Lake) ( Table 1). Lake Mendota is located in Madison, WI, U.S.A., while Trout Bog and Sparkling Lake are located in Boulder Junction, WI, U.S.A., approximately 350 km north of Madison. These lakes were chosen because they are core sites of the North Temperate Lakes-Long Term Ecological Research (NTL-LTER) program. Therefore, they have a rich context of historical environmental data, and automated sensor platforms were deployed at all three sites at the time of sampling. Previous microbial studies have been performed in all three lakes, providing reference genomes specific to each site (Ghylin et al. 2014;Bendall et al. 2016;Linz et al. 2018).
Hereafter, we provide brief summaries of our methods; full protocols are available in Supporting Information Document S1. The epilimnion (top thermal layer) of each lake was sampled 12 times at 4-h intervals in July 2016. We used an instrumented sonde (Hydrolab DS5X, OTT Hydromet, Kempten, Germany) equipped with sensors for temperature, dissolved oxygen concentrations, pH, conductivity, and turbidity to collect measurements of the epilimnion. Photosynthetically active radiation (PAR) was also measured at this time using a PAR meter (Li-Cor, Lincoln, NE, U.S.A.). Secchi depth was measured once per lake during the time series.
At each timepoint, we collected an integrated water sample of the epilimnion. The sampling depth was chosen based on the location of the thermocline on the day prior to initiation of the 2-d time series in each lake. To collect RNA, water from the integrated epilimnion sample was pumped through four 0.22-μm polyethylene filters (Pall, Port Washington, NY, U.S.A.). Filters were flash-frozen in liquid nitrogen in the field and stored at −80 C until extraction and sequencing. Additional samples were collected for metagenomic sequencing, single cell sequencing, total and dissolved nitrogen and phosphorus concentrations, chlorophyll concentrations, and bacterial production assays using 14 C-leucine (Chin-Leo and Kirchman 1988).

RNA extraction
Samples were lysed with ethylenediaminetetraacetic acid (EDTA) and sodium dodecyl sulfate (SDS) and incubated at 65 C, then subjected to bead-beating (FastDNA Spin Kit for Soil, MP Biomedicals, Santa Ana, CA, U.S.A.) with TRIzol (Thermo-Fisher, Waltham, MA, U.S.A.). An internal standard-an in vitro transcription of the cloning plasmid pFN18A-was added to samples after beadbeating (Satinsky et al. 2013). Phenol:chloroform was used to isolate RNA from the lysate. Purified RNA was precipitated in ethanol, pelleted, and resuspended in nuclease-free water. The RNA was further purified using an RNeasy kit (Qiagen, Hilden, Germany) with an on-column DNAse digestion.

Additional lab-based measurements
Chlorophyll was extracted with methanol from frozen filters and subsequently acidified to measure pheophytin. Total and dissolved nitrogen and phosphorus were measured with a colorimetric autoanalyzer. DNA was extracted using a phenol: chloroform protocol. Four additional DNA samples collected from Sparkling Lake in a similar manner in 2009 used as additional references for this lake.

Reference genomes
Single amplified genomes (SAGs) were generated following the Department of Energy Joint Genome Institute's (JGI) standard protocol (Rinke et al. 2014). Briefly, individual cells were sorted using an Influx flow cytometer (BD Biosciences) and treated with Ready-Lyse lysozyme (Epicenter; 5 U μL −1 final concentration) for 15 min at room temperature. Next, cell lysis and whole-genome amplification were performed with the REPLI-g Single Cell Kit (Qiagen) in 2 μL reactions. Lysis and stop reagents from the REPLI-g kit received UV treatment to remove potential DNA contamination (Woyke et al. 2011). Cells for SAG sequencing were chosen with a preference for Sparkling Lake, the least well-represented lake in our preexisting reference genome collection. An Illumina shotgun library was constructed from each single cell and sequenced on the Illumina NextSeq platform (Illumina, San Diego, CA, U.S.A.). Sequencing reads were filtered using BBTools (Bushnell et al. 2014) and assembled into SAGs using SPAdes (Bankevich et al. 2012).
Metagenomes were prepared for sequencing using the KAPA-Illumina library creation kit (KAPA Biosystems). Table 1. Comparison of Sparkling Lake, Lake Mendota, and Trout Bog. These three lakes were chosen for comparative metatranscriptomics because of their varying trophic statuses, extensive historical data, and previous microbial sampling. Data on surface area, maximum depth, dissolved organic carbon, and development on shoreline courtesy of NTL-LTER (https://lter.limnology.wisc.edu). Temperature, dissolved oxygen, pH, and conductivity were measured using a HydroLab DS5x Sonde and are averaged over all sampling depths and timepoints for each lake. Chlorophyll and pheophytin concentrations were measured from the integrated epilimnion samples using a methanol extraction protocol and averaged over all timepoints. Secchi depth was measured at the first timepoint for each lake. Bacterial production was quantified via 14 C-leucine incorporation and averaged over all timepoints. Total and dissolved nitrogen and phosphorus concentrations were measured via colorimetric HPLC; concentrations are within the typical ranges of these lakes. Due to thunderstorms during the night of July 8 th , the final 01:00 h timepoint in Sparkling Lake was collected on July 9 th instead.

Lake Mendota
Trout Bog Sparkling Lake Metagenomes were sequenced on the Illumina HiSeq platform utilizing a TruSeq paired-end cluster kit (Illumina, San Diego, CA, U.S.A.), producing paired ends of 150 bp (2 × 150). Quality filtering was performed on the resulting reads before assembly. BBDuk adapter trimming was used to remove known Illumina adapters (Bushnell et al. 2014). Read ends were trimmed where quality values were < 12. Read pairs containing more than three "N," or with quality scores (before trimming) averaging < 3 over the read, or length under 51 bp after trimming were discarded. Filtered reads were assembled using MegaHit (Li et al. 2016) with a range of kmers (-k-list 23, 43, 63, 83, 103, 123) and otherwise default settings. Assembled metagenomic contigs, newly sequenced SAGs, genomes from previous McMahon Lab time series sequencing on these lakes (Martinez- Garcia et al. 2012;Ghylin et al. 2014;Garcia et al. 2018;Linz et al. 2018), and freshwater algal genomes from NCBI RefSeq (Pruitt and Maglott 2001) were used to build a nonredundant, highly specific database for subsequent mapping of metatranscriptomic reads (Supporting Information Table S1). After formatting each type of genome or contig's fastq and gff files, coding regions were extracted and clustered at 97% identity using CD-HIT (Huang et al. 2010). The longest gene in each cluster (used as the seed sequence to generate each cluster) was chosen as the representative sequence and used as the mapping reference.
Individual metagenome assemblies were binned using MaxBin version 2.2.4 (Wu et al. 2014) and checked for completeness and contamination using CheckM version 1.0.10 (Parks et al. 2015). Many bins were incomplete, and many contigs were too short to classify using the phylogeny of single-copy genes; therefore, we aggregated the taxonomic assignments of coding regions in each contig/bin to classify these sequences. Coding regions were first classified based on their best hit in the Integrated Microbial Genomes database (IMG, accessed January 2017) (Markowitz et al. 2012). Bins and unbinned contigs from the metagenome assemblies were classified by taking the consensus taxonomy of the best hit for each coding region on a contig/bin using inhouse McMahon Lab scripts. If contigs or bins were too short to classify or had conflicting coding region classifications, we assigned no classification. In these cases, each coding region retained the IMG-derived classification of its best hit.

Metatranscriptomics
Three samples from each timepoint were submitted for metatranscriptomic sequencing at the JGI. Some samples failed to pass quality control standards and were replaced by the fourth filter from that timepoint. When we were unable to sequence three filters for a timepoint, we instead sequenced a fourth replicate for another timepoint, resulting in between one and four replicate metatranscriptomes for each timepoint (Supporting Information  Table S2). Ribosomal RNA was depleted using the Illumina Ribo-Zero rRNA Removal Kit, and samples were prepared for sequencing with the Illumina TruSeq Stranded Total RNA HT kit. Samples were sequenced using the Illumina HiSeq platform and a TruSeq paired-end clustering kit for paired-end, 150 bp sequencing (2 × 150). BBDuk adapter trimming was used to remove known Illumina adapters. Read ends were trimmed where quality values were < 12. Read pairs containing more than three "N," or with quality scores (before trimming) averaging < 3 over the read, or length under 51 bp after trimming were discarded. Ribosomal RNA reads were removed using SortMeRNA (Kopylova et al. 2012). Metatranscriptomic reads were competitively mapped to our nonredundant database with a 90% ID cutoff using BBMap and requiring at least 75% overlap with a gene feature. Mapped reads were tabulated using FeatureCounts (Liao et al. 2014).
Addition of an internal RNA standard allowed for both normalization of expressed reads to transcripts per liter and assessment of extraction success (Satinsky et al. 2013). Samples with either too few counts of the internal standard (< 50) or orders of magnitude higher expression of all genes after normalization when compared to replicates were discarded. After these quality control measures, 32 samples remained from Sparkling Lake, 30 from Lake Mendota, and 21 from Trout Bog. Nine samples from day 2 in the Trout Bog time series were not analyzed due to insufficient yield, resulting in two lost timepoints.

Statistics
The statistical software R was used for expression analysis (R Core Team 2018). Using the internal standard to determine normalization size factors, we converted read counts to units of transcripts per liter: all following read-based metrics are in transcripts per liter. To reduce noise in the dataset, the top 20,000 expressed genes in each lake were retained for differential expression analysis. From this subset, marker genes for metabolic processes were selected and aggregated by pathway. RAIN software was used to detect cyclic trends in gene expression (Thaben and Westermark 2014). Genes with p-values < 0.05 were considered cyclic. In addition to the abundance threshold imposed by taking the top 20,000 genes in each lake, we also required that genes included in the cyclic trend analysis have a coefficient of variance of at least 0.2. The maximal time of expression for each gene was determined by averaging metatranscriptomes taken at the same time of day over the 2-d time series. Results were plotted using the R packages ggplot2 (Wickham 2009) and cowplot (Wilke 2019).

Results
What genes were expressed?
We first asked which genes were most expressed in each lake across all timepoints (Supporting Information Fig. S1). Photosynthesis-related genes, particularly those relating to photosystem II P680, were highly expressed in all three lakes. Genes encoding ribulose-1,5-bisphosphate carboxylase (RuBisCO), the key enzyme in carbon fixation via the Calvin-Benson-Bassham pathway, were among the most highly expressed genes in Lake Mendota and Trout Bog. We also ran this analysis excluding genes associated with photoautotrophy and unannotated genes (Supporting Information Fig. S1). Many of the most highly expressed nonphotosynthetic genes in Lake Mendota belonged to Actinobacteria (acI-A and acI-B clades, both members of Actinomycetales), including translation elongation factors, a sodium:solute transporter, and a sugar transporter.
Which taxa were expressing genes?
We next aggregated expressed genes by taxonomic classifications to compare the most expressed taxa to the most abundant taxa based on metagenomic data (Fig. 1). We used the same reference database to map metatranscriptomes and metagenomes, making such comparisons possible. No positive trend between gene expression and taxonomic abundance was observed. At the phylum level, genes from Cyanobacteria, Bacteroidetes, and Actinobacteria were relatively highly expressed in all three lakes. Genes from Betaproteobacteria were highly expressed in Trout Bog and Sparkling Lake, while Verrucomicrobia genes were especially highly expressed in Trout Bog. Widely recognized abundant and/or cosmopolitan taxa were also present, and a significant portion of transcripts could be associated with groups recognized as being freshwater-specific and largely cosmopolitan "clades" (Newton et al. 2011). Where taxonomy was resolved to the clade level, we noted which clades contributed to observed transcripts. Genes from members of Actinobacteria acI-B (Actinomycetales) were expressed and abundant in all lakes, but especially Trout Bog. This is consistent with previous research identifying acI-B2 as an acidic lake specialist (Newton et al. 2007). The betIV-A clade (LD28, Methylophilales) contributed a surprising number of expressed genes in Lake Mendota, as did the alfV-A clade (Candidatus Fonsibacter, also known as LD12; Henson et al. 2018) and the acI-A clade (Actinomycetales). Genes from the bacIII-B clade (Sphingobacteriales) were highly expressed in Sparkling Lake, but had low abundance. Genes from members of betIV-A were abundant in Trout Bog, but not proportionally highly expressed. The Pnec clade (Polynucleobacter) genes, known to be endemic to bog lakes and particularly Trout Bog, were not as expressed or abundant as we had expected them to be (Jezberová et al. 2010;Linz et al. 2017). However, we noted many additional transcripts classified as Burkholderiales that may be Pnec, but could not be classified to the clade level,   Fig. 1. Abundance vs. expression by lake. To determine which phyla were most abundant or most expressed during our time series, we analyzed metagenomic and metatranscriptomic read counts. All read counts are reported in transcripts per liter. The expression of clustered, nonredundant genes was aggregated by phylum and compared to the coverage of those phyla in metagenomes and colored by kingdom (a-c). Axes are reported in proportion of reads assigned to each phylum across the time series. Genes that could not be classified into a phylum were not included in this analysis. Proteobacteria were split into classes due to the high diversity of this phylum. No positive relationship was observed between expression and abundance. We repeated this analysis at a finer resolution by investigating freshwater clades (d-f). Clades are color-coded by phylum to provide taxonomic context. likely due to the high diversity of this group (Jezberová et al. 2010;Hahn et al. 2016).

Assessing variability in freshwater metatranscriptomes
Because this study is among the largest metatranscriptomic sequencing efforts to date, we discuss the biological vs. technical variability observed in this data set to add to our knowledge of variability in environmental metatranscriptomics and to inform future study designs (Tsementzi et al. 2014). We used the coefficient of variation (CoV), i.e., the ratio of standard deviation to average expression (%), to compare the amount of variability within replicate samples to the variation observed across different timepoints (Supporting Information Fig. S2). Higher CoVs were observed across samples than within the replicates. Still, the upper limit for CoV within replicates approached 200%. This result highlights the importance of replication in metatranscriptomic studies.

Trends in environmental variables
We examined a suite of potentially relevant environmental variables to compare trends in these to the dynamic shifts observed in gene expression, expecting that several of these trends would be diel. PAR data were used to classify timepoints as night or day (Supporting Information Fig. S3). Parameters that reflect the boundaries between layers within the water column, such as dissolved oxygen, temperature, pH, and conductivity, were strongly diel in Lake Mendota, but less so in Sparkling Lake and Trout Bog (Supporting Information Fig. S4). Chlorophyll concentrations, often used as an indicator of primary production, were diel in Trout Bog, but not in the other two sites. Bacterial production, measured via 14 C-leucine incorporation, showed dynamics over the 2-d time series in all three lakes, although the trends were not diel (Supporting Information Fig. S5). No diel trends were observed in total and dissolved nitrogen or phosphorus concentrations.

Diel trends in gene expression
We used the RAIN software (Thaben and Westermark 2014) to reveal any cyclic trends with 24-h periods among our top 20,000 most abundant genes in each lake, with an additional requirement that genes must be sufficiently variable (CoV > 0.2) (Fig. 2). Many genes related to photosynthesis were cyclic in all lakes (Fig. 3). However, the time of maximum expression differed in each lake. In Lake Mendota, most photosynthesis genes were maximally expressed at 13:00 or 17:00, while in Sparkling Lake, the majority were most highly expressed at 13:00. In Trout Bog, 09:00 was the most common time of maximum expression for photosynthesis genes. Similarly, many genes related to sugar transport were cyclic across our study sites, but had different times of maximum expression. These genes were most expressed at 17:00 in Lake Mendota, 01:00 in Sparkling Lake, and 09:00 in Trout Bog. We also investigated cyclic trends in genes annotated as RNA polymerase subunits as a measure of general community activity. While less predominantly cyclic than the other categories discussed here, the time of maximum expression for RNA polymerase genes was typically midmorning (09:00 in Lake Mendota, 09:00 and 13:00 in Sparkling Lake, and 09:00 in Trout Bog). Many genes encoding subunits of the key carbon fixation enzyme ribulose-1, 6-bisphosphate carboxylase (RuBisCo) were cyclic in all three lakes and typically most expressed in the morning (09:00 in Lake Mendota, 05:00 in Sparkling Lake, and 09:00 in Trout Bog). Genes related to opsin synthesis were primarily expressed in Lake Mendota and Sparkling Lake, contained both cyclic and noncyclic trends, and peaked at 13:00 in Lake Mendota and 01:00 in Sparkling Lake. We used genes encoding photosynthetic reaction center M as a marker for AAP and found that these genes were predominantly cyclic and peaked at 01:00 in Lake Mendota and Sparkling Lake and 05:00 in Trout Bog. Notably, genes in many categories in Sparkling Lake showed peak expression at 01:00, while most maximal expression in Trout Bog occurred in the hours of 05:00-13:00.
We next investigated the taxonomic assignments of genes annotated as related to RNA polymerase, general sugar transport, and photosynthesis over time in each lake (Fig. 4). As expected, a diversity of freshwater microbes were expressing genes encoding RNA polymerase. Fewer unique taxa expressing RNA polymerase were observed in Sparkling Lake, likely due to the reduced number of available reference genomes for this lake. Cyanobacteria dominated expression of genes related to photosynthesis in Lake Mendota, while Trout Bog photosynthesis expression was derived from a mix of Cyanobacteria and Eukaryota (most likely algae). Genes related to photosynthesis expression in Sparkling Lake were largely unclassified, although some were assigned to Cyanobacteria. Expression of genes related to general sugar transport varied between lakes, both in timing and taxonomy. In Lake Mendota, this category was derived mainly from Cyanobacteria and Actinobacteria. Actinobacteria were also well represented in Trout Bog general sugar transport expression, as were Betaproteobacteria. In Sparkling Lake, genes related to general sugar transport were classified as Actinobacteria, Armatimonadetes, and Betaproteobacteria.

Discussion
In this study, we sought to identify diel trends in freshwater microbial community gene expression to hypothesize specific metabolic interactions between community members. Using metatranscriptomic time series, we were able to detect genes with cyclic trends. We found diel trends in all lakes studied, regardless of trophic status, although the timing of these cycles varied by lake.
The balance of primary production and respiration is of interest to those seeking to create carbon budgets for freshwater lakes. Previous research has linked photosynthesis and respiration to diel cycles (Solomon et al. 2013), leading us to hypothesize that genes related to these processes would also show diel trends. In all three lakes, genes related to photosynthesis were highly expressed and frequently cyclic. Photosynthesis and carbon fixation are often considered to be coupled in the process of these processes; however, we observed that expression of genes related to primary production occurred at different times of day in Lake Mendota and Sparkling Lake. We also investigated expression of other types of potential phototrophy, including AAP and opsins. Genes encoding proteins related to these processes were often cyclic, especially those involved in AAP. AAP genes typically had different times of maximal expression than the canonical oxygenic photosynthesis, possibly indicating adaptation to different times of day to avoid competition for light and nutrients.
Respiration is a broad functional category that encompasses the degradation of many carbon substrates. To identify the compounds being respired, we focused on genes related to carbon transport, as transporter expression has previously been used in marine systems to predict substrate use (Ottesen et al. 2013;Vorobev et al. 2018). In all three lakes studied, we identified cyclic trends in many genes related to sugar transport. Phototrophs are known to exude sugars (Maršálek and Rojí cková 1996), suggesting that sugars may be exchanged between phototrophs and heterotrophs. Therefore, it is of particular interest that expressed genes encoding sugar transporters in Lake Mendota are primarily classified as Cyanobacteria and Actinobacteria.
There are multiple nonexclusive hypotheses as to why we observed diel trends in some genes encoding sugar transport. One is biotic in origin-if these sugars are indeed algal exudates, they may be produced during the day and released at night. Although such diel release of sugars has not been documented here, day/night partitioning of photosynthesis and sugar metabolism are known to occur in phototrophs (Masuda et al. 2018;Welkie et al. 2018). Such diel trends in individual lake. Here, we present an example of these cyclic trends in genes related to photosynthesis. Read counts in transcripts per liter were z-score normalized for the purpose of visualization. Each gene trend is color-coded by its time of maximal expression in the first 24-h period. Because of missing samples in day 2 of Trout Bog, only day 1 is displayed.
populations may extend to community-level interactions. Another possible hypothesis is that oxidative stress prevents heterotrophs from consuming sugar during the day, even if such substrates are available. We also observed cyclic trends in expression in genes related to ROS defense and carboxylic acid transport, a common photodegradation product, in all three lakes.
There is ample evidence in marine microbial communities suggesting that carbon released by phototrophs influences heterotrophic community composition to improve phototroph fitness. In coral reefs, algal exudates can dramatically shift bacterial community composition, potentially providing algae with a competitive advantage over coral by selecting for coral pathogens in the heterotrophic community (Nelson et al. 2013). In marine microbial communities, heterotrophic bacterioplankton are highly dependent upon Prochlorococcus exudates and likely perform a critical community function in

Lake Mendota
Sparkling Lake    Fig. 4. Taxonomic composition of functional categories by time and lake. We next investigated the taxonomy of functional categories and how phylogenetic groups changed expression over time. The x axis indicates the number of genes from each category assigned to each phylum, summed across both days of the time series. Proteobacteria were split by class due to the high diversity of this phylum. RNA polymerase, used as an indicator of growth, was phylogenetically diverse in all lakes, although less well classified in Sparkling Lake, likely due to the lack of reference genomes from this site. Cyanobacteria contributed to photosynthesis in all lakes, particularly in Lake Mendota. General sugar transport was encoded primarily by Cyanobacteria and Actinobacteria in Lake Mendota, Actinobacteria and Betaproteobacteria in Trout Bog, and Actinobacteria and Armatimonadetes in Sparkling Lake.
return, such as the detoxification of hydrogen peroxide or free radicals (Morris et al. 2011). Prochlorococcus has lost its genes for ROS defense and depends on the associated heterotrophic bacteria to supply this function (Morris et al. 2016;Ma et al. 2018). Prochlorococcus may exude carbon to maintain redox balance, as it generates more reducing power via photosynthesis than it can allocate to anabolic processes (Bertilsson et al. 2005). However, a frequently observed adaptation to excess reducing power is to downregulate photosynthesis electron flux; this is not observed in Prochlorococcus and suggests alternative reasons for its release of carbon (Braakman et al. 2017). It is therefore reasonable to hypothesize that freshwater phototrophs may be releasing carbohydrates to shape the heterotrophic community, which in turn may benefit phototrophs. Most likely, heterotrophs perform functions that benefit the community, such as ROS defense and vitamin production. The origin of metabolic exchanges that lead to codependencies has been postulated to be an important driver of evolution in aquatic communities, as in the "Black Queen Hypothesis," where interdependencies among microbial community members lead to genome reduction (Morris et al. 2012). It is intriguing to note that the dependency between phototrophs and heterotrophs and the diel partitioning of carbon fixation and respiration would be analogous to the organization and functioning of chloroplasts and mitochondria in plant cells (Braakman et al. 2017). Further concerted efforts to recover nearly complete reference genomes from freshwater systems should enable more rigorous testing of this hypothesis.
Here, we present a comparative metatranscriptomic analysis, which demonstrates similar diel trends in photosynthesis and sugar transport in three different types of lakes. We hypothesize both biotic (algal exudates) and abiotic (oxidative stress) factors as drivers of community-level diel trends in freshwater microbiomes. Whether all of these microbes are responding to the same day-night stimulus or whether community interactions confer these diel trends remains to be determined. Given the consistent patterns across metatranscriptomes from biogeochemically disparate lakes, our results underscore the prevalence of conserved microbial interactions that underpin a broad diversity of freshwater environments.

Data availability statement
Datasets used in this study are available on the Open Science Framework (DOI: 10.17605/OSF.IO/9GR62). All code is publicly available at https://github.com/McMahonLab/geodes. Raw sequence files are available through the JGI Genome Portal.