Ontogenic succession of thermokarst thaw ponds is linked to dissolved organic matter quality and microbial degradation potential

Warming climate is thawing the permafrost in arctic and subarctic regions, leading to formation of thermokarst ponds. During the formation and geomorphological succession of these ponds, carbon that has been trapped in frozen soils for thousands of years is hydrologically mobilized and returned to the active carbon cycle. We sampled 12 thermokarst ponds representing three different stages of pond succession to study the potential of microbial communities to metabolize the organic carbon in the water. We investigated the quality of the dissolved organic carbon (DOC) in the water column based on the spectrophotometric and fluorometric properties of the chromophoric dissolved organic matter combined with parallel factor analysis and the potential of the microbial community for degrading these carbon compounds based on genetic markers related to carbon degradation. Our analysis showed a clear difference in the DOC quality across the different developmental stages. In the younger ponds, organic matter quality suggested that it was originating from the degrading permafrost and in the metagenomes collected from these ponds, the normalized abundance of genes related to degradation of carbon compounds was higher. There was also a shift in the degradation potential in the water column of the ponds, with higher potential for organic matter degradation in deeper, anoxic layers. In conclusion, our results show that the DOC quality and the genetic potential of the microbial community for carbon cycling change across the pond ontogeny, suggesting a capacity of the microbial communities to adapt to changing environmental conditions.

2006b; Schuur et al. 2008). Upon thawing, the bioavailability of this ancient carbon is very high (Waldrop et al. 2010;Vonk et al. 2013) and increasing temperature is expected to further increase the carbon release from the thawing permafrost (Dorrepaal et al. 2009). Alaska alone is predicted to lose 50-100 Gt of carbon from the permafrost to the atmosphere by the end of the century (Stieglitz et al. 2003;Zhuang et al. 2006). In many regions, the thawing of the permafrost ice contributes to the creation of thermokarst ponds and lakes (Osterkamp et al. 2000;Luoto and Seppälä 2003) where the ancient carbon can be reintroduced in the active carbon cycle (Roehm et al. 2009). These waterbodies represent one of the most common and widespread aquatic ecosystem types in the arctic and subarctic landscape with high contribution to greenhouse gas emissions (Pienitz et al. 2008;Koch et al. 2014;Wik et al. 2016).
By successive erosion and thawing, these thaw ponds can expand vertically and horizontally and eventually merge, becoming larger and more numerous in some parts of the Arctic landscape, while in other northern regions they are disappearing by drainage, evaporation, or infilling . During maturation, the ponds also deepen sufficiently to enable thermal stratification that may create distinct aquatic habitats for metabolic processes (Roiha et al. 2015). The persistence of these waterbodies is usually some hundreds of years (e.g., Fillion et al. 2014) but can differ, for example, in response to wind-exposure and resultant variability in the insulating snow cover (An and Allard 1995). The systems are also sensitive to stochastic effects such as fires that may be regarded as a key trigger for thaw/regeneration cycles in the permafrost (Zoltai 1993). The lifecycle of these ecosystems is expected to be shorter under warmer climate conditions (Vincent et al. 2013) and this will in turn expose a higher number of waterbodies to strong lateral influence from the adjacent carbon-rich permafrost erosion that will enrich the lakes and ponds and accordingly stimulate bacterial activity (Vonk et al. 2013). In the areas where thawing is ongoing, a shift toward increasing terrestrial dominance of the carbon in freshwaters has already been reported (Wauthy et al. 2018).
The massive amount of previously stabilized permafrost carbon that is now being routed back into the active carbon cycle can have global consequences for the biogeochemical cycling of carbon (Tranvik et al. 2009). A large share of the carbon in the permafrost is labile and readily available for microbial use (Waldrop et al. 2010), making the emerging thaw ponds hotspots for microbial production that may act as an important positive feedback to climate warming. Compared to freshwaters that are less influenced by disappearing permafrost, the molecular composition of the carbon in thaw ponds is characterized by a larger proportion of terrestrially derived dissolved organic matter (DOM) of higher molecular weights combined with a minor influence from compounds derived from autochthonous primary production (Wauthy et al. 2018). There is however no information on how DOM quality and the associated microbial composition change throughout the long-term succession of the permafrost thaw ponds. That said, carbon quality has been shown to change during the ontogenetic aging of boreal beaver ponds (Catalan et al. 2016) and such changes are likely also happening in thaw ponds.
Despite the documented importance of the microbial loop in the permafrost thaw pond carbon cycling, studies looking into microbiology of thermokarst waters are still scarce (Vonk et al. 2015) and majority of the microbiological studies have focused on the resident methanotrophs in these ponds (e.g., Rossi et al. 2013 ;Crevecoeur et al. 2015 ;Crevecoeur et al. 2017). This focus is likely because these waters represent a source of around 24 Tg of CH 4 emission per year (Walter et al. 2007), with a predicted increase of 54% by the end of the century from high northern latitude lakes alone (Wik et al. 2016). However, prior to becoming CH 4 , the carbon compounds coming from permafrost have to be extensively processed by various heterotrophic organisms (Tveit et al. 2013). Our current understanding of these processes is very limited, but using next generation sequencing methods, we can now study what kind of processes and substrate transformations the microbial communities have the potential to perform. Indeed, for permafrost soils, some information on microbial metabolic potential and carbon degradation already exists (Yergeau et al. 2010;Mackelprang et al. 2011;Tveit et al. 2013;Woodcroft et al. 2018). These preceding studies have shown the presence of genes related to plant polymer hydrolysis (Tveit et al. 2013) and organic matter degradation (Yergeau et al. 2010;Woodcroft et al. 2018) and even CH 4 cycling (Mackelprang et al. 2011) in the soils affected by permafrost and permafrost thaw. However, there are no analogues studies for the aquatic environments affected by permafrost thaw. We addressed this knowledge gap by studying the microbial pathways for degrading carbon compounds and the taxonomic composition of organisms involved in the degradation.
For this purpose, we used shotgun sequencing of the metagenomic DNA from 11 permafrost thaw ponds ranging in age from approximately 3-yr-old emerging ponds to 60-yr-old mature ponds (Wurzbacher et al. 2017). Due to the strong hydrological linkages with the surrounding soils and permafrost and the limited time for diagenetic "aging" of the soil organic matter before reaching the surface water, the carbon in young ponds was expected to be characterized by readily degradable terrestrial carbon fractions (Zimov et al. 2006a;Gao et al. 2018). Analogously, we expected the carbon compounds in the mature ponds to be more strongly influenced by autochthonous production and diagenetically aged terrestrial carbon compounds (Wauthy et al. 2018). Associated with this, we also hypothesized that the microbial genetic potential for carbon degradation would be tightly coupled with the carbon quality and that it would shift across the pond succession gradient and over depth in the stratified mature ponds. We further hypothesized that the taxonomic identities of the organisms contributing to carbon degradation across the lateral (among ponds) and vertical (within the depths of deep ponds) gradients would vary.

Sample site and sampling
Twelve permafrost thaw ponds located on a palsa bog in a river valley in the sporadic permafrost zone (55 13 0 N, 77 42 0 W) close to the village of Whapmagoostui-Kuujjuarapik (Nunavik, QC, Canada) were sampled once in the beginning of August 2014. Palsas are frost heaves containing permanently frozen ice lenses, with an ice core inside of the overlying soil (Seppälä 2006). The height of palsas at the study site is up to 3 m and the age of the organic material in the top 1.5 m varies from 700 yr in the surface to 5000 yr at a depth of 1.6 m (Fillion et al. 2014). The vegetation is dominated by Carex and Sphagnum (Bhiry et al. 2011). The thermokarst ponds at the study site are formed after thawing of palsas. We chose thaw ponds representing three different phases of pond succession; (1) young ponds which were emerging at the edge of palsas and that had not been present a few years earlier (depth 0.10-0.30 m), (2) developing and highly dynamic ponds located around an existing but thawing palsas (depth 0.30-0.70 m), and (3) mature ponds where there was no palsa left (depth 2-4 m) (see Supporting Information Figs. S1, S2 for the size of the ponds). The first two pond types are still growing as the permafrost is thawing and the palsas are sinking to the pond. The mature ponds were approximately 50-60 yr old (dating based on satellite images) and thermally stratified. The mature phase of the ponds at this site has been shown to last for 300-400 yr (Fillion et al. 2014). The sampling was conducted as previously described (Wurzbacher et al. 2017). In short, surface waters from four ponds of each age category were sampled. When the pond depth exceeded 50 cm, separate samples were retrieved from the epilimnion, metalimnion, and hypolimnion. All ponds were located within a 100 m radius. In order to analyze dissolved organic carbon (DOC) concentrations and perform optical analyses of the chromophoric dissolved organic matter (CDOM), water samples were collected from the surface and filtered through prerinsed cellulose acetate filters (0.2 μm). This filter pore size removes small-size inorganic soil particles common in thaw ponds (Watanabe et al. 2011) and was chosen to be consistent with earlier studies carried out on circumpolar ponds (Roiha et al. 2015;Wauthy et al. 2018). The samples were stored in acid-washed and combusted glass vials at 4 C in the dark pending carbon characterization. Samples for chemical analyses (Fe 2+ , Fe 3+ , NH 4 , NO 2 , NO 3 , total N, PO 4 , SO 4 , pH) and CH 4 and CO 2 concentrations were collected from all targeted ponds and depth layers.

Chemical analyses
CH 4 and CO 2 concentrations were measured as in Kankaala et al. (2007) with the exception that instead of N 2 , ambient air was used to extract gas from the water. Shortly, 30 mL of water was collected to 60-mL syringes from each sampled layer and transported to the laboratory on ice. In the laboratory, the syringes were incubated in a 20 C water bath for 5 min and 30 mL of room air was added to each syringe. Following the addition of air, the syringes were shaken for 2 min to equilibrate the gas concentration and air was collected to glass vials. The CH 4 concentration of the ambient air was quantified in parallel and subtracted from the measured values to determine final concentrations. Nutrients were measured at Erken laboratory (Uppsala University, Sweden) using standard methods (https://www.ieg.uu.se/digitalAssets/437/c_437648-l_ 3-k_metodfo-rteckning-28.pdf) and pH was measured using Micro-pH 2001 (Crison Instruments, Barcelona, Spain).

Carbon characterization
DOC was measured from filtered water (GF/F glass fiber filter, Whatman, Little Chalfont, UK) using Shimadzu TOC-V analyzer (Shimadzu Corporation, Kyoto, Japan) and total organic carbon (TOC) from unfiltered water using TOC-L analyzer (Shimadzu Corporation, Kyoto, Japan). A set of carbon quality indicators was measured from the filtered water using spectrophotometric and spectrofluorometric methods. The absorbance spectra were measured in 1 cm quartz cuvettes using a UV-visible Cary 100 spectrophotometer (Agilent, Santa Clara, CA, U.S.A.), recording absorbances between 250 and 800 nm at 1 nm intervals. Milli-Q water was used as blank and spectra for samples were null-point adjusted by subtracting the mean blank value from 750 to 800 nm to correct for offsets due to instrument baseline effects (Helms et al. 2008). The absorbance measured at 254 nm (A 254 ) was normalized for DOC to calculate the specific ultraviolet absorbance at 254 nm (SUVA 254 ), which is used as a proxy for aromaticity of CDOM (Weishaar et al. 2003). Iron can bind to humic substances and form complexes that may increase both absorbance (Weishaar et al. 2003) and fluorescence (McKnight et al. 2001). To account for this, a correction factor developed for SUVA 254 using concentrations of dissolved iron Fe 3+ was applied as in Poulin et al. (2014). The absorption coefficient at 320 nm (a 320 ) was used as an index of CDOM concentration and was determined following the equation: where a 320 is the absorption coefficient (m −1 ) at 320 nm, A 320 the absorbance corrected at 320 nm, and L the path length of the cuvette (m) (Williamson et al. 1999). CDOM absorption spectra were further analyzed to calculate the spectral slopes S λ as described in Loiselle et al. (2009) for the intervals 279-299 nm (S 289 , named by center wavelength), 275-295 (S 285 ), and 350-400 nm (S 375 ). All linear regressions were performed using SciLab v 5.5.2. (Scilab Enterprises, Orsay, France). S 289 was used as a proxy to estimate the importance of fulvic and humic acids related to algal production (Loiselle et al. 2009), while the slope ratio (S R ) S 285 /S 375 is an index of CDOM molecular weight (Helms et al. 2008). Fluorescence spectra were recorded on a Cary Eclipse spectrofluorometer (Agilent, Santa Clara, CA, U.S.A.) across the excitation wavelength range 250-450 nm (10 nm increments) with emission between 300 and 560 nm (2 nm increments) in order to construct excitation-emission matrices. The fluorescence spectra were corrected for Raman and Rayleigh scattering and inner filter effects, and standardized to Raman units using the FDOMcorr 1.4 toolbox (Murphy et al. 2010). In order to investigate the origin of fulvic acids, the fluorescence index (FI) was calculated as the ratio of fluorescence emission intensities at 450 nm and 500 nm with 370 nm excitation (McKnight et al. 2001). To identify and quantify the main DOM components, a parallel factor (PARAFAC) model was run on the 12 excitation-emission matrices developed in this research combined with 117 extra-matrices collected from previous studies in 85 arctic and subarctic ponds. The PARAFAC modeling was performed on MATLAB v R2013a (MathWorks, Natick, MA, U.S.A.) as previously described (Murphy et al. 2013). The model was validated by split-half analysis (Supporting Information Fig. S3) using the drEEM toolbox (Murphy et al. 2013). The maximum fluorescence values [Cx] of each component x for each particular sample were summed to determine the total fluorescence (F T ) and the relative abundance of any component x was calculated according to the following equation: To identify the components of the model, their respective excitation and emission spectra were compared to published components from 70 models available in the OpenFluor database, as previously described (Murphy et al. 2014).

DNA extraction and sequencing
Microbial cells for DNA extraction and analyses were collected on Sterivex filters (Millipore, Massachusetts, U.S.A.) from each of the ponds and sampling depths by filtering water through the filter until it clogged (150-450 mL per sample). The filters were stored at −80 C until extraction, which was done using the Mobio PowerSoil DNA extraction kit (Mobio, Carlsbad, CA, U.S.A.). DNA concentrations were measured using PicoGreen (Invitrogen, Paisley, UK) and the DNA was fragmented using a Covaris E220 ultrasonicator (Covaris, Brighton, UK). Libraries were subsequently prepared using Thruplex FD Prep kit (Rubicon Genomics, Ann Arbor, MI, U.S.A.) according to the manufacturer's instructions and then sequenced at Science for Life Laboratory (Uppsala, Sweden) using an Illumina HiSeq 2500 system operating in high output mode. The raw data have been deposited to NCBI Short Read Archive under accession number PRJNA517163.

Processing of the metagenome data
The sequencing produced 163 Gb of total raw sequence data (range 4.3-11.4 GB per sample, Supporting Information Table S1). The raw reads were quality-trimmed using Sickle (v1.210) (Joshi and Fass 2011) with default parameters, then coassembled into contigs using Megahit (v1.1.2) (Li et al. 2015). This resulted in a combined assembly of 2.2 Gb of contigs longer than 1 kb. The raw reads were mapped against the assembly using BBMap (Bushnell 2016) to define coverage of individual samples in the assembly. Potential for carbon degradation was studied based on hidden Markov models (HMMs).
HMMs are heuristic models of protein domains from groups of functionally related proteins (i.e., protein families). Each HMM is based on an alignment of the sequences from multiple proteins. These models have a robust way of handling sequence variation, such as insertions and deletions, providing higher precision for protein annotations. Thus, the amount of false positives is lower than with most other methods, such as blast searches (Sonnhammer et al. 1998). For this study, we chose a set of HMMs from the PFAM-database (database for Protein FAMilies; Finn et al. 2008) that have previously been used to study carbon degradation in permafrost soil (Tveit et al. 2013). These included HMMs associated with carbohydrate binding molecules, carbohydrate esterase, glycoside hydrolases, phenolic compound degrading domains, polysaccharide lyase, and a group of other carbohydrate active enzymes (Tveit et al. 2013). Presence of these marker genes in the dataset indicates that the microbial community has the potential for conducting the reaction catalyzed by the protein that the gene produces. To identify the marker genes in the data, protein-coding genes (CoDing Sequence, CDS) were predicted on the assembled contigs with Prodigal (v2.6.3) (Hyatt et al. 2010) in metagenomic mode. CDS were then annotated against the PFAM database (release 32) (El-Gebali et al. 2019) using hmmsearch (from the hmmer suite, version v3.1b2) (Durbin et al. 2002). For each sample, the abundance of CDS related to carbon degradation was calculated with htseq-count (v0.9.1) (Anders et al. 2015) in "-nonunique none" mode, as suggested by the authors to perform differential abundance (or gene expression) analysis. A separate calculation of the abundance of CDS related to carbon degradation in "-nonunique all" mode was performed for descriptive purposes and full results are reported as transcripts per million reads (TPM) in Supporting Information Table S2.
The taxonomic affiliation of the organisms engaged in carbon degradation was identified from the contigs using Kaiju (Menzel et al. 2016), with NCBI nr as reference database.

Statistical testing
All statistical analyses were done using R (R Core Team 2015). Differences in the environmental factors, carbon quality, and total degradation potential (total normalized abundance of degradation PFAMs) across different age classes were tested using one-way ANOVA with Tukey's post hoc test with p value adjustment. For these analyses, three epilimnion samples were included for emerging and mature ponds and four epilimnion samples for developing ponds. SUVA 254 and S R were transformed using command "transformTukey" included in the R package rcompanion (Mangiafico 2015) in order for them to meet ANOVA assumptions. For the variables that did not fulfill the ANOVA assumptions even after transformation (O 2 , NH 4 , NO 2 , NO 3 , PO 4 , SO 4 , pH), differences were tested using Kruskal-Wallis tests and Wilcoxon post hoc tests with p value adjustment (Benjamini and Hochberg 1995;Benjamini and Yekutieli 2001).
To study the variation in abundances of PFAMs related to carbon degradation across different age classes and depth layers, PFAM abundances were rarefied to the sample with the lowest abundance, Bray-Curtis distances of the samples were calculated using the R package vegan (Oksanen et al. 2017) and these distances were visualized with nonmetric multidimensional scaling (NMDS). Differential abundance analysis of PFAMs in the samples originating from ponds representing different ages (emerging, developing, mature), or different depth layers within the mature ponds (oxic vs. anoxic) was performed using the R package DESeq2 (Love et al. 2014).
The difference in the diversity of PFAMs related to carbon degradation across different age classes and depth layers was tested using permutational multivariate analysis of variance (PERMANOVA) of Bray-Curtis distances of the samples. This was used as a proxy for qualitative total potential for degrading carbon compounds. The difference in total normalized abundance of PFAMs related to carbon degradation between oxic and anoxic compartment of the mature ponds was tested from the sum of all normalized abundances of carbon degradation related PFAMs using Welch two-sample ttest, while the difference in total normalized abundance of PFAMs related to carbon degradation in the oxic layer across ponds representing different ages was tested using one-way ANOVA. This was used as a proxy for quantitative total potential for degrading carbon compounds. Linkages between the carbon quality parameters and specific PFAM abundances were tested using Spearman correlations with Holm's correction for multiple inference (Holm 1979) (using the rcorr. adjust function from the R package RcmdrMisc) (Fox et al. 2018) and relationships with p value < 0.05 were visualized in a heat map.

Chemical parameters
Environmental conditions at the surface layer of the ponds were similar across the full set of ponds that represented different stages of thaw pond formation. There was no difference in concentrations of O 2 , CH 4 , CO 2 , iron, NH 4 , NO 2 , NO 3 , PO 4 , and SO 4 . However, pH was lowest in the emerging and highest in the mature ponds, while TOC and total N were higher in emerging and developing ponds than in the mature and temperature was higher in emerging than in developing ponds (Supporting Information Fig. S4, Table S3; Wurzbacher et al. 2017). In the mature thaw ponds, the environmental factors were also tested for different water depth layers. As expected, most chemical variables were significantly different between the water depth layers (Supporting Information Fig. S5). Oxygen and temperature decreased from surface to bottom, while for CH 4 , CO 2 , TOC, total N, and Fe 2+ the trend was opposite and pH was higher in the epilimnion than in the metalimnion. For Fe 3+ , NH 4 , NO 2 , NO 3 , PO 4 , and SO 4 , there was no difference between the depth layers.

Carbon characterization
The characterization of carbon quality revealed a strong effect of the thaw pond developmental stage on DOC quality and quantity (Fig. 1), with significant differences for DOC, a 320 , and FI (Supporting Information Table S3). The DOC concentrations varied from 12.3 to 39.0 mg L −1 (mean = 25.7 mg L −1 , SD = 10.1), with the highest values found in the emerging (mean = 34.3 mg L −1 , SD = 5.3) and developing ponds (mean = 29.0 mg L −1 , SD = 6.6). CDOM concentration (a 320 ) followed the same trend. In contrast, S R and FI were lowest in emerging thaw ponds, but the difference was only significant for FI. Moreover, the highest values of FI were found in mature thaw ponds, suggesting that these  systems had a higher content of microbial-derived fulvic acids (McKnight et al. 2001). No significant differences or patterns were observed for SUVA 254 and S 289 , suggesting that the developmental stage had no strong effect on the aromaticity and the humic and fulvic content originating from algal production. The PARAFAC model identified five fluorescence components (Supporting Information Fig. S6), of which four shared fluorescence properties with humic materials from terrestrial (C1-C3) and microbial origin (C4). The fifth component (C5) presented spectra similar to amino acids or proteins and is known to be associated with algal, microbial or terrestrial production (more details in Supporting Information Table S4). The combined percentage of terrestrial humic components (C1-C3) showed the highest values in emerging and developing thaw ponds (85.8% AE 2.3% and 86.4% AE 0.2%, respectively) (Fig. 2). However, these differences were significant only for C2. Furthermore, along the developmental stages gradient, C1 showed an inverse trend compared to C3. Moreover, mature thaw ponds had the highest mean value for C4 (11.7% AE 0.5%), indicating increasing representation of humic-dominated microbial material in these ponds. Finally, the proportion of C5 varied between 2.7% and 8.8% but was not significantly different among pond types.

Potential for carbon degradation in the surface layer
The microbial carbon degradation potential, as determined from diversity of PFAMs indicative of carbon degradation (Tveit et al. 2013), differed between ponds representing different developmental stages. Significant differences in diversity were also seen for the different sampling depths (Fig. 3, PERMANOVA, p < 0.01 and p < 0.05 for age and depth, respectively). There was no significant difference in the total degradation potential among pond types (the sum of normalized abundances of all carbon degradation PFAMs per sample; one-way ANOVA p = 0.0654, F = 3.29). This suggests that while the total normalized abundance of genes related to carbon degradation were similar across the ponds, the distribution of different PFAMs varied between the age classes. In addition, several individual PFAMs indicative of degradation potential had a significantly different abundance across age classes (Fig. 4, Supporting information Table S5). In general, carbon degradation genes with significantly different abundance were typically more abundant in the two younger pond classes. These differences were particularly pronounced for genes related to degradation of polymeric carbohydrates (glycosyl hydrolases, polysaccharide lyases) with overall higher abundances in the younger ponds.
In general, the organisms carrying degradation genes represented a diverse set of taxa, including mostly bacterial organisms, but also eukaryotes such as fungi, and viruses (Fig. 5, Supporting Information Fig. S7). Many of the individual PFAMs with differential abundance across ponds, such as PF00150, a cellulase belonging to glycosyl hydrolase family 5, were present in a wide variety of different taxa, whereas some, such as PF03663, a member of glycosyl hydrolase family 76, had a very limited taxonomic range. A number of fungal taxa contributed to the degradation potential, including organisms related to Ascomycota, Basidiomycota, and Mucoromycota. For some PFAMs, such as PF00150 (glycoside hydrolase family 5 [GH5]; Supporting Information Fig. S7), taxonomic classification suggested significant contribution from viruses to the degradation potential. It also appeared that there was a shift in the type of carbohydrate binding domains across the age classes, with emerging and developing ponds having the same dominant domains, while in the mature ponds, the dominant domains were different (Fig. 4A).

Shifting potential for carbon degradation across the depth gradient
For the mature ponds, the total potential for carbon degradation (the sum of the normalized abundances of all carbon degradation PFAMs per sample) was higher in the anoxic layer compared to the oxic (p < 0.001, t = −12.82). The normalized abundance of several PFAMs was significantly different between the oxic and anoxic layers (Fig. 6). For example, carbohydrate-binding modules were much more abundant in the surface layer (Fig. 6A) and this was also seen for pectate lyase (PF12708) suggested to participate in the degradation of plant tissue (Hugouvieux-Cotte-Pattat et al. 2014) and genes that are involved in chitin degradation (PF01607 and PF00182). Many of the glycosyl hydrolases were much more abundant in the oxic layer, suggesting higher potential for degrading complex carbohydrates under such conditions. Taxonomically the potential for degradation was also in this case related to a wide variety of bacterial taxa as well as fungi and viruses (Fig. 5, Supporting Information Fig. S8). Also some taxonomic differences in the organisms carrying degradation potential were apparent between the different layers. In the oxic layer, many PFAMs involved in degradation were affiliated with Proteobacteria, and these genes were also often significantly different in abundance between the layers.
Example of this is PF00182 (chitin degradation) that was abundant in Alpha-and Gammaproteobacteria. In the anoxic layer, especially Chlorobi and candidate phylum Omnitrophica contributed to the degradation potential via multiple PFAMs, such as PF01510 (N-acetylmuramoyl-Lalanine amidase) participating in cell wall degradation and PF2578 (laccase) which acts on compounds such as phenols and lignin. Also three PFAMs related to glycosyl hydrolase family 65 (PF03632, PF03633, and PF03636) were affiliated with Chlorobi in the anoxic compartment. Enzymes in this family act on substrates containing α-glucosidic linkages. Phylum Bacteroidetes contributed to most PFAMs and was found in both oxic and anoxic layers. We also explored the possible connections between the observed carbon degradation potential and carbon quality using a correlation analysis (Supporting Information Fig. S9). In the analysis, several PFAMs had significant correlations with the various carbon quality measures. For example, three PFAMs (PF03512, PF01229, and PF02156) related to xylan degradation were positively correlated with fraction C2, an indicator for terrestrial carbon. Xylan is a hemicellulose originating from plants, suggesting a link between carbon quality and genetic potential of the microbial community. However, several other PFAMs also related to xylan degradation (e.g., PF00722 and PF00457) were positively correlated with carbon fractions not as directly associated with plant production (C4), suggesting that the quality measures used here are too broad for linking organic matter substrates to individual PFAMs. In future studies, application of high resolution mass spectrometry and other highresolution methods for organic matter characterization may shed more light on whether or not there are such direct linkages.

Discussion
High terrestrial inputs to emerging ponds In this study, we assessed how DOC quality and potential for microbial organic carbon degradation change along the ontogenetic development of permafrost thaw ponds in a subarctic site exposed to on-going permafrost thaw. This natural gradient in thawing conditions provides a more realistic strategy for studying the impact of climate warming on permafrost compared to widely used short-term laboratory incubations (Monteux et al. 2018). The studied ponds were in close proximity to each other, eliminating the possible impact of many other environmental factors such as vegetation, weather, and geology, thus, ensuring that the observed differences are most likely due to the different stages of permafrost thaw and pond development. The ontogenetic stage of the pond had a strong influence on the carbon quantity and quality, as well as on the carbon degradation potential of the microbial communities. Our proxies suggested the highest average molecular weight as well as absolute DOC and CDOM concentrations in the emerging and developing thaw ponds, likely reflecting a higher supply of fresh allochthonous carbon at the onset of permafrost thaw, but possibly also the closer physical proximity to the sediment. However, the TOC concentrations in the bottom samples from the mature ponds (Supporting Information Fig. S5), taken equally close to the sediment, were significantly lower than the TOC in the emerging ponds (Supporting Information Fig. S4), suggesting that the carbon flow from the thawing palsa had a stronger influence on the carbon concentration than the proximity to the sediment. Despite the fact that the TOC concentration suggested higher carbon flow from the thawing palsas to the younger ponds, the quality of this supposedly allochthonous organic carbon in the younger ponds did not seem to adhere to the regular characteristics of highly chromophoric humic substances, for example, no significant differences in SUVA 254 could be observed compared to the old ponds. Also, S 289 did not vary across the age gradient, which could suggest that the differences according to developmental stage were moderate. Previously both proxies have been shown to differentiate, for example, the carbon source in thaw ponds compared to bedrock and tundra ponds (Wauthy et al. 2018), but it is possible, that these proxies are not sensitive enough to capture the variation in the carbon quality within this group of thaw ponds. Nevertheless, the higher sum of terrestrial humic components C1-C3 in the younger thaw ponds, although not significant, supported the assumption that there is stronger terrestrial input of organic compounds in the early stages of permafrost thaw.
Among the PARAFAC components, C1 and C3 have both been previously associated with humic materials of terrestrial origin, but in our data, they responded in contrasting ways to the aging of the ponds (Fig. 2). This has previously been explained by their interrelationship where one of these two components could be a product from the biological transformation of the other (Jørgensen et al. 2011). Furthermore, the higher content of fulvic (FI) and humic (C4) acids of microbial origin in mature thaw ponds suggests that this material has experienced longer exposure to microbial degradation and accordingly feature diagenetic changes in DOM quality. Because of the microbial origin of C4, we cannot determine if this material is from algal or terrestrial sources (Wauthy et al. 2018). It should however be noted that coupled to microbial degradation, photochemical transformation of terrigenous and autochthonous DOM is expected to be very important in thermokarst systems (Cory et al. 2013). Thus, this may also affect the DOM pool quality particularly in the more exposed mature thaw ponds. Consistent with a previous study, the amino acid-or protein-like component C5 featured very low values in all thaw ponds, highlighting the overwhelming inputs of terrestrial humic and fulvic acids from the degrading permafrost (Wauthy et al. 2018).
Bacterial, viral, and fungal contributions to carbon degradation potential The degradation potential was linked to various bacterial taxa. In general, the taxonomic composition of the microbial communities in the thaw ponds was in agreement with previous studies in peatland environments, with high contribution of Proteobacteria, Bacteroidetes, Verrucomicrobia, and Actinobacteria to organic matter degradation potential (Tveit et al. 2013;Ivanova et al. 2016;Wurzbacher et al. 2017;Woodcroft et al. 2018). Bacteroidetes was a major contributor to degradation potential in all age categories and different layers of the ponds. This agrees with a previous study on carbon degradation in Arctic peat soils (Tveit et al. 2013). Bacteroidetes are also well documented to participate in carbon degradation in other environments, such as anoxic marine sediments (Julies et al. 2010) and rumen (Solden et al. 2017). In the anoxic layer of the mature ponds, Chlorobi appeared to contribute to degradation potential. Typically, members of Chlorobi are associated with photoautotrophic energy metabolism and this is most likely also the case here as the Chlorobi represented the typical phototrophic taxa. Recently, Chlorobi was also found to have a major contribution to nitrogen fixation potential in the same thaw ponds studied here (Fernandez et al. 2019), and members of this phylum are also known to carry potential for degradation of aromatic compounds (Fuchs et al. 2011). While Chlorobi was specific for the anoxic layer, also other bacterial taxa contributed to the higher degradation potential in the anoxic compartment.
We could not classify any of the found degradation marker to a specific viral taxon, but for several PFAMs, there was a clear viral signal. A previous study, specifically addressing the viral community, showed that in one of the mature ponds (SAS2A), chloroviruses, typically infecting algae, and myoviruses, which are bacteriophages, were dominating the viral community (Lévesque et al. 2018). The authors also suggested that the DOC concentration was the primary driver of viral community composition and diversity, further linking viruses to carbon cycle. In fact, viruses have been suggested to impact the carbon cycle in aquatic ecosystems (Worden et al. 2015). A recent study from a similar permafrost thaw site in northern Sweden showed that in peat, viruses contributed directly to carbon cycling by carrying genes for enzymes that degrade plantderived polymers, as well as indirectly by infecting major carbon degrading bacteria (Emerson et al. 2018;Trubl et al. 2018). Furthermore, an enzyme belonging to GH5 (PF00150), and found from a virus, was tested and found to be fully functional (Emerson et al. 2018). These results suggest that viruses influence carbon degradation in more complex ways than just via host lysis (Emerson et al. 2018). The same PFAM PF00150 was found also in our study to have viral contribution suggesting that viruses could have a direct as well as indirect role in carbon degradation also in the ponds investigated by us.
Our data also included several PFAMs that appeared to be of fungal origin. Studies from northern peatlands have shown a significant influence of fungi in carbon cycling (Thormann 2006) and in the tundra, rising temperature has been shown to increase the abundance of fungi (Xue et al. 2016). Here, one of the fungal taxa found was Mucoromycota, which are known to participate in cellulose degradation in, for example, agricultural soil (Koechli et al. 2019). However, for both, viruses and fungi the databases lack behind, limiting the taxonomic information available and further studies are needed to better map the taxonomic identities of viral and fungal organisms in permafrost environments.
The impact of pond age and water layer to carbon degradation potential The potential of the microbial community for degradation of complex organic compounds was higher in the younger ponds (Fig. 4), which was in concordance with what was observed for the carbon quality. Individual PFAMs with significantly differential abundances across the developmental stages were also typically more abundant in emerging and/or developing ponds. Additionally, PFAMs with significantly higher abundances in the younger developmental stages included enzymes related to degradation of carbohydrate compounds likely originating from terrestrial production, such as plant polymers (e.g., polysaccharides). For example, PF00150, referring to GH5, was abundant in the surface layer of all the ponds, while still significantly more abundant in the younger stages than in the mature ponds (Fig. 4B). This family of enzymes includes a variety of substrate specificities, such as endoglucanases (cellulase), exoglucanases, and high specificity xyloglucanases. These enzymes target polymers typically found in plant tissue, which is likely to represent a major component of the bulk permafrost organic carbon (Schirrmeister et al. 2011). Accordingly, this cellulase family has been suggested to be the most abundant type of cellulases in arctic soil (Tveit et al. 2013). Enzymes belonging to GH5 are found widely distributed in archaea, bacteria, fungi, and plants (Davies and Attia 2018). This wide taxonomic range was also seen in our results, as organisms hosting PF00150 included alphaproteobacterial orders Rhodospirillales and Rhizobiales, but also fungal taxon Mucoromycota had a high contribution to GH5 abundance, as well as viruses.
For the mature ponds, we also compared the carbon degradation potential between the surface waters and the deeper anoxic layer. Overall, the potential for carbon degradation was higher in the anoxic water masses. One option for the higher potential could be input of dead cell material and their DNA from the upper layers of the ponds. However, the taxonomic analysis of the microbial community in the bottom water of the ponds showed taxa typical for anoxic environments (Wurzbacher et al. 2017) suggesting that the impact of possible sinking organisms from upper layers is minor and likely not affect the functional analysis. Other options for the higher degradation potential in the bottom water include the impact of higher total carbon concentration in the bottom layer (Supporting Information Fig. S5), or the fact that the prevailing conditions at the bottom with low oxygen concentration and low irradiance are limiting the autotrophic processes and together with the abundance of carbon substrates, the environment is more favorable for heterotrophic metabolism. We do not have data on differences in organic carbon quality between the upper and lower water masses, but in the literature anoxia and increasing depth have been linked with increased levels of more recalcitrant carbon compounds (Sachse et al. 2001;Jiao et al. 2010). SUVA 254 and S 289 values from thaw ponds located 40 km east of the sites we studied have also revealed that carbon in the hypolimnion is highly chromophoric and depleted of fulvic and humic acids from autochthonous production (Roiha et al. 2015).
For PFAMs, there was no clear trend toward higher potential for degradation of more recalcitrant compounds in the anoxic layer, but rather a more general difference between the compartments, where the same functional capacity was present in both layers but mediated by different enzyme groups. For example, while both compartments had several glycoside hydrolase families, those were clearly different in their taxonomic affiliations between oxic and anoxic environments. Nevertheless, certain genes related to degradation of more recalcitrant compounds were more prevalent in the bottom layer. An example of this is laccase which was highly present in the anoxic layer and related to, for example, Candidate Phylum Omnitrophica. This is a fairly recently established bacterial phylum for which functional information is still scarce. Interestingly, members of this phylum have been suggested to carry potential for methane oxidation (Momper et al. 2017), nitrogen fixation, and magnetotactic ability (Kolinko et al. 2015). Here, we add to this list a capacity for organic carbon degradation.
Another enzyme worth mentioning is trehalase that was also more abundant in the younger ponds. Trehalose is a compound that is used by organisms to survive in subzero conditions, but it may also serve as a source of energy and carbon (Strom and Kaasen 1993;Elbein et al. 2003). In arctic soil, trehalase has been shown to be more abundant in the active layer than in the permafrost (Yergeau et al. 2010), suggesting use of trehalose as a substrate as the environments warms up. In our study, the highest potential for trehalose degradation was found from the early stages of the permafrost thaw suggesting that organisms in permafrost may have depended on this cryopreservant to survive the extended frozen conditions.

Conclusions
Our study shows that both the DOC quality and the microbial potential for carbon cycling shift over time as the permafrost thaws and the ponds develop from small emerging puddles to deep and stratified mature ponds. We show that the carbon in the early stages of pond formation has a strong terrestrial signal and that while the majority of the degradation potential is in the bacterial community, also eukaryotes and viruses have a role in gauging the permafrost carbon to the active food web. Our results highlight the dynamic nature of the thaw ponds, which is reflected in both abiotic and biotic characteristics of these ponds. More studies are needed in order to show how many of the observed processes are in fact active in these ponds, as well as more detailed characterization of the carbon compounds found in these waterbodies. However, our results are an important first step in disentangling the relationship between carbon quality and microbial processes.