Drivers and projections of ice phenology in mountain lakes in the western United States

Climate change is causing rapid warming and altered precipitation patterns in mountain watersheds, both of which influence the timing of ice breakup in mountain lakes. To enable predictions of ice breakup in the future, we analyzed a dataset of mountain lake ice breakup dates derived from remote sensing and historical downscaled climate data. We evaluated drivers of ice breakup, constructed a predictive statistical model, and developed projections of mountain lake ice breakup date with global climate models. Using Random Forest analysis, we determined that winter and spring cumulative snow fraction (portion of precipitation falling as snow) and air temperature are the strongest predictors of ice breakup on mountain lakes. Interactions between precipitation, cumulative winter air temperature and lake surface area indicate that shifts in air temperature and precipitation affect smaller lakes (< 2 km2) more than larger lakes (> 2–10 km2). A linear mixed effects model (RMSE of 18 d), applied with an ensemble of 15 global climate models, projected that end‐of‐century ice breakup in mountain lakes will be earlier by 25 ± 4 and 61 ± 5 (mean ± SE) days for representative concentration pathways 4.5 and 8.5, respectively.

Anthropogenic changes to the climate are resulting in drought and reduced snowpack in mountainous regions of the world (Dai 2013;IPCC 2014;Mann and Gleick 2015;Pepin et al. 2015). In the mountains of the western United States, warm, high-magnitude precipitation events during winter are expected to become more common (Dettinger 2011;Dettinger et al. 2015). Climate projections show an increase to the frequency of "snow drought," when total precipitation may increase or remain similar but will fall as liquid rather than snow (Harpold et al. 2017a), coupled with earlier and shorter snowmelt runoff (Clow 2009;Hidalgo et al. 2009;Harpold et al. 2012). These changes have important implications for the ecology of mountain lakes because climate affects the phenology of ice-cover on lakes (Caldwell et al. 2020). Ice breakup date modulates lake heat budgets (Smits et al. 2020), nutrient concentrations (Preston et al. 2016;Sadro et al. 2018), primary production, zooplankton biomass (Park et al. 2004;Parker et al. 2008), and fish reproduction and energetic efficiencies (Farmer et al. 2015;Caldwell et al. 2020). Developing predictions of ice breakup dates for mountain lakes will help support evaluation of ecological responses to climate change in these systems (Sánchez-López et al. 2015;Preston et al. 2016). However, in order to develop these predictions for lakes which are heterogeneous in size and landscape position, a quantitative understanding of the factors regulating ice breakup in mountain lakes is needed.
Predicting ice breakup for mountain lakes requires quantifying environmental drivers that predict the efficacy of ice and snow melt. In theory, these are identified from thermodynamic principles (Brown and Duguay 2010;Bernhardt et al. 2012). Cold content is a measure of the snow and icepacks energy deficiency (Jennings et al. 2018;Jennings and Molotch 2020). Cold content (CC) in is mathematically defined as: This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
Additional Supporting Information may be found in the online version of this article.
where CC is cold content (MJ m −2 ), c i is the specific heat of ice (2.1 × 10 −3 MJ kg −1 C −1 ), ρ s is the density of snow (kg m −3 ), d s is snow depth (m), T s is depth weighted snowpack temperature ( C), and T m is temperature of melt (0 C). For melt to occur, positive energy is first required to reduce cold content and bring the ice and snow on the lake to a uniform 0 C. Radiative and turbulent energy are positive sources that further warm and melt the snow and ice after it becomes isothermal. The amount of snow water equivalent (SWE) exerts considerable control on cold content. Smaller snowpacks have less thermal mass and lower cold content than larger snowpacks, while colder air temperatures and sublimation can also increase cold content (Jennings et al. 2018). Shortwave radiation warms and melts snowpack and lake ice (Marks and Dozier 1992) and is primarily responsible for lake ice breakup . In general, incoming shortwave radiation is affected by time of year and latitude (via solar zenith and day length), aerosols, cloud cover, and shading by adjacent topography and vegetation (Rinehart et al. 2008). Reflected shortwave radiation depends on snowpack albedo, snow metamorphism, and impurities in the snowpack (Conway et al. 1996). The presence of liquid water further moderates heat and energy loss to and from lake ice (Vavrus et al. 1996;Kirillin et al. 2012;Arp et al. 2013). A challenge for predicting ice-off dates is to determine whether the critical variables to use in modeling are those that lead to increased cold content, those that contribute to the melt, or whether a combination of variables is required. Given rapid climate change and spatial heterogeneity in mountain landscapes, the need to predict ice breakup at large spatial scales in mountain ecosystems is acute. Prediction will be facilitated if large-scale climate variables such as snowpack and temperature can serve as proxies for specific mechanisms for ice breakup processes. Air temperature coupled with geographical variables (e.g., latitude, elevation, and lake size) have been used successfully to predict ice breakup and formation for lowland and arctic lakes (Gao and Stefan 1999;Shuter et al. 2013). Shuter et al. (2013) used empirical regression models to forecast ice phenology in lakes across Canada. Their models, which relied on air temperature and lake mean depth, were less accurate in mountain regions than in lowland lakes. In individual lakes or small geographic regions, ice breakup correlated with snowpack (Sánchez-López et al. 2015;Preston et al. 2016). Runoff (Brown and Duguay 2010;Arp et al. 2013), snowpack and extreme climate events have been less useful as predictors (Vavrus et al. 1996;Duguay et al. 2003;Benson et al. 2012). A predictive model that functions across latitude, elevation, and lake size gradients is required to predict ice breakup across mountainous regions.
In this study, we identify drivers of mountain lake ice breakup suitable for regional-scale studies and use them to project future ice conditions. Our study region extends from the southern Sierra Nevada in California to the northern Cascades in Washington and northern Rockies in Idaho. Specifically, this study (1) informs modeling of ice breakup dates by quantifying mechanistic relationships between ice breakup and climate variables and (2) develops a model to project ice breakup in mountain lakes to predict climate change impacts on mountain lake systems. Across lakes, we anticipate that (1) increases in the frequency of days when air temperature is above freezing and smaller snow packs that reduce cold content, will correlate with earlier ice breakup dates; (2) ice breakup date will be more sensitive to air temperature and snowpack in smaller lakes because they are higher in elevation where snowpack could be more influential than other variables; and (3) climate change will drive earlier ice breakup at the southern end of our study region (California's Sierra Nevada) where precipitation is more variable relative to other regions in our study area Dettinger 2016;Lamjiri et al. 2018). To test these predictions, we used remote sensing Reed et al. 2009;Arp et al. 2013) to determine ice breakup dates in 41 mountain lakes (between 1 and 10 km 2 ) spanning a latitudinal gradient in the western United States. We acquired downscaled climate data to determine meteorological drivers, built a predictive model, and then projected ice breakup dates in response to climate projections through the 21st century from 15 global climate models and two emissions scenarios.

Study area and lakes
The study included mountainous areas of California, Oregon, Washington, and Idaho. The area includes the Sierra Nevada, Cascade Mountains, and northern Rocky Mountains. We selected this area because interannual variation in precipitation is higher in the south than in the north (Cayan et al. 1998) with lakes that vary in size and elevation. The Sierra Nevada extends 640 km from southern to northern California, with a peak elevation of 4421 m. The southern end of the Cascades begins at the northern end of the Sierra Nevada. The Cascades extend north and include several stratovolcanoes with a maximum elevation of 4392 m. The northern Rocky Mountains (within Idaho) begin at the eastern edge of the Cascade Mountains in Washington. Their elevation is similar to that of the Cascade Mountains but there are fewer high-elevation peaks.
Forty-one lakes across the study area were identified as usable based on size requirements of the satellite image pixel resolution, classification as a mountain lake, and annual icecover occurrence ( Fig. 1; Table S1). The National Hydrography Dataset (NHD) Hydrological Unit-8 (sub-basin scale) was filtered to determine usable lakes for our analysis using ArcGIS (ESRI 2011). First, all lakes less than 1 km 2 were removed from a shapefile because they were below the detection resolution of the satellite imagery. This size requirement removed a large proportion of mountain lakes. Elevation was extracted from each lake by merging the lake polygons with the US Geological Survey 30 m digital elevation model (DEM), while lake surface area, latitude, and watershed size were extracted from each lake from the NHD dataset. Mountain lakes were classified by elevations greater than 1500 m for the Sierra Nevada and 500 m in the Cascade Mountains and northern Rockies. A data exploration procedure that examined numerous lakes in each region determined the elevation-based filter. We found that lakes at certain elevations did not freeze relative to the general elevation in each range. For example, in the Sierra Nevada, ice formation on lakes occurs at higher elevations than lakes in the Cascade Mountains due to the differences in latitude. Lake elevations, latitude, surface area, and watershed size ranged, respectively, between 580 and 2583 m, 37.24 and 48.78 N, 1.02 and 7.97 km 2 , and 6.74 and 449.04 km 2 (Table S1).

Ice breakup dates
Ice breakup dates were quantified using MOD10A1 (Hall and Riggs 2016), a daily snow cover product from the Moderate Resolution Imaging Spectroradiometer (MODIS) on board NASA's Terra satellite. The Aqua-based, MYD10A product was not in production during our data collection period. The daily snow product returns a value (0-100) of Normalized Difference Snow Index (NDSI), cloud cover masks, and a quality assurance value from each pixel on a daily time step. This product has approximately 93% accuracy in determining NDSI in its history of use (Hall and Riggs 2007) but loses accuracy under melt conditions and heavily vegetated areas (Rittger et al. 2013). MODIS sensors require off-nadir observations with viewing zenith angles up to 65 to obtain daily 500 m images. Off-nadir viewing angles increase the ground instantaneous field of view up to 10 times. An increased ground instantaneous field of view may increase the contribution of radiance from adjacent land areas in our lake-centered focal pixels. We accounted for possible contribution of radiance from land areas onto lake pixels in two ways. First, we obtained MOD10A1 observations by selecting the best available observation for each grid cell from the MODIS/Terra Snow Cover 5-Min L2 swath data set (MOD10L2) in part to minimize large off-nadir views (Hall and Riggs 2016). Second, we inspected each lake site by overlaying the MOD10A1 image pixel grid with each lake's surface area and data extraction point (geometric center of lake). We then removed any lakes where the pixel associated with the extraction point had any portion on land (Fig. S1).
We assumed that a NDSI snow cover value of > 0.50 indicated that the lake was completely ice covered, < 0.50 indicated breakup or partially ice-covered and 0 indicated ice-free. MOD10A1 has a 0.25-km 2 pixel size; thus, we only used lakes that had a 1-km 2 surface area to ensure that multiple pixels fell within the lake (Reed et al. 2009;Arp et al. 2013). We downloaded daily image values from the geometric center pixel in each lake from winter 2002 (mission start of MODIS satellite) to winter 2017 using Google Earth Engine (GEE; Gorelick et al. 2017). In addition to dates collected by remote sensing, we visually determined ice breakup date for Castle Lake, California through a record of daily images captured each day by a pre-programed digital camera (Fig. 1). Because the area of Castle Lake was 0.20 km 2 (below minimum surface area for MODIS), we did not use it in validation of remotely sensed lake ice breakup; however, we did use it to validate our predictive model. Ice breakup date was determined as the day that lake surface was 100% liquid and exposed to air (as observed by satellite imagery). When cloud cover obscured a lake's ice breakup date by more than 10 d, we did not use data for that lake in that year. The middle date between the last observed ice on and most recent observed ice breakup was taken from observations that were obscured by fewer than 10 d of cloud cover. On average, clouds obscured ice breakup by 6 AE 1 d (mean and SE).

Historical downscaled climate data
Downscaled climate data from each lake and all years (2001-2017) was acquired from the GRIDMET climate data set, using GEE to extract data (Abatzoglou 2013). GRIDMET is a gridded surface meteorological dataset that covers the continental United States on a 4 km × 4 km grid size. The data set includes minimum and maximum temperatures, precipitation accumulation, downward surface shortwave radiation, wind velocity, and relative and specific humidity on a daily time step. GRIDMET has been previously used to determine snowpack for mountains in the western United States (Harpold et al. 2017b). Cloud cover is not a specific variable included in the GRIDMET data set but is gathered from the National Land Data Assimilation System (NLDAS; Mitchell et al. 2004) to develop downward surface shortwave radiation within the (> 1 km −2 ), elevation (> 1500 m for Sierra Nevada and > 500 m for cascades), if they annually freeze, and if MODIS pixels accurately fit the shape of the lake (see Fig. S1). Black circles are lakes where ice breakup was determined from MOD10A1, the larger red triangle (Castle Lake, California) was visually identified.
data. The use of shortwave radiation in the mountains is limited because (1) resolution of the data is coarser than the topographic variation in complex mountain terrains and (2) cloud cover that can reduce shortwave radiation is not always accurately identified and accounted for in the data (Hinkelman et al. 2015;Bair et al. 2016).

Projected downscaled climate data
Projections of climate conditions from Global Climate Models (GCM) were used to forecast ice breakup dates through year 2099. We acquired historical projections  and future projected  climate data from 15 of 31 GCMs presented in the Fifth Assessment Report of the Intergovernmental Panel on Climate Change (IPCC 2014). California's Climate Change Technical Advisory Committee identified the subset of models that function best for the western United States (Lynn et al. 2015). We elected to use the ensemble approach because of the variability among projections for the individual GCMs.
Given these model-to-model differences, the ensemble approach gives general and robust results. The projections used here are daily maximum and minimum air temperatures and precipitation, downscaled onto 4 km × 4 km grid cells. Future projections were acquired for representative concentration pathways (RCP) 4.5 and 8.5 simulations. RCPs are atmospheric greenhouse-gas (GHG) emissions scenarios reflecting two different assumptions about future land uses, economic activities, lifestyles, energy uses, population growth rates, technology advancements, and climate policies (IPCC 2014;Lynn et al. 2015). RCP 4.5 is an optimistic emissions scenario with GHG concentrations in the atmosphere leveling off by midcentury, whereas RCP 8.5 represents a pessimistic scenario with high GHG emissions and rising GHG concentrations throughout the century (IPCC 2014). Historical values are simulations of climate under historical emissions and concentrations. The set of models used here are historical and RCP 4.5 and 8.5 simulations from the ACCESS-01, BCC-CSM1-1, CanESM2, CCSM4, CESM1-BGC, CESM1-CAM5,

Predictors of ice breakup
Mean daily temperature, snow and rain fraction (portion of precipitation that falls as snow), downward shortwave radiation, and wind speed were summed over winter and spring (1 October-31 May) and spring only (1 March-31 May) periods. We used these time periods because the former represents the period which both cold content and melt processes occur over the study area, while the latter represents the time period which melt processes typically occur over the range the study area. We used the cumulative sum of variables over other metrics to account for the total variation over the course of winter and spring (Preston et al. 2016). Energy required to melt ice and snow depends on absolute values and does not change relative to site specific means, therefore we did not standardize variables to variation from the mean. Mean daily temperature is the mean of the minimum and maximum temperature. Snow fraction was calculated from a temperature-based regression model (Dingman 2002), where precipitation falls as snow when temperature is < 0 C and rain > 6 C. Between 0 C and 6 C, precipitation is the product of the melt factor (0.167) and mean daily temperature. While wind speed is uncertain in downscaled climate models (Abatzoglou 2013), we included it because it has been used as a predictor of ice breakup in other lakes.
The Random Forest (RF) regression algorithm (Breiman 2001;Liaw and Wiener 2002) identified climatic and physical characteristics that correlate with ice breakup in mountain lakes. Random Forest is a tree-based regression tool where a subset of predictors chosen at random creates a split Fig. 3. Random Forest partial dependence plots for bi-variate interactions of surface area with snow fraction (a) and cumulative mean daily winter/ spring air temperature (CMDWT; b). The Z-axis represents relative change in the ice breakup day in response to the interactions when all other variables are held at their median value. The interactions were significant (p < 0.01) within the LME model. Smaller lakes are more effected by higher snow fraction and cooler temperatures. Table 1. Performance of linear mixed modeling sorted organized by AIC (Akaike information criteria) score. Fixed effects are listed in the Model column and include snow fraction, cumulative mean daily winter temperature (CMDWT), surface area, and downward surface shortwave radiation. All models included a random intercept term for lake identity. Four hundred and eighty observations were used for each model (only those data used in training data set). The bolded model was selected for projection. See text for rationale on model selection.

Model
Parameters ( Variable importance was ranked based on the percent increase in mean square error (MSE) when the variable was removed from the model and visually assessed using partial dependency plots. Partial dependency plots perform model iterations on a single variable and hold all others variables at their median value (Milborrow 2018). The dependent variable was the day of year (number of days from 1 January) of ice breakup. Independent variables fell into two categories: (1) climate variables, which included cumulative sums of rain fraction, snow fraction, mean daily temperature, downward shortwave radiation, and (2) static lake variables, which included surface area, elevation, latitude, delivery ratio (surface area to watershed size) and watershed area. The model was run for winter and spring periods separately and then results were compared using the percent variance explained. Final models were run with 10,000 trees, models were constructed with the R statistical software (R Core Team 2015) with package randomForest (Liaw and Wiener 2002) and partial dependence analysis was done with the package plotmo (Milborrow 2018) .

Model development and forecasting
We used linear mixed effects models (LMEM) to build a predictive model that estimates the day of the year that ice breaks up, using the lme4 package (Bates et al. 2015). The fixed effect size of each univariate predictor (β) was reported along with the standard error (SE) and p-value of each of the fixed effects. The data were first divided into a training and test set. The test data set included nine lakes. Eight of these lakes were randomly selected from the complete remote sensing data set. We included Castle Lake in the test data set because daily game camera images capture precise ice breakup. Furthermore, Castle Lake breakup dates are independent of remotely sensed data. The remaining lakes in the data set trained the model. We started our model building process with the top three predictors identified in the Random Forest algorithm. We then added lower ranked variables individually in a stepwise process to determine if they improved model performance. To create the most parsimonious model, we used only the top three variables from the Random Forest model. Fixed effects included climate and lake characteristic variables and the random intercept term was used for lake identity. Model fit was evaluated by comparing Akaike information criterion (AIC) values, log likelihoods and R 2 . Model performance was evaluated by calculating the root mean square error (RMSE) and absolute mean error (AME) between predicted and observed data using the test dataset.
The LMEM for ice breakup prediction was used with future and historical climate projections to project ice breakup dates over all lakes from 1950 to 2099 using historical conditions, RCP 4.5 and RCP 8.5. For use in our projections, we summarized temperature and precipitation data projected by GCMs in the same way as GRIDMET climate data. The "predict" function in R (R Core Team 2015) was used to project the date of ice breakup for each GCM, year and lake.
We compared projected future (to 2099) ice breakup dates from RCP 4.5 and 8.5 to projected historical  and observed current (2001-2017) ice breakup dates. Current mean ice breakup date was calculated for each lake by averaging observed (from remote sensing) ice breakup date across years. Historical mean projected ice breakup date was calculated for each lake by averaging ice breakup date for each lake and year from the 15 historical GCM projections. Annual average ice breakup date for each lake (from the 15 GCMs) was then averaged across all years  to give a lake specific historical ice breakup date. Future mean projected ice breakup date was calculated for each year and lake from the 15 GCM models. The number of days different from historical mean lake ice breakup date and current mean lake ice breakup date was calculated for each year and lake for the projected future date. Finally, the mean number of days different from historical average ice breakup date was calculated across all lakes. Mean difference among lakes for each GCM was plotted individually and as the mean of all models.
For each lake, we conducted a regression of predicted mean ice breakup day (across the 15 GCMs) relative to year. The slope of the regression line was used as the trend (number of Fig. 6. Lake specific ice breakup trend predicted by the base LMEM model for RCP 4.5 (left column) and RCP 8.5 (right column) related to latitude (a, b), elevation (b, c), and surface area (c, d). The relationship of surface area to trend of ice breakup was significant (p < 0.05 for both RCP 4.5 and RCP 8.5) but was not for latitude or elevation. Dashed lines are linear regressions for each relationship.
days per year) of ice breakup for each lake. The trend for each lake was then regressed against elevation, latitude, and surface area to explore geographic variation in mountain lake sensitivity to climate change.

Predictors of ice breakup
Random Forest was used to select variables that were important for ice breakup in mountain lakes. First, we evaluated which summation method of variables explained more variance, the winter or spring period. The Random Forest model that used the winter and spring period (1 October-31 May) explained more variance (77.2%) than the model that used the spring only period (69.4% of variance explained). The remainder of the paper focuses on the model that used the winter and spring period to evaluate ice breakup date because of better model performance, and that snow fraction (which mostly occurs during winter) was the strongest predictor of ice breakup. Random Forest identified cumulative winter/spring snow fraction, cumulative mean daily winter/spring air temperature (CMDWT) and lake surface area as the top three predictors of ice breakup (Fig. 2). Random Forest partial dependence plots showed that increasing cumulative winter snow fraction resulted in later ice breakup dates (Fig. 2b, while increasing CMDWT and surface area resulted in earlier breakup dates (Fig. 2c,d). Other variables had much smaller effects (Fig. S2).
Bivariate interactions between variables were also important in describing drivers of ice breakup. Topographic maps describing bivariate interactions generated with the Random Forest results indicate a threshold value of~500 mm of cumulative winter/spring snow fraction below which ice breakup days decrease rapidly (Fig. 3a). Lake size interacted with cumulative winter snow fraction, with smaller lakes (< 4 km 2 ) having longer periods of ice cover when cumulative winter snow fraction is > 500 mm. CMDWT also interacted with lake size to influence ice breakup date. Once CMDWT reaches approximately 500-1000 ( C), lake ice breakup date was rapidly hastened, and the effect was stronger for the smallest lakes (< 2 km 2 ).
During the stepwise linear mixed effect model (LMEM) development we further evaluated drivers of ice breakup. The top model included downward surface shortwave radiation, CMDWT, cumulative winter snow fraction, and lake surface area explained 88% of the variance. Use of only the three latter variables explained 79% of variance (Table 1). For predictive modeling, we elected to exclude downward surface shortwave radiation, because (1) obtaining high quality downward surface shortwave radiation projections from GCM models is not feasible at this time; (2) there was a relatively small improvement by inclusion of downward surface shortwave radiation in the LMEM and; (3) Random Forest showed downward surface shortwave radiation had a relatively low importance value.

Model development and forecasting
We validated our ice breakup using linear mixed effect models using the training data set. For this analysis, we used the top 3 predictors identified in the Random Forest model. These were cumulative winter snow fraction, CMDWT, and lake surface area. We used an interactive model despite snow fraction, air temperature, and lake surface area having significant effects as univariate predictors (snow fraction: β = 0.03, SE = 0.01, p < 0.01; temperature β = −0.07, SE = 0.01, p < 0.01; surface area: β = −8.73, SE = 2.71, p < 0.01), as well as bivariate and tri-variate interactions (all values p < 0.001). We explored additional co-variates by introducing them stepwise to the base model which resulted in lower Akaike information criteria (AIC) values than the base model (Table 1). Although latitude and elevation improved model fit, their contributions were not straightforward; thus, we elected to proceed with the base model using the three key variables. RMSE from model predictions using the test data set was 18 d, absolute mean error (AME) was 14 AE 2 d (mean AE standard error) and median absolute error was 10 d (Fig. 4). If Castle Lake (0.20 km 2 ) was omitted from the test data set, model accuracy increased to 15 d (RMSE) and 13 AE 1 d for AME.
Using the predictive LMEM with historical and climate projection data indicated the sensitivity of future mountain lake ice breakup dates to changes in CMDWT and cumulative winter snow fraction predicted GCMs. In 2099, the mean (AE standard error) of all GCM projections under RCP 4.5 will be 25 AE 4 d earlier and 61 AE 5 d earlier under RCP 8.5 than historically modeled  ice breakup dates (Fig. 5). In 2099, the mean (AE standard error) of all GCM projections under RCP 4.5 will be 18 AE 3 d earlier and 54 AE 2 d earlier under RCP 8.5 than current observed mean (2002-2017) ice breakup date.

Discussion
Understanding the mechanisms governing ice breakup dates in mountain lakes and developing predictive models for ice breakup are critical research goals in the face of ongoing climate change. We developed a predictive model to estimate the impacts of climate change on mountain lake ice breakup date. Projections from our models suggest that lake ice breakup dates will be earlier under both RCP 4.5 and RCP 8.5 climate scenarios and are independent of lake latitude or elevation. Our model also indicated that smaller lakes will be more affected than larger lakes. The most parsimonious model included the climate variables CMDWT and cumulative winter and spring snow fraction. The results indicate that, for mountain lakes in western North America, these climate variables, which affect the accumulation of snowpack, drive the timing of ice breakup date. The use of cumulative predictors that integrate duration, magnitude and variation of winter climate over the entire winter season suggests models should incorporate factors that capture all the variation of the climate signal. Our analysis indicates that under future climate scenarios the breakup of lake ice might occur up to 61 d earlier compared to historical means in the mountain areas in western North America by the year 2099. In the following paragraphs, we discuss and develop our findings on the predictors of ice breakup and results of our predictive model.

Predictors of ice breakup
In our Random Forest analysis, cumulative mean daily winter/spring air temperature were stronger predictors of ice breakup date than downward surface shortwave radiation, which included the effects of cloud cover (Mitchell et al. 2004;Abatzoglou 2013). Cloud cover and shortwave radiation are well documented primary controls on snow and ice melt (Marks and Dozier 1992;Marks et al. 1999;Sumargo and Cayan 2018) and moderately improved the predictions of our linear mixed effects model (LMEM). However, we elected not to include downward shortwave radiation in our predictive model because GCM predictions of cloud cover are less certain than the more certain outcome of reduced snow fraction. Furthermore, cloud cover is difficult to parameterize, particularly in complex terrain (Teixeira and Hogan 2002;Bair et al. 2016), and is not projected in the selected GCMs for our study region (Lynn et al. 2015). Our focus on snow fraction and air temperature bolsters the results from other models developed for mountain lakes (Sánchez-López et al. 2015;Preston et al. 2016).
Cumulative winter and spring snow fraction and cumulative mean daily winter and spring air temperature were stronger predictors of ice breakup in mountain lakes than latitude, elevation or incoming shortwave radiation. Cumulative winter and spring air temperature plays a direct role or is a proxy for several processes that impact mountain lake ice breakup. Snow fraction is a primary control on lake ice breakup date because it can thicken ice through the creation of "gray ice," that forms from the successive freezing and melting of snowpack (Vavrus et al. 1996;Walsh et al. 1998). Both ice and snowpack have considerable cold content and require energy to melt. Air temperature affects snow fraction and cold content by regulating the form of winter precipitation which then moderates the accumulation of cold content in snow/ice pack (Jennings et al. 2018). Snow fraction and pack which accumulates at colder air temperatures also increases cold content in the snow pack more rapidly than snow fraction that occurs closer to 0 C (Jennings and Molotch 2020). Cumulative winter air temperature also moderates snow ablation due to net radiation and turbulent heat fluxes. Mechanistic modeling of snow and ice on individual lakes would require a surface energy budget approach. This approach would require the calculation of cold content accumulation (Jennings et al. 2018;Jennings and Molotch 2020), latent and sensible heat fluxes (Cline 1997;Burns et al. 2014) and inclusion of net short and net longwave radiation (Marks and Dozier 1992;Kirillin et al. 2012;Sumargo and Cayan 2018). However, Random Forest and LMEM showed that ice breakup dates were driven by the correlative parameters that result in cold content accumulation and snow/ice melt during winter and spring.
Lake surface area improved model performance and was a strong predictor of ice breakup. We predicted that lakes with higher watershed area to lake surface area ratios (delivery ratio) would experience earlier ice breakup driven by increased runoff (Brown and Duguay 2011), but we did not detect a strong relationship. We posit that snowpack either overrides watershed influences or discharge was inadequate to cause ice breakup in the size of watersheds that were included in our data set.
Climate variables, that is, CMDWT and cumulative snow fraction, were stronger predictors than geographic descriptors. Our results contrast with results from the Pyrenees, Tetras, and Rocky Mountains where elevation and latitude were used as primary predictors of lake ice breakup (Šporka et al. 2006;Sánchez-López et al. 2015;Preston et al. 2016). In those studies, the authors suggest that latitude and elevation are representative of climatic conditions. For example, Sánchez-López et al. (2015) found that the North Atlantic Oscillation had strong effects at different latitudes and elevations. Alternatively, observations at Emerald Lake detected, using Landsat images, that 81% of the interannual variation in ice breakup date was explained by snow water equivalent (Sadro et al. 2019;Melack et al. 2021). Climate variables may be better predictors in our study because of the large spatial extent examined, where lower elevations at higher latitudes could have similar conditions as higher elevations at lower latitudes.
The smaller lakes in our dataset were more sensitive to climate change scenarios. These experienced colder air temperatures and had a higher snow fraction. Thus, ice breakup dates were later than for the larger lakes. As cumulative air temperature increases and cumulative snow fraction decreases with climate, the ice breakup date of smaller lakes will hasten more rapidly than larger lakes.
The cumulative sum of snow fraction and mean daily air temperature from the winter and spring period provided clearer identification of drivers and better predictions of ice breakup in our geographic region than the spring period used by others ( The winter and spring period snowpack and cumulative mean daily air temperature were better predictors in our region for several reasons. First, early season snow at the snow-ice interface can convert to "gray ice" that thickens and increases the cold content of snow and ice on top of a lake (Jennings et al. 2018). Second, late season snow creates both a thicker snowpack with more cold content and a higher albedo surface that reflects more shortwave radiation (Vavrus et al. 1996). Lastly, cold air temperature throughout the winter drives sublimation which results in longwave radiation emission from the snow and ice pack, which is responsible for an estimated 27% of cold content (Jennings et al. 2018). Our data show that the entire winter season's weather can influence ice breakup in mountain lakes.

Model development and forecasting
Predictions from our LMEM model were within 14 d (absolute mean error) of observed dates of ice off. In years when ice breakup was significantly earlier than the lake specific mean breakup date, the model typically predicted later ice-off than what was observed. The training data had fewer observations in this area of parameter space, which resulted in less certainty about predictions with very low snowfall. Others have experienced similar uncertainty for lowland lakes, where climate varies less and hydrological processes are simpler (Brown and Duguay 2011;Shuter et al. 2013).
We observed no clear relationship between projected trends (number of days earlier per year) of ice breakup dates with latitude or elevation. Cumulative winter snow fractions are expected to decline as more precipitation falls as liquid rather than snow due to warming. Because the frequency of storms above freezing is larger at southern latitudes when compared to northern latitudes already and should continue to be even more frequent with warming (Guan et al. 2013;Dettinger et al. 2015;Cayan et al., 2016), we anticipated that projected ice breakup dates would depend on latitude. However, among the lakes studied here, there is a strong inverse relation between altitude and latitude (r = −0.91) such that the southern lakes are higher and cooler, while the northern lakes are lower and warmer overall. Thus, altitude and latitude play conflicting roles in the projected breakup dates.
Over time, additional variables associated with snow melt processes may contribute to mountain lake ice breakup date. Eolian dust in the Rocky Mountains Clow et al. 2016) has increased (Deems et al. 2013) and continued deposition on mountain snow has increased the melt rate by reducing albedo. For instance, we expect that our model may be improved by explicitly accounting for effects of snow albedo (Conway et al. 1996;Hansen and Nazarenko 2004;Skiles et al. 2015). When climate models have the capacity to incorporate particulate loading (Oaida et al. 2015), the effects of changing albedo on ice-off dates can be included.
Accurate modeling over large geographic extents will help support management and climate-adaptation efforts for mountain lake ecosystems. The model we present projects up to a 61 d decrease in ice cover duration by the end of the century, and this will adversely affect physical (Smits et al. 2020), biogeochemical (Preston et al. 2016;Sadro et al. 2018), and ecological processes (Park et al. 2004;Parker et al. 2008;Caldwell et al. 2020). Ice breakup projections can be improved by incorporating more years of data into model training (drawing upon either satellite data or other sources) and improved projections of cloud cover and downward radiation. The use of lake specific and empirically collected albedo and cold content of the snow and ice pack data, and shortwave radiation estimates that account for local topography, will help identify mechanisms driving mountain lake ice breakup not identified here. Despite these caveats, our model results illustrate that ice breakup in mountain lakes is likely to hasten with changes in CMDWT and cumulative winter snow fraction. The predictive relations we quantified and the projections we provided, as well as the identified stronger sensitivity of small mountain lakes relative to moderately sized lakes, already provide an improved basis for planning and decision making about future climate impacts and adaptations.