Biogeographic drivers of diazotrophs in the western Pacific Ocean

The global budget of marine nitrogen (N) is not balanced, with N removal largely exceeding N fixation. One of the major causes of this imbalance is our inadequate understanding of the diversity and distribution of marine N2 fixers (diazotrophs) as well as their contribution to N2 fixation. Here, we performed a large‐scale cross‐system study spanning the South China Sea, Luzon Strait, Philippine Sea, and western tropical Pacific Ocean to compare the biogeography of seven major diazotrophic groups and N2 fixation rates in these ecosystems. Distinct spatial niche differentiation was observed. Trichodesmium was dominant in the South China Sea and western equatorial Pacific, whereas the unicellular cyanobacterium UCYN‐B dominated in the Philippine Sea. Furthermore, contrasting diel patterns of Trichodesmium nifH genes and UCYN‐B nifH gene transcript activity were observed. The heterotrophic diazotroph Gamma A phylotype was widespread throughout the western Pacific Ocean and occupied an ecological niche that overlapped with that of UCYN‐B. Moreover, Gamma A (or other possible unknown/undetected diazotrophs) rather than Trichodesmium and UCYN‐B may have been responsible for the high N2 fixation rates in some samples. Regional biogeochemistry analyses revealed cross‐system variations in N2‐fixing community composition and activity constrained by sea surface temperature, aerosol optical thickness, current velocity, mixed‐layer depth, and chlorophyll a concentration. These factors except for temperature essentially control/reflected iron supply/bioavailability and thus drive diazotroph biogeography. This study highlights biogeographical controls on marine N2 fixers and increases our understanding of global diazotroph biogeography.

The availability of nitrogen (N) controls primary productivity in large parts of the ocean (Gruber and Galloway 2008;Canfield et al. 2010). However, some global estimations of N input and loss have indicated that the oceanic N budget is unbalanced (Gruber 2008;Zehr and Kudela 2010) and that N removal by denitrification and anammox largely exceeds N 2 fixation (Codispoti et al. 2001;Galloway et al. 2004;Mahaffey et al. 2005;Codispoti 2007). Recent investigations have proposed several reasons for the possible underestimation of N 2 fixation: (1) long-term observations using the 15 N 2 tracer method underestimated N 2 fixation rates (NFRs), owing to slow and incomplete equilibration between the water sample and the 15 N 2 tracer added as a gas bubble (Mohr et al. 2010b;Großkopf et al. 2012); (2) previous studies of marine N 2 fixation mostly focused on surface tropical/subtropical oceans and ignored some atypical areas, such as coastal upwelling locations, polar or cold water zones, oxygen minimum zones, and the deep sea, where diverse diazotrophs or active N 2 fixation has been found (Cheung et al. 2016;Fernández-Méndez et al. 2016;Benavides et al. 2016b;Gradoville et al. 2017;Moreira-Coello et al. 2017;Shiozaki et al. 2017);and (3) little is understood about the diversity of N 2 fixers in the ocean, and some diazotrophic groups, such as unicellular cyanobacteria or heterotrophic prokaryotic lineages, that were previously ignored may actually play an important role in marine N 2 fixation (Montoya et al. 2004;Riemann et al. 2010;Farnelid et al. 2011;Thompson et al. 2012;Moisander et al. 2014). Diazotroph diversity and N 2 -fixing habitats need to be reassessed in terms of offsetting the gap between the oceanic N input and loss.
The nifH gene, which encodes the structural component of nitrogenase iron proteins, has been widely used to characterize diazotroph diversity and abundance (Zehr et al. 1998(Zehr et al. , 2003*Correspondence: yaozhang@xmu.edu.cn This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. Additional Supporting Information may be found in the online version of this article. Farnelid et al. 2011). Diazotrophic cyanobacteria have always been considered the major diazotrophs (Zehr and Ward 2002). The filamentous non-heterocyst-forming cyanobacterial genus Trichodesmium dominates the diazotroph communities in the tropical northwestern Atlantic Ocean Mahaffey et al. 2005;Langlois et al. 2008;Goebel et al. 2010), the western tropical South Pacific Ocean (Shiozaki et al. 2014a;Messer et al. 2016), and the Arabian Sea (Capone et al. 1998;Krishnan et al. 2007;Gandhi et al. 2011); members of this genus play a decisive role in the input of "new N" to the global ocean (Capone et al. 1997;Karl et al. 1997;Capone et al. 2005;LaRoche and Breitbarth 2005). Some heterocystforming cyanobacteria, which have symbiotic relationships with several oceanic diatom genera (Carpenter and Foster 2002;Foster et al. 2009), are usually found in estuaries and near-shore areas (Foster et al. 2007;Zehr 2011). The role of unicellular diazotrophic cyanobacteria in N 2 fixation was first characterized by  from samples recovered from the North Pacific Ocean (Station ALOHA) and was regarded to be equally important as or more important than that of Trichodesmium in the North Pacific (Montoya et al. 2004;Bonnet et al. 2009;Kitajima et al. 2009). Subsequently, several studies have demonstrated the diversity of unicellular diazotrophic cyanobacteria, which includes UCYN-A (symbiotically associated with photosynthetic eukaryotes; Thompson et al. 2012), UCYN-B (Crocosphaera; Montoya et al. 2004) and UCYN-C (the benthic cyanobacterial genus Cyanothece; Falcón et al. 2002;Langlois et al. 2005;Foster et al. 2007;Langlois et al. 2008). Multiple studies have indicated that UCYN-A is widespread in the North Pacific Subtropical Gyre (Church et al. 2005a(Church et al. , 2008Gradoville et al. 2017), the Pacific northwestern coast (Shiozaki et al. 2015a), and the oligotrophic western South Pacific Ocean (Moisander et al. 2010) and dominates in the tropical northeastern Atlantic (Benavides and Voss 2015;Benavides et al. 2016a). Notably, heterotrophic prokaryotes seem to represent a substantial part of the nifH gene libraries in most regions (Farnelid et al. 2011), but their role in local N 2 fixation is arguable when diazotrophic cyanobacteria are also present ). In the South Indian Ocean and South Pacific Gyre, heterotrophic organisms are almost the exclusively retrieved nifH phylotypes, and in situ NFRs are incredibly low (Halm et al. 2012;Turk-Kubo et al. 2014;Shiozaki et al. 2014b). There are exceptions, such as the Chilean upwelling system and the nearby South Pacific Gyre, where the diazotrophic community is dominated by heterotrophic alpha and beta-proteobacterial nifH groups, and the NFRs are high (Gradoville et al. 2017). While diazotrophs are omnipresent in marine waters, strong differences in community structure and activity between regions imply distinct biogeographies for these organisms.
The western Pacific continental margin comprises more than 75% of marginal basins in the global ocean (Tamaki 1991). Studies on diazotrophs have been carried out in the temperate waters of the Kuroshio Current offshore Japan and the Kuroshio bifurcation region of the East China Sea, where high NFRs were detected and high abundances of Trichodesmium were observed using microscopy (Shiozaki et al. 2010(Shiozaki et al. , 2015b. In contrast, the (sub)tropical South China Sea (SCS) is a semienclosed basin. Although previous studies indicated that Trichodesmium and proteobacterial groups were the two dominant nifH phylogenetic groups in the SCS, NFRs and activity of major diazotrophic groups are still unknown (Moisander et al. 2008;Kong et al. 2011;Zhang et al. 2011;Xiao et al. 2015). The Philippine Sea is almost entirely surrounded by island arcs (Uyeda and Ben-Avraham 1972). These island chains and continental land together divide the tropical and subtropical marginal seas into many relatively independent areas with distinct hydrological conditions (Wang 1999). These diverse environmental conditions make the tropical and subtropical western Pacific marginal seas an ideal system for studying the biogeography of N 2 -fixing groups and the major factors controlling it.
In this study, we performed a large-scale cross-system analysis spanning the SCS, Luzon Strait, Philippine Sea, and western tropical Pacific. Diazotroph diversity was analyzed in these ecosystems based on nifH gene clone libraries. The seven major diazotrophic groups (Trichodesmium, UCYN-A1, UCYN-B, UCYN-C, Richelia associated with Rhizosolenia [Het-1], gamma-proteobacteria, and alpha-proteobacteria) were quantified using quantitative polymerase chain reaction (qPCR) targeting nifH genes and transcripts. In all areas, NFRs were determined using a modified 15 N 2 tracing method. In this way, we addressed the spatial and temporal dynamics of diazotrophs, NFRs, and potential activity of major diazotrophic groups. Moreover, we investigated biogeographical controls on marine N 2 fixers. This study improves our understanding of global diazotroph biogeography.

Study sites and sampling
Nine stations (SCS1-SCS9) in the SCS were sampled aboard R/V Shi Yan 3 in June to July 2015 along a vertical water depth profile with one to six layers within the upper 150 m (Fig. 1). Seawater was collected for DNA-based gene analysis and NFR measurements. Twenty-nine stations in the western Pacific were sampled aboard R/V Dong Fang Hong 2 from December 2015 to January 2016. Surface water was collected from 12 stations (PS1-PS12) in the Philippine Sea and four stations (EQ1-EQ4) in the western equatorial Pacific. Water was also collected along a vertical profile with three to five depth layers within the upper 100 m from nine stations (EQ5-EQ13) in the western equatorial Pacific and four stations (WP1-WP4) along 143 E and between 1 N and 11 N (Fig. 1). Samples from the western Pacific were subjected to DNA-and RNA-based gene analysis and NFR measurements. Detailed sampling information is shown in Supporting Information Table S1. Water samples were collected using a conductivity, temperature, and depth rosette sampling system fitted with Go-Flo bottles (SBE 911 Plus, Chen et al. Pacific Ocean diazotroph biogeography SeaBird Inc.); temperature, salinity, and depth data were obtained from this system.

DNA and RNA extraction
Four liters of seawater was filtered through 0.22-μm-poresize polycarbonate membranes Pall Gelman) for DNA or RNA extraction. Samples for RNA analyses were filtered within 30 min and stored immediately in 2 mL RNase-free tubes containing 1 mL of RNAlater RNA stabilizer (Invitrogen, Life Technologies). All membranes were flashfrozen in liquid N and then transferred to −80 C until further analysis.
DNA was extracted using the phenol-chloroform-isoamyl alcohol method as described by Massana et al. (1997), with minor modifications. The concentration and purity of the genomic DNA were detected using a NanoDrop spectrophotometer (Thermo Scientific 2000/2000c). RNA was extracted using TRIzol reagent (Invitrogen, Life Technologies), following the manufacturer's protocols but with minor modifications. DNA was digested using a Turbo DNA Free Kit (Ambion), and contamination was checked by amplifying the bacterial 16S rRNA genes with the universal primers 27F and 1492R. The nifH mRNA was reverse transcribed to synthesize complementary DNA (cDNA) using a SuperScript III First-Strand cDNA synthesis kit (Invitrogen, Life Technologies) with the specific reverse transcription (RT)-primers nifH2 and nifH3 , following the manufacturer's specifications. No-RT reactions were also carried out to check residual nifH DNA contamination as described by Moisander et al. (2014).

nifH gene clone libraries and phylogenetic analysis
Four nifH gene clone libraries were constructed based on the pooled DNA samples from the SCS, the Philippine Sea, the western equatorial Pacific, and Sta. WP1-WP4 (Supporting Information Table S1, see Supporting Information for sampling sites and depths of the clone libraries). A two-step nested PCR strategy was applied to amplify a 359-bp region of nifH genes , with minor modifications. The first round of PCR contained 25 μL of Premix Ex Taq (Takara Bio Inc.), 1 μL of the nifH primers nifH3 and nifH4 (100 μmol L -1 ), 22 μL of double-distilled H 2 O, and 1 μL of DNA template. The reaction program consisted of an initial 5 min of denaturation at 95 C, then 31 cycles of 1 min at 95 C, 1 min at 57 C, and 1 min at 72 C, and a final elongation step of 10 min at 72 C. The second round of PCR was run with the primers nifH1 and nifH2 and 1 μL of first-round PCR products as template. The reaction program was the same as that for the first-round PCR with the exception of an annealing temperature of 54 C. Triplicate PCR products were visualized by gel electrophoresis, pooled, and purified using an agarose gel DNA purification kit (Takara Bio Inc.). Amplicons were cloned into pMD18-T vector (Takara Bio Inc.) and then transformed into competent cells of Escherichia coli DH5α. Plasmid DNA was isolated from individual clones, purified, and sequenced.
A total of 443 clone sequences were recovered and edited using BioEdit software (Hall 1999). All sequences were manually checked for sequencing errors and chimeras. Nucleotide sequences were clustered into operational taxonomic units (OTUs) with a cut-off value of 0.03 using Mothur software (Schloss et al. 2009). Representative nucleotide sequences were analyzed with the BLASTN tool to obtain the most similar reference sequences. Maximum-likelihood phylogenetic trees were constructed in MEGA4 using Jones-Taylor-Thornton model with translated amino acid sequences. Sequences from this study are available from NCBI (GenBank accession numbers MH144398-MH144546 and MH938844-MH939137).

Quantification of nifH genes and transcripts
Seven major diazotrophic groups were quantified via TaqMan qPCR assays. Specific primer and probe sets reported by Church et al. (2005a,b), Foster et al. (2007), and Moisander et al. (2008) were used for Trichodesmium, Richelia associated with Rhizosolenia (Het-1), UCYN-A1, UCYN-B, UCYN-C, and Gamma A. A new qPCR primer and probe set targeting the sequence Alpha-MH144511 was designed (forward primer: 5 0 -ACGGCGCCTAC GAGGATATCGATT-3 0 , reverse primer: 5 0 -CTGCGCCTTGTTCT CGCGGAT-3 0 , and probe: 5 0 -ACGTGCTGGGCGACGTTGTCTGC-3 0 ) using Oligo 7 to quantify the nifH gene of the dominant diazotrophic alpha-proteobacterial OTU in this study. The cross reactivity of the new primer-probe set was checked using a dilution series of plasmids containing inserts matching all of the other primer and probe sets. The thermocycling conditions and reaction mixtures for qPCR followed Zhang et al. (2011), with slight modifications. Each 20-μL reaction mixture contained 10 μL of Premix Ex Taq (probe qPCR; Takara Bio Inc.), 400 nmol L -1 of fluorogenic probe, 400 nmol L -1 each of the forward and reverse primers, and 1 μL of environmental DNA or cDNA. Triplicate qPCRs were run for each environmental DNA/cDNA sample and for each standard on a CFX96 Real-Time System (Bio-Rad Laboratories). The thermocycling conditions were 50 C for 2 min, 95 C for 2 min, and 45 cycles of 95 C for 15 s, followed by 60 C for 1 min. Negative controls without template were also included to test for contamination. A standard curve was made using serial dilutions of quantified, linearized plasmid standards for each qPCR run. The amplification efficiencies of PCR were always between 90% and 100%, with R 2 values >0.99. The quantification limit of the qPCR reactions was two nifH gene copies per reaction according to the estimates from maximum Ct-values generated by quantifiable samples. In samples where triplicate measurements produced amplifications, signals were noted as quantifiable (Stenegren et al. 2018). All qPCR products were sequenced to confirm that the expected target was amplified. Inhibition tests were performed by five-fold serial dilutions of all samples and by mixing environmental samples and positive controls. Based on these tests, we concluded that our samples were not inhibited.

Nitrogen fixation rates
The NFRs were measured using the dissolution method (Mohr et al. 2010b;Lu et al. 2018) to avoid the problems of bubble dissolution. 15 N 2 gas (98.9%) from Cambridge Isotope Laboratories was used, and a blank was checked for contamination of bioavailable non-N 2 15 N, as described in Dabundo et al. (2014). Briefly, triplicates of 10 mL of natural seawater and 2 mL of 15 N 2 gas were injected into 20-mL headspace vials, which were then sealed with a septum stopper and shaken overnight. The δ 15 N of total dissolved N was measured and compared with the δ 15 N of natural seawater. The δ 15 N values (4.7‰ for total dissolved N and 5.0‰ for natural seawater) indicated no contamination. The 15 N 2 -enriched seawater was prepared on board according to Shiozaki et al. (2015a). First, 0.2 μm-filtered seawater was degassed using a Sterapore membrane unit (20M1500A: Mitsubishi Rayon Co., Ltd.) in which the seawater flowed on the inside of the membrane, and a vacuum (−960 mbar, water jet pump) was applied to the outer surface of the membrane. The seawater flow rate was approximately 500 mL min −1 (recirculation period of 10 min). Then, 500 mL of degassed seawater was stored in an acidcleaned, gas-tight plastic bag and 15 N 2 gas was added at a ratio of 10 mL of 15 N 2 per 1 L of seawater. The bags were gently tapped until the gas dissolved completely. After complete dissolution, 10 mL of 15 N 2 -enriched seawater was injected into a 20-mL headspace vial, which was pre-urged with ultrapure helium for ≥ 5 min and was sealed with a septum stopper, and then was stored at 4 C until determining the percentage of 15 N 2 in the laboratory using a gas chromatography isotope ratio mass spectrometry (IRMS; Thermo Fisher Delta V Plus). During seawater sampling, triplicate polycarbonate bottles (1.2 or 4.5 liters) were filled to capacity to avoid contamination from air and capped with septa. Two hundred milliliters of 15 N 2 -enriched seawater was added into the polycarbonate sample bottles, which were thoroughly acid washed, rinsed with Milli-Q water, and then rinsed three times with water from the target collection depth to minimize potential trace metal contamination. The incubation bottles were covered with a neutral-density plastic screen to simulate in situ irradiance levels and incubated for 24 h in on-deck incubators that were flushed with surface seawater connected to a cooling system to maintain approximate in situ temperatures.
After incubation, water samples were gently filtered (< 200 mm Hg) onto precombusted (450 C, 4 h) 25-mm-diameter 0.7-μmpore-size glass fiber filters (Whatman). The filters were frozen at −80 C until they could be dried in an oven at 50 C overnight. The use of a pore size of 0.7 μm could have allowed a few bacteria through the filters, although previous studies have shown that Gamma A is present in the > 3 μm size fraction (Benavides et al. 2016a), and diazotrophic alpha-proteobacteria can form cell aggregates to facilitate N 2 fixation in the presence of O 2 (Martínez-Pérez et al. 2018). The particulate organic N concentrations and isotopic values were analyzed on a Flash elemental analyzer (Thermo Fisher Flash HT 2000) coupled to an IRMS (Thermo Fisher Delta V Plus). International reference material (USGS40) with a different amount of N and a certified δ 15 N value of −4.5‰ was analyzed every eight samples to check the instrument drift and to ensure the accuracy of the measurements. The reproducibility of δ 15 N measurements was better than 0.3‰. The equations proposed by Montoya et al. (1996) were used to calculate the NFRs. The detection limit for NFRs was approximately 0.03-0.3 nmol N L −1 d −1 , which was estimated by taking 4‰ as the minimum acceptable change in the δ 15 N of particulate N (Montoya et al. 1996).

Regional biogeochemistry
The annual aerosol optical thickness at 862 nm and monthly satellite-derived chlorophyll a (Chl a) concentrations at a 4-km resolution were obtained from NASA's OceanColor Web (https:// oceancolor.gsfc.nasa.gov/) for 2015. The monthly mixed-layer depth at a 0.5 lateral resolution was obtained from the U.S. National Oceanic and Atmospheric Administration online database (https://www.pmel.noaa.gov/mimoc/). The daily gridded (1/4 × 1/4 on a Cartesian grid) absolute dynamic topography and absolute geostrophic velocity were obtained from the Copernicus Marine and Environment Monitoring Service website (http://marine.copernicus.eu/) for 2015. The monthly sea surface temperature data on a 1 × 1 latitude-longitude grid were obtained from the National Oceanic and Atmospheric Administration's Earth System Research Laboratory website (https://www. esrl.noaa.gov/psd/data/gridded/data.noaa.oisst.v2.html) for 2015 and 2016. Figures were produced using Ocean Data View v. 4.7.10 and Python 2.7 (Supporting Information Fig. S1). From these public datasets, the data points with coordinates closest to our oceanographic stations were selected for further analysis.

Statistical analyses
The Pearson correlation coefficients were calculated from the relative contributions of seven diazotroph groups to the total number of nifH gene copies summed in a given sample using the program R. Differences in the nifH gene transcript level between day and night for each diazotrophic group were tested using the nonparametric Wilcoxon tests, as the normal distribution assumption was not always met for the individual datasets. Redundancy analysis based on the qPCR data was used to analyze the variations in the targeted diazotroph communities under biogeochemical constraints with the vegan package in R. The qPCR-based relative abundances and the environmental factors were normalized via Z transformation (Magalhães et al. 2008). The null hypothesis, specifically, that the community was independent of environmental parameters, was tested using constrained ordination with a Monte Carlo permutation test (999 permutations).

Phylogenetic diversity of diazotrophs
Rarefaction analysis of clone libraries showed that the observed diversity of the nifH gene was not exhaustive in all Chen et al.
Pacific Ocean diazotroph biogeography of the studied regions, and the richness of the nifH gene was higher in the SCS than in the western equatorial Pacific (Supporting Information Fig. S2). Phylogenetic analysis indicated that all nifH gene sequences grouped into four clusters (I, II, III, and IV), which were defined by Zehr et al. (2003) (Fig. 1). The SCS library was dominated by diazotrophic proteobacterial nifH phylotypes, accounting for more than 88% of the total clone sequences. Approximately 30% of all sequences grouped with sequences EU052413 (Moisander et al. 2008), HQ586455 (Zhang et al. 2011), and LC013636 (Shiozaki et al. 2015a), which belong to the same amino acid sequence-based OTU, namely, Gamma A (Langlois et al. 2015). The two most dominant nucleic acid sequence-based OTUs in the diazotrophic alpha-and beta-proteobacterial groups were sequences MH144511 (~18%) and MH144544 (~17%). The diazotrophic cyanobacteria Trichodesmium, UCYN-A1, and UCYN-C were also retrieved from the SCS library and accounted for only~4% of the total clone sequences.
In contrast, the nifH gene libraries from the Philippine Sea (~40%) and the western equatorial Pacific (24%) were dominated by Trichodesmium. The sequence MH938851, belonging to the alpha-and beta-proteobacterial groups, accounted for~15% (Philippine Sea) and~11% (western equatorial Pacific) of the total clone sequences in the two clone libraries; the sequences belonging to Gamma A accounted for~9% (Philippine Sea) and~15% (western equatorial Pacific). Gamma A sequences, accounting for~34%, were also abundant in the nifH gene library from four stations (WP1-WP4) along 143 E and between 1 N and 11 N. However, no diazotrophic cyanobacterial sequences were retrieved from the nifH gene pool of Sta. WP1-WP4. In addition, a dominant gamma-proteobacterial sequence, MH938954, was retrieved from both libraries of the western equatorial Pacific and Sta. WP1-WP4, accounting for~33% and 39% of the total sequences, respectively (Fig. 1). Taken together, these results indicated that Trichodesmium and Gamma A were the two major diazotrophic groups in the nifH gene clone libraries across the studied regions.

Distribution of major diazotrophic groups
As the major diazotrophic groups based on the library analysis, Trichodesmium and Gamma A were further quantified via qPCR. In addition, to clarify the distribution of major diazotrophic groups, the unicellular cyanobacteria UCYN-A1, UCYN-B, and UCYN-C and the heterocystous cyanobacteria (Het-1) were also quantified as they have been considered important in previous studies (Church et al. 2005a,b;Taniuchi et al. 2012;Turk-Kubo et al. 2015;Bonnet et al. 2016Bonnet et al. , 2018Stenegren et al. 2018). Between the two dominant diazotrophic alpha-and beta-proteobacterial phylotypes, Alpha-MH144511 was chosen for quantification using the primer and probe set designed in this study. We also quantified UCYN-A2, Richelia associated with Hemiaulus hauckii (Het-2), and Calothrix symbionts of Chaetoceros (Het-3), but their abundances were very low (see Supporting Information Table S2); thus, they were excluded from subsequent analyses.
In contrast, UCYN-B was the most abundant nifH group in the Philippine Sea (Sta. PS1-PS12) surface water (9.6 × 10 3 to 5.9 × 10 5 nifH copies L −1 ), with the relative nifH gene abundances ranging from 36.9% to 93.8% of the total number of copies. Along 143 E and between 1 N and 11 N, Trichodesmium, gamma-proteobacteria, and Het-1 codominated at the low-latitude Sta. WP1 and WP2, and UCYN-B generally dominated at the relatively high-latitude Sta. WP3 and WP4. However, overall, the total nifH gene abundances were distinctly lower at these four stations as well as at the EQ9-EQ13 stations (Fig. 2). Notably, Gamma A was widespread throughout the SCS and the western Pacific Ocean, with higher nifH abundances (3.9 × 10 3 to 4.6 × 10 4 nifH copies L −1 ) in Philippine Sea surface water, although they were not dominant in most samples. Het-1 was also widely observed, but its nifH gene abundances were one order of magnitude lower than those of Gamma A in the SCS and Philippine Sea and comparable to those of Gamma A in the western equatorial Pacific (Fig. 2). UCYN-C nifH genes were detected in only one-third of the samples from the Philippine Sea and western equatorial Pacific and generally showed abundances one order of magnitude lower than those of Het-1, with the exception of Sta. PS1, PS2, and EQ7 (up to 1.0 × 10 5 nifH copies L −1 ) where their nifH gene abundances were comparable. In the SCS, the UCYN-C nifH gene abundances were also comparable to the Het-1 nifH gene abundances (Fig. 2).

Distribution of diazotroph transcripts
The nifH gene transcript abundances of the seven targeted major diazotrophic groups were quantified via RT-qPCR. The results showed regional differences that were similar overall to those for the gene distribution but were more variable (Fig. 3a). The nifH transcripts of Trichodesmium were retrieved from a 0-to 30-m water depth (with the exception of 75 and 100 m at Sta. WP4) at 18 of the 29 stations sampled, and the abundances ranged from 59 to 1.1 × 10 4 nifH transcripts L −1 .

Chen et al. Pacific Ocean diazotroph biogeography
The UCYN-B nifH transcripts were retrieved mainly from the surface water (17 of 29 stations), and the abundances ranged from 64 to 4.7 × 10 4 nifH transcripts L −1 . In general, Trichodesmium had higher relative abundances of nifH transcripts in the total number of transcripts summed across the seven major groups in the western equatorial Pacific region (Sta. EQ1-E13) than did other diazotrophs, whereas the relative abundances of UCYN-B nifH transcripts were higher in the Philippine Sea (Sta. PS1-PS12) and the western Pacific Sta. WP3 and WP4 (Fig. 3a).
Notably, the nifH transcripts of Gamma A were the most widespread, retrieved from a 0-to 30-m water depth at 22 of 29 stations sampled; the abundances ranged from 60 to 3.2 × 10 3 nifH transcripts L −1 . In general, higher relative abundances of Gamma A nifH transcripts in the total number of transcripts were found in the western equatorial Pacific region (Sta. EQ5-EQ13 and WP1) than in the other regions (Fig. 3a). The nifH transcripts from Het-1 were found at 15 stations, with abundances ranging from 68 to 1.3 × 10 3 nifH transcripts L −1 ; Het-1 dominated only four pools of nifH transcripts, specifically, in the surface water of Sta. PS5 and EQ4 and at a 75-to 100-m water depth at EQ5 and EQ7 (Fig. 3a). The nifH transcripts of UCYN-A1 (100 transcripts L −1 at the surface) were detected only at Sta. PS1 (in the Luzon Strait). The nifH transcripts of UCYN-C were detected only at Sta. EQ8 (142 transcripts L −1 at 75 m) and EQ10 (130 transcripts L −1 at the surface). The nifH transcripts from Alpha-MH144511 were not retrieved from any of the western Pacific samples. Overall, Trichodesmium, UCYN-B, and Gamma A were the most abundant diazotrophic groups in the nifH transcript pools in the western Pacific.
The nifH cDNA concentrations were normalized to the nifH gene copy concentrations to compare the number of nifH transcripts per gene copy, a proxy for relative activity (Church et al. 2005b;Zehr et al. 2007), among various diazotrophic phylotypes (Fig. 3b). Notably, the relative activity of Trichodesmium was distinctly lower than that of UCYN-B, Gamma A, and Het-1 in the western equatorial Pacific region (Sta. EQ3-EQ12), although the nifH gene and transcript abundances of this genus were higher. In contrast, Het-1 had a higher relative activity than that expected from its nifH gene or transcript abundances across all regions; its relative activity was even the highest among the targeted diazotrophic groups at some stations. In addition, the highest relative activities of Trichodesmium, UCYN-B, and Het-1 all occurred at the western Pacific Sta. WP3 and WP4.
(a) Fig. 3. nifH (a) transcript abundance distribution and (b) transcript abundances normalized to gene abundances of the seven targeted major diazotrophic groups in the western Pacific Ocean obtained by quantitative RT PCR. The depth-integrated mean nifH transcript abundance in a given depth zone is plotted in (a). The depth-integrated mean nifH transcript abundance normalized to the depth-integrated mean nifH gene abundance in a given depth zone is plotted in (b). Crosses indicate sampling stations.

Diazotroph community structure
There were distinct differences in diazotroph community structure between the clone libraries and qPCR assays. Most notably, the UCYN-B and Het-1 nifH gene sequences were not retrieved by the clone libraries, while they were abundant diazotrophic phylotypes in the qPCR assays from the studied regions. In particular, the UCYN-B qPCR-based abundance was highest among the targeted nifH groups in the Philippine Sea; however, Trichodesmium was the most dominant nifH phylotype in the clone library. In addition, Trichodesmium was a dominant nifH group in the SCS based on qPCR assays, but in the nifH gene clone libraries, the dominant sequences belonged to heterotrophic proteobacteria (Gamma A and Alpha-MH144511). Similarly, almost all nifH sequences obtained from the clone library belonged to proteobacteria at Sta. WP1-WP4. We speculate that there was an amplification bias generated when performing "nested" PCR. A previous study indicated that Gamma A can be preferentially amplified in PCR libraries (Turk et al. 2011). Amplification bias may have caused the role of UCYN-B in the western Pacific to be ignored in previous studies that used clone libraries or high-throughput sequencing (Moisander et al. 2008;Zhang et al. 2011;Xiao et al. 2015;Shiozaki et al. 2015a).

Spatial niche differentiation of major diazotrophic groups
There was distinct large-scale spatial structure among the targeted major diazotrophic groups, as shown in Fig. 2. We compared the relative contributions of these seven groups to the total number of nifH gene copies summed in a given sample across the study regions. Most notably, the relative abundance distribution of the Trichodesmium nifH genes displayed a strong negative correlation with that of UCYN-B (Fig. 5, r = −0.69, p < 0.01), indicating spatial niche differentiation. As Trichodesmium and UCYN-B were the two most dominant N 2 -fixing groups in the western Pacific region, they might perform spacespecific niche partitioning to avoid strong competition for the Trichodesmium is considered an opportunist (r-strategist) compared with other phytoplankton (Koening et al. 2009); thus, it may have a competitive advantage over the six other diazotrophic groups (Tyrrell et al. 2003) when nutrients are supplied. However, there was a significant positive correlation between the relative abundance distribution of the UCYN-B and Gamma A nifH genes (Fig. 5, r = 0.26, p < 0.05), which was consistent with observations by Moisander et al. (2014) from the South Pacific Ocean. Previous studies have shown that Gamma A is widespread in oligotrophic surface waters (Moisander et al. 2008Langlois et al. 2015) and occupies an ecological niche that overlaps with that of cyanobacterial diazotrophs (Bombar et al. 2016). Although some heterotrophic bacteria have been reported to be tightly associated with diazotrophic cyanobacteria (Zehr and Paerl 2008), we know little about the lifestyle and genome of Gamma A. Further studies on potential metabolic interactions between Gamma A and diazotrophic cyanobacteria are needed to explain their ecological niche overlap.

Diel temporal patterns of major diazotrophic groups
The nifH gene transcript abundances of the seven targeted major diazotrophic groups showed more variable regional differences than did the gene distribution (Fig. 3), suggesting that some environmental factors might influence their transient transcript levels on a short timescale. First, irradiance levels should be considered. We divided the Philippine Sea and Sta. EQ1-EQ4 samples, which were sampled at regular time intervals, into day (06:00-18:00) and night (18:00-06:00; Supporting Information Table S1). Strikingly different diel rhythm patterns of the nifH transcripts were observed between

Chen et al. Pacific Ocean diazotroph biogeography
Trichodesmium and UCYN-B. The transcript abundances (Fig. 6a) and the cDNA to DNA copy-number ratios (Fig. 6b) of Trichodesmium nifH genes were significantly higher in the daytime than at night (Wilcoxon test, p < 0.05-0.01), whereas the opposite pattern (Fig. 6c,d, Wilcoxon test, p < 0.05-0.01) was observed for UCYN-B. This contrasting diel pattern of higher expression of Trichodesmium nifH genes, as indicated by the cDNA to DNA copy-number ratios (Church et al. 2005b;Zehr et al. 2007), vs. lower expression of UCYN-B nifH genes in the daytime was also observed in the Heron Reef lagoon and the North Pacific Subtropical Gyre (Hewson et al. 2007;Zehr et al. 2007). These results indicated distinct diel temporal patterns of in situ nifH expression between Trichodesmium and UCYN-B. UCYN-B has been proved to preserve the integrity of the O 2sensitive nitrogenase enzyme complex by temporal separation of photosynthesis and N 2 fixation (Mohr et al. 2010a). Trichodesmium is suspected of having differentiated cells (i.e., diazocytes) that localize N 2 fixation activity in a manner resembling localization to heterocysts (Sandh et al. 2012), which allows Trichodesmium to express nitrogenase and fix N 2 aerobically during the day (Zehr 2011). The distinct temporal rhythms of nifH transcription also possibly avoid strong resource competition. For example, Trichodesmium has a high Fe demand in the daytime because Fe is a cofactor in both photosystem I and nitrogenase (Küpper et al. 2008). However, UCYN-B synthesizes Fe-containing proteins for N 2 fixation during the night and carbon fixation during the day (Saito et al. 2011). This strategy of repurposing Fe throughout the diel cycle helps UCYN-B inhabit regions with low Fe concentrations. Het-1 and Gamma A also seem to have higher transcript abundances and the cDNA to DNA copy-number ratios in the daytime than at night, but this difference was not test statistically (Fig. 6e-h). Richelia associated with Rhizosolenia (Het-1), as a heterocyst-forming cyanobacterium, carries out N 2 fixation in cells differentiated into heterocysts and photosynthesis in the other cells (Zeev et al. 2008;Lyimo 2011), which protects its nitrogenase from O 2 damage; thus, it is able to fix N 2 during the daytime.
Considering that UCYN-B dominated the nifH gene pool in the Philippine Sea and Trichodesmium was dominant in the northern SCS and western equatorial Pacific, we postulate a spatiotemporal pattern of N 2 fixation. The N 2 fixation in the Philippine Sea is driven primarily by UCYN-B at night, whereas that in the northern SCS and the western equatorial Pacific is driven primarily by Trichodesmium during the daytime.

Contribution of major diazotrophic groups to N 2 fixation
Positive correlations between the nifH gene abundances and NFRs were observed for Trichodesmium, UCYN-B, UCYN-C, Gamma A, and Het-1 (Supporting Information Fig. S3; p < 0.01 for each), suggesting that they all might significantly contribute to local N 2 fixation. However, it is difficult to accurately quantify and assess the contribution of individual diazotrophic taxa to bulk N 2 fixation, although some previous studies separated Trichodesmium from other groups through filtering Moisander et al. 2014;Benavides et al. 2016a). In this study, we attempted to provide information on the contribution of the three most dominant diazotrophic groups (among the targeted seven groups), namely, Trichodesmium, UCYN-B, and particularly the Gamma A phylotype, to the NFRs. We plotted the NFRs normalized to nifH gene copies (based on the total nifH gene-copy numbers from the seven diazotrophic groups) along three dimensions that represent the relative contributions of Trichodesmium, UCYN-B, and Gamma A to the total nifH gene pools (if any of the three nifH groups was dominant, then the data were plotted; Fig. 7). When Trichodesmium dominated the nifH gene-based DNA pool, the average nifH gene copy-specific NFRs were distinctly higher. For UCYN-B, most of the average nifH gene copy-specific NFRs were very low when it dominated the nifH gene-based DNA pool. Therefore, the nifH gene copy-specific NFR of UCYN-B was lower than that of Trichodesmium. Notably, however, we found that the average nifH gene copy-specific NFRs were also relatively high when Gamma A dominated the DNA pool (Fig. 7). Therefore, the Gamma A phylotype may be responsible for the high NFRs observed in some samples; of course, other unknown/undetected diazotrophs may have contributed to the high NFRs. N 2 fixation consumes a large amount of energy and thus is not an efficient pathway for heterotrophic bacterial growth in the oligotrophic open ocean, where organic matter is fairly limited. However, the Gamma A nifH phylotype is geographically widespread but limited to the upper ocean (Bird et al. 2005;Langlois et al. 2008;Moisander et al. 2008Moisander et al. , 2014Zhang et al. 2011;Langlois et al. 2015). We speculate that the Gamma A phylotype benefits from light as a supplementary energy source or potentially interacts with phototrophs . The latter speculation is consistent with a previous study that detected the Gamma A nifH genes in only the > 3-μm-size fraction (Benavides et al. 2016a). Further studies are needed to better understand how these processes help members of this phylotype fix N 2 .

Drivers of diazotroph biogeography
To explore the drivers of N 2 -fixing community composition and activity across multiple systems, regional biogeochemical factors were analyzed, including sea surface temperature, aerosol optical thickness, absolute geostrophic velocity (current velocity), mixed-layer depth, and Chl a concentration, and combined with the gene datasets. A hierarchical clustering dendrogram clustered the surface N 2 -fixing communities (based on the nifH gene abundances of the seven diazotroph groups) into four major clades, corresponding to four marine regions ( Fig. 8a): (1) Trichodesmium and Alpha-MH144511-codominant communities from the SCS stations, (2) UCYN-B and Gamma A-codominant communities from the Philippine Sea and adjacent western Pacific Sta. WP3 and WP4, (3) predominantly Trichodesmium communities from the near-island Sta. EQ1-EQ8 of the western equatorial Pacific, and (4) Trichodesmium and Gamma A-codominate communities from the far-island Sta. EQ9-EQ13 of the western equatorial Pacific and the 143 E Sta. WP1 and WP2. Notably, distinct differences in biogeochemical factors were observed among these marine regions (Fig. 8c). The greatest aerosol optical thicknesses and the shallowest mixed-layer depths were found in the marginal SCS, where the two highest Chl a concentrations were observed. In contrast, the deepest mixedlayer depths as well as the lowest sea surface temperatures were found in the Philippine Sea and adjacent western Pacific Sta. WP3 and WP4; accordingly, the Chl a concentrations were the lowest. The western equatorial Pacific region had the strongest current velocities, much shallower mixed-layer depths, and higher average Chl a concentrations than the Philippine Sea; the far-island stations had the highest sea surface temperatures and the thinnest aerosol optical thicknesses. These distinct regional differences in biogeochemical characteristics suggest different nutrient inputs, particularly Fe, to the upper water column.
The SCS is a semienclosed marginal sea, and dust input from the Gobi Desert supplies it with substantial amounts of particulate Fe , which might fuel diazotroph (e.g., Trichodesmium) blooms. While the nifH gene abundances of Trichodesmium were high (up to 6.9 × 10 6 copies L −1 ), relatively low NFRs were measured in the SCS, with the exception of Sta. SCS8 (Fig. 8a,b). Fe solubility and bioavailability are hypothesized to be limited in the SCS due to a lack of Febinding organic ligands despite the relatively high Fe input (Wu et al. 2003). This hypothesis is supported by the dissolved Fe concentrations (0.18-0.44 nmol L -1 ) in the upper 100 m of the water column in the SCS (Wen et al. 2006;Fig. 8d). In addition, input from the continental shelf and the Pearl River also supply organic matter, nutrients, and coastal bacterial species, which may help produce the abundant diazotrophic alpha-proteobacteria in the SCS (Kong et al. 2011).
Fe may also fuel Trichodesmium blooms (up to 2.0 × 10 7 nifH copies L −1 ) in the western equatorial Pacific region, but the Fe input pathway in this region is different from that in the SCS. Atmospheric deposition is very low (Duce and Tindale 1991;Gao et al. 2001), but intense hydrological dynamics, such as upwelling, have been demonstrated by the existence of multiple undercurrents in the western equatorial Pacific (Mackey et al. 2002;Fig. 7c). Additionally, numerous hydrothermal vents in the Bismarck Sea (Slemons et al. 2010) provide high Fe content to deep water. Therefore, the upwelling of Fe-rich deep water stimulates surface Trichodesmium blooms and NFRs (Fig. 8a,b), which is consistent with the observation of dissolved Fe concentrations in the surface waters of up to~1 nmol L -1 in this region (Slemons et al. 2010;Labatut et al. 2014;Fig. 8d). In addition, riverine inputs from New Guinea may also be an important source of Fe (Milliman et al. 1999;Sholkovitz et al. 1999); for example, Sta. EQ7 is located near the mouth of the Sepik River, and the maximum NFR was measured there. Stations EQ9-EQ13 and WP1 and WP2 are relatively far from the islands; therefore, the nutrient supply might be lower at these stations than at EQ1-EQ8, although the hydrological characteristics were similar and the sea surface temperatures were the highest (Fig. 8c). The diazotrophs showed low abundances and relatively low NFRs at these stations (Fig. 8a,b).
The Philippine Sea may be the most oligotrophic of these marine regions due to low dust input (Uematsu et al. 2003), a deep mixed-layer depth, and a lack of coastal upwelling (Fig. 8c); the dissolved Fe concentrations were extremely low (0.08-0.25 nmol L -1 in the upper 80 m of the water column; Fig. 7d; Kondo et al. 2007). Trichodesmium has a high Fe demand (for photosystem I and nitrogenase) and thus is more sensitive to Fe    (Wen et al. 2006;Kondo et al. 2007;Slemons et al. 2010;Labatut et al. 2014). The data plotted in (c) are from public datasets (see Methods section). The annual average aerosol optical thickness at 862 nm is from 2015; the monthly average sea surface temperature, monthly average mixed-layer depth, and monthly average Chl a concentration are from June to July 2015 (South China Sea) and December 2015 to January 2016 (the other regions; see Supporting Information Fig. S1). The data points with coordinates closest to our oceanographic stations were selected for plotting. The lines indicate the means of these values.
limitation (Küpper et al. 2008;Sohm et al. 2011). UCYN-B implements the strategy of repurposing Fe throughout its diel cycle, which has allowed it to become the most dominant diazotroph in the oligotrophic Philippine Sea. In addition, the lowest sea surface temperatures in the Philippine Sea may also promote the dominance of UCYN-B, as this unicellular cyanobacterial diazotroph has been reported to adapt to waters with lower temperatures than necessary for Trichodesmium blooms (Moisander et al. 2010). The redundancy analysis further demonstrated that the crosssystem variations in the N 2 -fixing communities were constrained by regional biogeochemistry (Fig. 9). These constraints explained 41% of the variation in the diazotroph distribution in the western Pacific Ocean regions. The sea surface temperature, mixed-layer depth, current velocity, and Chl a concentration were significantly correlated with diazotroph distribution (Monte Carlo permutation test, p < 0.01-0.05). Diazotrophic communities dominated by UCYN-B in the Philippine Sea were constrained by low temperature, deep mixed-layer depth, low current velocity, and low Chl a conditions, whereas the opposite conditions constrained the diazotrophic communities dominated by Trichodesmium in the western equatorial Pacific and the SCS (Fig. 9). To some extent, aerosol optical thickness positively influenced Trichodesmium in the western equatorial Pacific and SCS and UCYN-A1 and UCYN-B in the Philippine Sea (Fig. 9). Moreover, our redundancy model results suggest that Gamma A could become the dominant diazotrophic group in some ultraoligotrophic environments (Fig. 9). Extended investigations across more spatially and temporally diverse potential N 2 -fixing habitats are needed to completely understand the gap in the global N 2 fixation budget.
Taken together, these results suggest that sea surface temperature is a key factor shaping the N 2 -fixing community composition across multiple systems. In addition, the supply of key nutrient elements (such as Fe) and their bioavailability, which is controlled/reflected by these biogeochemical conditions, may be a more important driver of diazotroph biogeography. However, field-collected Fe concentration data are scarce, and more exploration is needed. This study highlights the biogeographical controls on marine N 2 fixers, and with this improved understanding, we are able to gain a global perspective on diazotroph biogeography and the input of "new N" to the global ocean.