A low‐cost underwater particle tracking velocimetry system for measuring in situ particle flux and sedimentation rate in low‐turbulence environments

We describe a low‐cost three‐dimensional underwater particle tracking velocimetry system to directly measure particle settling rate and flux in low‐turbulence aquatic environments. The system consists of two waterproof cameras that acquire stereoscopic videos of sinking particles at 48 frames s−1 over a tunable sampling volume of about 45 × 25 × 24 cm. A dedicated software package has been developed to allow evaluation of particle velocities, concentration and flux, but also of morphometric parameters such as particle area, sinking angle, shape irregularity, and density. Our method offers several advantages over traditional approaches, like sediment trap or expensive in situ camera systems: (1) it does not require beforehand particle collection and handling; (2) it is not subjected to sediment trap biases from turbulence, horizontal advection, or presence of swimmers, that may alter particulate load and flux; (3) the camera system enables faster data processing and flux computation at higher spatial resolution; (4) apart from the particle settling rates, the particle size distribution, and morphology is determined. We tested the camera system in Lake Stechlin (Germany) in low turbulence and mean flow, and analyzed the morphological properties and settling rates of particles to determine their sinking behavior. The particle flux assessed from conventional sediment trap measurements agreed well with that determined by our system. By this, the low‐cost approach demonstrated its reliability in low turbulence environments and a strong potential to provide new insights into particulate carbon transport in aquatic systems. Extension of the method to more turbulent and advective conditions is also discussed.

Settling of particulate organic matter (POM) plays a central role in carbon transport and productivity in lakes and oceans (Grossart and Simon 1998;McDonnell and Buesseler 2012;Li and Minor 2015). During sinking through the water column, POM undergoes continuous biogeochemical transformations and remineralization, such as consumption and degradation by heterotrophic organisms, but also exchanges with dissolved organic matter (Meyers and Eadie 1993). The remaining POM reaching the lake bottom becomes a food source for the benthic community (Ostrovsky and Yacobi 2010), can be resuspended by mixing events or is stored (sequestered) in the sediments.
POM sinking in the field is assessed via the in situ downward particle flux F and its settling rate, SV. F (No. m −2 d −1 ) is the product between SV (m d −1 ) and the particle concentration C (No. m −3 ) and determines the particle abundance or carbon content sinking through the water column per area and unit time. Measurements of F and SV can span several orders of magnitude depending on sampling environment, methodology, depth, turbulence, and biological origin of particles (Turner 2002;Stemmann et al. 2004;Armstrong et al. 2009;McDonnell and Buesseler 2010).
Throughout the last decades, sediment traps have been widely used to directly measure F and infer SV by counting particles collected within these devices (Lee 2002;McDonnell and Buesseler 2012). Even though specific criteria have been developed to minimize numerous trap biases (Bloesch and Burns 1980;Larsson et al. 1986), this methodology provides varying levels of accuracy depending on (1) trap geometry and deployment configuration, (2) hydrodynamic disturbances, such as turbulence or horizontal advection, (3) grazing by swimmers, and (4) sediment resuspension (Kozerski 1994;Peterson et al. 2005;Buesseler et al. 2007). Data processing may also be limited by exposure time of the traps leading to severe uncertainties in SV due to its large variability in space and time (McDonnell and Buesseler 2012). Traditional sediment traps solely provide bulk flux measurements only, integrated over the trap exposure time and over the water column above the trap. Flux differentiation between particle type and size is possible, but only during sample processing (Tang et al. 2014;Dubovskaya et al. 2015). Recent studies have tried to improve estimates of F and SV by combining polyacrylamide gel traps, for precise measurements of particle size and shape, with video recorders, for high-resolution particle abundance Buesseler 2010, 2012). This recent, but relatively difficult-to-handle methodology, also enables semiautomatic measurements and parametrization of F into particle size classes. To improve SV accuracy, other studies have instead exploited waterproof cameras installed within sediment traps (Van Leussen and Cornelisse 1993;Sternberg et al. 1996;Diercks and Asper 1997;Asper and Smith 2003;Mikkelsen et al. 2004;Smith andFriedrichs 2011, 2015) or directly observed sinking particle by SCUBA diving (Alldredge and Gotschalk 1988).
Based on the recent advancement in camera technology and video-processing algorithms, the particle tracking velocimetry (PTV) technique has been successfully applied in the field to estimate particle 3D velocities from videos taken by highprecision cameras (Smith 2008;Tauro et al. 2017). The study of Smith and Friedrichs (2015) is the only one that simultaneously measured density, SV, and size of particles with PTV; this method may, however, be affected by the same hydrodynamic biases as sediment traps, since data were only measured after trapping particles in a settling column. In the present study, we aimed to further exploit the advantages of the PTV systems to improve particulate flux measurements. We propose a simple solution to directly determine abundance, settling rate, size and shape of individual particles, and estimate in situ particle fluxes by in situ stereoscopic imaging. Compared to existing PTV solutions, the proposed three-dimensional underwater particle tracking velocimetry (3D-PTV) system is low-cost, compact, easy-tobuild and circumvents biases of traditional sediment traps that may affect flux computation. The method is successfully tested in low-turbulent lake environments. Its extension on a wider range of turbulence, waves, and mean horizontal flow conditions is as straightforward as for other in situ imaging systems. Here, we provide a precise description of the PTV system and the developed software package for particle analysis and data processing for rapid and reliable velocity and flux estimation. The algorithm code is available as a configurable MATLAB ® package at URL reported in the Supporting Information.

Materials and procedures
This section is divided into six parts. In the first part, we describe the stereo camera rig setup. The second part deals with the algorithms required to detect particles in situ and to extract and analyze their morphometric features. In the third part, we explain how the spatial position of particles is tracked from consecutive frames over time. The last three parts cover data postprocessing and accuracy as well as computation of particle sinking rate, abundance, and fluxes.

Camera rig setup
We designed and built a compact, easily reproducible, fastto-deploy, and low-budget stereoscopic video system ( Fig. 1) to record underwater in situ videos of moving particles. The Fig. 1. Schematic of the stereoscopic camera system (a) with details of the camera arrangement and the noise-reduction box (b). rig consists of two identical stereo cameras "GoPro Hero 4" manufactured by GoPro (http://gopro.com) disposed in a frontal parallel arrangement (Fig. 1b). The cameras were installed in a small plastic housing (Dual HERO System Housing by Go Pro) that guarantees waterproofness and a fixed distance (baseline) b between the camera lenses (see Fig. 1b). A hollow black box, with a transparent acrylic window (noise-reduction box in Fig. 1a), was also attached on the housing front and filled with particle-free tap water. This box had a twofold purpose: (1) it enhanced the underwater lighting conditions thanks to the eight LEDs incorporated on the box edges (Fig. 1b); the LED strip was connected to an external battery pack in a water-tight case.
(2) The box also ensured to record particles only from a distance z MIN from the camera lens (Fig. 1a). This avoided sampling objects too close to the lenses, which would have appeared too large and that might have partially obstructed the far field of view (FOV) during deployment.
The camera system (housing and plastic box) was mounted on a stainless-steel frame in front of a laminate 37 × 60-cm black and flat screen. The screen acted as a uniform and fixed background to enhance particle contrast and improve their detection even at low-light conditions. Screen distance can be adjusted to the maximum distance Z MAX at which particles can be detected. Z MAX was set to avoid introducing noise in the system that may have been generated by too distant and very small particles. The parameters Z MIN and Z MAX defined the water volume sampled by the stereoscopic system. In our application, we used Z MIN = 8 cm and Z MAX = 32 cm, equivalent to a sampling volume of 2.7 × 10 4 cm 3 . The cameras were also counter balanced, on the side opposite to the screen, with a weight to maintain the entire rig always in a horizontal position during the deployment (see Fig. 1a). Two vertical bars were also soldered onto the frame middle to connect the rig to a surface buoy and to an additional weight to keep the entire system vertical. A picture of the frame is given in Supporting Information Fig. S1.
The cameras were set up to acquire videos at a frequency f V of 48 frames s −1 with a 2.7 K resolution (2704 × 1520 px) and with an ultra-wide FOV to capture as many particles as possible. The choice of the image quality was dictated by camera storage space and camera battery limitations. Due to the camera arrangement, the position (X P ; Y P ; Z P ) of a particle P, exemplified as a zooplankton sinking carcass (Fig. 2), can be found when its pixel coordinates (x 1 , y 1 , and x 2 , red and blue points and segments in Fig. 2) are known at the same time instant. When the camera focal length f (green line between camera and image plane 1), baseline b (green line between the two cameras) and the principal point coordinates (x O ; y O ) are also available, particles can be triangulated by using: where d = x 2 − x 1 is the particle disparity. Equation 1 allows converting pixel coordinates to real-world coordinates.
The stereo cameras were calibrated to estimate such camera parameters, as focal length and principal points, as well as intercamera geometry constrains that are needed for particle tracking and triangulation. The calibration was performed by acquiring pictures of a checkerboard in a 0.7 m × 0.5 m × 0.5 m tank filled with clean particle-free water to simulate in situ conditions and correctly estimate the lens distortion coefficients (Lavest et al. 2003;Li et al. 2016b). Additional photos of the calibration tank are reported in the Supporting Information Fig. S2. Sixty-seven images of the calibration pattern were taken for both cameras and analyzed with the MATLAB ® Computer Vision Toolbox™. Table 1 contains the main estimated parameters (see f, b, x O , and y O in Fig. 2 and Eq. 1) needed for particle triangulation. A complete list of all calibration outputs is given in Supporting Information Table S1.
The calibration accuracy was then assessed in terms of mean reprojection error (e R ) and mean epipolar error (e E ). The former provides a measure of the calibration precision for each single camera; the latter is a proxy for the precision of the entire stereo rig. For our calibration, we obtained e R = 0.5 px and e E = 0.05 px, which are low and within the acceptable range used in other stereo systems (Smith 2008;Balletti et al. 2014  x 1 x 2 Camera 1 I m a g e p l a n e 1 Camera 2 b f I m a g e p l a n e 2 P P P 2 P P Triangulation of spatial position (X P ; Y P ; Z P ) for particle P illustrated as a zooplankton carcass. P is projected as P 1 (red dot and gray dashed line) and P 2 (blue dot and gray dashed line) onto the image planes (gray rectangles) of the two cameras (black boxes) whose distance is b. The particle coordinates in pixels on the image plane is (x 1 ; y 1 ) (red segments) and (x 2 ; y 1 ) (blue segment) taken from the upper-left corner of each frame. The distance between the image plane and the camera sensor is the focal length f and (x O ; y O ) is the coordinate of the principal point of the first camera.
Hardware frame synchronization between two cameras of the same type is not supported by default. Neither it was implemented in the GoPro 4 cameras we used. As a result, the videos from the two cameras can be delayed by a drifting number of frames, which depends on the relative drift of the camera internal clocks. To fix the synchronization issue, we implemented an original algorithm based on parallel audio track recording, easily applicable to any set of two cameras with microphones (see Section S1 in the Supporting Information).

Particle analysis
Video frames from the two cameras were processed in pairs using a combination of algorithms, explained in the following, to detect, analyze, and stereo-match particles. The algorithms were applied on the frames after rectification to remove any distortions and projects images onto a common image plane. The rectification partially crops the frames to 2691 × 1488 px to obtain the common area sampled by the cameras. For each frame, we also applied the MATLAB ® unsharp masking technique with a standard deviation of 10 for the Gaussian low-pass filter. This was done to remove possible blurring of particles close to the lens, as the optimal focus of GoPro cameras is about 17 cm (www.gopro.com; last accessed 23 July 2019). In the following, the number of processed frames from camera j (with j = 1,2) is indicated with the letter i. The k th particle detected in frame i by camera j is referred as P ijk .
Particles P ijk were detected using a combination of the foreground detection (FD) and blob detector (BD) algorithms available in the MATLAB ® Computer Vision Toolbox™. With the FD algorithm, we first built a background model to identify which pixels of the frames belonged to the background. Once the background was identified, any objects in the foreground could be extracted for further processing by the BD algorithm.
The FD model was trained independently for both cameras using the first 100 consecutive frames and based on changes in the pixel intensity due to the moving particles. The training number of frames was chosen to ensure the algorithm stability, so that the background image could be reconstructed without any foreground object. Particles were detected from the videos when i > 100. The FD model provides a binarized image or mask of the frame in which pixels marked with zero represent the background and with one particle. From the mask, particles were finally identified by analyzing connected pixel regions with the BD algorithm.
After detection of particles, their morphometric features were extracted from the mask (see Fig. 3a). Each particle was enclosed by a bounding box (blue rectangle) identified by the points Q ijk , R ijk , S ijk , and T ijk (blue dots). The particle centroids C ijk (black dot), with coordinates x C ijk ; y C ijk , was calculated by Table 1. Main parameters of the stereo system estimated from the camera calibration. f is the focal length, b the system baseline, and x O and y O the coordinates of the principal point of the first camera. Fig. 3. Panel (a) shows a particle k from frame i and camera j shaped as an ellipse, with area A ijk . The blue rectangle is the bounding box containing the particle and is identified by the blue dots Q ijk , R ijk , S ijk , and T ijk whose coordinates are reported in the round brackets. The black dot C ijk is the blob centroid. C ijk E ijk and C ijk F ijk are the minor and major semi-axis of the ellipse circumscribing the particle. α ijk is its orientation with respect to the horizontal axis C ijk R ijk . The arrows in the upper-left corner indicate the positive directions of the x and y axis. Panel (b) shows an example of particle P 10,1516 detected in the lake deployment from frame i = 200 in camera j = 1 with disparity d 200,516 of 250 px, and with a ×10 magnification level. The lower and upper part of the image contains the measured morphometric information in pixels and millimeters.
extracting the pixels marked with one and weight averaging their spatial coordinates by the particle mass, that was assumed to be equally distributed on each pixel. The particle, with an area A ijk , was approximated as an ellipse with major semi-axis C ijk E ijk and minor semi-axis C ijk F ijk (dotted red lines). Both lengths were computed assuming that the ellipse had the same normalized second central moment as the region whose pixels were marked with one. All the parameters were in pixels and the coordinates were from the upper-left corner of the image. The angle −π/2 < α ijk < π/2 (green sector in Fig. 3a) defined the particle orientation with respect to the horizontal axis C ijk R ijk .
Given the pixel coordinates with sign(α ijk ) being the sign function for the orientation, the spatial position of the points C ijk , E ijk , and F ijk was triangulated to find the particle real-world coordinates , using Eq. 1 and the parameters in Table 1. The particle lengths C ijk E ijk and C ijk F ijk in millimeters were then calculated as: Once particles are detected in both frames, particle P i1k located at x C i1k ; y C i1k À Á (Fig. 4a, red dot) needs to be matched with the exact same particle P i2k at x C i2k ; y C i2k À Á (Fig. 4b, red dot). This problem is referred as stereo correspondence (SC). Once the particles are matched, the disparity d ik = x C i2k − x C i1k can be used in Eq. 1 to triangulate the particle location in real-world coordinates. Stereo matching is usually accomplished with a correspondence algorithm that identifies similar regions among two frames to assign a disparity value for each image pixel (Bradski and Kaehler 2008). Traditional algorithms, however, were not able to determine disparities of millimeter-sized particles, such as those in our application. This was due to discontinuities between the background and the particle edges, their small size and the presence of dark or noisy or textureless regions that would have produced no valid disparities (Brown et al. 2003;Sabater et al. 2011). Some more accurate techniques are also sensitive to the choice of the input parameters and their tuning may depend on the field conditions, such as illumination or particle appearance (Scharstein et al. 2003). In turn, the particle morphological features were already known and could be used to improve correspondence or avoid false matches. In the following, we therefore propose a new local and feature-based approach to stereo match particle centroids using the following constrains of the stereo system: (1) y C i2k = y C i1k (horizontal dashed line in Fig. 4a,b) after j=1 j=2 Searching area 15 px rectification (see above); (2) x C i2k < x C i1k because of the camera arrangement (Fig. 4b); (3) also, in our system, d ik was restricted by Z MIN and Z MAX and, according to Eq. 1, were the minimum and maximum system disparities, respectively (blue rectangle in Fig. 4b).
For each detected particle P i1k , the SC algorithm proceeded to select potential matchesP i2k (see Fig. 4, dark gray dots) within a searching area (blue rectangle) with width d MAX − d MIN and height 15 px. The height was set to take into account deviations of y C i2k with respect to y C i1k due to the calibration parameters, additional small distortions introduced by flowing water (Li et al. 2016a,b) and the fact that the videos were out of sync by τ frames. The matrix representations M i1k of particle P i1k and the matrixes M i2m of particlesP i2k were then extracted from the frames based on the coordinates of their bounding boxes (blue dots in Fig. 3a,b). These matrixes had three dimensions with width x R ijk −x T ijk and height y S ijk − y Q ijk (see Fig. 3a), while the third dimension contained instead information about the pixel intensity. M i1k was compared against each M i2m with the template matching (TM) algorithm (Bradski and Kaehler 2008), which provided a score S between 0 and 1; a high score indicated thatP i2k was visually similar to P i1k . The particle areas A i1m of P i1k andÂ i2k ofP i2k were also considered as additional matching parameters, so that the correct match was finally selected when S MIN , A MIN , and S MAX were thresholds for the score and the area ratio, respectively. The lower limit on S was introduced because the intensity of each pixel in M i1k and the correct match M i2k would never be identical due to pixel dissimilarities for noise or other degrading factors (Birchfield and Tomasi 1998). The constraints on the area were instead set to account for possible small pixels differences in the particle boundaries between the two frames. When conditions in Eq. 4 were satisfied for more than one particle, the match was ignored. Equation 4 was also conceived so that no match was produced when P i2k was situated outside the FOV. To correctly choose S MIN , A MIN , and A MAX , we selected about 20 particles from five different frames; the frames were chosen from the field deployment videos to be about 40 s apart so that individual particles appeared in varying locations. Particles were then manually matched and their scores S and area ratios A i1k =Â i2k were determined with the TM and BD algorithms. According to the results in Fig. 5, we chose S MIN = 0.8, A MIN = 0.7, and A MAX = 1.3 to ensure correct matches.

Particle temporal tracking
Particle tracking consists of finding particle trajectories among consecutive 2D frames. In our implementation, we exploited the Kalman filter (KF), which is the most common method for object tracking and navigation systems (Blackman 1986;Li et al. 2016a). The KF is an iterative algorithm that employs a discrete-time motion model, based on the physics of a moving object, to predict its kinematic state in the future (i.e., centroid, velocity, and acceleration), only relying on its previously observed states. The KF takes also into account possible deviations of the state model from the particle actual motion and potential noise in the input states, such as uncertainties in the centroid detection. In our implementation, the estimated stateŜ i1k of particle P i1k was provided by: wherex C i1k andŷ C i1k were the estimated centroid coordinates in pixels by the KF and _ x C i1k and _ y C i1k the velocity in px frame −1 . Equation 5 assumes a constant-velocity motion model, which was chosen as compromise between the number of tunable parameters in the model and the physics of the detected particles. We therefore assumed a smooth particle motion with small changes in velocity among frames.
Once the KF parameters were initialized (see Section S2 of the Supporting Information) for each detection P i1k = x C i1k ; y C i1k À Á in frame i (Fig. 6a, black dot), the algorithm predicted the new particle position in the next frame i+1 with coordinateŝ x C + 1,i1k ;ŷ C i + 1,1k (red dot). The correct particle position x C i + 1,1k ; y C i + 1,1k (green dot) was then detected with the FD and BD algorithm. The prediction (red dot), linked to the particle in the previous frame (black dot), was assigned with the correct detection (green dot) using the Munkres' assignment (MA) algorithm (Munkres 1957). The algorithm performs the assignment when the distance between the prediction (Fig. 6b, red dot) and the detection (Fig. 6b, green dot) was less than the assignment cost C MAX (black arrow). After assignment, the particle path (black line in Fig. 6a,c) was then fixed using the correct centroid x C i + 1,1k ; y C i + 1,1k , linking the particle among the consecutive frames. The algorithm simultaneously performed the assignment for multiple particles tracked at the same time. It also resolved which tracks were missing, when particles left the sampling volume, and which detections began as new tracks when new particles enter the FOV. The KF and MA algorithms were calibrated to determine their parameters as detailed in Section S3 of the Supporting Information. Figure 7 depicts the scheme to process the stereo videos using the algorithms outlined above. Once frame i was extracted from camera j = 1 (Fig. 7a), the frame was analyzed with the FD and BD algorithms (Fig. 7b) to find the particle centroids x C i1k ; y C i1k À Á (Fig. 7c). The information about the tracks at the previous step i−1 (Fig. 7d) was then used to predict the particle centroidsx C i1k ;ŷ C i1k at step i with the KF (Fig. 7e,f).

Video analysis
The detections x C i1k ; y C i1k À Á and predictionsx C i1k ;ŷ C i1k were then provided to the Munkres' algorithm ( Fig. 7g) to link particles among consecutive frames. The tracks were consequently The black dot shows the spatial position of the first detection at frame i, the red dot the position of its prediction by the KF in the next frame i+1, while the green dot is the position of the real detection in i+1. The black line shows the current particle path based on the centroid prediction. (b) Prediction and detection are linked using a distance criterion based on the assignment cost C MAX of the Munkres' algorithm. (c) The corrected path followed by the particle after the assignment. updated with the corrected and detected centroids (Fig. 7h,i). The same frame i from camera j = 2 (Fig. 7j) was processed to detect the centroids x C i2k ; y C i2k À Á (Fig. 7k). Each pair x C i1k ; y C i1k À Á was then matched with the corresponding coordinates x C i2k ; y C i2k À Á with the SC algorithm (Fig. 7l). This process was repeated until the end of the video sequence was reached. Because the video from camera j = 2 was advanced or delayed with respect to j = 1 by τ frames, the coordinates x C i2k were out of sync with respect to x C i1k . Assuming that between two consecutive frames from j = 2, the particle trajectory varied linearly, x C i2k was synced using: if L > 0 i:e: j = 2 was delayed ð Þ if L < 0 i:e: j = 2 was advanced ð Þ x C i + 1,2k where m k = x C i + 1,2k − x C 1,2k is the slope of the line between the coordinates from the two consecutive frames and τ i is the frame-dependant delay from Supporting Information Figure S1F1 and Eq. S1a. τ i is negative whenL < L. (see Supporting Information Eq. S1a). Once this correction was calculated for all the processed frame, the particle disparities were computed as d ik = x C i1k − x C i2k and the points were triangulated to find the time series of the real-world coordinates X C ik ;Y C ik ;Z C ik À Á , particle lengths (C ijk E ijk , C ijk F ijk ) and sinking angle α ijk . The time series were finally filtered with a 3 rd order median filter to remove any noise due to limited resolution of the cameras, possible irregularities in particle shapes (Smith 2008) and differences in particle detection between the two cameras that were affected by the deployment environment.

Uncertainties in the measurements and detection limits
Although the camera coordinates were corrected for radial and tangential distortions via calibration, residual errors in the distortion coefficients and in other calibration parameters (see standard deviations in Supporting Information Table S1) may affect the accuracy in the estimation of the particle centroids, lengths, and velocities. The uncertainties in these measurements were statistically computed via simulations as detailed in Section S3 of the Supporting Information. The mean uncertainties for particle centroid were about 0.3 mm in the X direction, 0.15 mm in the Y direction, and 0.3 mm in the Z direction. The uncertainties in the particle lengths were instead 10 −2 mm, 0.5 × 10 −2 mm, and 2.1 × 10 −2 mm for the X, Y, and Z direction, respectively. Finally, the velocity accuracies were as high as 7.6 × 10 −2 mm s −1 in the X direction, 3.2 × 10 −2 mm s −1 in the Y direction, and 1.5 × 10 −1 mm s −1 in the Z direction.
The SC and TM algorithms set also a limit on the minimum detectable particle length. The minimum particle area cannot be smaller than A 0 MIN = 10 px 2 , as the matrixes M ijk representing the particles in the SC algorithm would be too small and would not contain enough information to be reliably processed by the TM algorithm. According to Eqs. 1, 3, a spherical particle with an area equal to A 0 MIN , has a length of 0.17 mm at Z = Z MIN = 8 cm and 0.68 mm at Z = Z MAX = 32 cm. This implies that particles below~0.2 mm are still detectable but only when they are very close to the screen of the noisereduction cone. This differs from sedimentation traps since they are able to potentially collect particles of all sizes. Finally, according to Eq. 1, the minimum detectable length of a particle varies depending on the disparity plane on which the particle is situated; this implies that some particles may be detectable only from a specific distance from the camera lenses.   Fig. 8. Panel (a) shows an example of the computed particle position Y C ik (black line) and the sinking velocity U Y,k (red line) calculated as average of the first derivative of Y C ik . Panels (b-d) reports the particle lengths and sinking angle (black lines). The red lines show the temporal average for each morphological parameter.

Flux and sinking rate computation
The sinking speed U Y,k of each particle was determined from the average of the first derivative of the particle trajectory along Y C ik ; an example is shown in Fig. 8a (red line). To obtain one set of particle morphological parameters per track, the particle mean axes μ C i1k E i1k À Á and μ C i1k F i1k À Á as well as sinking angle μ(α ijk ) were computed from the average of the respective time series (Fig. 8b-d, red lines). Finally, the particle equivalent length was defined as l EQ,k After binning particles into a defined size class c based on l EQ, k , the particle downward flux (F c , No. m −2 d −1 ) for a class c was calculated as where U Y, c (m d −1 ) is the mean settling rate and N ' C = N C Á 24=t D (No. m −3 ), the daily particle concentration for the class c (McDonnell and Buesseler 2010); N C is the observed concentration from the camera; and t D the deployment time of the 3D-PTV system in hours.

Assessment
In summer 2018, we tested the 3D-PTV system in two different environments. We first deployed it, along with sedimentation traps, in an enclosure facility with low horizontal advection and very low turbulence conditions; this allowed us to compare particle fluxes estimated from the cameras and traps minimizing the bias from both systems. We also tested the system in the littoral zone of Lake Stechlin where advection was not negligible, but low. In this case, a direct comparison with sedimentation traps was not possible as their biases would add further uncertainties to the measurements. The acquired videos from the deployments were analyzed with the same scheme as explained in Fig. 7 and with the algorithm parameters provided in Tables 1, 2.

Deployment in an enclosure facility
On 24 July 2018, we deployed the 3D-PTV system in the LakeLab enclosure facility (www.lake-lab.de, accessed on 01 May 2019) of Lake Stechlin (Fig. 9a), a dimictic oligomesotrophic lake in northeast Germany (53 10.2 0 N, 13 00 0 E). The enclosure (gray cylinder in Fig. 9b) Fig. 9. Panel (a) shows the deployment positions of the 3D-PTV system in Lake Stechlin. Panel (b) depicts the cylindric enclosure (gray) whose walls were partially buried at the bottom into the lake bed (brown color) isolating it completely from the surrounding lake water (blue background). The 3D-PTV camera system was deployed at Z C = 9.3 m with ropes (vertical green line). Panel (c) shows the mooring configuration when the system was deployed at location A at 9 m depth with the help of two submerged buoys (orange circles). 1270 m 3 . The water inside the enclosure was isolated from the surrounding lake water by its impermeable walls. The water column of the enclosure was routinely profiled with a multiparameter probe YSI 6920 V2-2 2 (Yellow Spring Instruments, U.S.A.) to routinely profile the whole water column for the major environmental parameters (water temperature, conductivity, dissolved oxygen, turbidity, and photosynthetically active solar radiation). The camera system was deployed for about 40 min twice at Z C = 9.3 m from the surface-water level, at the base of the metalimnion (see Supporting Information  Fig. S3). The two deployments (deployment 1 and 2 in the following) were 1 h apart from each another. Turbulence in this part of the enclosure was expected to be very low due to stratification and lack of external turbulence-generation mechanisms, such as lake circulation or internal waves. The camera frame was slowly lowered and anchored to avoid generating turbulence that would have affected the particle sinking behavior. At the same depth, three sediment traps were also installed to collect zooplankton carcasses (length < 0.8 mm) and estimate the daily flux F TRAPS . The trap design and methodology for flux computation is reported in Dubovskaya et al. (2015). The traps were deployed for 3 d and on a different date than the PTV system due to space limitations in the enclosure and to avoid interferences with the camera sampling. The flux estimated from the sediment trap could be compared with that of the PTV system, because the traps were 0 . 6 0 .  (c) (d) Fig. 10. Panel (a, b) shows the relative probability distribution of mean μ 2 Á C i1k E i1k À Á and μ 2 Á C i1k F i1k À Á (bars) of the major C i1k E i1k and minor C i1k F i1k lengths of the particle (see Fig. 3). Panel (c) reports the probability distribution (bars) of particle aspect ratios calculated as μ C i1k E i1k À Á =μ C i1k F i1k À Á . Panel (d) shows the sinking angle μ(α i1k ) of particles (bars). Gray bars refer to deployment 1, blue bars to deployment 2; the red vertical lines indicate the distribution mean and the black horizontal lines the 95% confidence interval for all particles. not affected by hydrodynamic biases due to the almost complete lack of horizontal advection and very low turbulence in the enclosure.
For our assessment, we filtered the particle tracks before processing. First, we removed tracks that started withiñ 25 min after the deployment; this was done to avoid sampling of potential velocities generated by the deployment of the camera system. The time was chosen by visually inspecting the videos, when small generated eddies stopped and turbulence visibly subsided. Finally, from the remaining tracks, we removed any particles moving upward: these were swimming zooplankton that could be easily seen from the videos. The total number of remaining tracks was 247 for deployment 1 and 136 for deployment 2. The average track duration was 34 s with a maximum duration of 180 s. The code, written in MATLAB ® , run on a 3.00 GHz Intel ® Core™ i5-8500 with 8 GB of RAM for about 48 h per deployment.
Results for all tracks show that the distributions of the particle mean lengths μ 2 Á C i1k E i1k À Á and μ 2 Á C i1k F i1k À Á (Fig. 10a,b, gray and blue bars for the two deployments) were rightskewed and that the majority of particles (about 90%) had a length less than 2 mm and never below 0.5 mm. No particles larger than 2.7 mm were observed in the enclosure. The mean for μ 2 Á C i1k E i1k À Á was 1.3 mm (red line) with a 95% confidence interval of CI 95 = [1.27, 1.34] mm (black line), while the average μ 2 Á C i1k F i1k À Á value was 1 mm with CI 95 = [0.98, 1.03] mm. The probability distribution of the particle aspect ratio (Fig. 10c, gray and blue bars) suggests in turn that particles were almost spherical with a mean ratio of 1.29 and CI 95 = [1.28, 1.31]. The particles sank with an angle α i1k varying between −70 and 60 (Fig. 10d, gray and blue bars), with the majority of particles sinking with the major axis horizontal or slightly tilted (mean of α i1k =3.3 and CI 95 = [0.5, 5.5]). Both deployments revealed the same the dotted line is the linear least squares fit, R 2 its coefficient of determination and p its p value. Panel (c) reports instead the estimated particle density ρ P from Eq. 8. Gray bars refer to deployment 1, blue bars to deployment 2; the red vertical lines in panels (a) and (c) indicate the distribution mean and the black horizontal lines the 95% confidence intervals for all particle.
distribution for all parameters, but with a slightly different relative probability. The sinking rate U Y,k in Fig. 11a shows a variation in the particle settling behavior, with a mean settling speed U Y,k of 97 m d −1 and CI 95 = [89, 106] m d −1 . Both deployments had the same distribution, but deployment 1 had slightly faster sinking particles than deployment 2. The bulk of particles (~95%) sank, however, with a speed no larger than 200 m d −1 . From our results, the velocities of 1-mm particles were between 2 and 280 m d −1 , with an average of 86 m d −1 . For particle of the same size, direct estimates from SCUBA divers report a more conservative average of 75 m d −1 (Shanks and Trent 1980;Alldredge and Gotschalk 1988), although these may partially be affected by operator. Indirect estimations from sediment traps provide instead varying results with a speed as low as 1.7 m d −1 for freshwater zooplankton carcasses (Dubovskaya et al. 2015) and up to 500 m d −1 for marine snow (Asper 1987;Peterson et al. 2005;Trull et al. 2008;Armstrong et al. 2009;Xue and Armstrong 2009). However, numerical simulations by Kirillin et al. (2012) for zooplankton carcasses 1 mm in size revealed a sinking rate of [38-100] m d −1 , which is in the same range as those measured with our 3D-PTV system.
According to the Stokes' equation and other parameterizations of sinking rates of ellipsoidal objects (Happel and Brenner 1965), U Y,k linearly depends with the equivalent length l 2 EQ,k and particle density ρ P , and may be altered by nonlinear effects due to χ k and μ(α i1k ). Since particles were almost spherical (χ k~1 in Fig. 10c) and in our data we did not observe any correlation of U Y,k with χ k and α i1k (not shown), we assumed that U Y,k could be described with the Stokes' equation: where g is the gravitational acceleration; T, μ W , and ρ W the water temperature, dynamic viscosity, and density, respectively. U Y,k plotted vs. l 2 EQ,k in Fig. 11b shows a weak linear correlation (coefficient of determination R 2 = 0.2, p value < 0.0001) with respect to the linear least squares fit (dotted line). The sinking rate increased for larger particles, but the points were moderately scattered for smaller lengths due to differences in ρ P .
The particle density was estimated by inverting Eq. 8 and using the observed value for T = 9.45 C, and μ W =1.33 × 10 −3 kg s −1 m −1 and ρ W = 999.75 kg m −3 from the parameterizations by Chen and Millero (1986). The result in Fig. 11c shows that ρ P varied between 999 and 1013 kg m −3 , with an average of 1002 kg m −3 . These values were smaller than  Fig. 12. Panel (a) reports the downward particle flux F c , along with the 95% confidence interval as vertical bars, for each particle size class c reported on the x axis from the cameras. The bars in panel (b) show the daily mean flux F TRAPS estimated from the sediment traps with the 95% confidence interval provided for each day. The horizontal patches indicate the flux F PTV from the 3D-PTV system for deployment 1 (light gray patch) and 2 (blue patch); their lower and upper limits are the 95% confidence intervals.
in vitro estimations of zooplankton carcasses by Kirillin et al. (2012) and Simoncelli et al. (2018), but the excess density ρ P − ρ W was in agreement with in situ measurements of marine snow (Alldredge and Gotschalk 1988). We finally binned the particles into eight classes c (0-0.6, 0.6-0.8, 0.8-1.0, 1.0-1.2, 1.2-1.4, 1.4-1.6, 1.6-1.8, and 1.8-2.0 mm) and computed the fluxes F c (Fig. 12a, gray bars) as product between N 0 c and the average velocity U Y, c from all the particles in each class c (see Eq. 7). The 95% confidence interval CI 95 for F c was calculated by multiplying N 0 c by the CI 95 for the sample U Y,k from c. The flux F TRAPS from the sediment traps was instead averaged between the three deployed traps (Fig. 12b, gray bars). F TRAPS was almost constant during the deployment with an average of 3.1 × 10 4 No. m −2 d −1 and CI 95 = [2.9, 3.4] × 10 4 No. m −2 d −1 , with an average of 340 carcasses collected daily. Since F TRAPS accounts only for zooplankton carcasses with sizes less than 0.8 mm, F PTV was calculated using Eq. 7 and combing particles from the first two classes. For deployment 1 (Fig. 12b, light gray bars), F PTV was 9.5 × 10 4 No. m −2 d −1 (N C=1 + N C=2 = 28, CI 95 = [0.7, 1.2] × 10 5 No. m −2 d −1 ), while for deployment 2 (Fig. 12b, blue bars) was 8.0 × 10 4 No. m −2 d −1 (N C=1 + N C=2 = 26, CI 95 = [0.5, 1.11] × 10 5 No. m −2 d −1 ). For both deployments, the difference between the two first fluxes was about 16%. F TRAPS and F PTV were very similar (Fig. 12b, light gray blocks and horizontal patches), but F TRAPS slightly underestimated the flux because it accounted for zooplankton carcasses only and possibly due to material removal by swimmers. These results generally show the reliability of the 3D PTV system in estimating particle fluxes in environment with little advection and low turbulence.

Deployment in a lake environment
On 26 June 2018, we deployed the camera system in the littoral zone of the same lake (Lake Stechlin) at Location A (Fig. 9a) using the mooring design reported in Fig. 9c. To guarantee the frame stability, we employed two buoys one of which was located 1.5 m below the surface to prevent the system from oscillating due to surface wave motions. The camera was positioned near the base of the metalimnion at 9 m for about 40 min (see Supporting Information Fig. S4). For the particle analysis, we removed the first 5 min of the video after velocity, we smoothed the signals x C i1k and y C i1k (blue lines in Supporting Information Fig. S5a,b) with a moving average filter. The corrected real-world coordinates (blue line in Supporting Information Fig. S5c,d) were computed from the corrected pixel paths using Eq. 1 and the parameters from Table 1. Other low-pass filters may also be used with a cutoff frequency that would depend on the nature of the oscillations.
A potential bias introduced by the SC algorithm should be mentioned. In order to ensure that a particle from camera j = 1 is correctly matched with the same one in camera j = 2, we disregarded particles that have multiple matches according to the criteria in Eq. 5. Multiple matches correspond to similar-looking particles close to each other (i.e., they are within the searching area in Fig. 4 at the same time). This may lead to a temporary loss of a part of tracks for a few particles. Other biases on the flux estimation may come from advection and camera exposure time. Advection may generate turbulent eddies, when water interacts with the camera frame, and potentially break particle flocs or alter their velocity. The camera exposure time may affect the flux of small and slowsinking particles that would appear less frequently than the faster sinking particulate.
The perspectives of the method extension on environments with strong currents, such as estuarine systems, bottom boundary layers of lakes, or coastal regions, include mounting of the cameras on a rigid bottom-anchored platform to prevent any vibrations, or attaching the system to a submerged zero-buoyancy float moving with the main current. The basic algorithm of video processing would remain essentially the same as presented here.

Comments and recommendations
In our deployment, the camera exposure time was limited by the internal camera battery. The deployment length may be prolonged by using an external battery. Furthermore, from the videos, it is not possible to determine the biological origin of the particles. Particles usually comprise dead algae cells, organic and inorganic debris, molts, and zooplankton carcasses (Grossart and Simon 1993;Kirillin et al. 2012;Dubovskaya et al. 2015). This issue could be overcome in future studies by increasing the camera's resolution up to 4 K and further improving the illumination (i.e., adding another light source). The focal distance of cameras may also be manually reduced to better focus on objects close to the lens or cameras with adjustable focal length may be employed. By this, the size resolution of particles can be improved compared to the present~0.17 mm in length. It will also require reducing or complete removal of the noise-reduction cone and (or) decreasing the distance between the screen and the camera. For example, if the size of the cone is set to 1 cm and the screen distance to 10 cm, the minimum detectable length would be between 0.02 and 0.2 mm. Finally, the camera resolution can also be increased to 4 K to allow sampling twice as more pixels than our current setup to detect tinier aggregates. Further development of the particle track processing algorithm, for example, by using spatial spectral characteristics, could allow differentiating between live and dead particles. The issue is particularly important in highly turbulent environments, where particles can undergo both positive and negative vertical movements at relatively short times, strongly reducing the overall sinking rate. Samples may also be collected near the PTV system to create empirical time-dependent relationships between the particle and carbon flux as a function of particle sizes (McDonnell and Buesseler 2012).
Finally, our tracking method is based on algorithms that contain a total of eight calibration parameters. However, the four parameters of the KF are already calibrated for this type of application. The only parameters that may depend on the deployment environment are the ones from the SC algorithm that can however be chosen based on the methodology we described above in the text.