WO2023196654A1

WO2023196654A1 - Monocular underwater camera biomass estimation

Info

Publication number: WO2023196654A1
Application number: PCT/US2023/017970
Authority: WO
Inventors: Julia Black Ling; Laura Valentine CHROBAK
Original assignee: X Development LLC
Current assignee: X Development LLC
Priority date: 2022-04-08
Filing date: 2023-04-07
Publication date: 2023-10-12
Anticipated expiration: 2024-10-08

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for monocular underwater camera biomass estimation. In some implementations, an exemplary method includes obtaining an image of a fish captured by a monocular underwater camera; providing the image of the fish to a depth perception model; obtaining output of the depth perception model indicating a depth-enhanced image of the fish data in the image; determining a biomass value estimate of the fish based on the output; and determining an action based on one or more biomass values estimates including the biomass value estimate of the fish.

Description

MONOCULAR UNDERWATER CAMERA BIOMASS ESTIMATION

CROSS-REFERENCE TO RELATED APPLICATION

[0001] This application claims the benefit of U.S. Application No. 17/716,797, filed April 8, 2022 and U.S. Application No. 17/729,703, filed April 26, 2022, the contents of which are incorporated by reference herein.

FIELD

[0002] This specification generally relates to cameras that are used for biomass estimation, and particularly to underwater cameras that are used for aquatic livestock.

BACKGROUND

[0003] A population of farmed fish may include fish of varying sizes, shapes, and health conditions. In the aquaculture context, prior to harvesting, a worker may remove some fish from the fish pen and weigh them. The manual process of removing the fish from the fish pen and weighing them is both time intensive and potentially harmful to the fish. In addition, because only a small portion of a fish population may be effectively measured in this way, the true characteristics of the population remain unknown.

SUMMARY

[0004] In general, innovative aspects of the subject matter described in this specification relate to estimating the biomass of aquatic livestock, e.g., using monocular underwater camera systems. Individual fish can be photographed using a single underwater camera. Images, or a single image, from the underwater camera can be processed using computer vision and machine learning-based techniques to identify fish within the images and to determine features, e.g., truss lengths, on the fish. Biomass estimations, e.g., for individual or groups of fish, are generated by a model (e.g., neural network, Random Forest Regressor, Support Vector Regressor, or Gaussian Process Regressor, among others) that is trained to generate predicted biomass based, e.g., on truss lengths. The biomass of fish populations may be used to control the amount of feed given to a fish population, e.g., by controlling a feed distribution system, as well as to identify and isolate runt, diseased, or other sub-populations. [0005] One innovative aspect of the subject matter described in this specification is embodied in a method that includes obtaining an image of a fish captured by a monocular underwater camera; providing the image of the fish to a depth perception model; obtaining output of the depth perception model indicating a depth-enhanced image of the fish; determining a biomass estimate of the fish based on the output; and determining an action based on one or more biomass estimates including the biomass estimate of the fish.

[0006] Other implementations of this and other aspects include corresponding systems, apparatus, and computer programs, configured to perform the actions of the methods, encoded on computer storage devices. A system of one or more computers can be so configured by virtue of software, firmware, hardware, or a combination of them installed on the system that in operation cause the system to perform the actions. One or more computer programs can be so configured by virtue of having instructions that, when executed by data processing apparatus, cause the apparatus to perform the actions.

[0007] The foregoing and other embodiments can each optionally include one or more of the following features, alone or in combination. For instance, in some implementations, actions include determining, based on the depth-enhanced image of the fish, a data set including a value that indicates a length between a first point on the fish and a second point on the fish.

[0008] In some implementations, actions include determining the biomass value of the fish includes providing the data set including the value that indicates the length between the first point on the fish and the second point on the fish to a model trained to predict biomass; and obtaining output of the model trained to predict biomass as the biomass value of the fish.

[0009] In some implementations, actions include detecting the first point and second point on the fish. In some implementations, detecting the points includes providing the depth-enhanced image of the fish to a model trained to detect feature points on a fish body.

[0010] In some implementations, actions include detecting the fish within the image using a model trained to detect fish. [0011] In some implementations, the action includes adjusting a feeding system providing feed to the fish.

[0012] In some implementations, the action includes sending data including the biomass estimate to a user device, where the data is configured to, when displayed on the user device, present a user of the user device with a visual representation of the biomass estimate.

[0013] In some implementations, actions include obtaining data from the monocular underwater camera indicating a current operation status of the monocular underwater camera; and in response to obtaining the image of the fish and the data from the monocular underwater camera, providing the image of the fish to the depth perception model.

[0014] An advantage of the methods, systems, and apparatuses described herein includes reducing reliance on hardware for obtaining depth data from an environment such as hardware for stereo cameras, light based depth detection, among others. When a setup of stereo cameras is used, depth information can be determined from the visual differences of objects captured by the two or more cameras of the stereo camera setup. However, if a stereo camera setup is damaged, is partially obscured, or is partially made non-functional, processes that rely on depth information of the stereo camera may not function properly. Similarly, stereo camera setups generally are more expensive to produce, calibrate, and maintain than single camera equivalents. Stereo camera setups can also be less efficient by requiring greater amounts of image data to be transferred to processing elements compared to single camera equivalents.

[0015] Solutions described herein specify the use of a depth perception model to determine depth based solely on a 2-dimensional image of a single camera without depth data. Other possible solutions, such as a time of flight (ToF) sensor present further issues, including environmental effects, that can affect accuracy. For example, ToF sensors or other depth sensors can detect debris within an environment, such as water, as objects for distance measurement. The variable debris in the environment can similarly present issues in determining a correlation of time for reflection and actual distance as the time may depend on the amount of debris in the environment. [0016] Another innovative aspect of the subject matter described in this specification is embodied in a method that includes obtaining a plurality of images of fish captured by a monocular underwater camera; providing the plurality of images that were captured by the monocular underwater camera to a first model trained to detect one or more fish within the plurality of images; generating one or more values for each detected fish as a set of values; generating a biomass distribution of the fish based on the set of values; and determining an action based on the biomass distribution.

[0017] Other implementations of this and other aspects include corresponding systems, apparatus, and computer programs, configured to perform the actions of the methods, encoded on computer storage devices. A system of one or more computers can be so configured by virtue of software, firmware, hardware, or a combination of them installed on the system that in operation cause the system to perform the actions. One or more computer programs can be so configured by virtue of having instructions that, when executed by data processing apparatus, cause the apparatus to perform the actions.

[0018] The foregoing and other embodiments can each optionally include one or more of the following features, alone or in combination. For instance, in some implementations, generating the biomass distribution of the fish based on the set of values includes providing the set of values to a second model trained to estimate biomass distributions; and obtaining output of the second model as the biomass distribution of the fish.

[0019] In some implementations, the one or more values for each detected fish include a value that indicates a length between a first point on a particular fish of the fish and a second point on the particular fish.

[0020] In some implementations, actions include detecting the first point and second point on the particular fish. In some implementations, the action includes adjusting a feeding system providing feed to the fish.

[0021] In some implementations, the action includes sending data including the biomass estimate to a user device, where the data is configured to, when displayed on the user device, present a user of the user device with a visual representation of the biomass estimate. [0022] In some implementations, actions include obtaining data from the monocular underwater camera indicating a current operation status of the monocular underwater camera; and in response to obtaining the plurality of images from the monocular underwater camera, providing the plurality of images to the first model.

[0023] An advantage of the methods, systems, and apparatuses described herein includes reducing reliance on hardware for obtaining depth data from an environment such as hardware for stereo cameras, light based depth detection, among others, e.g., by using processes to generate accurate biomass estimations from images captured by monocular or single lens cameras. When a setup of stereo cameras is used, depth information can be determined from the visual differences of objects captured by the two or more cameras of the stereo camera setup. However, if a stereo camera setup is damaged, is partially obscured, or is partially made nonfunctional, processes that rely on depth information of the stereo camera may not function properly. Similarly, stereo camera setups generally are more expensive to produce, calibrate, and maintain than single camera equivalents. Stereo camera setups can also be less efficient by requiring greater amounts of image data to be transferred to processing elements compared to single camera equivalents.

[0024] Solutions described herein specify the use of two-dimensional (2d) truss networks to determine biomass distributions for a population. Typically, truss networks are computed in three dimensions to account for changes in apparent distances between key points based on a position of a fish or other object in three dimensions. However, this approach relies on hardware for obtaining depth data from an environment such as hardware for stereo cameras, light based depth detection, among others. Solutions described herein specify the use of a model trained with 2d truss networks to generate an accurate biomass distribution prediction for a population of fish.

[0025] Other possible solutions, such as a time of flight (ToF) sensor present further issues, including environmental effects, that can affect accuracy. For example, ToF sensors or other depth sensors can detect debris within an environment, such as water, as objects for distance measurement. The variable debris in the environment can similarly present issues in determining a correlation of time for reflection and actual distance as the time may depend on the amount of debris in the environment. [0026] The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features and advantages of the invention will become apparent from the description, the drawings, and the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

[0027] FIG. 1 is a diagram showing an example of a system that is used for monocular underwater camera biomass estimation.

[0028] FIG. 2 is a flow diagram showing an example of a process for monocular underwater camera biomass estimation.

[0029] FIG. 3 is a diagram showing an example of another system that is used for monocular underwater camera biomass estimation.

[0030] FIG. 4 is a flow diagram showing an example of another process for monocular underwater camera biomass estimation.

[0031] FIG. 5 is a diagram showing an example of a truss network.

[0032] FIG. 6 is a diagram illustrating an example of a computing system used for monocular underwater camera biomass estimation.

[0033] Like reference numbers and designations in the various drawings indicate like elements.

DETAILED DESCRIPTION

[0034] FIG. 1 is a diagram showing an example of a system 100 that is used for underwater camera biomass estimation. The system 100 includes a control unit 116 and an underwater camera device 102. Generally speaking, the control unit 116 obtains images captured by a camera of the camera device 102 and processes the images to generate biomass estimations for one or more fish. The biomass estimations for one or more fish can be processed to determine actions such as feed adjustment, sorting, model training, and user report feedback, among others.

[0035] At least one of the one or more cameras of the camera device 102 includes a camera that captures images from a single viewpoint at a time. This type of camera may be referred to as a monocular camera. Where a stereo camera setup can include multiple cameras each capturing a unique viewpoint at a particular time, a monocular camera captures one viewpoint at a particular time. A computer processing output of a stereo camera setup can determine, based on differences in the appearance of objects in one viewpoint compared to another viewpoint at a particular time, depth information of the objects.

[0036] In some implementations, the camera device 102 has one or more cameras in a stereo camera setup that are non-functional or obscured, e.g., by debris or other objects, including fish or other animals, in an environment. In some implementations, the control unit 116 can process one or more images from the camera device 102, or obtain a signal from the camera device 102, indicating that one or more cameras of the camera device 102 are obscured or non-functional or the camera device 102 is operating with a monocular camera. The control unit 116 can adjust a processing method based on the status of the cameras of the camera device 102, such as a stereo setup status or a monocular camera setup status.

[0037] In some implementations, the camera device 102 includes a single camera. For example, the single camera can be a monocular camera that captures a single viewpoint at a time. In some implementations, the camera device 102 with a single camera is more efficient to produce and maintain, more economical, and can be more robust with fewer components prone to failure.

[0038] The system 100 also includes a feed controller unit 130 that controls the feed delivered by feed system 132. The feed controller unit 130 can include components configured to send control messages to actuators, blowers, conveyers, switches, or other components of the feed system 132. The control messages can be configured to stop, start, or change a meal provided to fish 106 in pen 104.

[0039] In this example, the camera device 102 includes propellers to move the camera device 102 around the fish pen 104. In general, the camera device 102 may use any method of movement including ropes and winches, waterjets, thrusters, tethers, buoyancy control apparatus, chains, among others.

[0040] In some implementations, the camera device 102 is equipped with the control unit 116 as an onboard component, while in other implementations, the control unit 116 is not affixed to the camera device 102 and is external to the camera device 102. For example, the camera device 102 may provide images 112 and 114 over a network to the control unit 116. Similarly, the control unit 116 can provide return data, including movement commands to the camera device 102 over the network.

[0041 ] Stages A through C of FIG. 1 , depict image data 110, including image 112, obtained by the camera device 102 that are processed by the control unit 116. The image 112 includes representations of the fish 113 and 115. Although image 112 shows the fish 113 and 115 in a side profile view, images of fish obtained by the camera device 102 may include fish in any conceivable pose including head on, reverse head on, or skewed.

[0042] In stage A, the camera device 102 obtains the image data 110 including image 112 of the fish 113 and 115 within the pen 104. The camera device 102 provides the data 110 to the control unit 116.

[0043] In stage B, the control unit 116 processes the images of the data 110, including the image 112. The control unit 116 provides the data 110 to the depth perception model 118. In some implementations, the depth perception model 118 is trained using the depth training data 124. For example, the depth training data 124 can include images of one or more fish in an environment. The depth training data 124 can include information of features of the one or more fish and distance to the camera for each of the features. For example, for a head on image of a fish, the depth training data 124 can include a distance to the camera of a head of a fish that is smaller than a distance to the camera of a side fin of the fish.

[0044] In some implementations, the depth perception model 118 includes one or more fully or partially connected layers. Each of the layers can include one or more parameter values indicating an output of the layers. The layers of the depth perception model 118 can generate output indicating depth information of an image, such as the image 112.

[0045] In some implementations, the control unit 116 trains the depth perception model 118. For example, the control unit 116 can provide the depth training data 124 to the depth perception model 118. The depth training data 124 can include ground truth data indicating depths of features within an environment. The depth perception model 118 can generate an estimation of depths within an image. The control unit 116 can compare the estimations of the depth perception model 118 to ground truth data of the depth training data 124. Based on the comparison, the control unit 116 can adjust one or more parameters of the depth perception model 118 to adjust a subsequent output of the depth perception model 118.

[0046] The control unit 116 provides output of the depth perception model 118 to the object detector 120. The object detector 120 obtains the output. The output can include a depth enhanced image of the fish 115. The output can include a depth enhanced image of objects depicted in the image 112, including the fish 113 and the fish 115. The output can include depth enhanced versions of one or more images of the data 110.

[0047] In some implementations, the depth-enhanced versions of the one or more images of the data 110 improve detection of the object detector 120. For example, the object detector 120 can determine, based on a depth-enhanced version of an image, a distance to a feature depicted in the image. The feature may represent a fish far away from the camera device 102. Without depth information, the object detector 120 may misdetect the fish as a piece of debris. Based on the depth- enhanced versions of images, the object detector 120 can provide more accurate identifications of objects. For example, the size and shape of fish can help identify the species and other attributes of a fish. By processing a depth-enhanced version of an image, the object detector 120 can more accurately identify fish based on a determination of the actual size of the features represented by an image.

[0048] In some implementations, the depth-enhanced versions of the one or more images of the data 110 improve detection of the pose detector 122. For example, depth data of a depth-enhanced version of an image can provide sufficient data for the pose detector 122 to determine a relationship between a distance to the camera 102 for one or more parts on a fish, such as the fish 115. The pose detector 122 can determine that the tail of the fish is more or less close compared to a nose of the fish. Such a position can distort a corresponding truss network where the lengths appear to be shorter based on the angle of the fish. Using the depth-enhanced version of an image, the pose detector 122 can determine such a pose with accuracy resulting in more accurate truss network generation and biomass estimation compared to estimating the pose based solely on visual appearance of the fish. For example, a slight angle in pose may not result in detectable visual features in the image potentially resulting in a truss network, generated without depth data, that represents a smaller fish than reality. However, with a depth-enhanced version of an image, the pose can be detected with accuracy even at relatively small angles.

[0049] In general, the control unit 116 can process one or more images of the data 110 in aggregate or process each image of the data 110 individually. In some implementations, one or more components of the control unit 116 process items of the data individually and one or more components of the control unit 116 process the data 110 in aggregate. For example, the depth perception model 118 can process each image of the data 110 individually. The depth perception model 118 can generate depth-enhanced images and provide the images to the object detector 120 and the pose detector 122. The object detector 120 and the pose detector 122 can process the depth-enhanced images of the depth perception model 118 individually and provide data to the biomass engine 127.

[0050] In some implementations, the biomass engine 127 processes data from the depth perception model 118, the object detector 120, and the pose detector 122, individually. In general, processing individually as discussed herein, can include processing one or more items sequentially or in parallel using one or more processors. In some implementations, the decision engine 129 processes data in aggregate. For example, the decision engine 129 can determine, based on one or more data values indicating biomass estimates provided by the biomass engine 127, one or more decisions and related actions, as described herein. In general, processing in aggregate as discussed herein, can include processing data corresponding to two or more items of the data 110 to generate a single result. In some implementations, an item of the data 110 includes the image 112.

[0051] In some implementations, the object detector 120 includes one or more feature recognition algorithms. In some implementations, the object detector 120 includes one or more machine learning models trained to detect portions of animals, such as key points on a fish, cow, or other animal. For example, the object detector 120 can obtain a portion of the image 112 and detect the fish 113 and the fish 115. The object detector 120 can detect key points of the fish 113 and 115 such as a head, nose, fins, among others. The object detector 120 can be trained using labeled images of fish or other objects. [0052] In some implementations, the object detector 120 generates bounding boxes around detected objects. For example, the object detector 120 can detect the fish 115 and generate a bounding box for the fish 115. The object detector 120 can similarly detect a key point on the fish 115 and store the location of the key point and provide the location as output to another component of the control unit 116. A bounding box for the fish 115 can indicate a position of the fish 115 within the image 112. Other bounding boxes can indicate positions of key points.

[0053] The control unit 116 provides output of the object detector 120 to the pose detector 122. The pose detector 122 determines a pose of the fish detected by the object detector 120. The pose detector 122 receives one or more object detections, such as key points on a fish, from the object detector 120. The pose detector 122 receives one or more object detections from images including the image 112 of the fish 113 and the fish 115. The pose detector 122 receives depth information from the depth perception model 118. The object detections and the depth information enable the pose detector 122 to estimate a pose of one or more detected fish in three dimensions (3D), such as the fish 113 or the fish 115.

[0054] In some implementations, the pose detector 122 includes a truss network generation module to generate truss networks for detected objects. For example, a truss network generation module can generate a truss network 126 for the fish 115. The truss network generation module can generate the truss network 126 based on the image 112, the output of the depth perception model 118, the output of the object detector 120, or the output of the pose detector 122. An example of a truss network is shown in FIG. 5.

[0055] The pose detector 122 is used to interpret distances between key points and thus determine truss lengths. For example, if the fish 115 is at a 15 degree angle from an imaginary line extending from the camera device 102, key points may appear close and thus truss lengths may appear to be of a small value compared to their true value. By using the pose detector 122, the 3D pose of the fish, including angle, is accounted for and accurate measurements of the fish 115 are made.

[0056] The control unit 116 provides output of the pose detector 122 to the biomass engine 127. The biomass engine 127 obtains the output of the pose detector 122. In some implementations, the biomass engine 127 includes one or more models trained to predict a biomass of a fish based on a truss network of a fish. For example, the control unit 116 can obtain ground truth data, similar to the depth training data 124, that includes one or more truss networks with corresponding known biomasses. The ground truth data can include a first truss network, with specific truss lengths, and a corresponding label indicating the known biomass of a fish from which the truss network was derived.

[0057] In training one or more models of the biomass engine 127, the control unit 116 can provide truss network data to the one or more models. The control unit 116 can obtain output of the one or more models indicating a predicted biomass based on the truss network data. The control unit 116 can compare the predicted biomass to the known biomass of the ground truth data and generate an error term. The control unit 116 can provide the error term to the one or more models or directly adjust one or more parameters of the one or more models based on the error term. In general, the one or more parameters can be adjusted to improve the accuracy of biomass estimations or reduce a difference value between the known biomass values of ground truth data and predicted values.

[0058] The biomass engine 127 generates biomass estimations of one or more fish based on the data provided by the pose detector 122. In the example of FIG. 1 , the biomass engine 127 generates a biomass estimation for one or more fish in the fish pen 104, including the fish 113 and the fish 115. For example, the biomass engine 127 can generate a biomass estimation of 5 kilograms for the fish 113 and 8 kilograms for the fish 115 as shown in the example of FIG. 1 . The biomass engine 127 can generate biomass estimation data 128 including a biomass estimation for the fish 113 and the fish 115. The biomass estimation data 128 can include one or more additional biomass estimations for one or more additional detected fish based on one or more additional images of the data 110 obtained by the control unit 116 and processed by the control unit 116.

[0059] In stage C, the control unit 116 determines an action based on the output, including the biomass estimation data 128, of the biomass engine 127. In some implementations, the control unit 116 provides output of the biomass engine 127, including one or more biomass estimation values, to the decision engine 129. [0060] In some implementations, the decision engine 129 determines an action based on processing a single biomass estimation generated by the biomass engine 127. For example, the decision engine 129 can compare a single biomass estimation to a threshold. If the estimation satisfies the threshold, the decision engine 129 can determine an action based on the estimation satisfying the threshold.

[0061] In some implementations, the decision engine 129 determines an action based on processing two or more biomass estimations generated by the biomass engine 127. For example, the decision engine 129 can determine two or more biomass estimations satisfy one or more thresholds. In some implementations, the decision engine 129 determines a portion of the fish 106, based on data generated by the biomass engine 127, are below an expected weight or below weights of others of the fish 106. For example, the decision engine 129 can determine subpopulations within the fish 106 and determine one or more actions based on the determined subpopulations, such as actions to mitigate or correct for issues (e g., runting, health issues, infections, disfiguration, among others). Actions can include feed adjustment, sorting, model training, and user report feedback, among others.

[0062] In some implementations, the output of the biomass engine 127 includes one or more biomass distributions. For example, the biomass engine 127 can provide biomass estimation for multiple fish. The decision engine 129 can detect one or more features of the biomass estimations. For example, the decision engine 129 can detect one or more subpopulations. Subpopulations can include runt fish, healthy fish, diseased fish, among others.

[0063] In some implementations, the decision engine 129 detects a runt subpopulation based on processing the output of the biomass engine 127. For example, the decision engine 129 can include one or more algorithms or trained models to detect groups within a distribution of data. The decision engine 129 can include one or more processors configured to perform clustering algorithms such as k-mean, partitioning methods, hierarchical clustering, fuzzy clustering, density-based clustering, model-based clustering, among others.

[0064] In some implementations, the control unit 116 determines an adjustment of feed using the feed controller unit 130 controlling the feed system 132. The control unit 116 can provide the output of the biomass engine 127 or a control signal to the feed controller unit 130. Depending on the data received from the control unit 116, the feed controller unit 130 can either process the output of the biomass engine 127 to determine an adjustment of feed and provide a control signal to the feed system 132 or can provide the control signal provided by the control unit 116 to the feed system 132.

[0065] In some implementations, the decision engine 129 does not detect a runt subpopulation. For example, the decision engine 129 can detect a biomass distribution from the output of the biomass engine 127 that does or does not satisfy a biomass requirement or threshold, such as a biomass requirement for distribution or sale. The decision engine 129 can determine, based on features of the biomass distribution, what action to perform.

[0066] For example, if one or more biomass estimations generated by the biomass engine 127 do not satisfy a threshold (e.g., the mean or median biomass is too large or too small), the control unit 116 can provide a control signal to a sorting actuator of a sorting system to sort one or more fish from the fish 106 or can provide a control signal to adjust a feeding of the fish 106. For example, the control unit 116 can sort the fish 106 based on biomass. The control unit 116 can send a signal to a sorting system that sorts fish based on one or more criteria, such as a threshold biomass, into multiple locations based on the one or more criteria.

[0067] In some implementations, the control unit 116 includes the feed controller unit 130. For example, the control unit 116 may control both the processing of the images in the data 110 and the adjustments to the feeding by controlling the feed system 132.

[0068] In some implementations, the control unit 116 adjusts feeding to provide feed to a certain area of the fish pen 104. For example, the obtained data 110 can include positions of the fish detected within the images of the obtained data 110. The control unit 116 can determine based on one or more subpopulations detected by the decision engine 129 of the control unit 116 that a given subpopulation requires additional feed.

[0069] The control unit 116 can send a control signal to the feed system 132 or to the control unit 130 for the feed system 132 configured to adjust the location of an output of feed. The control unit 116 can adjust the location of an output of feed to a location of one or more fish within a particular subpopulation or an average location of the subpopulation.

[0070] In some implementations, the feed system 132 includes multiple food types. For example, the controller unit 130 can provide control messages to the feed system 132 to change the food type provided to the fish 106. In some cases, the multiple food types include a medicated food type and a non-medicated food type. In some cases, the multiple food types include food with a particular nutritional value and food with a different nutritional value.

[0071] The controller unit 130 can determine, based on data from the control unit 116, which food to provide to the fish 106, how much food to provide, when to provide the food, and at what rate to provide the food. In general, the controller unit 130 can generate a meal plan based on data from the control unit 116, such as biomass estimations or a control signal generated based on biomass estimations, where the meal plan includes one or more of: a feed type, a feed rate, a feed time, and a feed amount.

[0072] In some implementations, the control unit 116 includes multiple computer processors. For example, the control unit 116 can include a first and a second computer processor communicably connected to one another. The first and the second computer processor can be connected by a wired or wireless connection. The first computer processor can perform one or more of the operations of the depth perception model 118, the object detector 120, the pose detector 122, the biomass engine 127, or the decision engine 129. The first computer processor can store or provide the depth training data 124.

[0073] Similarly, the second computer processor can perform one or more of the operations of the depth perception model 118, the object detector 120, the pose detector 122, the biomass engine 127, or the decision engine 129. The second computer processor can store or provide the depth training data 124. Operations not performed by the first computer processor can be performed by the second computer processor or an additional computer processor. Operations not performed by the second computer processor can be performed by the first computer processor or an additional computer processor. [0074] In some implementations, the control unit 116 operates one or more processing components, such as the depth perception model 118, the object detector 120, the pose detector 122, the biomass engine 127, or the decision engine 129. In some implementations, the control unit 116 communicates with an external processor that operates one or more of the processing components. The control unit 116 can store the depth training data 124, or other data used to train one or more models of the processing components, or can communicate with an external storage device that stores data including the depth training data 124.

[0075] In some implementations, the biomass engine 127 generates, for each fish and across multiple weight ranges, the likelihood that the fish’s weight is within a given weight range. The collection of likelihoods for each weight range corresponding to a fish can be represented as a distribution. For example, weight ranges for a given biomass distribution may include ranges from 3 to 3.1 kilograms (kg), 3.1 to 3.2 kg, and 3.2 to 3.3 kg. A likelihood that the actual biomass of the fish 113 is within the first range, 3 to 3.1 kg, can be 10 percent. A likelihood that the biomass of the fish 113 is within the second or third range, 3.1 to 3.2 kg or 3.2 to 3.3 kg, respectively, can be 15 percent and 13 percent. In general, the sum of all likelihoods across all weight ranges can be normalized (e.g., equal to a value, such as 1 , or percent such as 100 percent).

[0076] In some implementations, the weight ranges are indicated as values on an x axis of a distribution representation. For example, a Gaussian, or Gaussian-like form can indicate a likelihood, shown on the y axis, for each weight range across a range of weight ranges, shown on the x axis.

[0077] In some implementations, the biomass engine 127 generates a specific biomass for each fish. In some implementations, the biomass engine 127 generates a biomass distribution for each fish. Instead of generating a distribution, the biomass engine 127, including one or more trained models, can obtain the data 110, including image 112, and generate an indication of the most likely biomass based on the data. In some cases, truss length measurements can be used as input data. In other cases, the biomass engine 127 can simply generate biomass indications based on obtained images. [0078] FIG. 2 is a flow diagram showing an example of a process 200 for monocular underwater camera biomass estimation. The process 200 may be performed by one or more systems, for example, the system 100 of FIG. 1 or the system 300 of FIG. 3.

[0079] The process 200 includes obtaining an image of a fish captured by a monocular underwater camera (202). For example, the control unit 116 can obtain the data 110 from the camera device 102. The data 110 can include the image 112. The image 112 can include a depiction of the fish 115. In some implementations, the control unit 116 obtains the data 110 from another device communicably connected to the camera device 102. For example, the camera device 102 can send the data 110 to one or more intermediary devices that provides the data 110 to the control unit 116.

[0080] The process 200 includes providing the image of the fish to a depth perception model (204). For example, the control unit 116 can provide the data 110 to the depth perception model 118. In some implementations, the process 200 includes obtaining data from the monocular underwater camera indicating a current operation status of the monocular underwater camera; and in response to obtaining the image of the fish and the data from the monocular underwater camera, the process 200 includes providing the image of the fish to the depth perception model. For example, the control unit 116 can determine, based on data obtained from the monocular underwater camera, whether or not to process the data 110 using the depth perception model 118.

[0081] The process 200 includes obtaining output of the depth perception model indicating a depth-enhanced image of the fish (206). For example, the control unit 116 can obtain output data from the depth perception model 118. The output can include a depth enhanced image of the fish 115. The output can include a depth enhanced image of objects depicted in the image 112, including the fish 113 and the fish 115. The output can include depth enhanced versions of one or more images of the data 110.

[0082] The process 200 includes determining a biomass estimate of the fish based on the output (208). For example, the control unit 116, or the depth perception model 118 directly, can provide depth-enhanced versions of one or more images to the object detector 120 and the pose detector 122. In some implementations, the pose detector 122 generates a truss network, such as the network 126, to the biomass engine 127.

[0083] In some implementations, the control unit 116 determines, based on a depth-enhanced image of a fish, a data set including a value that indicates a length between a first point on the fish and a second point on the fish. For example, the pose detector 122 of the control unit 116 can generate a truss network 126 as shown in FIG. 1 for a fish 115. The truss network 126 can include one or more values indicating one or more lengths between multiple points on the fish 115. The lengths can be determined for one or more objects detected by the object detector 120 and based on a pose detected by the pose detector 122. As discussed herein, the pose can greatly affect the accuracy of truss networks generated. The depth-enhanced version of images allow the control unit 116 to accurately generate truss networks for fish based on a three-dimensional pose of a given fish.

[0084] In some implementations, the control unit 116 provides a data set including a value that indicates a length between a first point on a fish and a second point on a fish to a model trained to predict biomass. For example, the biomass engine can include a model trained to predict biomass. The control unit 116 can obtain data from the pose detector 122, including a value that indicates a length between a first point on a fish and a second point on the fish, and provide the data to the biomass engine 127.

[0085] In some implementations, the control unit 116 obtains output of a model trained to predict biomass as the biomass value of the fish. For example, the control unit 116 can obtain output of the biomass engine 127, such as the estimation data 128. The control unit 116 can provide the output to the decision engine 129.

[0086] In some implementations, the control unit 116 detects a first point and a second point on a fish. For example, the object detector 120 can include one or more feature detectors, including a feature detector for particular features of a fish including key points, such as eyes, fins, nose, among others. In some implementations, the control unit 116 provides a depth-enhanced image of a fish to a model trained to detect feature points on a fish body. For example, the object detector 120 can include one or more trained models. A model of the one or more models can be trained with data that includes labeled features of fish. The model can provide predicted features of fish based on input images of fish. The control unit 116 or the model itself can adjust one or more parameters of the model based on a comparison between the known features provided by the labels of the training data and the predicted features of fish provided by the model during training.

[0087] In some implementations, the control unit 116 detects fish within an image using a model trained to detect fish. For example, the object detector 120 can include a model trained to detect fish, such as the fish 113 and the fish 115. The model can be trained with data that includes labeled fish within an image of one or more fish. The model can provide predicted fish detections, such as bounding boxes or other indications of locations and identifications of the detected fish, based on input images of fish. The control unit 116 or the model itself can adjust one or more parameters of the model based on a comparison between the known features provided by the labels of the training data and predicted fish detections provided by the model during training.

[0088] The process 200 includes determining an action based on one or more biomass estimates including the biomass estimate of the fish (210). In some implementations, the action includes adjusting a feeding system providing feed to fish. For example, the control unit 116 can provide the output of the biomass engine 127 or a control signal to the feed controller unit 130. Depending on the data received from the control unit 116, the feed controller unit 130 can either process the output of the biomass engine 127 to determine an adjustment of feed and provide a control signal to the feed system 132 or can provide the control signal provided by the control unit 116 to the feed system 132.

[0089] In some implementations, the action includes sending data including the biomass estimate to a user device, where the data is configured to, when displayed on the user device, present a user of the user device with a visual representation of the biomass estimate. For example, the control unit 116 can generate a data signal that includes an indication of one or more biomass estimates, such as the estimate data 128. In some implementations, the control unit 116 waits for feedback from a user provided a visual representation of a biomass estimate to confirm an action determined by the decision engine 129, such as a feed adjustment. [0090] In some implementations, the control unit 116 obtains data from a monocular underwater camera indicating a current operation status of the monocular underwater camera and, in response to obtaining an image of a fish and the data from the monocular underwater camera, provides the image of the fish to a depth perception model. For example, the camera 102 may include a dysfunctional stereo camera pair. The dysfunction can result in images with the stereo effect provided depth data. To mitigate this situation, the camera device 102 can send a signal to the control unit 116 indicating a camera of a stereo pair is dysfunctional after the camera device 102 determines a camera of a stereo pair has become dysfunctional. In response to the signal, the control unit 116 can process images obtained from the camera device 102 using the depth perception model 118.

[0091] In some implementations, the control unit 116 determines, based on the data 110 or other data provided by the camera device 102, the images do not include depth data. For example, the control unit 116 can process the data 110 and determine that images lack a stereo feature. In response, the control unit 116 can provide the images to the depth perception model 118 to determine the missing depth data. The processing switch can be automatic such that, intermediate data processing issues or environmental effects, such as fish or debris obfuscating or blocking an image of a stereo pair, can be detected by the control unit 116 and the control unit 116 can determine to process corresponding images with the depth perception model 118 and can process the corresponding images with the depth perception model 118.

[0092] FIG. 3 is a diagram showing an example of a system 300 that is used for underwater camera biomass estimation. The system 300 includes a control unit 316 and an underwater camera device 302. Generally speaking, the control unit 316 obtains images captured by a camera of the camera device 302 and processes the images to generate biomass estimations for one or more fish. The biomass estimations for one or more fish can be processed to determine actions such as feed adjustment, sorting, model training, and user report feedback, among others.

[0093] In some implementations, the control unit 316 includes one or more components for processing data. For example, the control unit 316 can include an object detector 318, truss length engine 320, biomass engine 324, and a decision engine 328. Components can include one or more processes that are executed by the control unit 316.

[0094] At least one of the one or more cameras of the camera device 302 includes a camera that captures images from a single viewpoint at a time. This type of camera may be referred to as a monocular camera. Where a stereo camera setup can include multiple cameras each capturing a unique viewpoint at a particular time, a monocular camera captures one viewpoint at a particular time. A computer processing output of a stereo camera setup can determine, based on differences in the appearance of objects in one viewpoint compared to another viewpoint at a particular time, depth information of the objects.

[0095] In some implementations, the camera device 302 has one or more cameras in a stereo camera setup that are non-functional or obscured, e.g., by debris or other objects, including fish or other animals, in an environment. In some implementations, the control unit 316 can process one or more images from the camera device 302, or obtain a signal from the camera device 302, indicating that one or more cameras of the camera device 302 are obscured or non-functional or the camera device 302 is operating with a monocular camera. The control unit 316 can adjust a processing method based on the status of the cameras of the camera device 302, such as a stereo setup status or a monocular camera setup status. In some implementations, a monocular underwater camera includes stereo camera setups that have become, operationally, a monocular camera setup based on nonfunctioning elements, debris, among other causes. In some cases, stereo camera setups can obtain images from a single viewpoint at a given time and be considered, operationally, a monocular camera.

[0096] In some implementations, the camera device 302 includes a single camera. For example, the single camera can be a monocular camera that captures a single viewpoint at a time. In some implementations, the camera device 302 with a single camera is more efficient to produce and maintain, more economical, and can be more robust with fewer components prone to failure.

[0097] The system 300 also includes a feed controller unit 330 that controls the feed delivered by feed system 332. The feed controller unit 330 can include components configured to send control messages to actuators, blowers, conveyers, switches, or other components of the feed system 332. The control messages can be configured to stop, start, or change a meal provided to fish 306 in pen 304.

[0098] In this example, the camera device 302 includes propellers to move the camera device 302 around the fish pen 304. In general, the camera device 302 may use any method of movement including ropes and winches, waterjets, thrusters, tethers, buoyancy control apparatus, chains, among others.

[0099] In some implementations, the camera device 302 is equipped with the control unit 316 as an onboard component, while in other implementations, the control unit 316 is not affixed to the camera device 302 and is external to the camera device 302. For example, the camera device 302 may provide images 312 and 314 over a network to the control unit 316. Similarly, the control unit 316 can provide return data, including movement commands to the camera device 302 over the network.

[00100] Stages A through C of FIG. 3, depict image data 310, including image 312, obtained by the camera device 302 that are processed by the control unit 316. The image 312 includes representations of the fish 313 and 315. Images of fish obtained by the camera device 302 may include fish in any conceivable pose including head on, reverse head on, or skewed.

[00101] In stage A, the camera device 302 obtains the image data 310 including image 312 of the fish 313 and 315 within the pen 304. The camera device 302 provides the data 310 to the control unit 316. In some implementations, the camera device 302 obtains multiple images and provides the multiple images, including the image 312, to the control unit 316.

[00102] In stage B, the control unit 316 processes the images of the data 310, including the image 312. The control unit 316 provides the data 310 to the object detector 318. The object detector 318 can run on the control unit 316 or be communicably connected to the control unit 316. The object detector 318 detects one or more objects in the images of the data 310. The one or more objects can include large scale objects, such as fish, as well as smaller objects, such as key points of the fish.

[00103] In the example of FIG. 3, the object detector 318 detects the fish 313 and the fish 315 in the image 312. The object detector 318 can compute a bounding box for each detection. For example, the object detector 318 can compute a bounding box 318a indicating a location of the fish 313 and a bounding box 318b indicating a location of the fish 315. The object detector 318 can compute a bounding boxes for key points 318c. Instead of bounding boxes, other types of data indicating a location can be used, such as a specific location specified by one or more values (e.g., x and y values in a coordinate plane of the image 312, spherical coordinates, among others).

[00104] The object detector 318 detects key points 318c of the fish 313 and fish 315. As shown in FIG. 3, the key points 318c of the fish 313 and the fish 315 represent locations on the fish 313 and the fish 315. The locations can include a mouth, upper fin, tail, eye, among others. The object detector 318 can include one or more machine learning models. The machine learning models can be trained to detect key points as well as objects such as fish. In some implementations, the machine learning models are trained by the control unit 316.

[00105] In some implementations, machine learning models of the object detector 318 are trained to detect key points. For example, the object detector 318 can obtain images including one or more key points. The object detector 318 can determine a predicted location for each of the one or more key points. The object detector 318 can obtain ground truth data indicating an actual location for each of the one or more key points. Based on comparing the ground truth data to the predicted location for the one or more key points, the object detector 318 can determine an error term. The object detector 318 can then adjust one or more parameters of the machine learning models according to the error term in order to improve subsequent predictions.

[00106] In some implementations, machine learning models of the object detector 318 are trained to detect fish or other animals. For example, the object detector 318 can obtain images including one or more depictions of fish. The object detector 318 can determine a predicted location for the one or more depictions of fish. The object detector 318 can obtain ground truth data indicating an actual location for the depicted fish. Based on comparing the ground truth data to the predicted location for the depicted fish, the object detector 318 can determine an error term. The object detector 318 can then adjust one or more parameters of the machine learning models according to the error term in order to improve subsequent predictions. [00107] The truss length engine 320 obtains detected objects from the object detector 318. In some implementations, the control unit 316 obtains detected object data from the object detector 318 and provides the detected object data to the truss length engine 320. The truss length engine 320 generates truss networks 320a and 320b for the fish 313 and the fish 315, respectively. The truss network 320b for the fish 315 is shown in more detail in FIG. 3. In general, the truss networks 320a and 320b indicate distances between locations detected on the fish 313 and 315. These distances can be associated with unique properties of the fish such as biomass, identity, or health.

[00108] In some implementations, the truss networks 320a and 320b are two dimensional (2d) truss networks. For example, the truss length engine 320 determines a distance between a first key point of the key points 318c and a second key point of the key points 318c based on a number of pixels in the image 312 separating the first key point from the second key point. Such a determination does not rely on depth information from the image 312. Truss lengths generated in such a way may inaccurately indicate features of a fish, such as biomass, based on a pose of the fish (e.g., truss lengths for fish faced head on towards the camera 302 can appear smaller than they would if the fish was faced side on). However, by processing generated truss networks in aggregate, the biomass engine 324 can generate accurate predictions for the population of fish 306 as historical training data likely has a similar distribution of fish poses and corresponding generated truss networks. The control unit 316 can generate accurate biomass distributions for the fish 306.

[00109] The truss length engine 320 generates truss data 322 including the truss networks 320a and 320b. In some implementations, the truss data 322 includes truss networks generated for multiple fish, such as all the fish 306. The biomass engine 324 obtains the truss data 322. In some implementations, the control unit 316 obtains the truss data 322, including the truss networks 320a and 320b, and provides the truss data to the biomass engine 324.

[00110] The biomass engine 324 generates a biomass distribution 326 based on the truss data 322. In some implementations, the biomass engine 324 includes one or more machine learning models trained to generate biomass distributions for one or more fish. For example, the biomass engine 324 can obtain truss networks for a population of fish. The biomass engine 324, and one or more models therein, can generate a predicted biomass distribution for the population of fish based on the obtained truss networks. The biomass engine 324 can then compare the predicted biomass distribution with a known, or previously calculated, biomass distribution for the population of fish. The biomass engine 324 can adjust one or more parameters of the one or more machine learning models of the biomass engine 324 based on a result of comparing the predicted biomass distribution to the known biomass distribution. Various training techniques, including gradient descent, backpropagation, among others, can be used in training models of components of the control unit 316, including the biomass engine 324 component.

[00111] The known biomass distribution can be determined using one or more algorithms, models, or manual weighing of fish. The population of fish from which training data is derived can be different from the fish 306. The population of fish can be stored in a historical database including records from previous populations of fish. In some implementations, the control unit 316, or processors configured to operate similar to the control unit 316, generates truss data and provides the truss data to a database for later use in training. For example, the control unit 316 can provide the truss data 322 to a database. The truss data 322 can be stored and used for later processes, such as subsequent training of one or more models. In general, training data can include 2d truss networks as described herein.

[00112] The biomass distribution 326 generated by the biomass engine 324 includes one or more biomasses, corresponding to one or more fish, of the fish 306. In the example of FIG. 3, a number of fish corresponding to ranges of biomass is determined by the biomass engine 324. For example, the biomass engine 324 can determine that two fish correspond to biomass range 3.5 kilogram to 3.7 kilogram and three fish correspond to the biomass range 3.7 kilogram to 3.8 kilogram. In some implementations, the ranges are predetermined. In some implementations, the ranges are dynamically chosen by the control unit 316 based on the number of fish and the distribution of biomasses. In some implementations, the biomass distribution 326 is a histogram.

[00113] The biomass engine 324 generates the biomass distribution 326 and provides the biomass distribution 326 to the decision engine 328. The decision engine 328 obtains the biomass distribution 326. In stage C, the control unit 316 determines an action based on the biomass distribution 326. In some implementations, the control unit 316 provides the biomass distribution 326, including one or more biomass estimation values, to the decision engine 328.

[00114] In some implementations, the biomass engine 324 generates the biomass distribution 326 that includes likelihoods that a number of fish of the fish 306 are a particular biomass or within a range of biomass. For example, weight ranges for the biomass distribution 326 can include ranges from 3 to 3.1 kilograms (kg), 3.1 to 3.2 kg, and 3.2 to 3.3 kg. A likelihood that a number of the fish 306 are within the first range, 3 to 3.1 kg, can be 30 percent. A likelihood that the number of the fish 306 are within the second or third range, 3.1 to 3.2 kg or 3.2 to 3.3 kg, respectively, can be 35 percent and 33 percent. In general, the sum of all likelihoods across all weight ranges can be normalized (e.g., equal to a value, such as 3, or percent such as 300 percent).

[00115] In some implementations, the decision engine 328 determines a portion of the fish 306, based on data generated by the biomass engine 324, are below an expected weight or below weights of others of the fish 306. For example, the decision engine 328 can determine subpopulations within the fish 306 and determine one or more actions based on the determined subpopulations, such as actions to mitigate or correct for issues (e.g., runting, health issues, infections, disfiguration, among others). Actions can include feed adjustment, sorting, model training, and user report feedback, among others.

[00116] In some implementations, the decision engine 328 detects one or more features of the biomass estimations of the biomass distribution 326. For example, the decision engine 328 can detect one or more subpopulations. Subpopulations can include runt fish, healthy fish, diseased fish, among others. For example, a Gaussian-like shape in the biomass distribution 326 can indicate a subpopulation. In the example of FIG. 3, the biomass distribution 326 includes at least two subpopulations with one having a smaller average biomass than the other. The control unit 316 can determine the subpopulation with the smaller average biomass is a runt population based on a comparison with runt population criteria, such as a runt population average biomass threshold, among others. [00117] In some implementations, the decision engine 328 detects a runt subpopulation based on processing the biomass distribution 326. For example, the decision engine 328 can include one or more algorithms or trained models to detect groups within a distribution of data. The decision engine 328 can include one or more processors configured to perform clustering algorithms such as k-mean, partitioning methods, hierarchical clustering, fuzzy clustering, density-based clustering, model-based clustering, among others.

[00118] In some implementations, the control unit 316 determines an adjustment of feed using the feed controller unit 330 controlling the feed system 332. The control unit 316 can provide the biomass distribution 326 or a control signal to the feed controller unit 330. Depending on the data received from the control unit 316, the feed controller unit 330 can either process the biomass distribution 326 to determine an adjustment of feed and provide a control signal to the feed system 332 or can provide the control signal provided by the control unit 316 to the feed system 332.

[00119] In some implementations, the decision engine 328 does not detect a runt subpopulation. For example, the decision engine 328 can determine that the biomass distribution 326 does or does not satisfy a biomass requirement or threshold, such as a biomass requirement for distribution or sale. The decision engine 328 can determine, based on features of the biomass distribution 326, what action to perform.

[00120] For example, if one or more biomass estimations generated by the biomass engine 324 do not satisfy a threshold (e.g., the mean or median biomass is too large or too small), the control unit 316 can provide a control signal to a sorting actuator of a sorting system to sort one or more fish from the fish 306 or can provide a control signal to adjust a feeding of the fish 306. For example, the control unit 316 can sort the fish 306 based on biomass. The control unit 316 can send a signal to a sorting system that sorts fish based on one or more criteria, such as a threshold biomass, into multiple locations based on the one or more criteria.

[00121] In some implementations, the control unit 316 includes the feed controller unit 330. For example, the control unit 316 may control both the processing of the images in the data 310 and the adjustments to the feeding by controlling the feed system 332. [00122] In some implementations, the control unit 316 adjusts feeding to provide feed to a certain area of the fish pen 304. For example, the obtained data 310 can include positions of the fish detected within the images of the obtained data 310. The control unit 316 can determine based on one or more subpopulations detected by the decision engine 328 of the control unit 316 that a given subpopulation requires additional feed.

[00123] The control unit 316 can send a control signal to the feed system 332 or to the control unit 330 for the feed system 332 configured to adjust the location of an output of feed. The control unit 316 can adjust the location of an output of feed to a location of one or more fish within a particular subpopulation or an average location of the subpopulation.

[00124] In some implementations, the feed system 332 includes multiple food types. For example, the controller unit 330 can provide control messages to the feed system 332 to change the food type provided to the fish 306. In some cases, the multiple food types include a medicated food type and a non-medicated food type.

In some cases, the multiple food types include food with a particular nutritional value and food with a different nutritional value.

[00125] The controller unit 330 can determine, based on data from the control unit 316, which food to provide to the fish 306, how much food to provide, when to provide the food, and at what rate to provide the food. In general, the controller unit 330 can generate a meal plan based on data from the control unit 316, such as biomass estimations or a control signal generated based on biomass estimations, where the meal plan includes one or more of: a feed type, a feed rate, a feed time, and a feed amount.

[00126] In some implementations, the control unit 316 includes multiple computer processors. For example, the control unit 316 can include a first and a second computer processor communicably connected to one another. The first and the second computer processor can be connected by a wired or wireless connection. The first computer processor can perform one or more of the operations of the object detector 318, truss length engine 320, biomass engine 324, or the decision engine 328. The first computer processor can store or provide training data to train one or more models of the object detector 318, truss length engine 320, biomass engine 324, or the decision engine 328.

[00127] Similarly, the second computer processor can perform one or more of the operations of the object detector 318, truss length engine 320, biomass engine 324, or the decision engine 328. The second computer processor can store or provide the training data to train one or more models of the object detector 318, truss length engine 320, biomass engine 324, or the decision engine 328. Operations not performed by the first computer processor can be performed by the second computer processor or an additional computer processor. Operations not performed by the second computer processor can be performed by the first computer processor or an additional computer processor.

[00128] In some implementations, the control unit 316 operates one or more processing components, such as the object detector 318, truss length engine 320, biomass engine 324, or the decision engine 328. In some implementations, the control unit 316 communicates with an external processor that operates one or more of the processing components. The control unit 316 can store training data, or other data used to train one or more models of the processing components, or can communicate with an external storage device that stores data including training data.

[00129] In some implementations, models of the components of the control unit 316 include one or more fully or partially connected layers. Each of the layers can include one or more parameter values indicating an output of the layers. The layers of the models can generate output for each of the components, such as the object detector 318, truss length engine 320, biomass engine 324, or the decision engine 328.

[00130] In general, the control unit 316 can process one or more images of the data 310 in aggregate or process each image of the data 310 individually. In some implementations, one or more components of the control unit 316 process items of the data individually and one or more components of the control unit 316 process the data 310 in aggregate. For example, the object detector 318 can process each image of the data 310 individually. Similarly, the truss length engine 320 can obtain data from the object detector 318 and process each truss network for each detected fish. In some implementations, parallel processes of the object detector 318 and the truss length engine 320 generate object detections and truss lengths corresponding to the object detections for multiple fish at a time. For example, a first process of detecting a fish object and then determining a truss network for that fish can proceed as a parallel process with a second process of detecting another fish object and then determining a truss network for that fish.

[00131] In general, processing individually as discussed herein, can include processing one or more items sequentially or in parallel using one or more processors. In some implementations, the decision engine 328 processes data in aggregate. For example, the decision engine 328 can determine, based on one or more data values indicating biomass estimates provided by the biomass engine 324, one or more decisions and related actions, as described herein. In general, processing in aggregate as discussed herein, can include processing data corresponding to two or more items of the data 310 to generate a single result. In some implementations, an item of the data 310 includes the image 312.

[00132] FIG. 4 is a flow diagram showing an example of a process 400 for monocular underwater camera biomass estimation. The process 400 may be performed by one or more systems, for example, the system 100 of FIG. 1 or the system 300 of FIG. 3.

[00133] The process 400 includes obtaining images captured by a monocular underwater camera (402). For example, the control unit 316 obtains images of the image data 310 captured by the camera device 302. The camera device 302 can include a single camera. The camera device 302 with a single camera can be more efficient to produce and maintain, more economical, and can be more robust with fewer components prone to failure.

[00134] The process 400 includes providing data corresponding to multiple fish based on the images to a trained model (404). For example, the control unit 316 provides the image data 310 to the object detector 318 trained to detect objects in images of the image data 310. In another example, the control unit 316 provides the image data 310 to the object detector 318 and the truss length engine 320 to generate the truss data 322. The control unit 316 then provides the truss data 322 to the biomass engine 324. The biomass engine 324 includes one or more models trained to determine one or more biomass estimations based on one or more input truss networks.

[00135] The process 400 includes obtaining output of the trained model including one or more biomass values (406). For example, the control unit 316 can obtain output of the models, including the object detector 318 and the biomass engine 324. The output of the object detector 318 can include detected objects such as fish or key points. The output of the biomass engine 324 can include biomass distribution 326.

[00136] The process 400 includes determining an action based on the one or more biomass values (408). For example, the control unit 316 can determine to adjust a feeding of the fish pen 304 based on the detected objects and the biomass distribution 326. The decision engine 328 component of the control unit 316 can choose from one or more available actions, e.g., available actions obtained from data or provided by a user. In the example of FIG. 3, the available actions include feed adjustment. The control unit 316 can provide a signal to the feed controller unit 330 to adjust the feed for the fish pen 304.

[00137] In some implementations, the action includes sending data including the biomass distribution 326 to a user device, where the data is configured to, when displayed on the user device, present a user of the user device with a visual representation of the biomass distribution 326. For example, the control unit 316 can generate a data signal that includes an indication of one or more biomass estimates, such as the biomass distribution 326. In some implementations, the control unit 316 waits for feedback from a user provided a visual representation of a biomass estimate to confirm an action determined by the decision engine 328, such as a feed adjustment.

[00138] In some implementations, the control unit 316 obtains data from a monocular underwater camera indicating a current operation status of the monocular underwater camera and, in response to obtaining an image of a fish, provides the image of the fish to a trained model. For example, the camera 302 may include a dysfunctional stereo camera pair. The dysfunction can result in images without the depth data provided by the stereo effect of the cameras. To mitigate this situation, the camera device 302 can send a signal to the control unit 316 indicating a camera of a stereo pair is dysfunctional after the camera device 302 determines a camera of a stereo pair has become dysfunctional. In response to the signal, the control unit 316 can process images obtained from the camera device 302 as discussed herein.

[00139] In some implementations, the control unit 316 determines, based on the data 310 or other data provided by the camera device 302, the images from the camera device 302 do not include depth data. For example, the control unit 316 can process the data 310 and determine that images lack a stereo feature. In response, the control unit 316 can process the images to determine a biomass distribution without this additional depth data. For example, as discussed herein, the control unit 316 can generate a group of truss networks for a population of fish, such as the fish 306, and provide the group of truss networks to the biomass engine 324 to determine a biomass distribution for the fish 306.

[00140] The processing switch can be automatic such that, intermediate data processing issues or environmental effects, such as fish or debris obfuscating or blocking an image of a stereo pair, can be detected by the control unit 316 and the control unit 316 can determine to process corresponding images as described herein.

[00141] FIG. 5 is a diagram showing an example of a truss network, e.g., truss network 126, truss network 320a, or truss network 320b. FIG. 5 shows truss lengths and key points computed for, e.g., the fish 115 by the system 100 shown in FIG. 1. The truss lengths between key points are used to extract information about the fish including a weight of the fish. Various trusses, or lengths between key points, of the fish can be used. FIG. 5 shows a number of possible truss lengths including upper lip 502 to eye 504, upper lip 502 to leading edge dorsal fin 506, upper lip 502 to leading edge pectoral fin 508, leading edge dorsal fin 506 to leading edge anal fin 510, leading edge anal fin 510 to trailing low caudal peduncle 512, trailing lower caudal peduncle 512 to trailing upper caudal peduncle 514. Other key points and other separations, including permutations of key points mentioned, can be used. For different fish, or different species of fish, different key points may be generated. For any set of key points, a truss network may be generated as a model.

[00142] Other truss lengths not shown can be used by the system 100 or system 300. For example, a truss length from the upper lip 502 to the tail 513 can be used as the length of the fish 115 and included in a collection of one or more truss length measurements and, e.g., provided to a trained model to generate a biomass distribution. In addition, specific truss lengths can be used to recognize specific deformities. Deformities such as shortened operculum can be detected using truss lengths such as a truss length from the upper lip 502 to the gill. Processing one or more images of a fish can include determining any of the following health conditions: shortened abdomen, shortened tail, scoliosis, lordosis, kyphosis, deformed upper jaw, deformed lower jaw, shortened operculum, runting or cardiomyopathy syndrome (CMS).

[00143] In some implementations, a biomass distribution includes health data related to one or more fish represented in a distribution. For example, the biomass engine 127 of the system 100 shown in FIG. 1 or biomass engine 324 of the system 300 shown in FIG. 3 can generate a distribution of one or more fish that includes health probabilities as well as, or instead of, biomass data. The health probabilities can be used to determine various remedial actions including providing medicated feed or moving the fish to a system for treatment, such as delousing.

[00144] FIG. 6 is a diagram illustrating an example of a computing system used for monocular underwater camera biomass estimation. The computing system includes computing device 600 and a mobile computing device 650 that can be used to implement the techniques described herein. For example, one or more components of the system 100 or the system 300 could be an example of the computing device 600 or the mobile computing device 650, such as a computer system implementing the control unit 116 or control unit 316, devices that access information from the control unit 116 or control unit 316, or a server that accesses or stores information regarding the operations performed by the control unit 116 or control unit 316.

[00145] The computing device 600 is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The mobile computing device 650 is intended to represent various forms of mobile devices, such as personal digital assistants, cellular telephones, smart-phones, mobile embedded radio systems, radio diagnostic computing devices, and other similar computing devices. The components shown here, their connections and relationships, and their functions, are meant to be examples only and are not meant to be limiting. [00146] The computing device 600 includes a processor 602, a memory 604, a storage device 606, a high-speed interface 608 connecting to the memory 604 and multiple high-speed expansion ports 610, and a low-speed interface 612 connecting to a low-speed expansion port 614 and the storage device 606. Each of the processor 602, the memory 604, the storage device 606, the high-speed interface 608, the high-speed expansion ports 610, and the low-speed interface 612, are interconnected using various busses, and may be mounted on a common motherboard or in other manners as appropriate. The processor 602 can process instructions for execution within the computing device 600, including instructions stored in the memory 604 or on the storage device 606 to display graphical information for a GUI on an external input/output device, such as a display 616 coupled to the high-speed interface 608. In other implementations, multiple processors and/or multiple buses may be used, as appropriate, along with multiple memories and types of memory. In addition, multiple computing devices may be connected, with each device providing portions of the operations (e.g., as a server bank, a group of blade servers, or a multi-processor system). In some implementations, the processor 602 is a single threaded processor. In some implementations, the processor 602 is a multi-threaded processor. In some implementations, the processor 602 is a quantum computer.

[00147] The memory 604 stores information within the computing device 600. In some implementations, the memory 604 is a volatile memory unit or units. In some implementations, the memory 604 is a non-volatile memory unit or units. The memory 604 may also be another form of computer-readable medium, such as a magnetic or optical disk.

[00148] The storage device 606 is capable of providing mass storage for the computing device 600. In some implementations, the storage device 606 may be or include a computer-readable medium, such as a floppy disk device, a hard disk device, an optical disk device, or a tape device, a flash memory or other similar solid- state memory device, or an array of devices, including devices in a storage area network or other configurations. Instructions can be stored in an information carrier. The instructions, when executed by one or more processing devices (for example, processor 602), perform one or more methods, such as those described above. The instructions can also be stored by one or more storage devices such as computer- or machine readable mediums (for example, the memory 604, the storage device 606, or memory on the processor 602). The high-speed interface 608 manages bandwidth-intensive operations for the computing device 600, while the low-speed interface 612 manages lower bandwidth-intensive operations. Such allocation of functions is an example only. In some implementations, the high speed interface 608 is coupled to the memory 604, the display 616 (e.g., through a graphics processor or accelerator), and to the high-speed expansion ports 610, which may accept various expansion cards (not shown). In the implementation, the low-speed interface 612 is coupled to the storage device 606 and the low-speed expansion port 614. The low- speed expansion port 614, which may include various communication ports (e.g., USB, Bluetooth, Ethernet, wireless Ethernet) may be coupled to one or more input/output devices, such as a keyboard, a pointing device, a scanner, or a networking device such as a switch or router, e.g., through a network adapter.

[00149] The computing device 600 may be implemented in a number of different forms, as shown in the figure. For example, it may be implemented as a standard server 620, or multiple times in a group of such servers. In addition, it may be implemented in a personal computer such as a laptop computer 622. It may also be implemented as part of a rack server system 624. Alternatively, components from the computing device 600 may be combined with other components in a mobile device, such as a mobile computing device 650. Each of such devices may include one or more of the computing device 600 and the mobile computing device 650, and an entire system may be made up of multiple computing devices communicating with each other.

[00150] The mobile computing device 650 includes a processor 652, a memory 664, an input/output device such as a display 654, a communication interface 666, and a transceiver 668, among other components. The mobile computing device 650 may also be provided with a storage device, such as a micro-drive or other device, to provide additional storage. Each of the processor 652, the memory 664, the display 654, the communication interface 666, and the transceiver 668, are interconnected using various buses, and several of the components may be mounted on a common motherboard or in other manners as appropriate.

[00151] The processor 652 can execute instructions within the mobile computing device 650, including instructions stored in the memory 664. The processor 652 may be implemented as a chipset of chips that include separate and multiple analog and digital processors. The processor 652 may provide, for example, for coordination of the other components of the mobile computing device 650, such as control of user interfaces, applications run by the mobile computing device 650, and wireless communication by the mobile computing device 650.

[00152] The processor 652 may communicate with a user through a control interface 658 and a display interface 656 coupled to the display 654. The display 654 may be, for example, a TFT (Thin-Film-Transistor Liquid Crystal Display) display or an OLED (Organic Light Emitting Diode) display, or other appropriate display technology. The display interface 656 may include appropriate circuitry for driving the display 654 to present graphical and other information to a user. The control interface 658 may receive commands from a user and convert them for submission to the processor 652. In addition, an external interface 662 may provide communication with the processor 652, so as to enable near area communication of the mobile computing device 650 with other devices. The external interface 662 may provide, for example, for wired communication in some implementations, or for wireless communication in other implementations, and multiple interfaces may also be used.

[00153] The memory 664 stores information within the mobile computing device 650. The memory 664 can be implemented as one or more of a computer-readable medium or media, a volatile memory unit or units, or a non-volatile memory unit or units. An expansion memory 674 may also be provided and connected to the mobile computing device 650 through an expansion interface 672, which may include, for example, a SIMM (Single In Line Memory Module) card interface. The expansion memory 674 may provide extra storage space for the mobile computing device 650, or may also store applications or other information for the mobile computing device 650. Specifically, the expansion memory 674 may include instructions to carry out or supplement the processes described above, and may include secure information also. Thus, for example, the expansion memory 674 may be provided as a security module for the mobile computing device 650, and may be programmed with instructions that permit secure use of the mobile computing device 650. In addition, secure applications may be provided via the SIMM cards, along with additional information, such as placing identifying information on the SIMM card in a non- hackable manner.

[00154] The memory may include, for example, flash memory and/or NVRAM memory (nonvolatile random access memory), as discussed below. In some implementations, instructions are stored in an information carrier such that the instructions, when executed by one or more processing devices (for example, processor 652), perform one or more methods, such as those described above. The instructions can also be stored by one or more storage devices, such as one or more computer- or machine-readable mediums (for example, the memory 664, the expansion memory 674, or memory on the processor 652). In some implementations, the instructions can be received in a propagated signal, for example, over the transceiver 668 or the external interface 662.

[00155] The mobile computing device 650 may communicate wirelessly through the communication interface 666, which may include digital signal processing circuitry in some cases. The communication interface 666 may provide for communications under various modes or protocols, such as GSM voice calls (Global System for Mobile communications), SMS (Short Message Service), EMS (Enhanced Messaging Service), or MMS messaging (Multimedia Messaging Service), CDMA (code division multiple access), TDMA (time division multiple access), PDC (Personal Digital Cellular), WCDMA (Wideband Code Division Multiple Access), CDMA2000, or GPRS (General Packet Radio Service), LTE, 6G/6G cellular, among others. Such communication may occur, for example, through the transceiver 668 using a radio frequency. In addition, short-range communication may occur, such as using a Bluetooth, Wi-Fi, or other such transceiver (not shown). In addition, a GPS (Global Positioning System) receiver module 670 may provide additional navigation- and location-related wireless data to the mobile computing device 650, which may be used as appropriate by applications running on the mobile computing device 650.

[00156] The mobile computing device 650 may also communicate audibly using an audio codec 660, which may receive spoken information from a user and convert it to usable digital information. The audio codec 660 may likewise generate audible sound for a user, such as through a speaker, e.g., in a handset of the mobile computing device 650. Such sound may include sound from voice telephone calls, may include recorded sound (e.g., voice messages, music files, among others) and may also include sound generated by applications operating on the mobile computing device 650.

[00157] The mobile computing device 650 may be implemented in a number of different forms, as shown in the figure. For example, it may be implemented as a cellular telephone 680. It may also be implemented as part of a smart-phone 682, personal digital assistant, or other similar mobile device.

[00158] A number of implementations have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the disclosure. For example, various forms of the flows shown above may be used, with steps re-ordered, added, or removed.

[00159] Embodiments of the invention and all of the functional operations described in this specification can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them. Embodiments of the invention can be implemented as one or more computer program products, e.g., one or more modules of computer program instructions encoded on a computer readable medium for execution by, or to control the operation of, data processing apparatus. The computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or a combination of one or more of them. The term “data processing apparatus” encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them. A propagated signal is an artificially generated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal that is generated to encode information for transmission to suitable receiver apparatus.

[00160] A computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand alone program er as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code). A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.

[00161] The processes and logic flows described in this specification can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit).

[00162] Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read only memory or a random access memory or both. The essential elements of a computer are a processor for performing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks. However, a computer need not have such devices. Moreover, a computer can be embedded in another device, e.g., a tablet computer, a mobile telephone, a personal digital assistant (PDA), a mobile audio player, a Global Positioning System (GPS) receiver, to name just a few. Computer readable media suitable for storing computer program instructions and data include all forms of non volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.

[00163] To provide for interaction with a user, embodiments of the invention can be implemented on a computer having a display device, e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor, for displaying information to the user and a keyboard and a pointing device, e.g., a mouse or a trackball, by which the user can provide input to the computer. Other kinds of devices can be used to provide for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback, e.g., visual feedback, auditory feedback, or tactile feedback; and input from the user can be received in any form, including acoustic, speech, or tactile input.

[00164] Embodiments of the invention can be implemented in a computing system that includes a back end component, e.g., as a data server, or that includes a middleware component, e.g., an application server, or that includes a front end component, e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the invention, or any combination of one or more such back end, middleware, or front end components. The components of the system can be interconnected by any form or medium of digital data communication, e.g., a communication network. Examples of communication networks include a local area network (“LAN”) and a wide area network (“WAN”), e.g., the Internet.

[00165] The computing system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

[00166] While this specification contains many specifics, these should not be construed as limitations on the scope of the invention or of what may be claimed, but rather as descriptions of features specific to particular embodiments of the invention. Certain features that are described in this specification in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.

[00167] Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous. Moreover, the separation of various system components in the embodiments described above should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.

[00168] Particular embodiments of the invention have been described. Other embodiments are within the scope of the following claims. For example, the steps recited in the claims can be performed in a different order and still achieve desirable results.

[00169] What is claimed is:

Claims

1. A method comprising: obtaining an image of a fish captured by a monocular underwater camera; providing the image of the fish to a depth perception model; obtaining output of the depth perception model indicating a depth-enhanced image of the fish; determining a biomass estimate of the fish based on the output; and determining an action based on one or more biomass estimates including the biomass estimate of the fish.

2. The method of claim 1 , comprising: determining, based on the depth-enhanced image of the fish, a data set including a value that indicates a length between a first point on the fish and a second point on the fish.

3. The method of claim 2, wherein determining the biomass value of the fish comprises: providing the data set including the value that indicates the length between the first point on the fish and the second point on the fish to a model trained to predict biomass; and obtaining output of the model trained to predict biomass as the biomass value of the fish.

4. The method of claim 2, comprising: detecting the first point and second point on the fish.

5. The method of claim 4, wherein detecting the points comprise: providing the depth-enhanced image of the fish to a model trained to detect feature points on a fish body.

6. The method of claim 1 , comprising: detecting the fish within the image using a model trained to detect fish.

7. The method of claim 1 , wherein the action comprises: adjusting a feeding system providing feed to the fish.

8. The method of claim 1 , wherein the action comprises: sending data including the biomass estimate to a user device, wherein the data is configured to, when displayed on the user device, present a user of the user device with a visual representation of the biomass estimate.

9. The method of claim 1 , comprising: obtaining data from the monocular underwater camera indicating a current operation status of the monocular underwater camera; and in response to obtaining the image of the fish and the data from the monocular underwater camera, providing the image of the fish to the depth perception model.

10. A non-transitory, computer-readable medium storing one or more instructions executable by a computer system to perform operations comprising: obtaining an image of a fish captured by a monocular underwater camera; providing the image of the fish to a depth perception model; obtaining output of the depth perception model indicating a depth-enhanced image of the fish; determining a biomass estimate of the fish based on the output; and determining an action based on one or more biomass estimates including the biomass estimate of the fish.

11 . The non-transitory, computer-readable medium of claim 10, comprising: determining, based on the depth-enhanced image of the fish, a data set including a value that indicates a length between a first point on the fish and a second point on the fish.

12. The non-transitory, computer-readable medium of claim 11 , wherein determining the biomass value of the fish comprises: providing the data set including the value that indicates the length between the first point on the fish and the second point on the fish to a model trained to predict biomass; and obtaining output of the model trained to predict biomass as the biomass value of the fish.

13. The non-transitory, computer-readable medium of claim 11 , comprising: detecting the first point and second point on the fish.

14. The non-transitory, computer-readable medium of claim 13, wherein detecting the points comprise: providing the depth-enhanced image of the fish to a model trained to detect feature points on a fish body.

15. The non-transitory, computer-readable medium of claim 10, comprising: detecting the fish within the image using a model trained to detect fish.

16. The non-transitory, computer-readable medium of claim 10, wherein the action comprises: adjusting a feeding system providing feed to the fish.

17. The non-transitory, computer-readable medium of claim 10, wherein the action comprises: sending data including the biomass estimate to a user device, wherein the data is configured to, when displayed on the user device, present a user of the user device with a visual representation of the biomass estimate.

18. The non-transitory, computer-readable medium of claim 10, comprising: obtaining data from the monocular underwater camera indicating a current operation status of the monocular underwater camera; and in response to obtaining the image of the fish and the data from the monocular underwater camera, providing the image of the fish to the depth perception model.

19. A computer-implemented system, comprising: one or more computers; and one or more computer memory devices interoperably coupled with the one or more computers and having tangible, non-transitory, machine-readable media storing one or more instructions that, when executed by the one or more computers, perform one or more operations comprising: obtaining an image of a fish captured by a monocular underwater camera; providing the image of the fish to a depth perception model; obtaining output of the depth perception model indicating a depth-enhanced image of the fish; determining a biomass estimate of the fish based on the output; and determining an action based on one or more biomass estimates including the biomass estimate of the fish.

20. The system of claim 19, comprising: determining, based on the depth-enhanced image of the fish, a data set including a value that indicates a length between a first point on the fish and a second point on the fish.

21. A method comprising: obtaining a plurality of images of fish captured by a monocular underwater camera; providing the plurality of images that were captured by the monocular underwater camera to a first model trained to detect one or more fish within the plurality of images; generating one or more values for each detected fish as a set of values; generating a biomass distribution of the fish based on the set of values; and determining an action based on the biomass distribution.

22. The method of claim 21 , wherein generating the biomass distribution of the fish based on the set of values comprises: providing the set of values to a second model trained to estimate biomass distributions; and obtaining output of the second model as the biomass distribution of the fish.

23. The method of claim 21 , wherein the one or more values for each detected fish comprise: a value that indicates a length between a first point on a particular fish of the fish and a second point on the particular fish.

24. The method of claim 23, comprising: detecting the first point and second point on the particular fish.

25. The method of claim 21 , wherein the action comprises: adjusting a feeding system providing feed to the fish.

26. The method of claim 21 , wherein the action comprises: sending data including the biomass estimate to a user device, wherein the data is configured to, when displayed on the user device, present a user of the user device with a visual representation of the biomass estimate.

27. The method of claim 21 , comprising: obtaining data from the monocular underwater camera indicating a current operation status of the monocular underwater camera; and in response to obtaining the plurality of images from the monocular underwater camera, providing the plurality of images to the first model.

28. A non-transitory, computer-readable medium storing one or more instructions executable by a computer system to perform operations comprising: obtaining a plurality of images of fish captured by a monocular underwater camera; providing the plurality of images that were captured by the monocular underwater camera to a first model trained to detect one or more fish within the plurality of images; generating one or more values for each detected fish as a set of values; generating a biomass distribution of the fish based on the set of values; and determining an action based on the biomass distribution.

29. The medium of claim 28, wherein generating the biomass distribution of the fish based on the set of values comprises: providing the set of values to a second model trained to estimate biomass distributions; and obtaining output of the second model as the biomass distribution of the fish.

30. The medium of claim 28, wherein the one or more values for each detected fish comprise: a value that indicates a length between a first point on a particular fish of the fish and a second point on the particular fish.

31 . The medium of claim 30, wherein the operations comprise: detecting the first point and second point on the particular fish.

32. The medium of claim 28, wherein the action comprises: adjusting a feeding system providing feed to the fish.

33. The medium of claim 28, wherein the action comprises: sending data including the biomass estimate to a user device, wherein the data is configured to, when displayed on the user device, present a user of the user device with a visual representation of the biomass estimate.

34. The medium of claim 28, wherein the operations comprise: obtaining data from the monocular underwater camera indicating a current operation status of the monocular underwater camera; and in response to obtaining the plurality of images from the monocular underwater camera, providing the plurality of images to the first model.

35. A computer-implemented system, comprising: one or more computers; and one or more computer memory devices interoperably coupled with the one or more computers and having tangible, non-transitory, machine-readable media storing one or more instructions that, when executed by the one or more computers, perform one or more operations comprising: obtaining a plurality of images of fish captured by a monocular underwater camera; providing the plurality of images that were captured by the monocular underwater camera to a first model trained to detect one or more fish within the plurality of images; generating one or more values for each detected fish as a set of values; generating a biomass distribution of the fish based on the set of values; and determining an action based on the biomass distribution.

36. The system of claim 35, wherein generating the biomass distribution of the fish based on the set of values comprises: providing the set of values to a second model trained to estimate biomass distributions; and obtaining output of the second model as the biomass distribution of the fish.

37. The system of claim 35, wherein the one or more values for each detected fish comprise: a value that indicates a length between a first point on a particular fish of the fish and a second point on the particular fish.

38. The system of claim 37, wherein the operations comprise: detecting the first point and second point on the particular fish.

39. The system of claim 35, wherein the action comprises: adjusting a feeding system providing feed to the fish.

40. The system of claim 35, wherein the action comprises: sending data including the biomass estimate to a user device, wherein the data is configured to, when displayed on the user device, present a user of the user device with a visual representation of the biomass estimate.