Marine Imaging Workshop Abstracts

Marine Imaging Workshop/presentations/interactive

A non-conventional approach to image annotation using the largo tool in BIIGLE 2.0

Jaime S Davies , University of Gibraltar

Amelia E.H. Bridges, University of Plymouth

The use of annotation platforms (e.g. BIIGLE 2.0) to record and store biological data from imagery and video is now a common practice, offering major benefits in terms of consistent identification and data management. Image annotation for ecological purposes typically consists of two steps: drawing a bounding box around an individual and specifying an identification label for the annotation. Combined, these steps can be extremely time-consuming leading to a bottleneck in the annotation of large datasets and expedite cognitive fatigue of the annotator given the constant complex decision-making requirements (i.e. identification of individuals). Here, we share the experiences of two expert annotators working on a large (~7,200 images) image dataset, where annotation data were required at the highest taxonomic resolution possible. Initial annotations were allocated to broad, low-resolution taxonomic classes that required little taxonomic knowledge (e.g. sea stars, sea urchins, crabs, fish). The in-built largo tool in BIIGLE 2.0 allows all annotations with a certain label to be viewed at once. This tool was used on these broad, low-resolution taxonomic classes to check for mis-identification, and quickly refine annotations to a to a higher taxonomic resolution. In contrast to the traditional approach of having multiple annotators work on separate chunks of the data independently, we believe this largo-based approach not only increases accuracy by allowing experts to work on their taxonomic specialities, but also increases annotation efficiency given the reduced complexity of decision-making during the initial annotation phase. During this talk we will outline the pros and cons of this approach, and welcome feedback from the community to develop and document recommended protocols moving forward.

Diving into science communication: Leveraging underwater imagery to engage new audiences in ocean science and conservation

Susan Von Thun , MBARI

Cassy Burrier, Lila Luthy, Raúl Nava, Marike Pinsonnaeult, Susan von Thun

Underwater imagery is critical to engaging and inspiring audiences to learn about the ocean and the fascinating life that calls it home. For more than a decade, MBARI’s SciComm Team has leveraged our unparalleled deep-sea video archive to share stories about deep-sea animals and the people and technology that bring this imagery back to shore. We scour 30,000 hours of expertly annotated video to find imagery that will spark curiosity in our audiences on social media. Paired with clever captions and accessible, relatable, and inclusive narration, we're inspiring a new generation of ocean explorers and stewards.

In this workshop, MBARI’s SciComm Team will share what we’ve learned as we’ve amassed hundreds of millions of views and more than 800,000 followers and subscribers on our digital channels. We’ll also let participants try their hand at developing visually rich storytelling for social media audiences. We’ll have examples of imagery to practice with, and participants are also encouraged to bring examples of images or videos from their own work.

Educational Design for Marine Imaging: Introducing Computer Vision Basics and Python

Atticus Carter , University of Washington Seattle, School of Oceanography

Sasha Seroy PhD, Kristine Prado-Casillas BS, University of Washington Seattle, School of Oceanography

As careers and internships in marine imaging continue to grow, there is a notable lack of customized educational resources for onboarding and skill-building in this specialized field. To address this gap, we present a pilot course titled "Computer Vision across Oceanography," to be taught as a pilot special topics course during the winter quarter of 2025 and made available online indefinitely for free. This course aims to equip students with the best practices and current research in marine imaging techniques, alongside intermediate Python programming skills. Our instructional approach is grounded in constructivist learning theory, emphasizing active, self-guided exploration. Key components of the course include the use of Google CoLabs to facilitate interactive, cloud-based learning environments where students can engage with Python syntax and functions relevant to marine imaging. Through a combination of flipped classroom structures, synchronous activities infused with active learning, and individualized final projects, students will be encouraged to apply their learning in practical, research-driven contexts. The course design prioritizes accessibility, ensuring that learners with varying levels of prior experience can achieve similar success through higher engagement with course resources. By analyzing data from surveys, student work, assessments, and focus group feedback, we aim to continuously refine the course to better support self-guided scientific inquiry and skill acquisition. We believe that this approach to teaching computer vision and Python within the context of marine imaging will be broadly applicable and beneficial across the earth sciences and other scientific domains.

Expanding AI annotation assistance in BIIGLE

Daniel Langenkämper , Biodata Mining Group, Faculty of Technology, Bielefeld University, Germany

Martin Zurowietz (Genome informatics Group, Bielefeld University, Germany), Tim W Nattkemper (Biodata Mining Group, Faculty of Technology, Bielefeld University, Germany)

As the quality, resolution, and volume in underwater visual data (i.e. photos and videos) keeps increasing with the number of applied platforms as well as with progress in digital camera and storage technology, the bottleneck in the analysis and interpretation of marine visual data becomes more serious. For example, the recent trend to develop so called ocean digital twins, to monitor and model marine habitats and anthropogenic impacts, requires quantitative and/or qualitative data extracted from such visual data, so this rich source of information can be integrated in this development. In this contribution we will present new features of the open source online visual data annotation tool BIIGLE that address this bottleneck problem by implementing different kinds of AI (artificial intelligence) support. First, the nowadays most obvious approach is to apply machine learning for automatic detection and classification of objects (such as fishes, corals, sponges etc). While BIIGLE’s deep learning-based object detection method MAIA has been implemented a few years ago and is still improving, we will present other new AI-related functions of BIIGLE. One example is the Segment Anything Model-based “Magic SAM” instrument in the image annotation tool that automates the segmentation of objects during manual annotation. Another example is an improved version of the LARGO (LAbel Review Grid Overview) tool, which is used by many users for the posterior assessment and processing of manual or computational detection and classification results. To make this process much more efficient, we implemented new sorting functions and an outlier detection, based on deep learning–derived morphological similarities. A new import interface supports the import and export of annotation data in the COCO (Common Objects in COntext) format, so the interplay between manual annotation and review in BIIGLE and external machine learning applications is facilitated.

FathomNet Database: An open-source image database and machine learning model repository for underwater artificial intelligence

Kevin Barnard , MBARI

Brian Schlining, MBARI (USA), brian@mbari.org Lonny Lundsten, MBARI (USA), lonny@mbari.org Giovanna Sainz, MBARI (USA), gsainz@mbari.org Eric Orenstein, National Oceanography Centre (UK), Eric.Orenstein@noc.ac.uk Erin Butler, CVision AI (USA), erin.butler@cvisionai.com Benjamin Woodward, CVision AI (USA), benjamin.woodward@cvisionai.com Kakani Katija, MBARI (USA), kakani@mbari.org

Since its launch in 2021, FathomNet Database has served as a public, open-source database for the aggregation of expertly-labeled ocean imagery. Over the past two years, FathomNet Database has grown significantly in both size and functionality. Today, FathomNet is home to over 100,000 images and nearly 300,000 localizations, made possible by its community of over 600 individuals, with contributing institutions including MBARI, NOAA Ocean Exploration, National Geographic Society (NGS), Ocean Networks Canada (ONC), University of Hawaii, and University of Plymouth. The FathomNet website now offers new community features to support taxonomic ID discussions, annotation history tracking, ORCID integration, and mechanisms to follow contributions related to topics (animals, community members, marine regions) of interest. Additionally, the FathomNet Model Zoo provides a platform to host, organize, and serve machine learning (ML) models trained using FathomNet data. FathomNet’s open-source ecosystem of tools and services are designed to easily integrate into existing workflows at any point of the data lifecycle. We will present an overview of FathomNet Database, and its related services, with a focus on improvements released in Version 1 (July 2023). We will also discuss several general findings regarding ML usability and interoperability for ocean science with visual data based on our user-centered design process. FathomNet is an evolving public resource, and we hope to encourage further engagement from oceanographic community members as we build toward a dataset representing all ocean life.

FathomNet Portal: Accelerating the processing of ocean visual data using artificial intelligence and community engagement

Benjamin Woodward , CVision AI

Elizabeth Corvi MBARI, FathomNet, Laura Chrobak, MBARI Susan Poulton, ODL Tapas Dwivedi, ODL Brian Tate, CVision AI Erin Butler, CVision AI Katherine Parker, CVision AI Katy Croff Bell, ODL Kakani Katija, MBARI

In order to fully explore our ocean and effectively steward the life that lives there, we need to scale up our observational capabilities. Underwater imaging, a major sensing modality for monitoring biodiversity, is being deployed widely, however the community faces a data analysis backlog that artificial intelligence may be able to address. How can we create workflows, data pipelines, and hardware/software tools that will enable novel research themes to expand our understanding of the ocean and its inhabitants in a time of great change? FathomNet seeks to address these community needs by providing a central hub for researchers using imaging, AI, open data, and hardware/software; create data pipelines from existing image and video data repositories; provide project tools for coordination; leverage public participation and engagement via gamification; and data products that are widely shared. These software solutions include FathomNet Portal, an online, collaborative tool for end-to-end AI-assisted processing of ocean imagery. The Portal connects users to FathomNet’s community-supported data pipelines, making it easy for researchers to process their own imagery as well as contribute back to open-source repositories to improve and accelerate automated analysis of underwater visual data worldwide. Users can directly implement FathomNet’s open-source database of machine learning models and expertly labeled ocean imagery, the FathomNet Database, into their analysis workflows. Researchers can also access a global community of ocean enthusiasts to assist with labeling and verification of annotations through the community science game, FathomVerse. Portal helps users train their own machine learning models by providing intuitive interfaces for annotation, streamlined access to the necessary computational power, and accessible tools to analyze the output. By incorporating state-of-the-art artificial intelligence and best practices in human-computer interactions, FathomNet Portal accelerates the analysis of ocean visual data to reveal actionable insights.

Image based Essential Ocean Variables: Integrating information to realize the power of big data in the Global Ocean Observing System

Henry A Ruhl , Monterey Bay Aquarium Research Institute

As part of an interactive session, we will introduce basic concepts for operationalizing imaging from research frameworks towards contributing to international biodiversity information use, including ample question and discussion time with attendees. Biology and ecosystem information can now be readily collected across a wide range of organism sizes. These variables are at the heart of many statutory monitoring requirements for the marine environment including for protected species and understanding natural hazards, to managing fisheries, offshore energy, and mineral industries. The Global Ocean Observing System (GOOS) and its Biology and Ecosystems (BioEco) Essential Ocean Variable (EOV) panel are working to add detail to implementation planning. Marine imaging can now supply EOV information across most of the BioEco EOVs. Here we will introduce the purpose and scope of GOOS BioEco EOVs. This will include discussion of feasibility/Impact and example applications of EOV data from imaging such as for seagrass and coral cover, and the distribution and abundances of plankton, fishes and benthic invertebrates. We will also cover data handling needs and tools for integrating data at national and international scales, including how to find and implement metadata standards, processing pipelines and tools. Potential outcomes include improved global biogeochemical and ecological modeling and data to assess progress towards Convention on Biological Diversity targets.

Immersive VR ‘Dives’ for Research, Education, and Patient Care

James Lindholm , Department of Marine Science, CSU Monterey Bay

Kameron Strickland (Moss Landing Marine Laboratories), Corin Slown (Department of Bioloigy and Chemistry, CSU Monterey Bay, and Carrie Bretz (Department of Marine Science, CSU Monterey Bay)

It is an irony of the 21st century that the ocean on which so much of human society depends (including for climate moderation, food safety, and recreation) remains fundamentally inaccessible to many communities, where stakeholders often have no direct experience with the sea and its ecosystems, regardless of its proximity. Advances in immersive virtual reality (VR) technology now make it possible for us to not only expose stakeholders to the wonders of the undersea world, but more importantly to engage them with respect to how scientific research is conducted underwater, and how it can be used to inform policy and management in support of marine conservation objectives. Our Immersive VR Dive Program has to-date involved three related approaches, including research, education and patient care. We compared traditional SCUBA diver underwater visual surveys (UVC) to data extracted from VR imagery collected simultaneously. Results indicated that many key attributes of the fish communities on rocky reefs/kelp forests of Central California were comparable between the two approaches, including species richness, diversity and composition as well as selected measures of density, opening opportunities to engage non-divers in data collection. Dating back to the COVID-19 pandemic shelter in place, we developed and implemented a curriculum package in which small teams of students (both high school and undergraduate) work together to extract data from ‘transects’ conducted in VR. The curriculum has been engaged across California, including several traditionally underserved communities, allowing students to learn how to conduct scientific research and to ask a variety of ecological and management questions. We have also utilized VR transects in a pilot project offering hospital patients the opportunity to relieve often-excruciating pain through relaxing virtual dives in a choose your own adventure format. Sitting in their beds, patients immerse themselves in kelp forests, coral reefs, and shipwrecks, with news dives being added every month.

Long term stewardship of underwater video data - Is the vision of storing all imagery data at full resolution forever sustainable?

Mashkoor Malik , NOAA Ocean Exploration

Megan Cromwell (NOAA National Ocean Service), Ashley N. Marranzino (UCAR/NOAA Ocean Exploration Affiliate), Adrienne Copeland (NOAA Ocean Exploration), Sarah Groves (UCAR / NOAA Ocean Exploration Affiliate), Anna Lienesch (CISESS / NOAA NCEI Affiliate), Caitlin Ruby (CIRES / NOAA NCEI Affiliate), Kirsten Larsen (NOAA NCEI), Vidhya Gondle (STC/NOAA NCEI Affiliate), Julie Rose (julie.rose@noaa.gove)

Due to revolutionary changes in imaging resolution over the last decade, data rates reach more than 20 GB / minute (in the case of 8k video). While imagery is essential to understand the marine environment, the question of what should be preserved perpetually remains unanswered. The multifold increase in data size makes it extremely difficult to retain the video data at the highest resolution onboard the vessel, transfer data to shore, analyze the data, and finally archive the data. We propose to highlight current obstacles in large-volume imagery pipelines and to solicit community feedback on long-term optimized stewardship solutions. We review three approaches to address this challenge: Preserve all video at the highest resolution: Assuming technology will improve eventually, the first approach is to preserve the highest possible resolution. Waiting for improved storage and processing has been a current working assumption in defining the data management protocols. However, in absence of these technical solutions, the current hardware available will necessitate decisions that will be made by the operators arbitrarily and may not represent a well thought through approach. Technical solutions may include compression algorithms and / or utilize optimized archival procedures and formats; Information based decision making: Assessing information content of imagery and only preserving the video data that offers information that will be lost if rendered at lower resolution. With the latest advancements in artificial intelligence, it can be used to evaluate the imagery and decide whether data should be preserved. This second approach will require community consensus to define what is considered information and will be processing intensive; Arbitrary solutions: This approach selects solutions based on available resources and perhaps currently the most deployed approach. Some of the pragmatic solutions have been to preserve highest resolution at only 1 frame per second or user defined clips of highest resolution that will be preserved. As currently there is little to no consensus on defining the usefulness of the video for a particular application, this approach is at best arbitrary.

MBARI's Video Annotation and Reference System (VARS): A Guide to Setup and Use

Brian Schlining , MBARI

Kevin Barnard, Nancy Jacobsen Stout, Kyra Schlining, Lonny Lundsten, Kris Walz, Larissa Lemon, Megan Bassett

The Monterey Bay Aquarium Research Institute’s (MBARI) Video Annotation and Reference System (VARS) has been used for creating and managing video and image annotations for over 20 years. The central tenants of VARS--using a controlled vocabulary and a centralized archive of annotations--have proven extremely effective for generating quantitative and qualitative information from images and video. Over 551 peer-reviewed publications have been written using data from VARS. MBARI continues to invest in the development of VARS to meet the evolving needs of researchers.

Join us for this interactive workshop to learn how to use VARS in your own lab. We'll provide a comprehensive overview of the software infrastructure and guide you through the process of setting up and running a VARS system on your own computer or within your lab.

Squidle+: Revolutionising Marine Image Management, Annotation and Reuse with Advanced Collaboration and Automation

Ariell Friedman , Greybits Engineering

Jacquomo Monk: Institute for Marine and Antarctic Studies, University of Tasmania; Oscar Pizarro: Norwegian University of Science and Technology; Stefan Williams: Australian Centre for Field Robotics, University of Sydney

Squidle+ (squidle.org) is a marine image data management, discovery, and annotation platform providing access to millions of images and annotations. It supports Australia’s National Understanding Marine Imagery facility and is the world’s largest repository of openly accessible, georeferenced seafloor images with annotations. Squidle+ offers a centralised platform for managing, discovering, and annotating marine image data, with advanced workflows for collaborative analysis and expedited data delivery. Adhering to FAIR principles, it supports granular discovery, access, sharing, and fosters scientific research, collaboration, communication, and educational outcomes. It indexes imagery from existing cloud storage repositories, reducing redundant image storage, streamlining imports, and enabling the discoverability of new survey data for various marine image data collection programs. The platform features a sharing framework that facilitates collaboration between users and external algorithms. Algorithms are set up as "users" interacting through the API backend, enabling automated processing pipelines that connect independent machine learning (ML) researchers to real-world problems with high-quality, validated training data. Conversely, it provides the marine science community with access to algorithms that reduce annotation time and improve data quality. Unlike traditional annotation tools tied to specific ML pipelines, Squidle+ offers unprecedented flexibility for ML integration. Squidle+ supports multiple standardised or user-defined vocabularies for annotation and includes tools to translate between vocabularies. This flexibility allows users to construct datasets targeting specific scientific questions and facilitates data reuse, cross-project syntheses, large-scale machine learning training, and broad summaries like National State of Environment Reporting. In this presentation, we will showcase how Squidle+ tools and features facilitate advanced collaborative annotation workflows, enabling third-party integrations, big-picture reporting, and automated annotation of marine imagery with transparency, quality control, and flexibility to accommodate diverse user needs.

Using imagery data to monitor Marine Protected Areas in UK offshore waters

Rob P. Harbour , Joint Nature Conservation Committee

The UK Governments are committed to the restoration and conservation of the rich diversity found within our marine habitats. A key tool to achieving this is through a strong, ecologically coherent network of Marine Protected Areas (MPAs) that are well managed, understood and supported by stakeholders. The Joint Nature Conservation Committee (JNCC) plays a pivotal role in delivering scientific advice to the UK Governments for MPAs located in the UK’s offshore waters. Through the UK MPA Monitoring Programme, JNCC collect empirical data to monitor, assess, and understand the health of the UK offshore seabed. The programme aims to detect and monitor change over time in the habitats and features within each MPA, attributing changes to causes where possible, and to assess the effectiveness of current management strategies. Marine imagery has rapidly become a key tool for monitoring the health of our MPA network. Using drop cameras and sleds, we collect high quality information on benthic communities and broadscale habitats over large areas, in a cost-effective and non-destructive way. Imagery also provides the opportunity for data collection from habitats where traditional extractive sampling techniques (e.g., grabbing or coring) are impractical. Utilisation of still imagery and video in the assessment of monitoring objectives requires careful planning to ensure that statistically robust conclusions may be drawn from the data. A step-by-step overview of how marine ecologists at JNCC plan, undertake, and analyse marine imagery surveys will be presented, including a case study from Wyville Thomson Ridge, a rocky plateau situated in the Atlantic Ocean. A challenging site from a monitoring perspective, it features depths of 300-1000m with a ridge that divides the warmer waters of the Rockall Trough from the cooler waters of the Faroe-Shetland Channel. The site supports diverse biological communities including sponges, corals, and beds of featherstars.

Utilizing and Expanding Tools from the Cinema Industry to Further Scientific Exploration and Research

Jeremy Childress , The Sexton Corporation

Making the ocean, especially the deep sea, accessible through advanced cinematic imaging technology is crucial for the progress of ocean conservation, exploration, and scientific missions. Over the past decade, increased support for socioecological movements has demonstrated that media and technology are major drivers of ecological change and awareness. To stay relevant amidst continuous technological advances and growing demand for interactive media, society and science must adapt to higher expectations for media realism, modularity, and quality. To reach a broader audience and advance scientific research and exploration in the context of global environmental changes, developing new technologies to deliver high-resolution imagery through innovative mediums is more critical than ever. Fortunately, cinematic camera technology has become more affordable and accessible to the average consumer over the past decade. These advancements provide high-resolution imaging, high-speed recording, low-light sensitivity, and advanced image-processing software at retail prices, aiding scientists and design agencies in meeting these heightened expectations. However, integrating these systems for innovative scientific applications requires unique collaborative efforts between scientists and design agencies to incorporate existing technologies into new and creative formats. Engineering and design agencies must innovate in this evolving field to develop new technologies that are affordable and useful for both science and exploration. The Sexton Corporation has been collaborating with various clients to utilize cinematic technologies for underwater exploration, 3D modeling, virtual reality capture, fisheries surveys, oceanographic sensing, and ecosystem monitoring for over a decade. In this talk, Sexton will provide an overview of the challenges and lessons learned from working in this field, termed "cinemascience." The audience will gain a deeper understanding of the current state and future direction of the field, technologies, and integrations hopefully sparking new and innovative ideas.

Marine Imaging Workshop/presentations/talks

A Visualization Tool for Reproducible Image Preprocessing in Ecological Research

Tobias Ferreira , National Oceanography Centre - UK

Eric Orenstein, Jennifer Durden, Colin Sauze, Loic Van Audenhaege, Mojtaba Masoudi . All co-authors have the same filiation: National Oceanography Centre - UK

Image preprocessing is essential to prepare data for AI models and ensuring trustworthy biodiversity metrics. Properly selecting and documenting preprocessing steps is crucial for reproducible biodiversity research. However, current software solutions are fragmented and application-specific, requiring multiple tools and platforms, and are not designed for ecologists. To address this gap, the PAIDIVER project are developing an image processing workflow builder and visualization tool specifically designed for ecologists. Our workflow manager enables users to create preprocessing workflows and ensures reproducibility by meticulously recording each step in a standardized, interoperable format. Workflow visualization is provided through a lightweight web application, accessible online for testing and constructing image preprocessing workflows. The web application features an intuitive drag-and-drop interface, allowing users to easily add, remove, and rearrange processing steps while observing the output on test images. Each step can be customized with parameters and settings, providing flexibility while maintaining simplicity. Our tool emphasizes a user-friendly interface and experience, allowing users to visualize, adjust, and preview intermediate and final outputs at each stage before applying them to larger datasets. We will eventually release a Dockerized version of the environment, enabling users to run the software on any available local machine or high-performance computing (HPC) system. Furthermore, the tool provides comprehensive documentation and tutorials, making it accessible even to those with limited technical expertise. We will also release two curated image sets from contrasting environments (pelagic and benthic), complete with metadata, hosted at BODC/CEDA and bundled with the software tool. By combining an intuitive interface with powerful functionality, PAIDIVER aims to make image preprocessing accessible, reproducible, and efficient for ecological research.

A review of seafloor annotation methods used in deep-sea benthic diversity studies

Bárbara de Moura Neves , Fisheries and Oceans Canada

Bárbara de Moura Neves 1, Marion Boulard 1, Rylan Command 1, Merlin Best 2, Katleen Robert 3, Marilyn Thorne 4, Chris Yesson 5 1 Fisheries and Oceans Canada, Newfoundland and Labrador Region 2 Fisheries and Oceans Canada, Pacific Region 3 School of Ocean Technology, Fisheries and Marine Institute of Memorial University of Newfoundland, 4 Fisheries and Oceans Canada, Québec Region 5 Institute of Zoology, Zoological Society of London

The use of seafloor imagery to study deep-sea benthic environments has increased considerably in the past decade. Methods and tools used to perform imagery annotation (i.e., extracting data from the imagery) have also evolved and depend on study objectives and available equipment. Here, we review seafloor imagery annotation protocols and methods used in deep-sea benthic diversity studies. Our objective is to produce a centralized resource for researchers working in the field. The literature review consisted of searching Google Scholar, Web of Science, and Scopus for 12 keyword combinations that include “benthic”, “camera”, “deep sea”, “diversity”, “epifauna”, “imagery”, “fauna”, “field of view”, and “megafauna”. We have restricted our search to the years 2000-2023 and to the first 40 publications for each of the keyword combinations. Manuscripts were imported into the reference manager software Mendeley. We then scanned the list obtained from the literature search and used the authors’ expert knowledge to complete relevant gaps in the results and to remove papers that fell outside of the review’s scope (e.g., SCUBA surveys). The literature is currently being scanned for >20 parameters, including: 1) whether an annotation software was used (if so, which one), 2) whether images or videos were used, 3) how taxa were identified (i.e., is it specified, are there specific ID guides?), 4) presence/absence of lasers, 5) field of view calculation method, 6) sampling unit selected (i.e., temporal vs. spatial), 7) whether classification schemes were used, 8) whether both biota and substrate were annotated, etc. As expected, our preliminary observations indicate an array of different methods and terms used, which is particularly challenging for new projects/programs. Our review will provide a summary of the different methods and strategies used in the field and hopefully better inform decision-making related to seafloor imagery annotation projects.

Aquascope: automated imaging and machine learning to understand and predict plankton community dynamics in lakes

Francesco Pomati , Eawag: Swiss Federal Institute of Water Science and Technology, Aquatic Ecology

Ewa Merz (Eawag & UCSD), Pinelopi Ntetsika (Eawag), Marta Reyes (Eawag), Stefanie Merkli (Eawag), Luis Gilarranz (Eawag), Stuart Dennis (Eawag), Marco Baity-Jesi (Eawag)

Plankton communities are highly diverse and dynamic and influence important ecosystem properties at the local and global scale. They are sensitive indicators of ecosystem health and global change's effects on ecosystem services. Algal blooms, for example, are an emergent property of a complex interaction network of producers and consumers, often triggered by climate change and eutrophication. Traditional plankton monitoring based on sampling and microscopy fails to capture high-frequency dynamics and complexity in plankton communities. Here we present automated in-situ tracking and modelling of plankton interaction networks in lakes based on automated underwater microscopy, machine learning and time-series analysis. The imaging approach is embedded into a floating platform, comprising sensors for profiling water column physics and chemistry, and meteorological information. Sensor data are integrated with lake hydrodynamic models, and the whole platform is suitable for stable long-term deployments for research and water quality monitoring. A dual magnification dark-field microscope and associated image processing and classification provide real-time, high-frequency plankton data from ~10 μm to ~ 1 cm, covering virtually all the components of the planktonic food web and their morphological diversity. We find that vision transformers, with targeted augmentation, constitute the most robust models for high-accuracy plankton classification and low sensitivity to temporal dataset shifts. Comparing imaging data from the field to traditional sampling and microscopy revealed a general overall agreement in plankton diversity and abundances, despite some limitations in detection (e.g. small phytoplankton) and classification (e.g. rare and morphologically similar taxa). Using high-frequency time-series of plankton abundances, or their daily growth rates, environmental conditions and state-space reconstruction methods for modelling chaotic dynamics, our data allow us to: map interactions between taxa, reverse engineer mechanisms that drive algal blooms or biodiversity change, and forecast ecosystem properties.

Combining in situ imagery with acoustic surveys for a comprehensive view of pelagic ecosystems

Ben Grassian , Woods Hole Oceanographic Institute

Heidi Sosik, Andone Lavery, Megan Ferguson, Sidney Batchelder, E. Taylor Crockford (all WHOI)

Ocean midwaters are the largest habitat on Earth but are among the least understood marine environments due to sampling challenges. This project focuses on combining concurrently collected deep towed imagery and shipboard acoustic measurements in the epi and mesopelagic environment to interpret zooplankton and nekton distributions within the detailed environment and scattering from mixed assemblages. We have trained a machine learning detector model for the automated detection of 13 zooplankton functional groups from a towed shadowgraph imaging system. The zooplankton detection model currently achieves 81% F1 scores on our validation image set. We present results of taxa-specific zooplankton distributions derived from the image data and examine their distributions with respect to scattering layers detected at different frequencies within the upper 1000m of a pelagic ecosystem in the slope waters offshore of the New England Shelf. We derived zooplankton biometrics from the image data useful for feature-based (e.g. size) discriminations and to generate forward scattering predictions. We describe animal zonation within the environment and varying diel habits across species and size classes from the acoustic, optical, and environmental data. Collectively this work will further enable and automate the use of optical techniques for midwater surveys and the interpretation of acoustic scattering returns from mixed zooplankton populations.

Deep sea imaging considerations for geometric and color consistent environmental monitoring

Dennis Giaya , Northeastern University

Aniket Gupta (1), Pushyami Kaveti (1), Jasen Levoy (1), Victoria Preston (1), Zhiyong Zhang (1), Dan Fornari (2), Hanumant Singh (1) (1): Northeastern University (2): Woods Hole Oceanographic Institution

Deep-sea marine ecosystems are increasingly impacted by environmental stressors, including rising sea temperatures, plastic pollution, and habitat erosion. To quantify bulk and rate changes of an environment from a historical baseline, we must preserve a snapshot of the 3D structure and color for spatiotemporal analysis. Modern reconstruction methods, including state-of-the-art structure-from-motion techniques, neural radiance fields, and Gaussian Splatting, are generally benchmarked against in-air scenes. We demonstrate that for underwater scenes, these methods are insufficient for geometric- and color-consistent reconstructions due to outstanding challenges in robust in situ camera calibration and lighting pattern recovery. Extracting quantitative spatial data from imagery requires precise camera calibration to ensure accurate geometric and color measurements. We focus on calibration challenges specific to deep-sea underwater imaging involving practical integration of cameras in depth-capable housings and lighting fixtures onto imaging platforms. First, we show how intrinsic and extrinsic camera calibration parameters can change over the course of field missions, using calibration imagery acquired in a tank as well as benthic seafloor imagery from a stereo camera pair. Intrinsic calibration parameters can change between platform deployments whenever cameras need to be serviced within underwater housings. Extrinsic calibration parameters may change over the course of a single deployment due to flexing of the underwater platform, requiring online calibration methods to compensate. Second, using hydrothermal vent imagery acquired at the northern East Pacific Rise, we show how color rendering is impacted by different lighting patterns and how uncertainty in lighting can lead to color inconsistency across deployments. Geometric- and color-consistency challenges are critical areas for hardware, data-processing, and methodological innovations to translate imagery into scientifically viable data products. This is especially pressing as novel scientific expeditions aim to track both natural and man-made changes. From our analyses, we suggest several best practices to consider for image collection, processing, and curation.

Deep sea invertebrate bodies viewed through Microcomputed Tomography (microCT)

Freya Goetz , Smithsonian Institution

Marine invertebrates exhibit broad diversity in anatomy and adaptations to their environment, especially in the midwater and deep-sea floor. Darkness and the viscosity of water foster physical adaptations not found elsewhere. This has resulted in vast soft body variations ideal for explorations that could inspire new soft robotic designs. Microcomputed Tomography (microCT) is an exciting tool for studying these adaptations, especially now that microCT scanners are becoming more accessible and resolution capabilities are improving. Animals best suited for these analyses range in size from roughly similar to a 5 gallon bucket down to a few millimeters. Resolution depends on animal size, where the smallest animals can yield voxel sizes around 1.6 micrometers, approximately the thickness of a red blood cell. In practice, we cannot see individual cells but can differentiate between tissue types like muscle and nerve. Data can be generated exponentially faster than with traditional histology and without distorting anatomical structures. By staining with compounds containing metals such as tungsten, osmium and iodine, soft tissue contrast can be enhanced to visualize internal anatomy. This means that animals can be viewed in 3D without dissection, maintaining the most life-like state. The resulting dataset of each scan is a stack of grayscale image slices which one can virtually fly through the animal along all three axes. Data analysis possibilities are vast and open to questions involving physiology, 3D modeling and anything structural that can be visualized or measured. I will highlight practical approaches of specimen preparation, acquisition, data analysis and data management along with examples of various study systems, including leech internal anatomy and hyperiid amphipod visual systems.

Developing a Portable, Cost-Effective Multimodal Imaging System for Underwater Mapping and Monitoring with Remotely Operated Vehicles

Giancarlo Troni , Monterey Bay Aquarium Research Institute (MBARI)

Sebastián Rodríguez-Martínez, MBARI Eric Martin, MBARI Paul Roberts, MBARI Kevin Barnard, MBARI Bastian Muñoz

Underwater exploration is crucial in advancing our understanding of marine environments and ecosystems. Employing multimodal imaging sensors enables us to conduct comprehensive studies with multidimensional capabilities. This work presents the ongoing developments of the CoMPAS Lab at the Monterey Bay Aquarium Research Institute (MBARI). Our focus lies in utilizing cost-effective sensors to improve perception, particularly in seafloor mapping and navigation with remotely operated vehicles (ROVs) and autonomous underwater vehicles (AUVs). A multimodal sensing platform has been developed for benthic surveys, incorporating downward-looking stereo cameras, a forward-looking sonar (FLS) for acoustic imaging, a push-broom hyperspectral camera, a laser-sheet projector, and a suite of low-cost navigation sensors. Each component provides an additional dimension to our understanding of the underwater eenviorment. Leveraging deep learning techniques, we are improving our perception based on sensors such as FLSs and laser scanners to enhance the quality and resolution of seafloor mapping data and enable long-term monitoring of those areas. The hyperspectral system shows promise in advancing our understanding of underwater ecosystems through enhanced imaging capabilities with more than one hundred short-bandwidth images per experiment. Additionally, the stereo cameras enable the creation of photomosaics and point cloud representations of the seabed while improving navigation through a simultaneous localization and mapping (SLAM) framework. The performance and capabilities of these systems have been validated through multiple deployments with platforms such as the MiniROV from MBARI, as well as in the exploration of the Salas and Gomez Ridge, Chile, with the Schmidt Ocean Institute’s ROV, SuBastian. Through this work, we aim to validate and demonstrate the potential for scalable approaches using cost-effective sensors, setting the foundations for their integration in scalable autonomous platforms for widespread underwater mapping and monitoring, contributing to a greater understanding of marine ecosystems, and fostering advancements in oceanographic research and conservation.

Economies of Scale: The GEOMAR Seafloor Photogrammetry Workflow

Tom Kwasnitschka , GEOMAR Helmholtz Centre for Ocean Research Kiel

Armin Bernstetter, Jacob Eichler, Marco Rohleder, Oliver Jahns, Malte Eggersglüss, Carl von Brandis, Markus Schlüter, Jan Fleer - all affiliated at GEOMAR Helmholtz Centre for Ocean Research Kiel

Photogrammetry is a transformative methodology in seafloor research both in terms of its potential for situational awareness, and for the quantitative assessment of otherwise inacessible outcrops and habitats. Recent work has been focused on the economies of scale connected to particularly large, synoptic and holistic surveys of study areas covering up to several acres, and the challenges connected to such work. We will report on recent developments at GEOMAR in this field, including the finalizing of a suite of immersive fisheye cameras, metrologic water-compensating survey cameras, and a corresponding 48-channel LED video strobe lighting system. We will elaborate on the construction and first field trials of an all-new dedicated underwater vehicle to carry this sensor suite. We will show progress on immersive virtual fieldwork and quantitative measurements on photogrammetric models using the Unreal Game Engine in our ARENA2 visualization lab. Lastly, we will give an overview on the progress applying this methodology to a host of ongoing marine geoscientific mapping projects.

Enhancing Pelagic Ecosystem Research with a Standardized Image Preprocessing Software Tool

Mojtaba Masoudi , Ocean BioGeosciences, National Oceanography Centre, Southampton, UK

Eric Orenstein1, Loic Van Audenhaege1, Tobias Ferreira 2, Colin Sauze2, Jennifer Durden1 1 Ocean BioGeosciences, National Oceanography Centre, Southampton, UK 2 Ocean Infomatics, National Oceanography Centre, Liverpool, UK

When studying pelagic ecosystems, marine scientists often grapple with the challenge of managing and analyzing large volumes of imagery. These datasets are collected in diverse habitats and, often, from several imaging systems. Each of these sampling regions and instruments may require unique preprocessing protocols to effectively prepare the data for downstream ecological analyses. Practitioners are often left to piece these pipelines together from previous projects and disparate software tools, often resulting in disjointed and poorly document workflows. Unlike benthic imaging, which often deals with more static environments, pelagic imaging must contend with the dynamic nature of open water habitats, where rapidly moving organisms and fluctuating light conditions complicate data capture and analysis. This can lead to inconsistencies, inefficiencies, potential errors in ecological data analysis, and an inability to compare biodiversity indicators between studies. To address these issues, we have developed a suite of software tools designed to standardize and streamline the selection of image preprocessing steps crucial for pelagic research. Our toolkit, enables researchers to effortlessly automate crucial tasks like adjusting brightness variations, detecting edges, and more. Furthermore, its detailed documentation helps users grasp the range of preprocessing choices available, letting them make well-informed decisions that suit their research requirements. By ensuring consistent preprocessing, the toolkit provides robust support for addressing a range of ecological inquiries in midwater, enabling more reliable ecological computations, and supporting reproducible biodiversity assessments. This unified approach is expected to enhance the efficiency and reliability of data analyses in marine ecosystem studies.

Enhancing Situational Awareness in Underwater Robotics with Real-Time Multi-Camera Mapping

Pushyami Kaveti , Norwegian University of Science and Technology (NTNU) and Northeastern University, Boston

Ambjørn Grimsrud Waldum (NTNU), Oscar Pizarro (NTNU), Hanumant Singh (NU), Martin Ludvigsen (NTNU)

Technological advances in robotics, particularly Autonomous Underwater Vehicles (AUVs) and Remotely Operated Vehicles (ROVs), have revolutionized deep ocean exploration, enabling access to unseen, inaccessible, and hazardous environments. Robust and reliable state estimation and situational awareness are crucial for both, autonomous and remotely piloted operations, facilitating informed decision-making in real-time. Visual imaging systems are integral to this advancement, with AUVs and ROVs typically equipped with multiple cameras. Typically, ROV pilots must manage multiple camera feeds to make operational decisions, which can be cumbersome, lead to fatigue, and require substantial training. Integrating vision-based navigation and mapping into these platforms enhances operator support and enables various downstream tasks, including online planning, virtual environment creation, scene comprehension, and potential semantic analysis. It also significantly impacts broader marine applications such as habitat monitoring, environmental sampling, and understanding ocean ecosystems. However, automating visual state estimation in underwater environments presents challenges such as visual degradation due to backscatter, illumination variability, and featureless surfaces. Other challenges include lack of synchronization among cameras, difficulty in processing multiple streams rapidly due to computational constraints, and the variability and sometimes unknown nature of controllable camera parameters such as zoom, pan, and tilt. Existing visual mapping systems are often designed for monocular or stereo cameras and do not scale well for multi-camera systems. Our work addresses these needs with a comprehensive solution encompassing sensor synchronization and advanced algorithms. We effectively fuse multi-camera data within a Simultaneous Localization and Mapping (SLAM) framework for robust and accurate state estimation. We achieve real-time dense 3D mapping of underwater environments by leveraging modern reconstruction and rendering techniques. We present mapping results from ROV operations for subsea infrastructure inspection, showcasing the sensor setup, collected data, and generated dense maps, highlighting the practical significance and the role of real-time mapping capabilities for improving marine exploration and situational awareness.

FathomVerse: Gaming for ocean exploration

Kakani Katija , Bioinspiration Lab, Research and Development, MBARI

Lilli Carlsen (MBARI), Emily Clark (MBARI), Giovanna Sainz (MBARI), Joost Daniels (MBARI), Kevin Barnard (MBARI), Ellemieke Berings (&ranj Serious Games), Meggy Pepelanova (&ranj Serious Games), GAF van Baalen (&ranj Serious Games)

In order to fully explore our ocean and effectively steward the life that lives there, we need to increase our capacity for biological observations; massive disparities in effort between visual data collection and annotation make it prohibitively challenging to process this information. State-of-the-art approaches in automation and machine learning cannot solve this problem alone, and we must aggressively build an integrated community of educators, taxonomists, scientists, and enthusiasts to enable effective collaboration between humans and AI. FathomVerse, a mobile game designed to inspire a new wave of ocean explorers, teaches casual gamers about ocean life while improving machine learning models and expanding annotated datasets (FathomNet). Of the three billion gamers worldwide, up to 70% say they care about the environment, and FathomVerse taps into this engaged community with innovative gameplay and rich graphics that draw players into the captivating world of underwater imagery and cutting-edge ocean science. Here we will share our process of designing FathomVerse, discuss early successes from the v1 launch on May 1, 2024, and highlight some areas of future focus. In less than one month after v1 launch, 8k players from 100 different countries generated >3M annotations that are currently being used to generate consensus labels and retrain machine learning models. Through FathomVerse, we hope to activate global audiences in high school and up, with the goal of increasing public awareness and inspiring empathy for ocean life.

Ghana Marine Species Diversity App

Peter Teye Busumprah , Ministry of Fisheries and Aquaculture Development

This oral presentation addresses the UN Ocean Decade Challenges 8.9 &10 which focuses on digital representation of the Ocean, create skills, knowledge and technology for all and change human relationships with the Ocean. It focuses on SDGs 9 & 14 which is centered o industry, innovation and infrastructure and life below water In Ghana, the fisheries sector is a crucial contributor to the national economy, but the country's fish stocks are facing declining trends due to overfishing and degradation. Accurate monitoring and analysis of fish species are essential to inform sustainable fishing practices and conservation efforts. However, traditional methods of data collection are often limited by inadequate resources and expertise. To address this challenge, we developed "Sea Rock Base App", a mobile-based citizen science app designed specifically for local fishermen in Ghana The app enables fishermen to collect and submit data on fish species they encounter during their daily activities. The app includes a user-friendly interface for recording species information, photographs, and geolocation data. The submitted data is then analyzed using machine learning algorithms to identify patterns and trends in fish species distribution and abundance. The app also provides real-time feedback to fishermen on the most common species found in their fishing grounds, allowing them to make informed decisions about their fishing activities. Pilot testing of the app in four coastal communities in Ghana revealed high user adoption rates and accurate data collection. The app has also led to the identification of new fish species and areas of high conservation concern. By empowering local fishermen with data analysis tools, Sea Rock Base App aims to promote sustainable fishing practices, improve fisheries management, and support the conservation of Ghana's marine biodiversity. This citizen science approach has the potential to be replicated in other fisheries contexts worldwide, contributing to the global effort to safeguard marine ecosystems and ensure food security for future generations.

Harnessing 3D Geometrical Features for Automated Coral Habitat Classification

Larissa M. C de Oliveira , 1. Earth & Ocean Lab, Department of Geography, University College Cork, Ireland 2. Environmental Research Institute (ERI), University College Cork, Ireland

Patricia Schontag 3, Judith Fischer 3, Timm Schoening 3 3. Data Science Unit (DSU), GEOMAR Helmholtz Centre for Ocean Research Kiel, Germany

Cold-water coral reefs are among the most structurally complex marine ecosystems, supporting vast biodiversity. Recent advancements in photogrammetry, such as Structure-from-Motion (SfM) and gaussian splatting, have facilitated the creation of high-resolution 3D models of these environments. However, integrating 3D structural complexity into artificial intelligence (AI)-based classification methods remains an emerging area of research. While established 3D classification methodologies have proven effective in terrestrial applications, their potential for underwater contexts has yet to be fully explored. Measures of 3D complexity provide ecological indicators for coral reef assessments. But their use beyond the ecological landscape remains uncertain. This study investigates the application of geometrical features to enhance the automated classification of 3D reconstructions of coral reefs. By computing geometrical features at multiple scales, seabed details are captured at varying spatial resolutions, allowing for a data-driven optimisation of 3D scene classification parameters. We developed four feature sets incorporating geometric features derived from point cloud covariance matrices and primary features such as 3D coordinates and colour. These feature sets were passed through a gradient-boosted tree model to assess performance and feature importances. Our results indicate that primary features alone are insufficient for accurate classification, whereas incorporating geometrical features significantly enhances performance. Furthermore, the combination of geometrical features with colour information provided only marginal improvements. These findings underscore the robustness of geometrical features for precise 3D classification and habitat mapping in complex 3D environments like coral reefs.

High-resolution bathymetry coupled with 3D models of hydrothermal vents at the Aurora vent field - Gakkel ridge, Arctic Ocean

Tea Isler , Alfred Wegener Institute Helmholtz-Center for Polar and Marine Research Bremerhaven

Christopher R. German [Woods Hole Oceanographic Institution], Vera Schlindwein [Alfred Wegener Institute Helmholtz-Center for Polar and Marine Research Bremerhaven; University of Bremen], Tom Kwasnitschka [GEOMAR Helmholtz Centre for Ocean Research Kiel], Michael Jakuba [Woods Hole Oceanographic Institution]

At the slowest-spreading ridge on earth, the Gakkel ridge, evidence of hydrothermal activity was first observed in 2001 and later confirmed in 2014 by video recordings using the deep-tow camera survey system OFOS (Ocean Floor Observing System). This provided the evidence for submarine venting under ice-covered oceans, an analog for possible habitability on ice-covered ocean worlds such as Saturn's moon Enceladus or Jupiter's moon Europa. In July 2023, the AUV/ROV Nereid Under Ice (NUI) was deployed from the RV Polarstern and discovered seven new submarine vents at the Aurora hydrothermal field. Bathymetry and optical data acquired during the dives allowed for a deeper understanding of this topographically complex environment. The dives primarily targeted the sampling of rocks and fluids, with cameras mainly used for navigation and the identification of new vents and species. From the bathymetry map we are now able to identify morphological features at a higher resolution than was previously possible, while the opportunistic video footage is used to create a scaled 3D model of the newly discovered hydrothermal vents using structure-from-motion techniques. This approach allows for further investigations (e.g., habitat mapping) which would otherwise not be possible using classic ship based multibeam technologies. We present a new high-resolution bathymetric map of the Aurora vent field which, coupled with 3D models, provides a deeper understanding of the morphologically complex study area. This study highlights the usefulness of opportunistic data, especially when surveying in extreme environments, where data collection requires time consuming operations, expensive devices and experienced operators. The final product shows the results achieved during the most recent expedition to the Aurora vent field and also gaps for areas that are yet to be explored.

High-resolution time series of plankton and carbon from underwater imaging

Klas Ove Möller , Institute of Carbon Cycles, Helmholtz-Zentrum Hereon

Ankita Vaswani, Katharina Kordubel, Daniel Blandfort, Götz Flöser (all Institute of Carbon Cycles, Helmholtz-Zentrum Hereon) and Saskia Rühl (Plymouth Marine Laboratory)

High-resolution snapshots of plankton biodiversity, behavior and carbon flux from an underwater imaging time series: Plankton, organic particles and aggregates ("marine snow") form the basis of marine ecosystems and play a fundamental role in the food webs and the biological carbon cycle of the ocean. For these reasons, it is necessary to understand the processes that influence the spatial and temporal distribution as well as the origin, characteristics and biodiversity of these organisms and particles. However, these processes are still insufficiently resolved and quantified, since marine ecosystems are characterized by immense variability and traditional measurement methods are limited in their resolution. To overcome this limitation we deployed an underwater high-resolution imaging observatory in the North Sea to investigate the dynamics and characterization of plankton and particles and to quantify the balance between biological productivity and carbon export at an unprecedented resolution (from seconds to months). To unlock the full potential of our observations, we developed new AI based image analysis pipelines that go beyond classifying organisms by taxonomic groups, but also quantify functional traits and biological phenomena from images. We here present highlights and snaphots of our unique high-resolution time series, including a one year time series of carbon flux in a coastal environment and the underlying drivers, the biodiversity, seasonal variation and vertical migration patterns of plankton organisms, and the increase and impact of Noctiluca on the food web dynamics. These high-resolution observations of the interactions between physics, biology and geochemistry at unprecedented spatial and temporal scales contribute significantly to our understanding of the productivity, resilience and carbon storage capacity of marine ecosystems as a function of climatic changes (rise in temperature, acidification, oxygen minimum zones) and anthropogenic influences and their future developments (nutrients, economic use, loss of biodiversity).

Imaging and taxonomic identification of meiofauna: progress on the BLUEREVOLUTION project

Catherine Borremans , IFREMER (Biology and Ecology of dEEP marine ecosystems Unit (BEEP)/Deep Sea Lab)

Daniela Zeppilli (IFREMER), Abdesslam Benzinou (ENIB), Kamal Nasreddine (ENIB), Valentin Foulon (ENIB), Anthonin Martinel (ENIB), Edwin Dache (IFREMER)

The Earth’s Ocean is the largest three-dimensional living space on our Planet. It is crucial to life as we know it, yet we know less about the entire seafloor than we do of the surface of the moon. Infaunal benthic communities (meiofauna) comprise some of the most diverse groups of organisms on Earth, and only a very small amount of this diversity has been described by science. Despite our partial exploration of this vast domain, all marine habitats, including the deepest trenches, experience direct or indirect human impacts. Thus, the scientific community faces the need for fast and accurate monitoring studies and building baseline datasets to measure future changes and biodiversity losses. In this context the “Biodiversity underestimation in our bLUe planEt: artificial intelligence (AI) REVOLUTION in benthic taxonomy” (BLUEREVOLUTION) project aims at developing in- and ex-situ methods using automated or semi-automated imaging method on meiofauna linked with taxonomic classification tools (AI) for high throughput analysis of benthic diversity. Different imaging techniques -from low to high resolution- were assessed (e.g. flow imaging, focus stacking brightfield microscopy, 3D-fluorescence imaging) and developed (‘MeioCam’) to identify the best configuration(s) according to resolution (capacity to taxonomic identification), throughput, quantitative power, data volume (Foulon et al., 2024). In addition, computer vision-based methods were investigated to support the need of faster recognition and morphometric measurement of meiofauna (e.g. nematods), as well as to compensate for the lack of available imaging dataset to train artificial intelligence models. Approaches based on deep learning and GAN are proposed in this work. The presentation will give an updated overview of the developed imaging pipeline that will ultimately allow for quantitative and functional data of benthic communities being generated at speeds unseen before. It will produce a standardized method for building open-access reference databases together with fast and reliable tools for impact assessments and biodiversity surveys. Ref : Foulon, V., et al. 2024. Taxonomic identification of meiofauna through imaging – a game of compromise. Submitted in Limnology and Oceanography: Methods

Improving Video Annotation Quality Assurance and Quality Control

Ashley N. Marranzino , University Corporation for Atmospheric Research | NOAA Ocean Exploration Affiliate

Sarah Groves ( University Corporation for Atmospheric Research | NOAA Ocean Exploration Affiliate), Samuel Candio (NOAA Ocean Exploration), Vanessa Steward (Ocean Networks Canada)

The increasing volume of video data on deepwater habitats provides scientists with critical information needed to close gaps in our understanding of these poorly understood environments. However, analysis of these videos requires appropriate annotations of features of interest, with the quality of the data analysis dependent on the quality of video annotations. NOAA Ocean Exploration is dedicated to exploring the unknown oceans and routinely collects video data using remotely operated vehicles (ROVs) during expeditions aboard NOAA Ship Okeanos Explorer to provide scientists with a view of unexplored deepwater habitats. All video collected during these expeditions is archived with the National Centers for Environmental Information and made publicly accessible through the Ocean Exploration Video Portal. NOAA Ocean Exploration also partners with Ocean Networks Canada (ONC), using the cloud-based annotation platform SeaTube V3 to host video and ROV sensor data from expeditions. NOAA Ocean Exploration and its partners use SeaTube V3 to annotate ROV videos for biological organisms, geological features, engineering events, and other features of interest during dives and post-expedition. Annotations can also be used to flag content for education and public outreach purposes. While the collaborative features of SeaTube V3 can allow for a more thorough and accurate annotation of video data, it also raises the questions of the level of accuracy and completeness of annotations. NOAA Ocean Exploration and ONC have been working together to develop and test sets of features to help determine the robustness of a set of annotations and develop workflows for the analysis and archival of annotations when they are deemed “complete”. This presentation will highlight some of these features along with next steps the programs plan to take.

Improving marine imaging development by acknowledging contrasting values of data

Jennifer M. Durden , National Oceanography Centre

Key recurring themes in many Marine Imaging Workshop discussion sessions are the desire to share and align imagery data. Considerable effort is being put towards enabling practicalities, such as hosting and transferring large volumes of imagery, and developing metadata standards, yet a lack of discussions about our contrasting values towards imagery data across the marine imaging community hold us back from realising the desired collaboration. The ways that we value data influence how, with whom and why we share and collaborate, and what we expect from those actions. Marine imaging is a multidisciplinary community, made up of engineers, software developers, scientists, data managers and other users, operating in research, government and industry; each of these groups values data (and metadata) differently. For example, a group that spent significant effort generating human-derived annotations may value those image-derived data differently than a software developer looking to train an algorithm. Other drivers of the ways that we value data include research funders, relationships with industry, contractual obligations and academic publishing requirements. I explore the different ways that data are valued across members of the marine imaging community and where conflicts arise that prevent us from achieving the desired sharing and data alignment and governance, and suggest ways to surmount these issues.

Interactive machine learning: reflections on how a human-in-the-loop approach can increase model training efficiency.

Dr Laurence H. De Clippele , School of Biodiversity, One Health & Veterinary Medicine, University of Glasgow

De Clippele L.H., Dominguez Carrió C., Vad J., Vlaar T., Balsa J., Beckmann L.M., Boolukos C., Burgues I., Clark H.P., Cook E., de la Torriente Diez A., Easton. B., Ferreira T. C., Guzzi A., Harris L., Kaufmann M., Mendoza J.C., Riley T.G., Villalobos V.I.

Due to technical advances in imaging technologies and cost-effective data storage solutions, there is a global increase in the amount of underwater image data. Since it is too time-consuming to analyse this data manually, marine scientists are increasingly interested in using convolutional neural networks to automate their annotations. However, deep learning is considered to be a “black box” because it is difficult for the user to understand how deep neural networks make their decisions. In addition, if the user lacks an understanding of the machine learning fundamentals, it is unlikely that they will be able to develop models that are effective and produced in a time-efficient manner. Interactive, “human-in-the-loop”, machine learning is here suggested as a novel approach to increase the user’s understanding of fundamental machine learning concepts, increase model training efficiency and accuracy drastically, and make deep learning less of a black box. In June 2024, a Marine Animal Forest COST action training school was organised at the University of Glasgow. This training school brought together early career researchers from across the UK and European institutions. Through training models on a variety of Remotely Operated Vehicle and timelapse image datasets, reflections were gathered on how interactive machine learning can contribute to increasing the efficiency of model training. The main conclusions were regarding the importance of user expertise of the target object (i.e. the species), the complexity of the habitat, the quality of the data, the abundance of the species, and the psychological effect of observing a change in model performance. Other interesting observations were made regarding a lack of model improvement when trying to “recover” from mistakes during the training process and dealing with the uncertainty when creating a training dataset. A practical workflow is proposed to encourage more efficient model training decisions by users new to machine learning.

Inverse Physically Faithful Underwater Imaging

David Nakath , Kiel University and GEOMAR Helmholtz Centre for Ocean Research Kiel

Kevin Köser (Kiel University and GEOMAR Helmholtz Centre for Ocean Research Kiel)

All work presented at Marine Imaging Workshop depends in some way on underwater imagery which is governed by a unique image formation model. Its complexity often makes automatic processing a very complex task. Hence, knowing the basics is greatly beneficial for practitioners defining downstream tasks on underwater data. This talk will detail geometric as well as radiometric distortions, due to cameras operating directly water. The former emerge when a light ray passes interfaces between media with different optical densities, specifically: air--glass--water. While the latter are caused by attenuation and scattering effects inside the medium itself. Furthermore, homogeneous illumination by the Sun, inhomogeneous artificial illumination, or even a mixture of both contribute another dimension of complexity to be explored. Physically based rendering approaches are particularly well suited to tackle problems in underwater vision, due to the dualism between models originally devised in physical oceanography and the medium models which are nowadays typically employed in physically-based raytracing. This enables us to capture underwater imagery and simultaneously measure optical properties of the water using an established sensor suite. We can then directly synthesize images with the same medium-properties and verify our rendering-systems. This enables us to provide reliable synthetic image data focusing on specific problems to train, develop, and test algorithms with. With the advent of massively parallel computations on GPUs in conjunction with inverse physically-based rendering methods it, vice versa, became possible to infer inherent optical properties of the water body directly from images using an analysis-by-synthesis approach. The same approach can be applied to flat ports, dome ports as well as light sources. Being able to calibrate and simulate refraction, light and the optical properties of the water directly enables a wide range of applications such as image restoration, shadow removal and light removal directly on submerged 3D models.

Learning Ultra-high-throughput Fluorescence Microscopy-in-flow for Marine Phytoplankton Observation

Jianping Li, Claude , 1.Shenzhen Institute of Advanced Technology, CAS, Shenzhen, China; 2.University of CAS, Beijing, China.

Zhisheng Zhou 1,2; Zhenping Li 2,1. 1.Shenzhen Institute of Advanced Technology, CAS, Shenzhen, China; 2.University of CAS, Beijing, China.

The development of automated in-situ analysis technologies of photosynthetically active phytoplankton cells and colonies in natural seawater is of great significance for biological oceanography and marine ecology monitoring. However, the composition of natural seawater is highly complex. The size range of phytoplankton spans at least 3 orders of magnitude, from single cells <1μm to large diatoms or colonies >500μm. In addition, seawater also contains countless non-phytoplankton particles. These facts present enormous challenges in terms of specificity, sensitivity, and spatial resolution for imaging flow cytometers (IFC) such as CytoSense and IFCB to observe phytoplankton in situ. Obviously, ocean observation prefers high-throughput methods in analyzing more seawater within less time to extract more realistic phytoplankton information. Since most phytoplankton are tiny, IFCs usually adopt slow flow with high magnification to obtain sufficient resolution for imaging phytoplankton. However, to enhance imaging throughput, IFCs should use higher flow rates with lower magnifications, though this may be very likely at the cost of imaging resolution and quality sacrifice, to gain increased seawater sampling capability. The compromise between imaging resolution and observation accuracy of current IFCs essentially limits their net throughput. To balance this trade-off, we are trying to combine "low-magnification imaging plus computational super-resolution" in a light-sheet fluorescence microscopy-in-flow system named FluoSieve. By building up a large-scale phytoplankton fluorescence image dataset, we are training a super-resolution CNN network called IfPhytoSR (in-flow phytoplankton super-resolution). The preliminary results indicate that the IfPhytoSR model can restore the poorer resolution and quality images acquired by a 5× objective lens into much better counterparts as if were acquired by the FluoSieve system equipped with a 20× lens for downstream task such as recognition or measurement, theoretically achieving a ~40-fold imaging throughput. In this talk, we will report on the progress of this research.

Marimba: Efficient Marine Image Data Processing with FAIR Principles

Chris Jackett , CSIRO

Chris Jackett - CSIRO Environment Kevin Barnard - MBARI Nick Mortimer - CSIRO Environment David Webb - CSIRO NCMI Aaron Tyndall - CSIRO NCMI Franzis Althaus - CSIRO Environment Bec Gorton - CSIRO Environment Ben Scoulding - CSIRO Environment

Advancements in underwater imaging technologies have greatly enhanced marine exploration and monitoring, resulting in substantial volumes of image data. Efficient management, processing, standardization, and dissemination of this data remain challenging. Marimba is a Python framework co-developed by CSIRO and MBARI that addresses these challenges by facilitating the FAIR (Findable, Accessible, Interoperable, and Reusable) processing of marine image datasets. Marimba is built around three core constructs: Project, Pipeline, and Collection. A Project encompasses the entire data processing workflow; Pipelines provide isolated environments for each data processing stage, encapsulating all necessary logic and operations; Collections aggregate groups of data for isolated processing. Marimba offers a robust CLI and API for flexible user interaction and scripting in scientific imaging projects. It supports project structuring, file and metadata management, integrates with the iFDO standard, and provides a standard library to manage common image and video processing tasks. Dataset packaging capabilities include comprehensive processing logs that capture the full dataset processing provenance, file manifest generation, and dataset statistical summaries, with support for distribution to S3 storage buckets. Managing the entire lifecycle of marine image datasets, Marimba upholds FAIR principles, transforming raw data into structured, usable, and shareable assets for marine environmental research.

Marine snow particle imaging resolves the mechanism and magnitude of coastal carbon export

Colleen Durkin , MBARI

Danelle Cline (MBARI), Duane Edgington (MBARI), Sachithma Edirisinghe (U. Maine), Margaret Estapa (U. Maine), Nils Haentjens (U. Maine), Christine Huffard (MBARI), Paul Roberts (MBARI), Henry Ruhl (MBARI), Fernanda Lecaros Saavedra (MBARI), Melissa Omand (URI), Sebastian Sudek (MBARI)

Ocean waters are filled with marine snow: particles made of fecal pellets, aggregates, and other detritus generated by marine life. These particles are the primary food source for many deep-sea ecosystems, and are a key mechanism for ocean carbon uptake and sequestration. Constraining the spatial, temporal, and compositional variability of sinking marine snow is a fundamental challenge for both ocean biogeochemistry and deep-sea biology. We image individually resolved particles collected in samplers and also by in situ instruments, to estimate the quantity of carbon export in the ocean and its ecological drivers. We apply methods in computer vision and machine learning to automate the classification of large imaging datasets into ecologically meaningful categories of marine snow. These particle classes can then be used to estimate carbon biomass and export. Coordinated observations made in Monterey Bay, California demonstrate how a suite of imaging systems can expand the scale of particle flux measurements while also resolving mechanisms driving its variability. We deployed traditional sediment traps to collect physical samples of sinking particles, and used chemical measurements of carbon flux to ground-truth estimates generated by particle imaging. These ground truth measurements were related to in situ imagers of particles, including camera systems attached to long-range autonomous underwater vehicles, remotely operated vehicles, neutrally buoyant floats, and moored to the seafloor. Together, these temporally sustained and/or spatially distributed observations will enable us to quantify carbon export in dynamic coastal environments where the ecological source and fate of sinking carbon is difficult to trace.

Object Detection for quantitative analysis of deep-sea benthic megafauna distribution: making the best of video transects

Nils Piechaud , Institute of Marine Research, Bergen, Norway

Heidi Meyer¹, Pål Buhl-Mortensen¹, Genoveva Gonzalez-Mirelis¹, Gjertrud Jensen¹, Yngve Klungseth Johansen¹, Anne Kari Sveistrup¹, Rebecca Ross¹ 1: Institute of Marine Research, Bergen, Norway

The MAREANO program maps habitats on the Norwegian seabed by analysing data from a large dataset of underwater imagery collected over many years. Counting individual objects and animals in videos is a common technique to sample the seabed and its automation (partial or complete) with Computer Vision (CV) and Machine Learning (ML) would greatly help quantitative benthic ecology. Yet, many practical problems must be solved before it can reliably generate data on species distribution and abundance. An important drawback of automated annotations is the difficulty for Object Detectors (OD) models to account for contextual information in the video where the behaviour of the object over multiple frames can inform its identification. We investigated how recent innovations and open source libraries could help address this challenge and enable an OD to quickly generate the data required for mapping. Seabed images were analysed manually in Biigle to get bounding boxes annotations which were used as a training set for various OD models (V8m, V8X, Fathomnet and Fathomnet’s Megalodon). Predictions from these models were used to count the number of individuals of selected taxa in videos which were compared to the manual counts. We used additional techniques (tracking, deflector classes, zonal counting) to refine and filter the predictions and maximize performances. F1 scores in training can reach 0.95 for target taxa. Automated Individual counts per videos can also be within 95% of the manual counts. Performances can, however, drop dramatically in challenging visibility conditions prompting the need to manually review and filter the predictions to ensure consistency and reliability. These results show that, in the right conditions, CV can perform well and be useful to enumerate target taxa. However, the need to manually review the results remains a bottleneck and we discuss the possible ways around it.

Ocean on a table top: A virtual reality arena for longterm migration/mapping of plankton behavior in current and future oceans

Manu Prakash , Stanford University

Gravity Machine team, Prakash Lab, Stanford University

Marine plankton exhibit a Diel Vertical Migration with vertical displacement scales from several tens to hundreds of meters. Even at the scale of small phytoplankton and zooplankton (100 μm to a few mm) the interaction of this vertical swimming behavior with hydrodynamics affects large scale distribution of populations in the ocean and is thus an important component of understanding ocean ecology. However, concurrently observing organismal physiology and behavior is challenging due to the vast separation of scales involved. Resolving physiological processes involves sub-cellular (micron) resolution while tracking freely swimming organisms implies vertical displacements of several meters. We present a simple solution to this problem in the form of a “hydrodynamic treadmill” incorporated into a table-top scale-free vertical tracking microscope. We use this method to study the behavior of freely swimming marine plankton, both in lab and on-board a research vessel, revealing a rich space of dynamic behavioral states in marine micro-organism. We will present data sets from five different expeditions from all major oceans of the world. This effort has culminated in one of the largest behavioral plankton dataset on record till today.

Online Learning of Appearance Models to Enable Real-Time Visual Tracking of Arbitrary Marine Animals

Levi Cai , MIT and WHOI

Nathan McGuire (WHOI), Roger Hanlon (MBL), T. Aran Mooney (WHOI), Yogesh Girdhar (WHOI)

Collection of video and co-located telemetry information about various marine animals is crucial for understanding their behaviors and even establishing conservation practices. Divers or tags are the two most commonly used approaches to gathering this type of data, but is mostly limited to surface dwelling species and difficult to scale. Autonomous underwater vehicles (AUVs) equipped with cameras and edge computers directly integrated into their control systems are showing increased promise for enabling collection of this type of data without the use of tags or divers. However, there is a dearth of well-labeled visual data accessible in the marine domain for many species. In this work, we propose the use of a class of visual algorithms known as semi-supervised learning and tracking approaches to estimate online appearance models of never-before-seen visual objects. These approaches require a scientist to initially provide a bounding box or visual template of an animal, and then tracking can be done autonomously. We demonstrate how to incorporate these models into AUV platforms to enable tracking of marine animals, in-situ and in real-time, for which labeled data may not exist. Additionally, this allows us to simultaneously gather video and co-located telemetry information about them. Furthermore, we discuss strategies for deploying these types of systems on a larger-scale and argue the importance of continued innovation and development of AUV platforms that are equipped with modern edge devices and stereo-vision that are deeply integrated into their control systems.

ROVs for Science & Communications: The Making of the Documentary “All Too Clear: Beneath the Surface of the Great Lakes”

Yvonne Drebert , Inspired Planet Productions & Boxfish Robotics

Zach Melnick, Inspired Planet Productions & Boxfish Robotics

Remotely-operated vehicles (ROVs) enable us to explore previously inaccessible underwater environments and observe animal behaviours in unprecedented ways, all while incurring zero animal mortalities. ROVs also facilitate the study of hard-to-access underwater archaeological sites, thereby enriching our cultural heritage. Recent advances in ROV and camera technology allow us to “swim” further, deeper, and longer than with older models, while viewing 4k results in real time from the surface. After more than 20 years creating documentary film and television, our team has taken a deep dive into the world of ROVs as tools for both filmmaking and research. For our most recent documentary series, “All Too Clear: Beneath the Surface of the Great Lakes,” we spent more than 150 days underwater with the Boxfish Robotics “Luna” ROV. The results are a compelling underwater tv series, as well as an archive of wildlife footage that has provided scientists with new insights into the spawning, feeding, and schooling behaviours of North American Great Lakes fishes, including the first-ever recording of lake whitefish spawning in the wild. As an added bonus, in June 2023, we discovered the wreck of the steamer "Africa," lost to the depths in 1895. The story received international media attention, further raising the profile of the freshwater species and environments under the waves of the Great Lakes. We’re excited to share the successes and challenges of this work, as well as our learnings around techniques, gear, post-processing, and other tools to optimize our chances of capturing animal behaviours. Footage example: https://vimeo.com/871465906/4caed4807a

Rate-A-Skate – Deep learning for individual recognition of Flapper Skates

John Halpin , The Scottish Association for Marine Science

Joe Marlow1, Steven Benjamins1, Jane Dodd2, Thomas Wilding1 1 - Scottish Association for Marine Science, Oban, Argyll, PA37 1QA 2 - NatureScot

Flapper skates (Dipturus intermedius), once common in the Northeast Atlantic, are currently classified as Critically Endangered by the International Union for the Conservation of Nature (IUCN). To assess the efficacy of management measures, such as dedicated marine protected areas on skate populations, novel monitoring approaches are needed. Flapper skates can be accurately identified by their distinctive dorsal spot patterns. This has led to the development of the SkateSpotter database, managed by SAMS and NatureScot. The database collates images submitted by anglers for identification/tracking of individual skates, enabling study of skate migration and home-ranges, important considerations in their management. SkateSpotter presently relies on manual identification of skates which, while accurate, is time consuming. Here we present a deep learning method for skate re-identification, using a custom 2-stage deep learning algorithm. The algorithm can re-identify skate with a high degree of certainty (for a test set, unseen during training, 80% of images had correct individual as 1st match) and deal with the variation in the angler submitted images, for example, glare and occlusion. The algorithm is being incorporated into SkateSpotter, enabling a step-change in the speed at which new skate images can be checked for a match. This will enable the cost-effective extension of the database (in time and geography) enhancing scientific value. The combination of Skate Spotter and Rate-A-Skate represent best practice for future animal re-identification challenges, with the success of the deep learning algorithm being proportional to the quality of the images in the database.

Seabed field truth by imaging to protect biodiversity

Paula Vieira Castellões , CLS Brasil

Aline Wyllie Rodrigues (TotalEnergies EP Brasil); João Régis Santos-Filho (Lagemar/Universidade Federal Fluminense -UFF); Leonardo Maturo Marques da Cruz (CLS Brasil); Anderson Catarino (TotalEnergies EP Brasil); Cristine Louise Braun Moraes (Tohoku University); Fernanda Ramos de Lima (CLS Brasil); Cleverson Guizán (Lagemar/Universidade Federal Fluminense - UFF).

To safeguard seabed biogenic structures and diverse fauna, and to conserve these ecosystems and their biodiversity, it’s crucial to conduct initial studies to foresee and prevent potential impacts from exploratory activities. This is a best practice and one of the requirements by the Brazilian environmental agency, IBAMA, for environmental licensing and is also in line with TotalEnergies commitments of defining voluntary exclusion zones and managing local biodiversity in its new projects. These restrictions act as a catalyst for the use of techniques and tools that not only fulfill these requirements but also surpass them. To perform these studies, TotalEnergies EP Brasil adopted the approach developed by CLS Brasil, in partnership with Lagemar/UFF. Such approach is based on the integrated analysis of regional and local data, with subsequent verification of field truth through ROV (remotely operated vehicle) imaging, for mapping seabed surface. This case study focus on Blocks S-M-1711/S-M-1815 (1,296 km² in total), an area in Santos Basin, Brazil, with no prior exploratory history. The integrated and georeferenced analysis was conducted to map submarine relief and surface sediments compiled from seismic, mono and multibeam bathymetric, and sedimentological data. Seabed imaging aimed to identify biogenic environments, especially formed by benthic bioconstructive communities such as beds of calcareous algae and deep-water corals (known to occur in Santos Basin), and to recognize other organisms, colonies or biological structures, as well as relevant physical environmental features, including exudations and oil seeps, that could expand knowledge about the seabed, enabling more accurate and detailed mapping. The imaging was conducted at the same nine stations used for water and sediment quality characterization. If the integrated mapping had identified areas of interest for E&P activities resembling locations with bioconstructions, TotalEnergies would have applied the mitigation hierarchy (avoidance) and new imaging would have been performed.

Sweating the small stuff: Annotation methods to improve small and low cover organism estimates

Emma J. Curtis , School of Ocean and Earth Science, National Oceanography Centre, University of Southampton Waterfront Campus, Southampton, SO14 3ZH, UK; Centre for In Situ and Remote Intelligent Sensing, University of Southampton, Southampton, SO16 7QF, UK

Jennifer M. Durden, Ocean BioGeosciences, National Oceanography Centre, European Way, Southampton, SO14 3ZH, UK; Blair Thornton, Centre for In Situ and Remote Intelligent Sensing, University of Southampton, Southampton, SO16 7QF, UK, Institute of Industrial Science, The University of Tokyo, 4-6-1 Komaba Meguro-ku, Tokyo 153-8505, Japan; Brian J. Bett, Ocean BioGeosciences, National Oceanography Centre, European Way, Southampton, SO14 3ZH, UK

Small and low cover organisms can make up a large proportion of community structure and provide important insight into the resilience of ecosystems. Manually extracting measurements used to monitor these organisms with image annotation can be a high effort process and subject to human error and bias. Automating image annotation of marine images can reduce manual annotation effort and some human error, but supervised techniques and evaluating predictions rely on robust manual annotations. We present considerations for annotating small and low cover organisms from seafloor images, using images annotated for sparsely distributed cold-water coral by 11 researchers as an example case. Using at least two annotators with some overlap in annotation allows for an evaluation of annotation success and uncertainty in derived measurements. Segmenting distinct coral colonies reduced the standard deviation of cover estimates threefold compared to a grid-based cover estimation method, with no significant difference in annotation time. Size bias in manual annotation can lead to missing or incomplete annotations of small organisms and can be propagated by supervised machine learning techniques when incomplete annotations are used as training data, as demonstrated in our example case. By modelling missing sizes of colonies, we reduced the error in our coral cover estimates by up to 23%. When these modelled sizes were used to generate segments for incomplete annotations, trained machine learning predictions significantly improved. We discuss some additional methods which are either more computationally expensive or require more human effort to reduce size bias in image annotations, giving a summary of the impacts and limitations of the outlined methods to provide researchers with guidance on how to more efficiently and reliably monitor small and low cover organisms using seafloor images.

Three-dimensional movements of deep-sea octopus crawling quantified using the EyeRIS plenoptic imaging system

Joost Daniels , MBARI

Crissy L. Huffard, Paul L.D. Roberts, Alana D. Sherman, Henry Ruhl, and Kakani Katija (MBARI)

Ocean animals have adapted to their underwater environment in a myriad of ways, developing sensory systems and biomechanical strategies unlike those found on land. Octopuses in particular, use their flexible arms to swim and crawl in ways not seen in other species. Their movements have garnered considerable interest from roboticists, who attempt to emulate this highly complex yet adaptable system. However, studies have not yet quantified three-dimensional arm movements during crawling in-situ, due in part to the complexity of obtaining these 3D data. We used the plenoptic imaging system EyeRIS, deployed by a remotely operated vehicle, to record the three-dimensional arm kinematics of Muusoctopus robustus at a depth of 3200 meters off the coast of Central California, USA. Using semi-automatic, markerless tracking of the arms, we quantified strain, as well as the radius and location of bends in the arms. Preliminary analysis supports the hypothesis that these octopuses prioritize simple, rotation-based arm movements to generate locomotion. This demonstrates a novel use of plenoptic cameras in underwater environments, where the captured high-resolution 3D data can be used both for detailed point tracking as well as error reduction. Additional perspective cameras on the remotely operated vehicle helped reveal conserved attributes in whole-animal gait kinematics as the octopus crawled across the terrain. Despite moving over rocks of irregular sizes, their strides showed remarkable consistency, even across individuals. This represents the first in-depth study of deepwater octopus crawling kinematics, and significantly expands the existing data on in-situ octopus movement, which thus far – despite significant interest of the robotics community – has been limited.

Underwater Image Formation Ground-Truth

Patricia Schöntag , Geomar Helmholtz Centre for Ocean Research AND Christian-Albrechts Universität zu Kiel (CAU Kiel))

David Nakath (CAU Kiel), Mengkun She (CAU Kiel), Yifan Song (CAU Kiel), Gabriel Nolte (Geomar), Rüdiger Röttgers (Helmholtz Centre Hereon Geesthacht), Kevin Köser (CAU Kiel)

Underwater data acquisition usually demands high effort in equipment, logistics, time and cost. Particularly, acquiring images for developing underwater computer vision comes with a higher dimension of complexity as in air. An essential foundation for computer vision development are image formation models that link physical properties to image appearance. For the underwater case this necessitates to consider also complex optical properties of water as spectral attenuation and volume scattering in addition to light, camera and scene parameters. Thus, development and evaluation of those models require comprehensive data sets which have not been available yet. However, physically accurate underwater image formation models can enable various applications as generating training images for deep learning approaches, deduction of environment parameters from the image or true color estimation. Therefore, we address this gap in evaluation data with two image sets, one from a controlled tank environment and another in the open water. Light and camera parameters were acquired both times by using methods based on calibration images, the water parameters with spectrophotometers and volume scattering meters. To determine the underlying scene geometry and water and light-free texture in the open water, we deployed a known target beneath the camera. A more intricate scene was arranged in the tank that allows to reconstruct geometry and texture in air before filling water. Both sets combine images that show the same scene in varying turbidity and color. The open water image set was acquired in 10 near- and offshore spots in the western Baltic Sea. The tank image set was in turn created by adding colorants and scattering agents to the water according to literature knowledge about Jerlov water type properties. In this way, we obtain two diverse sets of underwater images with all the corresponding image formation parameters that are necessary to explain their appearance.

Using seafloor imagery to study short-term temporal variation in megafaunal communities in the Clarion-Clipperton Zone

Bethany Fleming , University of Southampton; National Oceanography Centre, UK

Erik Simon- Lledό, Noelie Benoist, Alejandra Mejía-Saenz, Daniel O. B. Jones. Affiliation: National Oceanography Centre, UK

Biological communities change over time. This can be in response to environmental variability, disturbance, biological or neutral processes. Studying temporal variability in the deep sea is challenging as the opportunity to revisit study areas regularly or establish continuous monitoring is relatively rare. Long-term monitoring sites have demonstrated that abyssal benthic communities change in response to natural variation, such as changes in POC flux (seasonal and interannual) and decadal climatic events such as ENSO. However, temporal change in the Clarion-Clipperton Zone (CCZ) has not been extensively studied. Photo transects were collected in the NORI-D area over 3 consecutive years. This provides a unique opportunity to study short-term temporal variation of megafaunal communities in the CCZ. However, differences in platforms and camera systems between surveys can make whole community comparisons across years challenging. We developed a framework to be able to standardise across datasets that differ in image quality, thereby ensuring that the results are a reflection of true temporal patterns and not differences in methodologies. To standardise across datasets, we focused on individuals >20mm (large enough to be detected in all years); assigned morphospecies to a detectability category (reliably detectable across all datasets; detectable but abundances likely to be skewed by differences in image quality between years; not likely detectable across all datasets i.e. small, indistinct), morphospecies in the third category were excluded from analyses. We also only retained morphospecies that had high enough density to be unlikely to be missed by chance. This framework was applied on image data from NORI-D. Megafaunal communities were expected to show relatively little variation in density, diversity and community composition between years, although temporal variation may vary between functional groups. It is important to understand how benthic communities vary over time so that future monitoring strategies can adequately take into account natural variability.

Validating an AI-based holographic imaging platform (Aqusens) for monitoring harmful algal bloom species

Holly Bowers , Moss Landing Marine Laboratories

Maxim Batalin, Lucendi, Inc. Alexis Pasulka, California Polytechnic State University

Harmful algal bloom (HAB) species can have detrimental effects on ecosystems, public health, fisheries, drinking water, subsistence harvesting, aquatic recreation and tourism. The HAB field is working to expand detection capabilities within a nationally networked system for long-term monitoring and early warning (National HAB Observing Network; NHABON). To meet this need, instruments demonstrating portability, robustness, ease-of-use, and the ability to produce real-time data are highly desired. Our project compares and validates a new cost-effective automated cell imaging platform (Aqusens) with the Imaging Flow CytoBot (IFCB). The Aqusens is based on holographic imaging principles coupled with machine learning for optimizing computation and object characterization. The Aqusens does not contain expensive and complex to maintain optical or mechanical components. Most of the hardware is off-the-shelf and mass-produced, allowing for scalable production and cost reduction. Accompanying software is used to control Aqusens operations and analyze collected data. It is user friendly and does not require substantial training or maintenance. Validation of the Aqusens is being carried out through three objectives: 1) laboratory experiments with a variety of cultured HAB and non-HAB species to provide foundational platform comparisons; 2) deployments at the Santa Cruz Wharf monitoring site to test performance during varied conditions (e.g. algal blooms, upwelled sediment); and 3) in San Francisco Bay cruise transects (slated for 2025) targeting seasonal succession of phytoplankton populations. These cruises, along with the Santa Cruz Wharf monitoring station, provide avenues for new platform integration into well-established programs that offer publicly available long-term data sets. Furthermore, platform portability is allowing us to support a broad variety of stakeholders (e.g. aquaculture, educational programs). These efforts are assessing functionality of the Aqusens in various settings and evaluating integration of data into existing public portals (California Ocean Observing Systems Data Portal [CalOOS] / Harmful Algal Bloom Data Assembly Center [HABDAC]).

Marine Imaging Workshop/presentations/posters

AI-Derived Krill Detection and Quantification from ROV Video

Savana Stallsmith , Oregon State University

Chris Sullivan Oregon State University: Astrid Leitner Oregon State University and MBARI

Artificial Intelligence is becoming a popular solution for solving a variety of complex problems in a way that allows us to work smarter and not harder. One important and evolving application of AI is in marine imaging, specifically helping to widen the bottleneck that the time consuming annotation of marine video creates for oceanographic research. While rapid advances are being made in AI for benthic marine images, the pelagic system is much more challenging because animals are highly mobile and low contrast with flexible, delicate morphologies. Krill are a key component of the California upwelling system food web and are highly mobile and often highly abundant from the surface to 200m depths. Quantifying krill from video is extremely difficult and time consuming. To solve this issue, we trained an A.I. model to automatically detect and quantify krill in pelagic ROV video transects from the Monterey Bay. We compare the AI generated quantifications with pre-existing hand quantified videos. We show that it is possible to train an A.I. model to detect krill and to export those detections to generate counts, though comparison to manual counts remains a challenge and requires tracking across frames. Several solutions to tracking individuals were tested, and with continuing development this approach could make video a viable method for quantifying these important prey organisms as well as other challenging midwater species.

AI-based Identification of Ocean Organisms in Imagery and Enhanced Human-Machine Collaboration for improved Video Analysis

Patrick Cooper , University Corporation for Atmospheric Research / NOAA Ocean Exploration Affiliate

Mashkoor Malik_2, Ashley N. Marranzino_1, Philip Hoffman_2 (1-University Corporation for Atmospheric Research / NOAA Ocean Exploration Affiliate, 2- NOAA Ocean Exploration )

Marine imagery analysis is essential for understanding and exploring underwater environments. However, manually annotating and characterizing captured footage and images is a time-consuming task for human experts. NOAA Ocean Exploration currently leverages SeaTube - A tool built by Ocean Networks Canada for human-based video annotations. To address these limitations, NOAA Ocean Exploration is scoping and developing several tools including “Seabot” and “Critter Detector” to increase annotation efficiency and evaluate the potential for automated annotations. The unique data distribution of marine imagery and sparse sampling compared to terrestrial domains requires human-in-the-loop systems and our efforts are focused on improving the efficiency of human-based annotations. Seabot uses Vision Transformers (ViTs) to categorize and localize organisms within individual images, relying on the Ocean Vision AI / FathomNet training dataset. Critter Detector operates on video footage and identifies distinct video segments of interest, entities, organisms, and debris within arbitrary video clips. Critter Detector utilizes deep learning, specifically convolutional neural networks (CNNs) and long short-term memory (LSTM) networks, to detect significant events and visually interesting content within underwater video footage leveraging SeaTube footage and annotations to train the Critter Detector. While Seabot emphasizes organism identification on individual images, Critter Detector is designed to utilize the recurrence relation of an RNN architecture to allow for identification of significant temporal regions within video. Both of these proposed models, trained on a diverse dataset of annotated images and videos, achieve useful accuracy, reducing the time and effort required for manual analysis. As the technology advances, incorporating features such as species recognition and behavior analysis, realtime deployment, it may provide new opportunities for marine research, targeted sampling operations and conservation efforts.

AQUA-VIS: Autonomous QUantitative Analysis and Visual Identification System

Judith Fischer , GEOMAR

Autonomous Underwater Vehicle (AUV) missions have revolutionized the exploration of the underwater world and enabled high-resolution seafloor imaging. This imagery is instrumental for monitoring benthic ecosystems, assessing biodiversity, or detecting objects such as unexploded ordnance. However, post-processing the acquired data from AUV surveys is often time-consuming, hindering immediate decision-making for subsequent mapping locations and effective environmental analysis. This work addresses this issue by developing an advanced image processing workflow tailored for AUVs that leverages open-access image processing tools. The proposed workflow aims to process incoming images in near real-time, extract features for 3D reconstruction, and detect objects from seafloor images. The workflow consists of two main steps: i) feature extraction and ii) object detection. Feature extraction is carried out using the open-source software COLMAP, which is a "general-purpose Structure-from-Motion (SfM) and Multi-View Stereo (MVS) pipeline". In the second part of the workflow, pre-trained tracking models such as YOLO are utilized to create an object detection framework that can detect objects from incoming images. Our current focus is on training YOLO using seafloor image datasets. We aim to train models for different locations, such as the Atlantic, Pacific, and Baltic Sea, thereby creating trained models that the wider community can use. These workflows are integrated into the AUV system to ensure they can operate in near real-time conditions. Processing time is optimized to handle the large number of incoming images while maintaining high accuracy. We anticipate that the results of this framework will support the accessible use of open-source workflows to the research community. Scientists can use this tool to gain valuable insights into underwater ecosystems and make informed and fast decisions about further dive sites based on immediate reports generated by the AUV at the end of each dive.

Accelerating Marine UAV Drone Image Analysis with Sliced Detection and Clustering (MBARI SDCAT)

Danelle E. Cline , Monterey Bay Aquarium Research Institute (MBARI), Moss Landing, CA, USA

Duane R. Edgington, Thomas O'Reilly, Steven H.D. Haddock, John Phillip Ryan, Bryan Touryan-Schaefer, William J. Kirkwood, Paul R. McGill, Enoch Nicholson, and Rob S. McEwen. Monterey Bay Aquarium Research Institute (MBARI), Moss Landing, CA, USA

Uncrewed Aerial Vehicles (UAVs) can be a cost-effective solution for capturing a comprehensive view of surface ocean phenomena to study marine population dynamics and ecology. UAVs have several advantages, such as quick deployment from shore, low operational costs, and the ability to be equipped with various sensors, including visual imaging systems and thermal imaging sensors. However, analyzing high-resolution images captured from UAVs can be challenging and time-consuming, especially when identifying small objects or anomalies. Therefore, we developed a method to quickly identify a diverse range of targets in UAV images. We will discuss our workflow for accelerating the analysis of high-resolution visual images captured from a Trinity F90+ Vertical Take-Off and Landing (VTOL) drone in near-shore habitats around the Monterey Bay region in California at approximately 60 meters altitude. Our approach uses a state-of-the-art self-distillation with knowledge (DINO) transformer foundation model and multi-scale, sliced object detection (SAHI) methods to locate a wide range of objects, from small to large, such as schools or individual jellyfish, flocks of birds, kelp forests or kelp fragments, small debris, occasional cetaceans, and pinnipeds. To make the data analysis more efficient, we create clusters of similar objects based on visual similarity, which can be quickly examined through a web-based interface. This approach eliminates the need for previously labeled objects to train a model, optimizing limited human resources. Our work demonstrates the useful application of state-of-the-art techniques to assist in the rapid analysis of images and how this can be used to develop a recognition system based upon machine-learning for the rapid detection and classification of UAV images. All of our work is freely available as open-source code. (https://github.com/mbari-org/sdcat)

Addressing challenges in using object detection for small datasets of imagery with a data-centric approach

Caroline Chin , National Institute of Water and Atmospheric Research (NIWA), Wellington, New Zealand; Te Herenga Waka - Victoria University of Wellington, Wellington, New Zealand

Ashley Rowden (National Institute of Water and Atmospheric Research (NIWA), Wellington, New Zealand; Te Herenga Waka - Victoria University of Wellington, Wellington, New Zealand), Alan Tan (National Institute of Water and Atmospheric Research (NIWA), Auckland, New Zealand); Daniel Leduc (National Institute of Water and Atmospheric Research (NIWA), Wellington, New Zealand

In marine imagery research, real-world applications using object detection models face challenges such as class imbalance due to heterogeneity within benthic communities, class/labelling ambiguity, and image quality (e.g., illumination). With these challenges, it is difficult to determine the optimal quantity and quality of research-derived training data – that are typically small datasets – thus impacting model performance. Furthermore, traditional object detection performance metrics such as mean Average Precision (mAP), F1-score, precision, and recall may not comprehensively evaluate a model’s efficiency nor inform the researcher on where errors or gaps may exist within the training data that can be targeted for improvement. In this study, our aim was to understand benthic community patterns within the Kermadec Trench by utilising object detection models for video imagery. We created training datasets by annotating benthic megafauna from random subsets of submersible dive imagery from abyssal and hadal depths using the BIIGLE 2.0 annotation tool. Using these datasets, we evaluated the performance of YOLOv8 object detection models trained on both single- and multi-class datasets. Our methods included analysis of performance metrics and data visualisations provided within the YOLOv8 framework. Our results demonstrated the importance of analysis within the training dataset to identify issues such as bounding box size, orientation, and localisation within an image that can be resolved with methods such as data augmentation. Furthermore, we observed that random sampling of annotated imagery resulted in a skewed dataset biased towards abyssal megafaunal classes, subsequently increasing class imbalance. To mitigate this issue, we undertook stratified random sampling of training imagery by depth that balanced and improved dataset quality, particularly as faunal classes within abyssal and hadal zones exhibited heterogeneity. Overall, our study demonstrates that by using a data-centric approach towards deep-learning, researchers can relatively easily infer how small-scale training data can be improved for object detection performance.

Advanced seagrass monitoring using automated image processing on underwater drones

Rod Connolly , Global Wetlands Project, Griffith University, Australia

Cèsar Herrera(1), Jasmine Rasmussen(1), Alison McNamara(1), Michael Sievers(1), Bela Stantic(2) 1. Global Wetlands Project, Griffith University, Queensland, Australia 2. Big Data Lab, Griffith University, Queensland, Australia

Advances in capacity for mapping and monitoring seagrass habitat are overcoming deficiencies in local monitoring and global distribution maps, and enabling rapid condition reporting. Underwater vehicles and drones reliably scan the seabed collecting vast amounts of imagery, in waters of any depth without putting divers at risk. We show how automated data extraction has overcome the processing bottleneck of manual data extraction. Using a combination of deep learning architectures (CCN and Transformer), we provide computer vision software that detects seagrass and records plant characteristics rapidly and reliably, anywhere in the world. Robust testing against large databases of manual records of seagrass presence and percentage cover shows high accuracy of automated measures. Seagrass presence data are ideal for detecting meadow size and location, and percentage cover data can be used as an indicator of seagrass condition, paired with automated records of seagrass species or morphological type. The combination of underwater drone imagery and automated data extraction substantially increases the scale of seagrass surveys, even in deep or turbid waters, and improves reproducibility from sequential visits.

Algorithmic construction of topologically complex biomineral lattices via cellular syncytia

Manu Prakash , Stanford University (BioE, Biology, Océans)

Pranav Vyas, Charlotte Brannon, Laurent Formery, Christopher J. Lowe, Manu Prakash (Stanford University)

Biomineralization is ubiquitous in both unicellular and multicellular living systems [1, 2] and has remained elusive due to a limited understanding of physicochemical and biomolecular processes [3]. Echinoderms, identified with diverse architectures of calcite-based structures in the dermis[4], present an enigma of how cellular processes control shape and form of individual structures. Specifically, in holothurians (sea cucumbers), multi-cellular clusters construct discrete single-crystal calcite ‘ossicles’ (∼ 100 µm length scale), with diverse morphologies both across species and even within an individual animal [5]. The local rules that might encode these unique morphologies in calcite ossicles in holothurians remain largely unknown. Here we show how transport processes in a cellular syncytium impart a top-down control on ossicle geometry via symmetry breaking, branching, and fusion in finite cellular clusters. As a unique example of cellular masonary, we show how coordination within a small cluster of cells builds calcite structures about an order of magnitude larger than any individual participating cell. We establish live imaging of ossicle growth in Apostichopus parvimensis juveniles revealing how individual crystalline seeds (∼ 1 − 2 µm) grow inside a multi-cellular syncytial complex with the biomineral completely wrapped within a membrane-bound cytoplasmic sheath. Constructing a topological description of ossicle geometries from 3D micro-CT (computational tomography) data reveals the hidden growth history and conserved patterns across ossicle types. We further demonstrate vesicle transport on the surface of the ossicle, rather than cell motility, regulates material transport to the ossicle tips via a unique cytoskeletal architecture. Finally, using reduced order models of conserved transport on self-closing active branching networks, we highlight the hidden universality in the growth process of distinct ossicles. The system presented here serves as a unique playground merging top-down cellular physiology and classical branching morphogenesis [6] with bottom-up non-equilibrium mineralization [7] processes at the interface of living and non-living matter [8].

Annotating Benthic Megafauna of the Cobalt-Rich Manganese Seamounts of the Mid-Pacific Mountains, Hawaiian and Necker Ridges with BIIGLE

Sierra Landreth , Department of Earth, Ocean, and Atmospheric Science, Florida State University

E. Brendan Roark (Department of Geography, Texas A&M University), Virginia Biede, Nicole B. Morgan, Amy R. Baco (Department of Earth, Ocean, and Atmospheric Science, Florida State University)

Seamounts are habitats for a diverse array of benthic deep-sea megafauna. Generally, seamounts are poorly studied in the central and western Pacific (CWP), an area that is also targeted for cobalt-manganese crust mining. The Mid-Pacific Mountains (MPM) have cobalt-rich crusts but are virtually unexplored. Necker Ridge has been hypothesized to be a key stepping-stone for deep-water faunal dispersal between the MPM and the seamounts of the Northwestern Hawaiian Islands (NWHI). The goal of this project was to test this hypothesis by comparing the species composition of CWP seamount fauna among the MPM, NWHI, and Necker Ridge while also observing the communities at potential risk from future mining activities. The remotely operated vehicle (ROV) SuBastian was used to collect video transects across 10 sites. Replicate 500m long transects, were taken at 500 m depth intervals from 1,500 to 3,500m and annotated for morphology and taxonomy of benthic megafauna using the BioImage Indexing Graphical Labeling and Exploration (BIIGLE) web service. Video transects were converted to screen grabs collected every 10 seconds. Organisms were labeled with the Standardised Marine Taxon Reference Image Database (SMarTaR-ID) morphology and taxonomy label trees. Species composition, abundance, and diversity of the benthic community were calculated at site MPM3 to establish depth zonation. Based on preliminary results, octocorals were the most abundant taxon at all depths, with the two deepest depths dominated by unbranched isidids. There was a peak in overall faunal abundance at 2000 m, with approximately 2000 megafaunal individuals across 3 transects. The annotation process was repeated across all survey sites to begin comparisons among the 3 regions. Across all sites, faunal abundance was lowest at the deeper depths of 3000 to 3500 m. Characterizing seamount communities and their connectivity can be essential to informing the conservation and management of vital deep-sea habitats targeted for mining.

Applications of the YOLOv8 Computer Vision Model for Long-term Underwater Ecological Monitoring.

Talen Rimmer , Department of Biology, University of Victoria

Bates, C.R. - University of British Columbia McIntosh, D., Zhang, T., A., Branzan, Albu A., & Juanes, F. - University of Victoria

The world's oceans are undergoing rapid changes due to climate change and other anthropogenic impacts, affecting marine species' distribution, abundance, and behavior. Traditional ecological monitoring methods struggle to keep pace with these transformations, especially in underwater habitats. Recently, computer vision techniques have emerged to enhance the efficiency of video and image-based underwater monitoring. These methods facilitate the detection and classification of objects in visual data, potentially streamlining the process of counting and classifying marine organisms. However, the adoption of computer vision in marine ecology has been slow, partly due to its inaccessibility to ecologists and the lack of easily adaptable tools for ecological monitoring. This study investigates the application and validation of computer vision techniques for monitoring underwater pelagic macrofaunal diversity, using a case study from coastal British Columbia. Over 9000 hours of underwater video were collected from four sites over 18 months, using mounted cameras programmed to record five minutes of video every hour. A supervised computer vision model (YOLOv8) was trained with approximately 240,000 annotations to assess the presence and abundance of marine species at these sites. This approach provided high-resolution temporal data on the diversity and abundance of pelagic fish and gelatinous zooplankton at these sites. We detail the process of employing computer vision techniques for long-term underwater ecological monitoring, emphasizing accessibility for ecologists. A stepwise method for adapting computer vision techniques to achieve biodiversity monitoring objectives is presented, highlighting the strengths and limitations of our approach. Finally, we propose future directions for integrating these technologies into new and existing monitoring programs, and suggest priority areas for future research to advance the use of computer vision in underwater ecological monitoring.

Automated Imaging in Long-Term Ecological Research Reveals Taxon- and Group-Specific Bloom Anomalies in Phytoplankton

Heidi M. Sosik , Woods Hole Oceanographic Institution

E. Taylor Crockford, Emily E. Peacock, and Miraflor Santos; Woods Hole Oceanographic Institution

Coastal pelagic waters support complex and dynamic ecosystems that depend on planktonic food webs. These systems are challenging to observe at the appropriate spatial, temporal, and taxonomic scales. We are working to meet this challenge with automated microscopic imaging at the Northeast U.S. Shelf Long-Term Ecological Research (NES-LTER) site. For over a decade, we have deployed Imaging FlowCytobot both at a nearshore cabled observatory and on-board research vessels surveying across the shelf seasonally. With this vast dataset (>1.5 billion images) and machine-learning for image classification, we are now able to document where and when unusual blooms of specific taxa or groups of phytoplankton occur. These atypical blooms are surprisingly common in the dataset, but also highly variable both in which taxa occur and in spatial and temporal location and extent. This presentation will highlight examples of decade-scale outlier abundances of specific diatoms, dinoflagellates, and haptophytes. We will show that some events appear to be nearshore phenomena with timescales of weeks while others are localized in mid- or outer- shelf waters over similar timeframes, and furthermore that certain massive, unprecedented blooms can persist for months and extend over 100s of kilometers along the NES. While we have an evolving understanding about factors promoting some of these types of blooms, in many cases they remain mysterious to explain.

BathyBot: The deep-sea crawler to see the unseen of the NW Mediterranean Sea

Séverine Martini , Mediterranean Institut of Oceanography (MIO)

Séverine Martini (1), Carl Gojak (2), Christian Tamburini (1), Dominique Lefèvre (1), Karim Bernardet (2), Karim Mahiouz (2), Céline Laus (2), Marc Picheral (3), Camille Catalano (3) 1- Mediterranean Institute of Oceanography 2- Division Technique de l'INSU 3- Laboratoire d'Oceanographie de Villefranche

Increasing exploration and industrial exploitation of the vast and fragile deep-ocean environment for a wide range of resources (e.g., oil, gas, fisheries, new molecules, and soon, minerals) raises global concerns about potential ecological impacts. BathyBot is a multi-instrumented deep-sea crawler deployed from a dock, at 2500m depth, 40 km off the French coast (Mediterranean Sea), at the EMSO-LO station. BathyBot is connected to the LSPM, a deep sea cabled observatory allowing real-time observations of the deep sea. The deployment of this benthic vehicle complements the ALBATROSS-MII pelagic line, instrumented with oceanographic sensors. Two cameras are installed on BathyBot for real time imaging of the deep sea, as well as an Underwater Video Profiler (UVP6) on its BathyDock, and a biocamera closeby the vehicle. Such instrumentation allows: 1) to better understand the biodiversity in the deep Mediterranean sea over time, 2) to to study the dynamic of exchanges at the water column and sediment interface, 3) to involve citizen through the diffusion of images acquired by BathyBot through the platform “Ocean Spy – Meditteranean Spy”.

C-SONIC: Cross-Sonar Image Correspondence Using Pose Supervised Learning

Akshay Hinduja , Carnegie Mellon University

S Gode*, M Kaess, Carnegie Mellon University

We present C-SONIC (Cross-SONar Image Correspondence), an enhanced network for underwater SLAM that addresses the challenges of data association in sonar-based perception. Building upon our previous SONIC framework, C-SONIC introduces cross-attention mechanisms to enable robust feature correspondence across different sonar makes and frequency modes. This advancement significantly improves performance in the complex underwater environment, where limited visibility often renders camera-based systems ineffective.

C-SONIC overcomes the limitations of multibeam imaging sonars, particularly their viewpoint-dependent measurements, by learning features that are invariant to both viewpoint changes and sonar hardware differences. Our method demonstrates superior performance in generating correspondences for diverse sonar images, facilitating more accurate loop closure constraints and place recognition. Importantly, C-SONIC's cross-modal capabilities open up new possibilities for multi-robot SLAM using heterogeneous sonar systems in underwater environments.

Cold-water coral biodiversity and distribution across oxygen minimum zones in the Galapagos Islands and La Isla del Coco: preliminary findings

Ana Belen Yanez Suarez , Memorial University of Newfoundland

Katleen Robert, Memorial University of Newfoundland

Climate change is altering the chemical and physical properties of the ocean, leading to ocean warming, acidification, and oxygen loss. In some areas of the deep sea, dissolved oxygen levels are already extremely low, and the expansion of these areas may threaten vulnerable marine ecosystems such as cold-water coral (CWC) reefs. CWC can form oases of life in the deep sea, providing habitat, food, and shelter to marine species, but are sensitive to increased temperature and low oxygen. The Eastern Tropical Pacific (ETP) presents one of the world's most pronounced oxygen minimum zones (OMZ), as well as some of the most biodiverse marine areas, such as the Galapagos, Ecuador, and La Isla del Coco, Costa Rica. These Islands are surrounded by large marine protected areas (MPA) where bottom trawling has never occurred, and their deep sea is considered pristine. Given the expected expansion of the OMZ in these critical areas, it is essential to understand the current distribution pattern of CWC and its implications to the local biodiversity. This study aims to investigate whether CWC biodiversity and abundance across OMZ differ within and between Islands and identify the environmental factors that may support CWC resilience under low oxygen conditions. In September 2023, the R/V Falkor (too) collected video, coral samples and environmental data using the ROV SuBastian in several sites across the Islands. Preliminary observations reveal varying patterns of coral distribution, diversity, and community structure related to abiotic oceanographic factors as well as large differences in the OMZ extension between the Galapagos and La Isla del Coco. Our findings will provide essential information to strengthen the development of a regional marine conservation corridor in the ETP currently under implementation by bordering nations as well as improve our current knowledge on CWC adaptation and vulnerability in the Pacific.

Comparative Analysis of AUV and HOV Imagery for Assessing Deep-Sea Fish Communities on Pioneer Bank, Northwestern Hawaiian Islands

Beatriz E. Mejía-Mercado , Coastal and Marine Laboratory, Florida State University

Amy R. Baco. Florida State University

This study quantitatively compares the effects of using Autonomous Underwater Vehicles (AUV) and Human-Occupied Vehicles (HOV) to survey and characterize deep-sea fish assemblages on Pioneer Bank at depths of 450-500 m and 600 m. A series of univariate and multivariate PERMANOVA analyses were conducted to assess the effects of method (AUV vs. HOV), side (NE vs. S), and their interaction on various ecological indices, including abundance, species richness, estimated species richness, Shannon diversity, and Simpson dominance. Univariate PERMANOVA results at 450-500 m revealed that AUV transects showed significantly higher diversity and lower dominance compared to HOV transects, indicating a broader range of species captured, particularly in the northeastern region of the bank. At 600 m, AUV again recorded greater diversity, abundance, and lower dominance than HOV, with significant differences in abundance showing AUV capturing more individuals, especially from the families Gadidae, Myctophidae, and Berycidae, on both the northeastern and southern sides of the seamount. Multivariate PERMANOVA analysis at 450-500 m indicated that the assemblage structure differed significantly by method and side, though no interaction effect was found. Cluster analysis revealed considerable variability in species assemblages, particularly for HOV transects on the southern side, which exhibited distinct community structures. At 600 m, significant differences were observed by method, side, and their interaction. Specifically, two HOV transects on the southern side displayed unique assemblages, while three AUV transects on the northeastern side showed greater similarity in species composition. These preliminary findings underscore the complexity and variability of deep-sea fish communities, highlighting the influence of methodological choices on biodiversity assessments. The results emphasize the need for integrated sampling approaches to achieve comprehensive biodiversity assessments, which are essential for effective management and conservation strategies in deep-sea ecosystems.

Crowd-powered precision: Enhancing image segmentation with collaborative annotations

Instance segmentation of organisms from marine images can be used for biodiversity estimation and can be used as training data for supervised machine learning techniques. However, manual segmentation can be subject to human error found in other visual data extraction techniques; including misidentification, imperfect detection and inconsistent boundary delineation; which can be propagated through trained machine learning predictions. User variability can lead to low segmentation agreement (intersection over union scores; IoU; of 29%) as seen in our case example of 11 researchers segmenting cold-water coral colonies. We outline two methods for combining users’ segments together to produce less variable annotations. The first method establishes a user consensus on the detection and identification of the annotation and then combines the distance scenes of user drawn segments together to generate an average consensus segment, which can be used to estimate organism properties or as training data for machine learning. The second method overlays the user drawn segments over each other to create a heatmap mask with an associated user certainty for the detection, identification and delineation of the organism. These non-binary masks can be used for organism estimates, or, by adapting the loss function of Mask R-CNN as an example, can be used to train machine learning. Both methods reduce the variability of drawn segments in our example, with a 12% and 25% increase in average IoU scores by combining two and three users’ segments, respectively. The value of combining user annotations at the expense of increased annotation effort and the ease of implementing each combination technique are summarised to advise researchers on how to create more reliable segment annotations and higher quality training data for supervised machine learning techniques.

Deep-sea plastic debris detection using deep learning with a large scale image dataset

Takaki Nishio , Japan Agency for Marine-Earth Science and Technology (JAMSTEC)

Hideaki Saito (Japan Agency for Marine-Earth Science and Technology (JAMSTEC)), Daisuke Matsuoka (Japan Agency for Marine-Earth Science and Technology (JAMSTEC)), Ryota Nakajima (Japan Agency for Marine-Earth Science and Technology (JAMSTEC))

Marine pollution by plastic debris is a global issue, and monitoring of the debris distribution is required to formulate policies for the solution and verify their effectiveness. In particular, Japan has the world's largest seafloor area deeper than 6,000 m in the exclusive economic zone (EEZ), and it is believed that there are a lot of "sinks" of debris on the seafloor. Accordingly, Japan plays an important role in continuous wide-area monitoring of deep-sea debris. In order to make the monitoring of deep-sea debris more efficient, the Japan Agency for Marine-Earth Science and Technology (JAMSTEC) is developing a system that uses deep learning methods to automatically detect plastic debris from deep-sea videos. A deep learning model was trained to detect macro-sized debris such as plastic bags and plastic bottles. The training dataset contains over 22,000 images sliced from videos recorded during JAMSTEC's deep sea surveys conducted over the past 30 years, and debris information was collected from the Deep-sea Debris Database (https://www.godac.jamstec.go.jp/dsdebris/j/index.html). The trained model was applied to newly recorded deep-sea videos to verify whether it could detect plastic debris. In addition, the trained model was tested on videos under multiple conditions, such as the size of the detection target and turbidity in order to determine the shooting conditions for videos suitable for automatic detection. As a result, the automatic detection rate of visually detectable plastic debris achieved over 60%. In this presentation, we report on the latest trial results of deep-sea plastic debris detection, current issues, and future prospects.

Defining the target population to make marine image-based biological data FAIR

Jennifer M. Durden , National Oceanography Centre, Southampton, UK

Timm Schoening (DeepSea Monitoring Group, GEOMAR Helmholtz Centre for Ocean Research, Kiel, Germany), Emma J. Curtis (Ocean and Earth Science, University of Southampton, UK), Anna Downie (Centre for Environment, Fisheries and Aquaculture Science, Lowestoft, UK), Andrew R. Gates (National Oceanography Centre, Southampton, UK), Daniel O.B. Jones (National Oceanography Centre, Southampton, UK), Alexandra Kokkinaki (British Oceanographic Data Centre, National Oceanography Centre, Southampton, UK), Erik Simon-Lledó (National Oceanography Centre, Southampton, UK), Danielle Wright (British Oceanographic Data Centre, National Oceanography Centre, Southampton, UK), Brian J. Bett (National Oceanography Centre, Southampton, UK)

Marine imaging studies have unique constraints on the data collected requiring a tool for defining the biological scope to facilitate data discovery, quality evaluation, sharing and reuse. Defining the ‘target population’ is way of scoping biological sampling or observations by setting the pool of organisms to be observed or sampled. It is used in survey design and planning, to determine statistical inference, and is critical for data interpretation and reuse (both images and derived data). We designed a set of attributes for defining and recording the target population in biological studies using marine photography, incorporating ecological and environmental delineation and marine imaging method constraints. We describe how this definition may be altered and recorded at different phases of a project. The set of attributes records the definition of the target population in a structured metadata format to enhance data FAIRness. It is designed as an extension to the image FAIR Digital Objects metadata standard, and we map terms to other biological data standards where possible. This set of attributes serves a need to update ecological metadata to align with new remotely-sensed data, and can be applied to other remotely-sensed ecological image data.

Developing automated multi-modal monitoring strategies of vulnerable marine ecosystems (VMEs)

Chloe Game , University of Bergen, Norway

Associate Professor Pekka Parviainen, University of Bergen, Norway Dr. Pål Buhl-Mortensen, Institute of Marine Research, Norway Associate Professor Ketil Malde, University of Bergen, Norway and Institute of Marine Research, Norway Dr. Pedro A. Rebeiro, University of Bergen, Norway Dr. Rebecca Ross, Institute of Marine Research, Norway;

Anthropogenic impacts on the marine environment are increasing as growing resource demands must be met. Recently (2024), the Norwegian government voted to open an area of the Norwegian Continental Shelf for seabed mining, putting the deep-sea bed, which harbours a rich diversity of ecologically and economically valuable, yet vulnerable, ecosystems under serious threat. It is critical that extensive accurate maps of the seafloor are created to establish baselines and support monitoring of impacts and recovery. Given the region’s vastness and isolation, such monitoring is logistically challenging and too slow to meet the requirements. It must therefore be conducted with photography and machine learning (ML) used to automatically identify and quantify seafloor organisms; avoiding labour-intensive manual analysis. This project aims to design and develop a multimodal DL model for automated identification of VMEs and VME indicator-species in Norwegian waters. This model will help to optimize mapping efforts, by promoting efficiency, consistency and quality of ecological data extraction. Crucially, this will better enable sustainable management of VMEs. Current monitoring efforts do not provide the scale and resolution of data required to protect and conserve these valuable seabed assets. Specifically, we are investigating how effective can off-the-shelf Vision Transformers (ViT) automate classification of VMEs from seabed imagery compared to current approaches (CNNs). We are also exploring whether the addition of environmental data (from other sensors such as local topography and water temperature), in a new multimodal architecture can increase accuracy of automated analysis and refine the level of biological details used to make predictions, which is currently not possible for ML with images alone. Importantly, we will consider explainability of model decisions, opening up the ‘black-box’, and seek to generalize the approach across target seabed communities and datasets.

Development of illuminated stereo-BRUVS for benthic marine monitoring in the NW Atlantic: Investigation of technological and operational methodologies for best practice

Jessica Sajtovich , Dalhousie University

Dr. Craig J. Brown (craig.brown@dal.ca), Dalhousie University

Imaging systems are valuable tools for sustainable oceanographic survey methods, enabling non-destructive monitoring of marine ecosystems and their associated fauna. Baited Remote Underwater Video Systems (BRUVS) represent an innovative and cost-effective monitoring technology, suitable for a wide variety of marine research, including Marine Protected Area (MPA) and fisheries monitoring. Since their development in the 1990’s, the use of BRUVS for marine monitoring has expanded considerably due to advancements in high-quality, low-cost digital cameras, however, an overwhelming majority of BRUVS surveys have been conducted in the photic zone within the southern hemisphere. BRUVS utilized in these regions typically lack integrated lights and record continuous video footage over short deployment durations (15-60 minutes). Significant knowledge gaps surround the use of BRUVS with integrated lighting for use in poor-visibility, low light, or aphotic environments in more northern latitudes such as the North Atlantic. Considering the potential benefits of BRUVS as a marine monitoring technology, research is essential to determine the effectiveness of BRUVS in these regions. Additionally, in such locations, including the coastal waters of Canada, the use of BRUVS with integrated lights are desirable for MPA monitoring, due to the non-destructive nature of the technology, and suitability for monitoring the diverse range of benthic environments included in Canadian MPAs, many of which are below the photic zone.   This research describes the development and application of a stereo-BRUVS with integrated lights for marine monitoring in low-visibility coastal waters surrounding Nova Scotia. The stereo-BRUVS were constructed using in-house designed and consumer grade elements to develop a cost-effective system and capable of producing high-quality images. Depth rated to 500m, the systems record staggered videos, maximizing battery life and video storage, thereby providing advantages for long-term monitoring efforts.

Discovery of a mud-burrowing squid evidences the complex life-habits of abyssal Cephalopoda

Alejandra Mejia-Saenz , Interdisciplinary Centre of Marine and Environmental Research (CIIMAR), University of Porto, Matosinhos, Portugal

Bethany Fleming 2,3, Daniel O.B. Jones 2, Loïc Van Audenhaege 2, Henk-Jan Hoving 4, Erik Simon-Lledó 2 2 Ocean BioGeosciences, National Oceanography Centre, Southampton, UK 3 Ocean and Earth Science, University of Southampton, Southampton, UK 4 GEOMAR, Helmholtz Centre for Ocean Research Kiel, Kiel, Germany

Cephalopods are a conspicuous component of marine ecosystems, present across all ocean depths, but little is known about the distribution, diversity, and life-habits of those dwelling the deep-sea, especially at remote abyssal depths. Here, we present the unexpected mud-burrowing behaviour of an undescribed species of whiplash squid (Mastigotheutidae) observed from ROV videos at a depth of 4100 m within the Clarion Clipperton Zone, an area in the abyssal central Pacific that is targeted for seabed mining. The specimen was detected motionless in the soft sediment, positioned upside-down with its elongated tentacles extending rigidly towards the water column, resembling a biogenic structure. This behaviour and orientation are in marked contrast to the head-down turning fork posture typically exhibited by Mastigotheutis species. As biogenic structures such as sponge stalks are biodiversity hotspots in the region, we hypothesise this behaviour to be a form of aggressive mimicry. Using disguise to lure prey is possibly efficient in the abyss, where food may be scarce. Alternatively, the observed behaviour may be a defence strategy against predators. This finding highlights the importance of imagery in the understanding of cephalopod behaviour and further expands the diversity of their complex adaptations to life in the abyss. Being one of the few groups thought to be purely predatory in the abyss, with an important role in benthopelagic trophic food-webs, further investigations must specifically target cephalopods. This knowledge is urgently required to protect the whole functional complexity and biodiversity of the abyssal Pacific seabed from human impacts such as ocean acidification from climate change and seabed mining, which are likely to reach these remote habitats in the years to come.

Emergency Response in Rocky Intertidal Ecosystems Using Drone Imagery: A MARINe Initiative

Karah Cox-Ammann , University of CA at Santa Cruz

Alexis Necarsulmer UCSC

The Multi-Agency Rocky Intertidal Network (MARINe) collaborates to monitor 200+ rocky intertidal sites along the West Coast of North America. Recently, MARINe has integrated drone imagery with on-the-ground monitoring to bolster surveying and mapping capabilities, particularly for emergency response. Given the increasing frequency of emergencies in ecological systems due to climate change effects, our objective is to showcase the considerable utility of drones as a potent tool for enhancing emergency response capabilities. By focusing on rocky intertidal ecosystems, we aim to highlight the versatility and effectiveness of drone technology in swiftly and effectively responding to ecological emergencies. We have employed drones to respond to various emergency scenarios, including oil spills, landslides, wildfire-induced debris flows, and sea level rise impacts. Drones enable us to capture baseline imagery, track sediment movement, access unsafe areas under landslides, and create high-resolution orthomosaics and 3-D models of affected sites. Our surveys have demonstrated the effectiveness of drones in capturing baseline data before impacts and tracking alterations to habitat over time. The high-resolution imagery produced by drones allows for habitat classification and assessment of impact zones, thus informing resource managers and guiding mitigation efforts. The integration of drones into emergency response strategies in rocky intertidal ecosystems offers several benefits, including rapid and repeatable monitoring, access to remote and otherwise inaccessible areas, and detailed assessment of impacts. This initiative has implications for agencies, such as NOAA, CDFW, and National Marine Sanctuaries, involved in coastal management and conservation efforts, particularly those involved in mitigating the impacts of emergencies within critical habitats and affecting sensitive species, such as the endangered black abalone (Haliotis cracherodii).

Enabling Large Scale Coral Reef 3D Models with Accurate Color Representation using DeepSeeColor

Yogesh Girdhar , Woods Hole Oceanographic Institution

Daniel Yang, John Walsh, Stewart Jamieson

Successful applications of complex vision-based behaviours underwater have lagged behind progress in terrestrial and aerial domains. This is largely due to the degraded image quality resulting from the physical phenomena involved in underwater image formation. Spectrally-selective light attenuation drains some colors from underwater images while backscattering adds others, making it challenging to perform vision-based tasks underwater. These distortions make image observation depend on distance, beyond simply resolution and scale change. We present DeepSeeColor algorithm, which is a state-of-the-art underwater image formation model with the computational efficiency enabled by modern deep learning. The proposed approach enables efficient color correction for underwater imagery enabling novel applications such as accurate in-situ imaging using AUVs and and scaling up building of underwater 3d models with accurate color representation. We show the application of the approach to coral reef survey imagery collected by an underwater robot.

End-to-end processing of water column video data

Ashley N. Marranzino , University Corporation for Atmospheric Research | NOAA Ocean Exploration Affiliate

Adrienne Copeland (NOAA Ocean Exploration), Michael Ford (NOAA), Jason Gronich (Western Washington University)

The National Oceanic and Atmospheric Administration (NOAA)’s office of Ocean Exploration is dedicated to exploring the unknown ocean, unlocking its potential through scientific discovery, technological advancements, partnerships, and data delivery. One of the ways in which NOAA Ocean Exploration leads national efforts to fill gaps in our basic understanding of the marine environment is by conducting exploration missions with NOAA Ship Okeanos Explorer and the remotely operated vehicles (ROVs) Deep Discoverer and Seirios. In 2014, NOAA Ocean Exploration began using ROV dives to explore the poorly documented pelagic fauna of the deep oceanic water column, working with water column biologists and taxonomic experts to determine exploration priorities and strategies. Over the past decade, 51 water column exploration dives have been conducted aboard NOAA Ship Okeanos Explorer, resulting in more than 175 hours of video data archived with the National Centers for Environmental Information (NCEI) and made available to the public through the Ocean Exploration Video Portal and SeaTube V.3. These videos can give scientists an unprecedented view of water column fauna - providing valuable information on species distributions, abundances, and behaviors. However, reviewing the raw video footage is time consuming and can require specific taxonomic expertise, thereby limiting the ultimate usability and accessibility of the data to the larger scientific community. For over 6 years, NOAA Ocean Exploration has trained undergraduate interns and partnered with taxonomic experts to develop a workflow for annotating, verifying, and analyzing water column exploration video data. Here we present the end-to-end workflow that NOAA Ocean Exploration has developed for water column exploration - including data collection during ROV dives, real-time and post-cruise annotation processes, data analysis and publication, and ongoing efforts using these data to feed into machine learning algorithms.

Exploring Oceans of Visual Data: Integrating Deep Learning Computer Vision and Information Visualization for Large-Scale Ecological Assessment of Cold-Water Coral Reefs

Daniel Langenkämper , 1. Biodata Mining Group, Faculty of Technology, Bielefeld University, Germany

Ingunn Nilssen Aksel A. Mogstad (Environmental Technology, Technology, Digital & Innovation, Equinor ASA, Norway), Tim W. Nattkemper (Biodata Mining Group, Faculty of Technology, Bielefeld University, Germany)

Growing amounts of marine video and image data are collected using different kinds of camera setups and technologies (e.g., ROV, AUV, stationary platforms) to explore and monitor marine habitats. Concurrently, ambitious ocean digital twin (ODT) projects aim for the integrative collection and management of multi-modal sensor data in large data bases. The aim on ODT development is to make use of this data for analysis, visualization and modeling, together with establishing feedback loops into the marine system by providing documentation for decision makers. However, there are still few clearly defined workflows from raw marine image (or video) data to real data products required for such ODTs. Progresses in artificial intelligence and computer vision (like convolutional neural networks) have increased the potential of computational approaches to (semi-) automatic marine image analysis significantly and the number of papers describing their applications for marine object classification or segmentation is growing year by year. Usually, the works end up with an accuracy assessment and a comparison with competing algorithms, but a gap to a real data product remains. In this contribution, we address the problem of large-scale assessments of cold-water coral reefs based on ROV-collected video data. These assessments are still performed by human operators evaluating hundreds of video files, sometimes with sub-optimal visibility. We present a workflow to estimate the coverage of Desmophyllum pertusum, Paragorgia arborea, other gorgonians and sponges on reefs based on semi-automatic annotation of less than 0.0003% of the available data, training a segformer b3 model and applying it to automatically segment 855 ROV videos from the Haltenbanken area (Norwegian Sea). The results are displayed in a web-based customized visual interface that supports an integrative analysis of the computed coverage data together with data and manual evaluations derived from operator reports.

FathomNet: Accelerating the processing of visual data to understand ocean life

Elizabeth Corvi , Bioinspiration Lab, Research and Development, MBARI, USA

Lilli Carlsen,1 Laura Chrobak,1 Emily Clark,1 Susan Poulton,2 Katy Croff Bell,2 Henry Ruhl,1 Benjamin Woodward, 3 Kakani Katija, 1 Bioinspiration Lab, Research and Development, MBARI, USA ; 2 Ocean Discovery League, USA ; 3 CVision AI, USA

In order to fully explore our ocean and effectively steward the life that lives there, we need to scale up our observational capabilities both in time and space. Marine biological observations and surveys of the future call for building distributed networks of underwater sensors, vehicles, and data analysis pipelines, which requires significant advances in automation. Imaging, a major sensing modality for marine biology, is being deployed on a diverse array of platforms, however the community faces a data analysis backlog that artificial intelligence and machine learning may be able to address. How can we leverage novel computer and data science tools to automate image and video analysis in the ocean? How can we create workflows, data pipelines, and hardware/software tools that will enable novel research themes to expand our understanding of the ocean and its inhabitants in a time of great change? FathomNet seeks to address these community needs through creating a collaborative R&D program that links artificial intelligence with broad community engagement. FathomNet provides a central hub for researchers using imaging, AI, open data, and hardware/software; provide data pipelines from existing image and video data repositories; share project tools for coordination; leverage public participation and engagement via gamification; and create data products that are widely shared. Together, FathomNet will be used to directly accelerate the automated analysis of visual data to enable scientists, explorers, policymakers, storytellers, and the public, to learn, understand, and care more about the life that inhabits our ocean.

Framing image preprocessing to build capacity and efficiency in benthic ecology

Loïc Van Audenhaege , Ocean BioGeosciences, National Oceanography Centre, European Way, Southampton, UK

Eric Orenstein, Ocean Informatics, National Oceanography Centre, European Way, Southampton, UK Mojtaba Masoudi, Ocean BioGeosciences, National Oceanography Centre, European Way, Southampton, UK Tobias Ferreira, Ocean Informatics, National Oceanography Centre, European Way, Southampton, UK Colin Sauze, Ocean Informatics, National Oceanography Centre, European Way, Southampton, UK Jennifer Durden, Ocean BioGeosciences, National Oceanography Centre, European Way, Southampton, UK

For the past decades, benthic ecologists have been leveraging images for quantitative surveys. Image-based benthic surveys are typically conducted with static or mobile cameras collecting images at a fixed rate over various types of seabed habitat. The images are then manually or automatically annotated, from which descriptors of density and biodiversity are derived. Image preprocessing is a critical step to consider in imaging workflows; it has an enormous impact on downstream ecological insights. However, preprocessing is an underappreciated element of imaging studies, especially as more attention has been paid to AI-based annotation techniques. There are no widely used set of best practices nor a decision framework supporting the definition of an adequate image preprocessing workflow, constrained by research goals and image sample quality. As a result, image sample preparation is usually done on an ad hoc basis and is conducted with a set of discontinuous software tools for image colour correction, overlap management and standardisation of the sampling effort (among many: manual, ImageJ, Python, R, MATLAB, Photoshop). The complexity of decision making, coupled with inconsistent preprocessing tools, undermines a user’s ability to design optimal workflows to return robust and reproducible ecological measurements. The lack of a holistic, codified approach introduces bias and makes comparison between studies difficult or impossible, undermining efforts to assess ecological dynamics at regional to global scales. The PAIDIVER project aims to tackle this issue by providing decision making resources and software tools to support image preprocessing, specifically targeting consistent computation of biodiversity indicators. The tools are designed to support reproducibility by outputting processing steps as metadata. PAIDIVER will help benthic image users design and implement an image preprocessing workflow by supporting data standardisation (e.g. seabed area calculation) and optimising image and sample selection (e.g. quality and overlap management).

From Research to Operational and Sustained Marine Imaging for Place-Based Management

Henry A Ruhl , Monterey Bay Aquarium Research Institute

HA Ruhl, Monterey Bay Aquarium Research Institute; JH Adelaars, Monterey Bay Aquarium Research Institute; CR Anderson, University of California San Diego; FL Bahr, Monterey Bay Aquarium Research Institute; R Bochenek, Axiom Data Science; FP Chavez, Monterey Bay Aquarium Research Institute; P Daniel, University of California Santa Cruz; CA Durkin, Monterey Bay Aquarium Research Institute; J Erickson, Monterey Bay Aquarium Research Institute; SHD Haddock, Monterey Bay Aquarium Research Institute; B Jones, Monterey Bay Aquarium Research Institute; K Katija, Monterey Bay Aquarium Research Institute; D Klimov, Monterey Bay Aquarium Research Institute; RM Kudela, University of California Santa Cruz; T Maughan, Monterey Bay Aquarium Research Institute; PLD Roberts, Monterey Bay Aquarium Research Institute; A Schnittger, Monterey Bay Aquarium Research Institute; AE West, Monterey Bay Aquarium Research Institute.

Biology and ecosystem information can now be readily collected across a wide range of organism sizes in conjunction with physical and biogeochemical oceanographic data. Biology and ecosystem variables are at the heart of many statutory monitoring requirements for the marine environment including for protected species and understanding natural hazards, to managing fisheries, offshore energy, and mineral industries. Synchro—launched in 2023—accelerates technology solutions for ocean research and monitoring. Its aim is to propel promising ocean observing technology from prototype to broader use and involve users of the technology from the get-go. Marine imaging is one of the focus areas of Synchro, where we are testing and evaluating various systems. These evaluations importantly include perspectives from not only scientists, but also tech developers and the users of resulting information for decision making. Additionally, many regions of the Integrated Ocean Observing System (IOOS) are working to transition imaging and artificial intelligence tools into sustained observing systems that meet Federal accreditation standards. Here we will cover cases of such transitions from research to operations (R2O) and the data lifecycle elements needed to realize broader imaging tech use. For example, a system in mid-stage technology readiness (TRL) is the Planktivore camera operating on a Long-Range Autonomous Underwater Vehicle (LRAUV). An advanced TRL system example will be discussed with the California Imaging FlowCytobot (IFCB) Network. Such examples will be given in the context of embedding such systems into sustained ocean observing efforts to improve coastal and climate resilience and inform place-based management in changing ecosystems. Cyberinfrastructure tools from FathomNet and Axiom Data Science contribute to timely access to this information. Applications include information to underpin Large Marine Ecosystem - Integrated Ecosystem Assessments, National Marine Sanctuary - Condition Reports, baseline and impact assessment for development of the offshore wind energy industry, and providing real-time information on harmful algal blooms.

High-Quality Underwater Imaging Data Achieved with Both Custom and Ready-Made Systems

Marcie Crosbie , SubC Imaging

Chad Collett, SubC Imaging; Katie Stoodley, SubC Imaging; Paul Smith, SubC Imaging

High-quality underwater imaging plays a crucial role in advancing marine research and conservation efforts. Whether opting to build an in-house camera system or purchase a commercial solution, the choice has significant implications for research efficiency and success. SubC Imaging supports both approaches and meets the unique demands of marine scientists, delivering robust imaging solutions that simplify data collection and accelerate research.

A key example of a ready-made solution is SubC’s Tow Camera System, vital in a 2022 Antarctic expedition to capture footage of the elusive colossal squid. The system, equipped with a 4K Rayfin Coastal camera and integrated LEDs and scaling laser, was deployed from a tourism vessel to capture footage of the elusive colossal squid in its natural deep-sea habitat. The expedition, organized by non-profit KOLOSSAL, marked a significant step in researching the world's largest invertebrate, contributing to conservation efforts for the Southern Ocean and yielding over 60 hours of deep-sea footage. SubC’s technology proved reliable in the extreme Antarctic environment, demonstrating its suitability for challenging marine research. This ready-to-use system enabled researchers to focus on their core work, bypassing the complexities and risks associated with custom-built solutions.

On the custom side, a SubC Rayfin Benthic camera was incorporated into a bespoke system used by the Marine Institute of Memorial University, leading to the discovery of a vibrant soft coral garden within the Funk Island Deep marine refuge in June 2024. Capturing high-definition footage of previously unknown marine biodiversity—including soft corals, sponges, and basket stars—the custom system revealed a critical cold-water habitat. This breakthrough discovery, led by Marine Institute researchers, highlights the adaptability of SubC’s technology for specialized research applications and contributes to the understanding of fragile marine ecosystems. The adaptability and reliability of SubC’s technology allowed researchers to uncover critical marine biodiversity with clarity, precision, and ease.

These case studies underscore the value of purchasing ready-made commercial systems from a trusted provider, while also highlighting how custom solutions can address specialized research needs. By opting for SubC Imaging’s advanced subsea cameras and systems, organizations gain reliable, high-resolution tools that reduce development time, enhance research outcomes, and support both off-the-shelf and bespoke applications for marine exploration and conservation.

High-resolution multispectral imaging of deep sea corals

Irene Hu , MBARI

Steve Litvin, MBARI (USA), litvin@mbari.org Paul Roberts, MBARI (USA), proberts@mbari.org Aaron Schnittger, MBARI (USA), aschnittger@mbari.org Jim Barry, MBARI (USA), barry@mbari.org

Multispectral imaging, which provides spectral reflectance information for each pixel of a two dimensional image, is a promising technique for studying the health and functionality of marine organisms. However, despite growing use in a variety of marine applications, its use in deep sea studies has been limited due to technical challenges. Here, we present the development of a macro-multispectral camera designed for the study of deep sea corals. High resolution reflectance information at multiple spectral bands can provide information on coral health, including potentially aiding in delineation and quantification of gametes, or providing information on the degradation state of corals. Thus, this approach presents a new tool for advancing research and monitoring of deep sea corals. The system uses high powered LEDs for illumination and a monochrome camera as a detector, to provide high-resolution images of reflectance at eight different wavelengths ranging from 370 nm to 940 nm. Two iterations of the system have been developed: a laboratory version which can be used in controlled studies of known targets, and a deep sea version for in situ deployment by ROV. The laboratory system utilizes a 25 MP camera capable of streaming images at 17 fps, and a total of 80 LEDs. The ROV system utilizes a 65 MP camera capable of streaming images at 35 fps, and a total of 500 LEDS. Both systems provide a field of view on the order of 5 cm and resolution on the order of 5-10 um/px, and image at a distance of 20-30 cm from the object. In this poster, we share instrument details as well as examples of collected data.

Image curation in action – from the deep sea to the community

Sophie V. Schindler , ^1GEOMAR Helmholtz Centre for Ocean Research Kiel

Henk-Jan Hoving^1, Jan Dierking^1, Julian B. Stauffer^1, Karl Heger^1 & Timm Schöning^1

Imaging has become a powerful marine scientific technique in the past decades to observe and monitor ocean environments like the sea floor or the open water column. Modern optical gears enable us to access the most remote ocean areas and continuous technological advances lead to exciting new possibilities and discoveries – but also to new challenges in handling the increasingly larger volumes of collected data. Marine imaging often involves multi-terabyte dataset volumes which require adequate storage and server hardware and can limit curation and management during field work. Other challenges include structuring and standardizing of on-site workflows and efficient and sustainable curation of gathered images. Here, we visualize a successful and efficient on-site image curation protocol, and identify challenges and suggestions for improvement. During the research expedition MSM126 “JellyWeb Madeira ” focusing on biodiversity and habitat exploration in the waters surrounding Madeira (Feb-March 2024), we gathered more than 14 TB of still images and video data, which were collected by ROV PHOCA and the two towed camera platforms XOFOS (sea floor surveys) and PELAGIOS (midwater surveys). Pre-cruise and pre-deployment preparation covered an assessment of used gear and the usage and ongoing development of Best Practice Protocols ensuring the quality of technical settings as well as metadata collection. After each deployment, back-ups were created by transferring image data to a transportable MAMS (media access storage system) and archiving data on the ship-internal server. The ELEMENTS media software system was then used to manage the images on the MAMS. The GEOMAR Data Science Unit developed software for sustainable image curation, including a standardized workflow structure and creation tool for FAIR (findable, accessible, interoperable and reusable) digital objects for images (iFDOs ). These efforts aimed to have image data ready-to-use after fieldwork and make the curated image data quickly accessible and publishable for researchers.

Imaging as an emergency need to understand jellyfish ecology and population dynamics along the coastal region of Kribi

Gisele Flodore Ghepdeu Youbouni , Specialized Research Station for Marine Ecosystems

Specialized Research Station for Marine Ecosystems

Jellyfish proliferations are increasing over the world with important socio-economic, environmental and health impacts. In many regions like the West/Central Africa, insufficient data are available on jellyfish which stand among the least studied marine zooplankton group. In Cameroon, jellyfish research started in 2017 under an ongoing PhD research work has yielded first reports about jellyfish taxonomy, diversity and ecology. This research has enabled the identification of four main jellyfish taxa including Catostylus, Chimaerus, Chrysaora and Cyanea, as well as their seasonality and the physicochemical characteristics of their living environments. However, jellyfish are species with metagenetic life cycle and our research focused only on the mobile phase. The vision of this research was to improve jellyfish literacy and establish a jellyfish industry in the region as a solution to the huge jellyfish discards in the region. Therefore, for the sake of sustainability, information regarding the polyp stage of jellyfish, jellyfish habitat as well as their relationship with other marine species especially fish and sea turtles is highly recommended. To achieve this purpose, marine imaging, which also represents the main method to study the polyp stage in their natural environment, is highly recommended. This talk is going to give a preview of jellyfish taxonomy, diversity, ecology, and presents the challenges link to the study of their habitat and of a better understanding of and role in the food web.

Imaging marine snow: Hidden Comet-Tails of Marine Snow Impede Ocean-based Carbon Sequestration

Manu Prakash , Stanford university

Rahul Chajwa (Stanford), Eliott Flaum (Stanford), Kay D. Bidle (Rutgers), Benjamin Van Mooy (WHOI), Manu Prakash (Stanford)

Global carbon-cycle on our planet ties together the living and the non-living world, coupling ecosystem function to our climate. Gravity driven downward flux of carbon in our oceans in the form of marine snow, commonly referred to as biological pump directly regulates our climate. Multi-scale nature of this phenomena, biological complexity of the marine snow particles and lack of direct observations of sedimentation fundamentally limits a mechanistic understanding of this downward flux. The absence of a physics based understanding of sedimentation of these multi-phase particles in a spatially and temporally heterogeneous ocean adds significant uncertainty in our carbon flux predictions. Using a newly invented scale-free vertical tracking microscopy, we measure for the first time, the microscopic sedimentation and detailed fluid-structure dynamics of marine snow aggregates in field settings. The microscopically resolved in-situ PIV of large number of field-collected marine snow reveals a comet tail like flow morphology that is universal across a range of hydrodynamic fingerprints. Based on this dataset, we construct a reduced order model of Stokesian sedimentation and viscoelastic distortions of mucus to understand the sinking speeds and tail lengths of marine snow dressed in mucus. We find that the presence of these mucus-tails doubles the mean residence time of marine snow in the upper ocean, reducing overall carbon sequestration due to microbial remineralization. We set forth a theoretical framework within which to understand marine snow sinking flux, paving the way towards a predictive understanding of this crucial transport phenomena in the open ocean.

Innovative Approaches to Great Lakes Benthic Mapping: Integrating AUV, ROV, and Drop Camera Data with Tator

Ayman Mabrouk , NOAA

Charles Menza and Tim Batista/NOAA

The Great Lakes, a vital freshwater resource, require precise and comprehensive benthic mapping to support ecological research, conservation efforts, and resource management. This presentation explores the innovative methodologies employed in the Great Lakes benthic mapping project, focusing on the integration of Autonomous Underwater Vehicles (AUVs), Remotely Operated Vehicles (ROVs), and drop camera systems. By leveraging the powerful video annotation platform Tator, we have enhanced the accuracy and efficiency of lakebed image analysis. Our approach combines high-resolution imagery from multiple sensors to create a detailed and comprehensive map of the lakebed. The AUVs provide extensive coverage and high-resolution data, while the ROVs offer maneuverability and detailed close-up imaging. Drop cameras complement these systems by capturing additional visual data in areas that are challenging to access with AUVs and ROVs. Tator’s robust annotation capabilities enable us to efficiently process and analyze vast amounts of video data, facilitating the identification and classification of benthic habitats and features. This multi-sensor integration, combined with advanced annotation techniques, offers a new paradigm in lakebed mapping, significantly improving our understanding of the Great Lakes' underwater environment. In this presentation, I will discuss the methodologies used, the challenges encountered, and the solutions implemented to overcome these challenges. I will also showcase preliminary results and highlight the potential applications of this innovative approach in future benthic mapping and marine research projects.

Low-Cost Stereo Imaging for In Situ Particle Sinking Speed Measurement on Autonomous Platforms

Fernanda Lecaros , MBARI (Monterey Bay Aquarium Research Institute)

Paul Roberts, MBARI (USA) Joost Daniels, MBARI (USA) Melissa Omand, University of Rhode Island (USA Kakani Katija, MBARI (USA) Colleen Durkin, MBARI (USA)

The sinking speed of carbon particles in aquatic environments plays a crucial role in the Earth's carbon cycle and affects the efficiency of the ocean carbon pump, which transports carbon dioxide from the atmosphere to the deep ocean. Changes in sinking speed could impact this pump's effectiveness, ultimately influencing atmospheric carbon dioxide levels and, by extension, global climate dynamics. Measuring in situ particle sinking speed remains a significant challenge due to the small size and slow movement of particles, potential platform motion, and power limitations for camera and lighting operations. We present a low-cost imaging system designed for attachment to an isopycnal-following float, capable of capturing images of sinking particles in the water column. This dual-camera, dual-strobe system uses stereo imaging techniques, tracking the motion of sinking particles, and precisely calculates their 3-D velocities within a specified volume of the water column. The imaging constraints have been extensively tested in simulations. Controlled tank experiments were used to image both synthetic particles and natural aggregates, demonstrating the instrument’s capability to quantify both particle size and sinking speed using a free-space imaging approach. Field measurements in a particle-rich coastal environment and results will be discussed. Through this innovation, we aim to enhance our understanding of particle transport and their significance in ecosystem dynamics and the global carbon cycle.

Lowering Barriers to Entry for 3D Imaging Marine Invert with a Low Cost Build and Open Tools

Jamie Andersen M Fields , independent researcher

Jonathan Fay, Microsoft

The MBARI BioInspiration Lab’s 2023 “Lab 3DR” method was a huge step forward in rapidly collecting life-accurate data for 3D reconstruction of semi-transparent marine animals. Starting with the scanning parameters they proved effective, we sought to develop a low-budget build with commodity parts and open-access software. An adjustable, minimum 0.4 mm width 635 nm 120 mW Uniform Powell Line Laser was acquired through Civil Lasers for illumination and a Canon R8 camera with a 105 mm Sigma macro lens was used for imaging. 3D printed mounts for the laser and camera were designed in Fusion 360. Software control for the rig was developed in Python and C#, which communicated with a Raspberry Pi Pico that controlled stepper motors that drove ball screw driven linear actuators. The total cost of the build was under $2500 including all but the use of a laptop. Testing of the rig was conducted at Rosario Beach Marine Lab using a variety of local species including hydrozoan jellyfish, ctenophores, and the semi-transparent nudibranch, _Melibe leonina_. The software, parts list, and other documentation will be available on GitHub for others to reproduce the rig to image their own local species and further adjust the methodology. Future directions include creating a 3D printed tool for easier calibration, developing image post-processing tools in Python using the ffmpeg-python library to automate the process of cropping videos and extracting frames, as well as image stack post-processing to reduce artifacts before the data is segmented in 3D Slicer. Our hope is that making it easier to adopt this method will lead to wider-scale imaging and reconstruction of these fragile organisms, which would be valuable for education, natural history documentation, and other applications.

MINUET: Harmony from the deep - an educational and machine learning development project that generates scientific data

Deanna Soper , University of Dallas)

Abigail Fritz (afritz@udallas.edu, University of Dallas), Jana Rocha (jrocha@udallas.edu, University of Dallas), Benjamin Woodward (benjamin.woodward@cvisionai.com, CVision AI), Kakani Katija (kakani@mbari.org, MBARI)

Recent efforts of the NOAA Office of Oceanic Exploration and Research (OER) have yielded substantial quantities of underwater footage recorded by Remotely Operated Vehicles (ROVs), which currently takes a significant amount of time in order analyze. The application of artificial intelligence (AI) has the potential to increase efficiency, reduce costs, and curate large biological data sets. In order to develop software that has the capability of organism detection and identification we utilized undergraduate students to assist in AI training data set development and biological analysis. Undergraduate students used previously expertly annotated ROV video footage to localized organisms by drawing a box around the individual and then connected the expert taxonomic identification to the organism. This had two outcomes: the generation of a biological data set that has revealed deep-sea coral abundance, distribution, and epibiont association patterns and resulted in a training data set for AI development. Here we show that instructors, software developers, and scientists at research institutes can form close partnerships to provide technological development, education, and hands-on training for undergraduate students. These important experiences for students early in their academic careers provide them with the skills needed to engage them in the fields of oceanic research and data science.

Making use of unlabeled data: Comparing strategies for marine animal detection in long-tailed datasets using self-supervised and semi-supervised pre-training

Tarun Sharma , Caltech

Danelle E. Cline (MBARI) and Duane Edgington (MBARI)

This paper discusses strategies for object detection in marine images from a practitioner’s perspective working with real-world long-tail distributed datasets with a large amount of additional unlabeled data on hand. The paper discusses the benefits of separating the localization and classification stages, making the case for robustness in localization through the amalgamation of additional datasets inspired by a widely used approach by practitioners in the camera-trap literature. For the classification stage, the paper compares strategies to use additional unlabeled data, comparing supervised, supervised iteratively, self-supervised, and semi-supervised pre-training approaches. Our findings reveal that semi-supervised pre-training, followed by supervised fine-tuning, yields a significantly improved balanced performance across the long-tail distribution, albeit occasionally with a trade-off in overall accuracy. These insights are validated through experiments on two real-world long-tailed underwater datasets collected by MBARI.

Marine Image Analysis with RootPainter

H. Poppy Clark , University of Aberdeen

Abraham George Smith (Department of Computer Science, University of Copenhagen, Universitetsparken 1, 2100 Copenhagen, Denmark) Daniel Mckay Fletcher (Rural Economy, Environment and Society, Scotland’s Rural College, West Mains Road, Edinburgh, EH9 3JG, Scotland, United Kingdom) Ann I. Larsson (Tjärnö Marine Laboratory, Department of Marine Sciences, University of Gothenburg, Strömstad, Sweden) Marcel Jaspars (Marine Biodiscovery Centre, Department of Chemistry, University of Aberdeen, Old Aberdeen, AB24 3DT, Scotland, United Kingdom) Laurence H. De Clippele (School of Biodiversity, One Health & Veterinary Medicine, University of Glasgow, Bearsden Road, G61 1QH, Scotland, United Kingdom)

RootPainter is an open-source and user-friendly interactive machine learning tool that is capable of analysing large marine image datasets quickly and accurately. This study tested the ability of RootPainter to extract the presence and surface area of the cold-water coral reef associate sponge species, Mycale lingua, in two datasets: 18,346 time-lapse images and 1,420 remotely operated vehicle video frames. New corrective annotation metrics integrated with RootPainter allow objective assessment of when to stop model training and reduce the need for manual model validation. Three highly accurate Mycale lingua models were created using RootPainter, with an average dice score of 0.94 ± 0.06. Transfer learning aided production of two of the models, increasing analysis efficiency from 6 to 16 times faster than manual annotation for time-lapse images. Surface area measurements were extracted from both datasets allowing future investigation of sponge behaviours and distributions. The ability of RootPainter to accurately segment the cold-water coral Lophelia pertusa was also tested in 3,681 ROV video frames, with model dice scores surpassing 0.80. Moving forward, interactive machine learning tools and model sharing could dramatically increase image analysis speeds, collaborative research, and our understanding of spatiotemporal patterns in biodiversity.

Microscopy in Motion in the Deep Ocean

Thom Maughan , MBARI

Paul L.D. Roberts, Brent Jones, Aaron Schnittger, Jon Erickson, Denis Klimov, Henry Ruhl, Steve Haddock, Francisco Chavez

Plankton are the foundation of the ocean food web, but observations are difficult to sustain at ecologically relevant time and space scales. Advances in in-situ imaging are expanding the toolkit of biological oceanographers, enabling better quantification of plankton communities as well as detrital particles that together drive the ocean’s carbon cycle. This paper/talk presents the engineering of "Planktivore," a deep-sea microscope for autonomous underwater vehicles (AUVs) and lessons learned in our first half year deploying as an ocean instrument for science on the MBARI Long Range Autonomous Underwater Vehicle (LRAUV). Planktivore is a free space imaging dual microscope that spans a size range from 10s of microns to centimeter size with vehicle speeds up to 1 meter per second. Storage of images is optimized using computer vision to identify and store regions of interest (ROI) from the full image frame. Weeklong LRAUV deployments in Monterey Bay typically record 10 Million to 30 Million plankton/particle ROIs. After deployment the images are uploaded and processed on a GPU enabled server and software tools are used to tag plankton images with their scientific name for use in machine learning for automated plankton identification.

North Sea 3D – Image analysis for measuring biofouling biomass from industry ROV footage

John Halpin , Scottish Association for Marine Science

Joseph Marlow1, and Tom A. Wilding1 1Scottish Association of Marine Science

Continued developments in marine infrastructure have given rise to an increasing number of Offshore man-made structures (MMS). This has become known as ‘ocean sprawl’. These structures are quickly colonised, becoming artificial reefs. To understand the role of these structure in marine systems, the colonising organisms need to be quantified, both by species and biomass. To date, this quantification has been limited by the difficulties in performing the studies on the scales needed. In the North Sea 3D (NS3D) project we use existing footage collected by offshore energy operators with remotely operated vehicles (ROVs). This footage already exists in abundance since structures are routinely surveyed by ROV for maintenance purposes. We use recent advances in image/video processing - Structure from Motion (SfM) Photogrammetry and semantic segmentation by convolutional neural networks (CNNs) to generate 3D images of marine growth classified by taxa from standard 2D-video ROV footage. These are then calibrated and used for biomass quantification. In this talk, the methodology of the associated image analysis is discussed, including the challenges of underwater photogrammetry and species auto identification, particularly when ROV footage is not taken specifically for this task. A 3D headset will accompany the talk, letting attendees ‘walk around’ the 3D dataset’s we have generated.

OCEAN SPY’s citizen annotation data

Catherine Borremans , IFREMER (Biology and Ecology of dEEP marine ecosystems Unit (BEEP)/Deep Sea Lab)

Marjolaine Matabos (IFREMER), Pierre Cottais (IFREMER), Julie Tourolle (IFREMER), Antoine Carlier (IFREMER), Séverine Martini (MIO)

For the purpose of assessing and monitoring conservation state of marine ecosystems, scientists are deploying seabed observatories and using mobile devices to acquire temporal and spatialized biodiversity data using imagery, to complement traditional 'stationary' sampling approaches. These monitoring programs make it possible to gather a large amount of data, particularly underwater images, which represents huge volume of datasets that are difficult to process. Artificial intelligence (AI) has enabled the development of algorithms to facilitate the processing of large datasets. However, the ability of machines to detect and classify objects automatically for scientific purposes depends on a learning phase based on large reference databases that have been built manually by human brains, a highly time-consuming task. In this context, involving citizens in collecting training data is an interesting solution, thanks to the important observation power that crowd represents (Matabos et al., 2024, Preprint). Launched in 2023, the Ocean Spy project aims at engaging citizen in the process of marine images annotation. It provides a web platform allowing general public to access images collected in various marine habitats, from shallow to deep waters. Different tools and functionalities were designed to guide users for locating and identifying animals (or other subjects of interest) in the images, through the annotation interface. Beyond the development of the online portal and the associated database, such an initiative requires methods for pre-processing and validating the data generated by citizen annotation. Particularly, to identify common organisms (resulting from multiple annotations) and to clean the database according to a threshold of agreement between participants. These are essential steps in the production of reliable, high-quality scientific knowledge, and in boosting the performance of AI methods for identifying taxa. General statistics on this first year of operation and the data analysis pipeline being fine-tuned will be presented at the workshop. Ref: Matabos, Marjolaine and Cottais, Pierre and Leroux, Riwan and Cenatiempo, Yannick and Gasne-Destaville, Charlotte and Roullet, Nicolas and Sarrazin, Jozée and Tourolle, Julie and Borremans, Catherine, Deep Sea Spy: An Online Citizen Science Annotation Platform for Science and Ocean Literacy. Available at SSRN: https://ssrn.com/abstract=4848325 or http://dx.doi.org/10.2139/ssrn.4848325

Object Detection of Benthic Morphospecies using Synthetic Data and Self-Training Domain Adaptation

Heather Doig , University of Sydney

Oscar Pizarro (Norwegian University of Science and Technology, University of Sydney); Stefan Williams (University of Sydney)

Scientists can observe benthic morphospecies using images taken from underwater robotic vehicles. The presence and quantity of morphospecies like endangered handfish or invasive sea urchins can be collected using neural network object detectors on high-resolution images. The detector is trained with labelled images, but its performance can suffer from two issues. First, the detector can degrade when used on images captured from different cameras, water conditions, locations or time compared to the source training images, known as domain shift. Secondly, very few labelled examples of the morphospecies of interest needed to train a supervised model may exist. We propose a framework to train a benthic morphospecies detector using synthetic underwater images followed by unsupervised domain adaptation to bridge the domain gap from synthetic to real. We generated synthetic images with a customised version of Infinigen (infinigen.org) from Princeton Vision & Learning Lab that creates natural scenes with annotation data for training. We have customised Infinigen to mimic images from underwater robots, modelling the attenuation of light and including species of interest, such as invasive sea urchins. We train an object detector using synthetic source images followed by an adaptation step with unlabeled target images. We tested our approach on sea urchins and other species of interest and successfully detected these species in visually complex seafloor images. The framework offers a novel approach for detecting rare and endangered morphospecies using increasingly popular 3D modelling software.

Ocean Observatories Initiative Imaging Options and Expansion Capabilities

Michael F. Vardaro , University of Washington, School of Oceanography

Katharine T. Bigham & Deborah S. Kelley, University of Washington, School of Oceanography

The NSF’s Ocean Observatories Initiative (OOI) Regional Cabled Array (RCA) has been streaming real-time data to shore from a diverse array of 150 instruments since 2014. The electro-optical cable network spans the Juan de Fuca tectonic plate. One backbone cable extends ~480 km west of the Oregon coast to Axial Seamount, an active undersea volcano, and a southern branch runs 208 km along the base of the Cascadia Subduction Zone (2900 m depth) and then turns east towards the Oregon Shelf (80 m depth), crossing the Cascadia Margin. Among the cabled instruments are six SubC digital cameras that capture 3 images every 30 minutes at various seafloor and water column sites. An HD Video camera is aimed at the 300 °C Mushroom vent in the ASHES hydrothermal field within the highly dynamic caldera of Axial Seamount, located 300 miles offshore (1500 m depth). The HD camera alone has collected 47,000 hours of video, imaging the entire venting edifice every 3 hours. All OOI-RCA imagery is open access and available for viewing and downloading via the recently upgraded OOI Data Explorer image and video gallery (dataexplorer.oceanobservatories.org). Since 2016, the OOI-RCA submarine observatory has also hosted a diverse array of cabled instruments through externally funded PI projects, including multibeam sonars and a 4K camera installed at Southern Hydrate Ridge by a group from MARUM to study methane bubble plume dynamics, environmental impacts on venting, and chemosynthetic communities (Marcon et al., G3, 2021). The RCA power and bandwidth support extensive expansion capabilities, and NSF encourages the addition of instruments and platforms onto the observatory. We continue to improve our image gallery annotation and quality control procedures, and several active projects are using automated event detection, machine learning, and computer vision techniques to sort, label, and sift data out of the OOI image archive.

PAMS, a low-cost, simple-to-assemble seafloor imagery solution for monitoring corals

Thierry Baussant , NORCE Norwegian Research Centre

Christian Andreas Hansen, Alan Le Tressoler, Junyong You, Gro Fonnes, Xue-Cheng Tai, Morten Kompen*, Christina Rørvik - NORCE; * NORCE and Compelling AS

NORCE has developed PAMS (Polyp Activity Monitoring System) as an innovative camera-based sensing platform and environmental effect methodology to monitor coral welfare in their natural environment and impacts resulting from use of the ocean space. The system comprises a hardware and software delivery, with the main value attributed to expert and machine learning (ML)-based interpretation of data. Based on off-the-shelf components, PAMS can be regarded as an effective, affordable, simple-to-assemble yet modular and customizable tool for users who want to monitor corals (shallow and deep) and assess their status. Whilst there exist solutions to measure coral endpoints in shallow waters, there is still a technology gap for deepwater corals and in general - it is difficult to find an appropriate parameter for monitoring corals in a non-intrusive manner which reflect any influences from their surrounding environment. High resolution still photos can support the non-intrusive and specific identification of polyp behavioral change on corals exposed to particle plumes or change of conditions in the water. The PAMS technology is based on years of lab research and observations dedicated to specific analysis of coral endpoints and more specifically coral behavioral endpoints with polyp activity. It is currently developed using specific ML modules for automatic coral recognition and classification of the coral status according to their polyp activity using a high-throughput process independent of user expertise. Currently at Technology Readiness Level (TRL) 5-6, the vision for PAMS encompasses both full development and qualification for industrial applications as well as the advancement of research on marine seafloor environments where coral is found. In this workshop, we will present a detailed description of the current hardware and the status of the software, including an open-source data visualization platform for displaying and analyzing coral images. The PAMS project is funded by the Research Council of Norway.

Planktwin -- Physically Faithful Synthetic Imagery from Digital Twins of Plankton

David Nakath , Kiel University and GEOMAR Helmholtz Centre for Ocean Research Kiel

Xiangyu Weng(GEOMAR Helmholtz Centre for Ocean Research Kiel), Jan Taucher (GEOMAR Helmholtz Centre for Ocean Research Kiel), Elisa Wendt(Kiel University), Veit Dausmann (GEOMAR Helmholtz Centre for Ocean Research Kiel)

Plankton observation is a ubiquitous task in underwater imaging. A multitude of imaging systems exists, covering several optical and acoustic imaging modalities from macroscopic to microscopic, dark to bright field and perspective to orthographic projections. The resultant versatility of the imaging data hinders standardized workflows in downstream tasks like species classification, trait classification, and continuous-valued trait regression. The key challenge remains to disentangle the semantic information of the individual plankton objects from the underlying imaging domain-specific information. To overcome this, we present a rendering-based system which can synthesize physically faithful image data of plankton in different underwater environments and with different camera-types. Specifically, our pipeline features a model learning step which disentangles plankton 3D-representations from the light and optical conditions it was taken in. This model can then be submerged in standardized Jerlov-Water bodies, as defined by measurable attenuation and scattering parameters, in different light fields like homogeneous sun illumination, highly heterogeneous artificial setups like light sheets, shadowgraphy, or even a mixture of all. Finally, the resultant scene can be captured with different simulated optical systems. This approach allows for the training of AI networks to understand the information inherent to plankton, which can then be transferred to other imaging domains or used as a digital twins. It is also possible to test newly developed approaches against image data with full known ground truth - which is barely available in underwater scenarios or has to be obtained by annotation approaches. It also enables the design of new plankton imaging systems by virtually submerging hypothetical setups and inspect the resultant imagery in different simulated water columns. Finally, as the system is a physically-based implementation in an autodiff-framework, the model parameters can be optimized with respect to e.g., image quality criteria and implemented in an actual system.

Publicly available and freely accessible Imagery AI Training Data set repository

Errol Ronje, NOAA National Centers for Environmental Information

Megan Cromwell (NOAA National Ocean Service), Vidhya Gondle, Yee Lau, Kirsten Larsen, Hassan Moustahfid, Rachel Medley (NOAA Ocean Exploration), Mashkoor Malik (NOAA Ocean Exploration), Patrick Cooper (University Corporation for Atmospheric Research UCAR / NOAA Ocean Exploration Affiliate), Kakani Katija, Brian Schlining (Monterey Bay Aquarium Research Institute)

Efforts to advance artificial intelligence (AI) applications, particularly in underwater image data analysis, are hindered by the absence of a curated, long-term repository for AI training data. The lack of such resources constrains collaboration and knowledge sharing across various entities, including governmental organizations, research institutions, and industries. In order to address this community-wide need for a robust standardized and openly accessible training data set, NOAA National Centers for Environmental Information (NCEI), NOAA Ocean Exploration, and the Monterey Bay Aquarium Research Institute have agreed to co-design and develop a public and freely available underwater imagery training data repository at NCEI using curated data from MBARI’s FathomNet project. This initiative will allow data and appropriate metadata submitted to FathomNet to be transferred to NCEI for web-accessible hosting and continuous community access. Dedicated pipeline development has begun with the image and metadata extraction automation from FathomNet image banks for hosting on a partner managed web-accessible server. The FathomNet data and metadata will be combined into a version-controlled bi-annual archival package within NCEI and assigned a digital object identifier (DOI). This facilitates citability and replication of the database and image collection over-time. This process is a pioneering effort to create, collect, and maintain AI-ready data for the oceanographic imaging community. The goal is to foster collaboration and innovation, empowering NOAA and its partners to leverage AI technologies effectively for advancing environmental research and stewardship.

Returning to the Deep: Maximizing Insights from Deep-Sea Coral and Sponge Imagery from the U.S. Pacific Islands

Savannah Goode , Deep Sea Coral Research & Technology Program, NOAA

Kiki Kamakura (Barnard College, NOAA Pacific Islands Fisheries Science Center), Beth Lumsden (NOAA Pacific Islands Fisheries Science Center), Heather Coleman (Deep Sea Coral Research & Technology Program, NOAA)

One major disadvantage to seafloor image analysis is the bottleneck created by the time required to manually annotate imagery for scientific study. It is thus uncommon for imagery to be re-analysed after the original study is completed. Consequently, it is difficult to combine imagery datasets to conduct broader-scale investigations, which can be particularly important when studying habitats that are difficult to survey, such as the deep-sea benthos. A second issue that has arisen from limited re-annotation of seafloor imagery is that many existing image datasets are composed primarily of images with non-localized annotations (i.e., records of different taxa and/or individuals not differentiated spatially within a single image). Such images cannot be readily incorporated as training data for automated annotation pipelines, nor can they be easily used by non-experts for identification purposes. We sought to address these issues of non-standardized, non-localized annotations using photos obtained from [NOAA’s National Database of Deep-Sea Corals and Sponges](https://www.ncei.noaa.gov/maps/deep-sea-corals/mapSites.htm), focusing on the U.S. Pacific Islands region due to the abundance of high-resolution images available. Using BIIGLE 2.0 (an online image annotation platform), 14,226 images were initially assessed to determine whether they comprised a single taxon or a wider ‘community’. Images labeled ‘community’ (3,230 total) were selected for re-analysis. Each image was annotated to characterize the visible substrate (using the Coastal and Marine Ecological Classification Standard ([CMECS](https://iocm.noaa.gov/standards/cmecs-home.html)) classification) and to localize the taxon of interest (using rectangular bounding boxes). This dataset will be used to: (1) investigate taxon-substrate relationships in this region, (2) provide representative _in situ_ taxon images for non-experts, (3) train automated classification tools, and (4) inform a national guide to identify important and vulnerable deep-sea habitats, currently in development by the NOAA Deep Sea Coral Research & Technology Program and the Bureau of Ocean Energy Management.

SQSIZE - Stereographic Quantification and Shadowgraph Imaging of Zooplankton and their Environment

Mehul Sangekar , X-STAR, JAMSTEC

Ariell Friedman, Greybits Engineering (Australia), ariell@greybits.com.au ; Dhugal Lindsay, X-STAR, JAMSTEC (Japan), dhugal@jamstec.go.jp

The SQSIZE (Stereographic Quantification and Shadowgraph Imaging of Zooplankton and their Environment) system is comprised of two autonomous imaging systems and is designed to be deployed simultaneously on a 36-bottle CTD rosette system. This system was designed primarily for gathering data on the particle field in near real time at areas being assessed for their suitability or currently being exploited for deep sea mineral extraction. We describe the system hardware and the software pipelines that are tightly integrated to minimize the time necessary between retrieval of the CTD system and visualization of the quantitative data on particles, plankton and larger nekton imaged by the system.

Spatial distribution and hotspots of habitat forming coral and sponge communities of the Hawaiian – Emperor Seamount Chain: Use of marine imagery for spatial analyses

Virginia Biede , Florida State University

Nicole B Morgan, NOAA ; Amy R Baco, Florida State University

Deep-sea benthic community spatial distribution is a fundamental parameter for understanding communities at multiple scales. Spatial analyses inform spatial management of vulnerable ecosystems, like sponge and coral communities on seamounts exposed to bottom contact fishing. To gain a better understanding of patch sizes on North Pacific seamounts, an existing dataset derived from surveys with the AUV Sentry, was reanalyzed. The Sentry data included annotations from surveys of seafloor communities between 200m–800m depth on seamounts of the Hawaiian-Emperor Seamount Chain. Sentry images were annotated for coral, sponge, crinoid, and brisingid presence, and categorized for density as sparse, medium, or abundant for each taxon. 4627 individual patches were found when the taxa were analyzed individually, 3860 of those being Coral patches. For all taxa combined, 4399 patches were found. For most patch analyses, the largest patch was found on Koko Guyot, with a linear length of over 2km for all fauna combined, over 1km for the largest coral patch, and 988m for Crinoids. For Sponges the largest patch was 78m on Bank 11, and for Brisingids maximum patch size was 109m on Kammu. Patch size of Coral, Crinoid, and All Fauna groups varied by seamount (Coral and Crinoids: p< 0.001; All Fauna: p <0.01) and depth (p<0.0001 for all three) but no significant pattern was found for Brisingids or Sponges. A second scale of analyses used Getis-Ord to find hotspots of faunal abundance. Using the average nearest neighbor distance and median distance between patches, neighborhood size was found for each fauna group. A total of 850 hotspots were found across the seven seamounts and both neighborhood size and hotspot size varied substantially by taxon; from 500m2 to 95,284m2. Future discussions of Marine Imaging should incorporate needs for spatial analyses on multiple scales.

Synthesize Large-scale in situ Images for Training Deep Plankton Detection Algorithms

Jianping Li, Claude , 1.Shenzhen Institute of Advanced Technology, CAS, Shenzhen, China; 2.University of CAS, Beijing, China.

Zhenping Li 2,1 1.Shenzhen Institute of Advanced Technology, CAS, Shenzhen, China; 2.University of CAS, Beijing, China.

The content distribution of full-frame raw images captured by in situ plankton imaging instruments is highly complex and variable, easily affected by multiple factors such as the properties of seawater, the quantity, taxonomy, posture, and morphology of underwater objects, as well as changes in camera focusing status and working settings. Therefore, most plankton observation methods based on in situ imaging have divided the recognition task into two stages: the first is to locate the objects from the raw full-frame images and crop out the region of interest (ROI) vignettes containing only one target; the second is to classify the ROI images to achieve the ultimate automatic plankton recognition. However, the raw image complexity can significantly impact the performance of the two-stage plankton target cropping and recognition algorithms, and may even cause serious errors for the observation of interested plankton. Although existing in situ imaging instruments have already collected massive full-frame image data under different sea conditions, their content variety poses great challenges for manual annotation, making it almost impossible to obtain large-scale high-quality datasets to train usable end-to-end deep CNN models for plankton detection. We note that it is possible to utilize deep generative models to perform large-scale raw image synthesis using a limited amount of full-frame images and annotated ROI data. Based on the images collected by our dark-field underwater imager IPP (Imaging Plankton Probe) across different sea regions in China and Australia, we propose a method that can quickly build a massive full-frame image dataset IsPlanktonGC with definite object identity ground truth. The preliminary experimental results show that the imitated images have very realistic visual similarity to the true in-situ collected ones. This strategy is expected to be very helpful in providing big data foundations for developing robust downstream object detection algorithms to better observe marine plankton.

Tackling exponential increases in deep-sea video data: VARS + AI

Lonny Lundsten , Monterey Bay Aquarium Research Institute

Nancy Jacobsen Stout (MBARI), Kyra Schlining (MBARI), Kristine Walz (MBARI), Larissa Lemon (MBARI), Megan Bassett (MBARI), Brian Schlining (MBARI), Kevin Barnard (MBARI), Danelle Cline (MBARI), Duane Edgington (MBARI)

The Monterey Bay Aquarium Research Institute (MBARI) has made significant advancements towards incorporating machine learning (ML) into the annotation and analysis of our 37-year archive of deep-sea video. Central to these efforts is the integration of the Video Annotation and Reference System (VARS) with advanced ML tools and the development of a new machine-assisted annotation workflow. VARS is a software system developed by MBARI to annotate and manage video data collected from remotely operated vehicles (ROVs), autonomous underwater vehicles (AUVs), and other camera platforms. Using VARS, researchers create detailed annotations of marine organisms, geological features, and habitats from deep-sea video footage using a structured format within a searchable relational database. Visual observations are merged with additional data collected during vehicle deployments, such as salinity, temperature, position, and depth. Historically, MBARI’s Video Lab staff have manually annotated all deep-sea videos recorded by MBARI's ROVs in a consistent manner. However, the recent, exponential increase in camera platforms used for conducting surveys has necessitated the development of innovative workflows to handle this massive influx of visual data, which is rapidly outpacing the feasibility of manual processing. To address this, we have developed new tools and workflows for 1) localizing ML model training data, 2) generating ML proposals rapidly and inexpensively, and 3) validating and editing ML generated proposals. To date, we have produced 500,000 deep-sea localizations for ML training and used the subsequent ML models to generate detections on 200 hours of ROV video and 800+ hours of AUV video (PiscivoreCam, i2MAP). These efforts underscore MBARI's commitment to utilizing cutting-edge technology to advance marine research, ultimately providing critical insights into ocean health and ecosystems while fostering collaboration among scientists.

The MBARI Low Altitude Survey System for 1-cm-scale seafloor surveys in the deep ocean

David W. Caress , MBARI

Eric J. Martin, Jennifer B. Paduan, Michael Risi, Giancarlo Troni, Andrew Hamilton, Chad Kecy, Mauro Candeloro, Justin Tucker

The Monterey Bay Aquarium Research Institute (MBARI) has developed a Low Altitude Survey System (LASS) to conduct 1-cm-scale seafloor surveys of complex terrain in the deep ocean. The LASS is integrated with Remotely Operated Vehicles (ROVs), which are operated at a 3-m standoff to obtain 5-cm lateral resolution bathymetry using a 400 kHz multibeam sonar, 1-cm resolution bathymetry using a wide-swath lidar laser scanner, and 3-mm/pixel resolution color photography using stereo still cameras illuminated by strobe lights. Surveys are typically conducted with 3-m line spacing and 0.2-m/s speed, and executed autonomously by the ROV. The instrument package is mounted on a rotating frame that is actively articulated, keeping the sensors oriented normal to the seafloor. The strobe lights, mounted on swing-arms on either side of the ROV, similarly rotate to face the seafloor. Areas of 120 m x 120 m can be covered in about 8 hours. We present three examples from surveys of (i) deep-sea soft coral and sponge communities from Sur Ridge, offshore Central California, (ii) a warm venting site hosting thousands of brooding octopus near Davidson Seamount, also offshore Central California, and (iii) a high temperature hydrothermal vent field on Axial Seamount, on the Juan de Fuca Ridge. An advantage of combining optical and acoustic remote sensing is that the lidar and cameras map soft animals while the multibeam sonar only maps the solid seafloor. Calculating the difference between lidar bathymetry and multibeam bathymetry shows the location and approximate size of soft animals such as sponges and deep-sea corals. The long-term goal for this sensor suite is to field it from a hover capable autonomous platform rather than ROVs, enabling efficient 1-cm-scale seafloor surveys in the deep ocean.

Through their eyes: adaptations of hyperiid amphipod eyes for the midwater

Karen J. Osborn , Smithsonian National Museum of Natural History

Jan M. Hemmi, University of Western Australia Oceans Institute

Vision in the midwater (the open ocean below the photic zone) is challenging given the relatively high attenuation of light over short distances that results in a highly structured light field, which is periodically interrupted by bioluminescence. This unfamiliar light field has led to visual adaptations in many midwater animals that can provide inspiration for underwater and low-light imaging. Hyperiid amphipods are a small group of midwater crustaceans notable for the diversity of their visual systems and their importance in the midwater food web, where they are both abundant prey and voracious predators. Having compound eyes allows hyperiids to drape their photoreceptor sheets over any shape imaginable creating local, optimize regions in visual space for different tasks. This flexibility has led to a unique array of visual adaptations. By studying the anatomical and functional differences between the different eye designs, we learn what it takes to see in the midwater and subsequently, how successful visual predators image this challenging habitat. To determine their visual field composition, we reconstruct the 3D eye anatomy of hyperiids from microCT scans. A custom neural network allows us to map a compound eye’s visual field in a matter of hours to days, rather than months. Combining the resulting “map” information with physiological measurements from live animals, we model what each hyperiid eye sees and how their vision changes with through the light field. We test and interpret these models against what we are learning of hyperiid behavior from in situ observations and a newly designed, immersive, visual experimental chamber, to understand what each eye design is “optimized” for. This work provides a native view of the midwater and a better understanding of how animals optimize vision in the midwater under extreme size and energy constraints.

Towards Species-Level Detection of Caribbean Reef Fish

Levi Cai , WHOI

Austin Greene (WHOI), Daniel Yang (MIT and WHOI), Nadege Aoki (MIT and WHOI), Sierra Jarriel (WHOI), T. Aran Mooney (WHOI), Yogesh Girdhar (WHOI)

Visual detectors and classifiers have shown remarkable utility for animal conservation and monitoring tasks by providing rapid estimates of species abundance and biodiversity or can also reduce labeling efforts during data preprocessing tasks. These approaches have been proposed for animals in underwater environments, such as the deep ocean with FathomNet [1]. However, their deployment in coral reef environments still faces many obstacles. Like other marine environments, there is a lack of previously annotated data and the ecological nature of the data faces a long-tail distribution problem where many less common species are difficult to obtain observations about. Distinct from the more controlled environment of imaging animals in the deep ocean, coral reefs present a visually rich and complex environment. Their location in shallow waters requires visual detectors to cope with significant variations in lighting while corals themselves provide visually confounding characteristics to ecologically interesting species. We have built a new dataset for training and evaluating visual detectors and classifiers at the species-level in coral reef environments. It consists of fish-species labels from diver transect videos across 5 different reef sites spanning over 8 years in the U.S. Virgin Islands. In total there are 162 videos, of which we label 14K images. We also provide tracking information for every fish across each video, enabling novel insights into tracking challenges. Furthermore, we examine the entire pipeline, from data pre-processing, labeling, and training, to deployment. We investigate how best to utilize pre-existing fish detection datasets when applied to our specific domain, either as supplemental data or in transfer learning. We further show results of detectors trained on our dataset that are subsequently deployed on stationary camera or AUV data from the same or nearby sites, to demonstrate the utility and challenges that still remain in deploying these systems even in similar locations. [1] K. Katija et al. “FathomNet: A global image database for enabling artificial intelligence in the ocean” Nature Scientific Reports, 2022.

Towards an AIoT-based Marine Plankton Imaging Network

Jianping Li, Claude , 2. Shenzhen Institute of Advanced Technology, CAS, Shenzhen, China; 3. University of CAS, Beijing, China;4. Xcube Technology Co, Ltd., Shenzhen, China;

Zekai Zheng 1,2, Zhenping Li 2,3, Chi Liu 2, Junqiang Jiang 2,4, Peng Liu 2,3, Liangpei Chen 2,3, Shunming Chen 2,4, Zhisheng Zhou 2,3, Ming Zhu 2,3, Haifeng Gu 5, Jiande Sun 1, Jixin Chen 6. 1. Shandong Normal University, Jinan, China; 2. Shenzhen Institute of Advanced Technology, CAS, Shenzhen, China; 3. University of CAS, Beijing, China; 4. Xcube Technology Co, Ltd., Shenzhen, China; 5. Third Institute of Oceanography, Ministry of Natural Resource, Xiamen, China; 6. Xiamen University, Xiamen, China.

In situ observation of plankton based on underwater imaging should theoretically have real-time and continuous advantages. However, current methods rely almost entirely on post-processing to convert image data into observational information. This results in significant delays in information acquisition and exerts enormous pressure on transmission network bandwidth and real-time processing capabilities required for the massive amounts of data collected from remote instruments. To address these issues, we have developed an Artificial Intelligence of Things (AIoT) marine plankton imaging system as a proof-of-concept to combine edge-computing and cloud-computing to network multiple underwater dark-field imagers (imaging plankton probe, IPP) distributed across various locations, aiming to leverage the high temporal resolution advantage of in situ observations while expanding the spatial coverage of ocean observations. At the edge, a client imager first captures in situ images and then processes them through operations such as object detection, DOF extension, feature extraction, etc. On the cloud side, the server supports complex plankton data management and AI retrieval functions, with an intelligent application layer providing user management, device control and monitoring, and real-time data analysis and visualization functions. Compared to traditional methods, this AIoT-based multi-location underwater imaging approach has significant advantages. The system optimizes computing and storage resources, enhances real-time data processing and security, improves transmission reliability, and reduces deployment and maintenance costs. By deploying multiple IPPs in various locations in China and Australia for periods ranging from one to eight months, we have preliminarily demonstrated the capability of this innovative AIoT-based paradigm to monitor complex marine planktonic dynamics, such as the bloom and decay of Noctiluca scintillans. This offers a more cost-effective solution for expanding global marine plankton observation capabilities in the future.

Towards standardised reporting of macro-litter and biotic interactions from imagery

Jaime S Davies , University of Gibraltar

Alice L Bruemmer (University of Gibraltar) Awantha Dissanayake (University of Gibraltar)

The use of marine imagery as a tool to monitor species and habitats is commonplace, but can also provide insightful information of marine litter, and its potential impacts. Monitoring and quantification of marine litter often do not include reporting of interactions between fauna and litter, meaning impacts are largely unconsidered and unknown. It is important to be able to quantify anthropogenic effects on vulnerable deep-sea habitat to ensure adequate protection. A standardised, comprehensive framework for the reporting of Litter-Fauna interactions from imagery was created from a literature review and includes 6 major categories: entanglement, ingestion, smothering, habitat provision, adaptive behaviour, and encountering (entanglement and smothering occur on abiotic features as well). Litter can cause long term impacts such as smoothing of corals or even snagging of discarded fishing gear; the L-F framework allows extraction of additional information from imagery. The framework gives clear definition and examples to allow standardised, repeated implementation. This work identifies the gaps in knowledge and presents a standardised litter-fauna interaction framework for deep-sea imagery data, which was applied to a case study to test the applicability.

Understanding the Current State of Southern Ocean Benthic Ecosystems Using Deep Computer Vision

Cameron Trotter , British Antarctic Survey

Huw Griffiths (hjg@bas.ac.uk) & Rowan Whittle (roit@bas.ac.uk), British Antarctic Survey

Loss of marine biodiversity is a key issue facing the modern world. The removal of species from an environment can have profound effects on the overall ecosystem structure, though to what degree any species contributes to ecosystem stability is often unknown until they are removed. Due to its remoteness, relatively little is known about the benthic ecosystems situated in the Southern Ocean and around Antarctica. This region is also one of the most vulnerable to climate change, and is currently one of the fastest warming areas on the planet. Traditionally, our understanding of Southern Ocean biodiversity has relied on nets or devices to bring benthic organisms to the surface, though this is intrinsically destructive and fails to detail community structure. The development of underwater imaging technologies has allowed for the capturing of this information non-destructively in-situ, though localising and classifying the organisms within these images can be time-consuming and requires specialist expertise, given that many organisms found here are seen nowhere else on Earth. This has resulted in a data bottleneck, greatly limiting our understanding of the Southern Ocean’s benthic ecosystems and the effect of climate change on them. To help combat this, we present the development of a deep computer vision model trained to detect key taxa captured in Southern Ocean benthic imagery. This model is trained using only a small subset of labelled images captured from a downward facing towed camera. Once trained, the model is capable of processing unlabelled imagery in an autonomous manner, requiring only human verification of system output. This allows analysis to be performed faster and over a larger spatio-temporal range when compared to a fully-human approach, providing a clearer picture of the current state of the Southern Ocean’s benthic ecosystems. Ultimately this allows for the development of more comprehensive protection strategies for these ecosystems.

Unveiling a hidden sanctuary: first description of mesophotic gorgonian assemblages thriving in a coralligenous reef, close to a tuna fish farm in l’Ametlla de Mar (Spain)

Judith Camps-Castella , Departament de Biologia Evolutiva, Ecologia i Ciències Ambientals, Facultat de Biologia, Universitat de Barcelona, Avda. Diagonal 643, 08028 Barcelona, Spain

Patricia Prado. Instituto de Investigación en Medio Ambiente y Ciencia Marina (IMEDMAR-UCV), Universidad Católica de Valencia SVM, C/Explanada del Puerto S/n, 03710 Calpe, Alicante, Spain.

Mesophotic ecosystems (30-120 m) may be less affected by thermal stress from climate change and provide crucial refuge to vulnerable species. This study provides the first characterization of a Mediterranean coralligenous reef locally referred as “El Ramonet” located near a large tuna farm in l’Ametlla de Mar (Southern Catalonia, Spain), hosting gorgonians such as Paramuricea clavata and Eunicella spp. at 50 m depth. The biodiversity of the megabenthic assemblages, including the demographic structure of P. clavata, was described in six video transects taken at different distances from the fish farm using Remotely Operated Vehicle (ROV) video imaging. Also, we calculated MAES Index to evaluate the ecological status of coralligenous bioconstructions. Our results show that the reef host one of the largest colony sizes ever recorded for P. clavata in the Mediterranean Sea (ca. 113 cm), with the population mostly dominated by large colonies. Video analysis revealed 387 gorgonian colonies with highest abundances recorded in a transect near the fish farm (mean: 1.88 ± 0.42 colonies · m2). For Eunicella spp., the highest abundance was found away from the farm (1.12 ± 0.12 colonies · m2). The MAES Index revealed that transects closest to the fish farm were in the worst condition, resulting from the presence of branches with necrosis and discarded fishing gear. Further, Shannon-Wiener diversity index for sessile fauna showed homogeneous patterns across transects (0.28-0.17) except for significantly lower traits in a site near the fish farm (0.09 ± 0.03). This study proclaims the significance of this unique reef as a hotspot for the conservation of P. clavata and associated species. The unique features associated to mesophotic environments might provide an advantage for the species in the face of climate change, but management and conservation initiatives are also necessary to protect the site from other local human activities.

Using Computer Vision to Objectively Quantify Features in Geological Drill Core

Lewis Grant , University of Southampton

Rosalind M. Coggon (University of Southampton), Blair Thornton (University of Southampton), and Damon A. H. Teagle (University of Southampton)

Ocean crust paves two thirds of Earth’s surface and its physiochemical exchanges with seawater circulating through it exert a primary control on oceanic and atmospheric composition through time. Quantifying the rate and nature of this exchange of heat, elements and nutrients is imperative if we are to understand long term climate, changing ocean conditions, and possible sites for the origin of life. Drilling into in situ ocean crust, as well as ancient crustal slabs obducted onto land (ophiolites) provides valuable records of these interactions. Geologists painstakingly document the distributions of petrographic features in recovered cores, based on visual assessments. Core description is therefore limited to the time and resources available during core recovery and prone to human bias. As cores are often digitally imaged, we have explored the potential benefits computer vision provides to the expert geologist tasked with describing the recovered material. We have demonstrated that self-supervised machine learning methods are capable of accurately identifying and quantifying features within core images, and that the addition of spatial metadata during network training improves accuracy by as much as 50 %. Using the classification output of a spatially-guided contrastive learning model (GeoCLR), important geological information has been quantified through 400 m of core at a resolution impossible for a human expert to generate in the same time. This work highlights the promising potential of using machine learning to extract valuable information from large unutilised image datasets, although improvements in core imaging best-practices are needed to improve the machine readability of future core imagery. Furthermore, automated core description has wide-reaching potential to improve the efficiency of numerous industries such as geoengineering, remote sensing and mineral exploration.

Using machine learning to investigate temporal dynamics of methane seep fauna at the Ocean Observatories Initiative Regional Cabled Array

Katharine T. Bigham , University of Washington

Michael F. Vardaro, University of Washington; Deborah S. Kelley, University of Washington

Methane hydrate seeps are unique chemosynthetic marine environments where organisms use subsurface-derived chemicals and volatiles rather than sunlight to create energy. Seeps are found on continental margins worldwide; >800 gas emission sites have been imaged along the Cascadia Margin. The seeps provide important services such as energy storage in the form of methane hydrates, habitats for commercially important fish stocks, and contribute to regional biodiversity and productivity. One of these methane seeps, Southern Hydrate Ridge (SHR), has hosted a diverse suite of instrumentation on NSF’s Ocean Observatories Initiative (OOI) Regional Cabled Array (RCA) that has been streaming real-time data to shore for a decade. Among the instruments is a SubC digital still camera, which takes images every half an hour resulting in >50,000 images in a year. Additionally, ROV imagery (still, HD, 4k) are collected at the site annually as part of the RCA operations and maintenance cruises. In concert, this has amounted to a collection of >37 TB of visual data that has the potential to provide important insights into the evolution of these seeps over time, as well as into the dynamics of the benthic fauna community that thrives there. Due to the time-consuming nature of manually annotating the large amount of SHR imagery that the RCA has collected and continues to collect, machine learning pipelines are being developed to annotate imagery as it is captured. This effort will serve as an important contribution, by making the OOI-RCA imagery more accessible to researchers and the public, and product integration with a wealth of available stored and real-time environmental data. We will present preliminary results from those pipelines, discuss methods being used to integrate human verification in the pipeline, give an example of the type of data produced by the pipeline, and discuss the ecologic questions that we aim to answer.