Process Discovery through Assimilation of Complex Biogeochemical Datasets (original) (raw)

This white paper addresses two of three focal areas identified in the white paper call: 1) Biogeochemical data acquisition and assimilation enabled by machine learning and 3) Insight gleaned from complex data using AI. We focus on AI application to complex biogeochemistry (BGC) data (e.g. laboratory experimental data, field manipulation data, literature data), which is an untapped source of information for improving Earth System Predictability (ESP). Science Challenge Laboratory experiments and field manipulation experiments are critical to interrogate the impact of hydrological and climate perturbations on BGC processes in a controlled environment. Hydrologicallydriven BGC processes are a key aspect of ESP, particularly at dynamic interfaces (e.g. terrestrial-aquatic interfaces, hot spots/hot moments), because BGC processes govern the cycling of nutrients, metals, and organic matter. However, BGC experiments yield a complex array of data types (spatiotemporal observational and laboratory measurements, microscopy image data, spectroscopy data, etc.), which hampers assimilation and analysis, as well as the application of machine learning (ML). Ensuring that these data are findable, accessible, interoperable, and reusable (FAIR) is of paramount importance. AI/ML methods are poised to transform the way we incorporate complex BGC laboratory and field manipulation experimental data into earth and environmental systems models, quantify data uncertainty, design future experiments, and develop new models with unprecedented fidelity and resolution. 1