isimip-qc (original) (raw)

ISIMIP quality control

Python Version License CI status Latest release DOI

A command line tool for the quality control of climate impact data of the ISIMIP project. It mainly covers tests of:

Setup

The application is written in Python (> 3.11) uses only dependencies, which can be installed without administrator privileges. The installation of Python (and its developing packages), however differs from operating system to operating system. Optional Git is needed if the application is installed directly from GitHub. The installation of Python 3 and Git for different platforms is documented here.

The tool itself can be installed via pip. Usually you want to create a virtual environment first, but this is optional. The tool works also with pipx.

setup venv on Linux/macOS/Windows WSL

python3 -m venv env source env/bin/activate

setup venv on Windows cmd

python -m venv env call env\Scripts\activate.bat

install from the Python Package Index (PyPI), recommended

pip install isimip-qc

update from PyPI

pip install --upgrade isimip-qc

install directly from GitHub

pip install git+https://github.com/ISI-MIP/isimip-qc

update directly from GitHub

pip install --upgrade git+https://github.com/ISI-MIP/isimip-qc

Usage

The tool has several options which can be inspected using the help option -h, --help:

usage: isimip-qc [-h] [-c] [-m] [-O] [--unchecked-path UNCHECKED_PATH] [--checked-path CHECKED_PATH] [--protocol-location PROTOCOL_LOCATIONS] [--log-level LOG_LEVEL] [--show-time] [--show-path] [--log-path LOG_PATH] [--log-path-level LOG_PATH_LEVEL] [--include INCLUDE] [--exclude EXCLUDE] [-f] [-w] [-e] [--ignore-critical] [--skip-exp] [--match-only] [-r [MINMAX]] [-nt] [--summary] [--fix] [--fix-datamodel [FIX_DATAMODEL]] [--check CHECK] [--force-copy-move] [-V] schema_path

Check ISIMIP files for matching protocol definitions

positional arguments: schema_path ISIMIP schema_path, e.g. ISIMIP3a/OutputData/water_global

options: -h, --help show this help message and exit -c, --copy copy checked files to CHECKED_PATH if no warnings or errors were found -m, --move move checked files to CHECKED_PATH if no warnings or errors were found -O, --overwrite overwrite files in CHECKED_PATH if present. Default is False. --unchecked-path UNCHECKED_PATH base path of the unchecked files --checked-path CHECKED_PATH base path for the checked files --protocol-location PROTOCOL_LOCATIONS URL or file path to the protocol when different from official repository --log-level LOG_LEVEL log level (CRITICAL, ERROR, WARN, CHECKING, INFO, or DEBUG) [default: CHECKING] --show-time show time in console logs --show-path show path in console logs --log-path LOG_PATH base path for the individual log files --log-path-level LOG_PATH_LEVEL log level for the individual log files [default: WARN] --include INCLUDE patterns of files to include. Exclude those that don't match any. --exclude EXCLUDE patterns of files to exclude. Include only those that don't match any. -f, --first-file only process first file found in UNCHECKED_PATH -w, --stop-on-warnings stop execution on warnings -e, --stop-on-errors stop execution on errors --ignore-critical allow fixing and copy/move files with critical issues found --skip-exp skip test for valid experiment combination --match-only only match the file name and skip all other checks -r [MINMAX], --minmax [MINMAX] test values for valid range (slow). MINMAX denotes the length of the ordered top list of outliers -nt, --skip-time-span-check skip check for simulated time period --summary append a summary with statistics about experiments and specifiers to the output --fix try to fix warnings detected on the original files --fix-datamodel [FIX_DATAMODEL] also fix warnings on data model found using NCCOPY or CDO (slow). Choose preferred tool per lower case argument. --check CHECK perform only one particular check --force-copy-move copy or move files despite errors -V, --version show program's version number and exit

The only mandatory argument is the schema_path, which specifies the pattern and schema to use. The schema_path consitst of the simulation_round, the product, and the sector separated by slashes, e.g. ISIMIP3a/OutputData/water_global. If the only argument used is schema_path, the current user path when calling the tool should be same as the directory of the files to be checked.

The options in detail