Customized Metrics (original) (raw)

This is an example demonstrating how to add your customized Python side Prometheus metrics.

Mosec already has the Rust side metrics, including:

If you need to monitor more details about the inference process, you can add some Python side metrics. E.g., the inference result distribution, the duration of some CPU-bound or GPU-bound processing, the IPC time (get from rust_step_duration - python_step_duration).

This example has a simple WSGI app as the monitoring metrics service. In each worker process, the Counter will collect the inference results and export them to the metrics service. For the inference part, it parses the batch data and compares them with the average value.

For more information about the multiprocess mode for the metrics, check the Prometheus doc.

python_side_metrics.py

Copyright 2022 MOSEC Authors

Licensed under the Apache License, Version 2.0 (the "License");

you may not use this file except in compliance with the License.

You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software

distributed under the License is distributed on an "AS IS" BASIS,

WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

See the License for the specific language governing permissions and

limitations under the License.

"""Example: Adding metrics service."""

import os import pathlib import tempfile from typing import List

from prometheus_client import ( # type: ignore CollectorRegistry, Counter, multiprocess, start_http_server, )

from mosec import Server, ValidationError, Worker, get_logger

logger = get_logger()

check the PROMETHEUS_MULTIPROC_DIR environment variable before import Prometheus

if not os.getenv("PROMETHEUS_MULTIPROC_DIR"): metric_dir_path = os.path.join(tempfile.gettempdir(), "prometheus_multiproc_dir") pathlib.Path(metric_dir_path).mkdir(parents=True, exist_ok=True) os.environ["PROMETHEUS_MULTIPROC_DIR"] = metric_dir_path

metric_registry = CollectorRegistry() multiprocess.MultiProcessCollector(metric_registry) counter = Counter( "inference_result", "statistic of result", ("status", "worker_id"), registry=metric_registry, )

class Inference(Worker): """Sample Inference Worker."""

def __init__(self):
    super().__init__()
    self.worker_id = str(self.worker_id)

def deserialize(self, data: bytes) -> int:
    json_data = super().deserialize(data)
    try:
        res = int(json_data.get("num"))
    except Exception as err:
        raise ValidationError(err) from err
    return res

def forward(self, data: List[int]) -> List[bool]:
    avg = sum(data) / len(data)
    ans = [x >= avg for x in data]
    counter.labels(status="true", worker_id=self.worker_id).inc(sum(ans))
    counter.labels(status="false", worker_id=self.worker_id).inc(
        len(ans) - sum(ans)
    )
    return ans

if name == "main": # Run the metrics server in another thread. start_http_server(5000, registry=metric_registry)

# Run the inference server
server = Server()
server.append_worker(Inference, num=2, max_batch_size=8)
server.run()

Start

python python_side_metrics.py

Test

http POST :8000/inference num=1

Check the Python side metrics

Check the Rust side metrics

How to build monitoring system for Mosec

In this tutorial, we will explain how to build monitoring system for Mosec, which includes Prometheus and Grafana.

Prerequisites

Before starting, you need to have Docker and Docker Compose installed on your machine. If you don’t have them installed, you can follow the instructions get-docker and compose to install them.

Starting the monitoring system

Clone the repository containing the docker-compose.yaml file:

git clone https://github.com/mosecorg/mosec.git

Navigate to the directory containing the docker-compose.yaml file:

cd mosec/examples/monitor

Start the monitoring system by running the following command:

This command will start three containers: Mosec, Prometheus, and Grafana.

Test

Run test and feed metrics to Prometheus.

http POST :8000/inference num=1

Accessing Prometheus

Prometheus is a monitoring and alerting system that collects metrics from Mosec. You can access the Prometheus UI by visiting http://127.0.0.1:9090 in your web browser.

Accessing Grafana

Grafana is a visualization tool for monitoring and analyzing metrics. You can access the Grafana UI by visiting http://127.0.0.1:3000 in your web browser. The default username and password are both admin.

Stopping the monitoring system

To stop the monitoring system, run the following command:

This command will stop and remove the containers created by Docker Compose.