Accessing real-time telemetry data for extensions using the Telemetry API (original) (raw)

The Telemetry API enables your extensions to receive telemetry data directly from Lambda. During function initialization and invocation, Lambda automatically captures telemetry, including logs, platform metrics, and platform traces. The Telemetry API enables extensions to access this telemetry data directly from Lambda in near real time.

Within the Lambda execution environment, you can subscribe your Lambda extensions to telemetry streams. After subscribing, Lambda automatically sends all telemetry data to your extensions. You then have the flexibility to process, filter, and dispatch the data to your preferred destination, such as an Amazon Simple Storage Service (Amazon S3) bucket or a third-party observability tools provider.

The following diagram shows how the Extensions API and Telemetry API link extensions to Lambda from within the execution environment. Additionally, the Runtime API connects your runtime and function to Lambda.

The Extensions, Telemetry, and Runtime APIs connecting to processes in the execution environment.

Important

The Lambda Telemetry API supersedes the Lambda Logs API. While the Logs API remains fully functional, we recommend using only the Telemetry API going forward. You can subscribe your extension to a telemetry stream using either the Telemetry API or the Logs API. After subscribing using one of these APIs, any attempt to subscribe using the other API returns an error.

Extensions can use the Telemetry API to subscribe to three different telemetry streams:

Note

Lambda sends logs and metrics to CloudWatch, and traces to X-Ray (if you've activated tracing), even if an extension subscribes to telemetry streams.

Sections

Creating extensions using the Telemetry API

Lambda extensions run as independent processes in the execution environment. Extensions can continue to run after function invocation completes. Because extensions are separate processes, you can write them in a language different from the function code. We recommend writing extensions using a compiled language such as Golang or Rust. This way, the extension is a self-contained binary that can be compatible with any supported runtime.

The following diagram illustrates a four-step process to create an extension that receives and processes telemetry data using the Telemetry API.

Register your extension, create a listener, subscribe to a stream, and then get telemetry.

Here is each step in more detail:

  1. Register your extension using the Using the Lambda Extensions API to create extensions. This provides you with aLambda-Extension-Identifier, which you'll need in the following steps. For more information about how to register your extension, see Registering your extension.
  2. Create a telemetry listener. This can be a basic HTTP or TCP server. Lambda uses the URI of the telemetry listener to send telemetry data to your extension. For more information, see Creating a telemetry listener.
  3. Using the Subscribe API in the Telemetry API, subscribe your extension to the desired telemetry streams. You'll need the URI of your telemetry listener for this step. For more information, see Sending a subscription request to the Telemetry API.
  4. Get telemetry data from Lambda via the telemetry listener. You can do any custom processing of this data, such as dispatching the data to Amazon S3 or to an external observability service.
Note

A Lambda function's execution environment can start and stop multiple times as part of its lifecycle. In general, your extension code runs during function invocations, and also up to 2 seconds during the shutdown phase. We recommend batching the telemetry as it arrives to your listener. Then, use the Invoke and Shutdown lifecycle events to send each batch to their desired destinations.

Registering your extension

Before you can subscribe to telemetry data, you must register your Lambda extension. Registration occurs during the extension initialization phase. The following example shows an HTTP request to register an extension.

POST http://${AWS_LAMBDA_RUNTIME_API}/2020-01-01/extension/register
 Lambda-Extension-Name: lambda_extension_name
{
    'events': [ 'INVOKE', 'SHUTDOWN']
}

If the request succeeds, the subscriber receives an HTTP 200 success response. The response header contains the Lambda-Extension-Identifier. The response body contains other properties of the function.

HTTP/1.1 200 OK
Lambda-Extension-Identifier: a1b2c3d4-5678-90ab-cdef-EXAMPLE11111
{
    "functionName": "lambda_function",
    "functionVersion": "$LATEST",
    "handler": "lambda_handler",
    "accountId": "123456789012"
}

For more information, see the Extensions API reference.

Creating a telemetry listener

Your Lambda extension must have a listener that handles incoming requests from the Telemetry API. The following code shows an example telemetry listener implementation in Golang:

// Starts the server in a goroutine where the log events will be sent
func (s *TelemetryApiListener) Start() (string, error) {
    address := listenOnAddress()
    l.Info("[listener:Start] Starting on address", address)
    s.httpServer = &http.Server{Addr: address}
    http.HandleFunc("/", s.http_handler)
    go func() {
        err := s.httpServer.ListenAndServe()
        if err != http.ErrServerClosed {
            l.Error("[listener:goroutine] Unexpected stop on Http Server:", err)
            s.Shutdown()
        } else {
            l.Info("[listener:goroutine] Http Server closed:", err)
        }
    }()
    return fmt.Sprintf("http://%s/", address), nil
}

// http_handler handles the requests coming from the Telemetry API.
// Everytime Telemetry API sends log events, this function will read them from the response body
// and put into a synchronous queue to be dispatched later.
// Logging or printing besides the error cases below is not recommended if you have subscribed to
// receive extension logs. Otherwise, logging here will cause Telemetry API to send new logs for
// the printed lines which may create an infinite loop.
func (s *TelemetryApiListener) http_handler(w http.ResponseWriter, r *http.Request) {
    body, err := ioutil.ReadAll(r.Body)
    if err != nil {
        l.Error("[listener:http_handler] Error reading body:", err)
        return
    }

    // Parse and put the log messages into the queue
    var slice []interface{}
    _ = json.Unmarshal(body, &slice)

    for _, el := range slice {
        s.LogEventsQueue.Put(el)
    }

    l.Info("[listener:http_handler] logEvents received:", len(slice), " LogEventsQueue length:", s.LogEventsQueue.Len())
    slice = nil
}

Specifying a destination protocol

When you subscribe to receive telemetry using the Telemetry API, you can specify a destination protocol in addition to the destination URI:

{
    "destination": {
        "protocol": "HTTP",
        "URI": "http://sandbox.localdomain:8080"
    }
}

Lambda accepts two protocols for receiving telemetry:

Note

We strongly recommend using HTTP rather than TCP. With TCP, the Lambda platform cannot acknowledge when it delivers telemetry to the application layer. Therefore, if your extension crashes, you might lose telemetry. HTTP does not have this limitation.

Before subscribing to receive telemetry, establish the local HTTP listener or TCP port. During setup, note the following:

Configuring memory usage and buffering

Memory usage in an execution environment grows linearly with the number of subscribers. Subscriptions consume memory resources because each one opens a new memory buffer to store telemetry data. Buffer memory usage contributes to the overall memory consumption in the execution environment.

When subscribing to receive telemetry through the Telemetry API, you have the option to buffer telemetry data and deliver it to subscribers in batches. To optimize memory usage, you can specify a buffering configuration:

{
    "buffering": {
        "maxBytes": 256*1024,
        "maxItems": 1000,
        "timeoutMs": 100
    }
}
Parameter Description Defaults and limits
maxBytes The maximum volume of telemetry (in bytes) to buffer in memory. Default: 262,144 Minimum: 262,144 Maximum: 1,048,576
maxItems The maximum number of events to buffer in memory. Default: 10,000 Minimum: 1,000 Maximum: 10,000
timeoutMs The maximum time (in milliseconds) to buffer a batch. Default: 1,000 Minimum: 25 Maximum: 30,000

When setting up buffering, keep these points in mind:

{  
   "time": "2022-08-20T12:31:32.123Z",  
   "type": "function",  
   "record": "Hello World"  
}  

Sending a subscription request to the Telemetry API

Lambda extensions can subscribe to receive telemetry data by sending a subscription request to the Telemetry API. The subscription request should contain information about the types of events that you want the extension to subscribe to. In addition, the request can contain delivery destination information and a buffering configuration.

Before sending a subscription request, you must have an extension ID (Lambda-Extension-Identifier). When you register your extension with the Extensions API, you obtain an extension ID from the API response.

Subscription occurs during the extension initialization phase. The following example shows an HTTP request to subscribe to all three telemetry streams: platform telemetry, function logs, and extension logs.

PUT http://${AWS_LAMBDA_RUNTIME_API}/2022-07-01/telemetry HTTP/1.1
{
   "schemaVersion": "2022-12-13",
   "types": [
        "platform",
        "function",
        "extension"
   ],
   "buffering": {
        "maxItems": 1000,
        "maxBytes": 256*1024,
        "timeoutMs": 100
   },
   "destination": {
        "protocol": "HTTP",
        "URI": "http://sandbox.localdomain:8080"
   }
}

If the request succeeds, then the subscriber receives an HTTP 200 success response.

HTTP/1.1 200 OK
"OK"

Inbound Telemetry API messages

After subscribing using the Telemetry API, an extension automatically starts to receive telemetry from Lambda via POST requests. Each POST request body contains an array of Event objects. Each Event has the following schema:

{
   time: String,
   type: String,
   record: Object
}

The following table summarizes all types of Event objects, and links to the Telemetry API Event schema reference for each event type.

Category Event type Description Event record schema
Platform event platform.initStart Function initialization started. platform.initStart schema
Platform event platform.initRuntimeDone Function initialization completed. platform.initRuntimeDone schema
Platform event platform.initReport A report of function initialization. platform.initReport schema
Platform event platform.start Function invocation started. platform.start schema
Platform event platform.runtimeDone The runtime finished processing an event with either success or failure. platform.runtimeDone schema
Platform event platform.report A report of function invocation. platform.report schema
Platform event platform.restoreStart Runtime restore started. platform.restoreStart schema
Platform event platform.restoreRuntimeDone Runtime restore completed. platform.restoreRuntimeDone schema
Platform event platform.restoreReport Report of runtime restore. platform.restoreReport schema
Platform event platform.telemetrySubscription The extension subscribed to the Telemetry API. platform.telemetrySubscription schema
Platform event platform.logsDropped Lambda dropped log entries. platform.logsDropped schema
Function logs function A log line from function code. function schema
Extension logs extension A log line from extension code. extension schema