OpenMetrics: Is Prometheus unbound?

This article will introduce OpenMetrics, compare it to the Prometheus exposition format and review the current status of its implementation.

This is the written evolution of a lightning talk I gave on May, 19th (2019) at Cloud_Native Rejekts in Barcelona 🇪🇸. It was originally published on Leornardo’s medium.

Here, you can find the accompanying slides. 👨‍🏫

Historically, the monitoring landscape has been a mess; today, it still is. It’s even worse given how software architectures have changed with all of the cloud-native principles.

As “techies”, we need to do something about this. Otherwise, we’ll remain chained up by an inability to properly observe our own platforms and applications.

The titan Prometheus unchained by Hercules, by Christian Griepenkerl.

Discover #OpenMetrics, the open standard for transmitting metrics at scale 📈, how it complements #Prometheus 🔥, and how can you start using it. Click to tweet

Here comes OpenMetrics.

It’s a huge effort to create a standard specifically designed for exposing metric data and providing tools to go beyond metrics. That’s because observability also consists of logging events and correlating things.

At the moment, it’s still a draft, sandboxed by the CNCF, but the final goal is to have an RFC.

By the way, please don’t ask me when … !

Anyway, don’t get scared. It’s not complex, and not really something you’ll need to learn from scratch. It’s just a wire protocol, a lingua franca in the whole observability story being built using the Prometheus exposition format as the starting point.

The OpenMetrics journey so far

Everyone knows the Prometheus exposition format nowadays, right?

It simply displays metrics line-by-line in a text-based format, and supports the histogram, gauge, counter and summary metric types.

And yes, it’s true that Prometheus is really good at doing metrics, and … nothing else. 🤷‍♂️

Prometheus has had a broader adoption, and it’s happened really fast. Because its format is simple and easy to read, Prometheus enforces labels rather than hierarchies. That’s also because there was nothing out there in the field…

What did we have before Prometheus?

The awesome SNMP? 🤦‍♂ What else? Other proprietary formats, with missing docs that were difficult to implement, contained hierarchical and rigid data models, or had all of these flaws together?

So one could argue it’s been an easy victory! But rest assured, there are no easy wins in computers. At least not lasting ones…

The standardization

So why do we need OpenMetrics? The truth is, we don’t.

The world needs it, in particular, the part of the world which is not cloud-native. I mean the traditional networking, storage and hardware vendors.

Yes, politics is part of the world: traditional vendors may want to avoid lock-in or to appear to support external (or competing) products. They usually prefer to use official standards only, and that’s perfectly fine.

OpenMetrics is about enabling all of the systems to ingest and emit data in a certain wire format, and also agreeing on what that wire format should be. To talk to each other about themselves over HTTP. OpenMetrics does not intend to prescribe what you must do on the other end. Its aim is to introduce the concept of n-dimensional spaces via labels into the world. And by doing this, OpenMetrics will totally kill the concept of hierarchical data models, which is one of its goals.

The way to achieve this is to create an official open standard: a set of vendor neutral guidelines, with no brand, available to all to read and implement in order to foster cooperation to solve a shared problem.

OpenMetrics Novelties

The Prometheus exposition format and OpenMetrics will primarily be the same, except for some improvements:

An official IANA port assignment (probably).
A registered content-type/mime-type.

application/openmetrics-text

New descriptor directive to represent the unit of a metric (it will work for all metric types except for info and state sets) — ie., UNIT.

# UNIT foo_seconds seconds

Single lines will always ending with a LINE FEED (ie., \n) but metric sets will need an end marker — ie., # EOF — to help detect responses that got cut off.

# HELP test_m Bla bla bla description no one really reads
# TYPE test_m counter
# UNIT potatoes 🥔
test_m{...} x
test_m{...} x
# EOF

UNIX timestamps in seconds.
Same escape rules for HELP directive and labels.
Better handling of white spaces between tokens.

OpenMetrics also introduces new metric types:

state sets for representing enums, bitmasks and generally booleans.

# TYPE foo stateset
foo{entity="controller",foo="a"} 0
foo{entity="controller",foo="b"} 1
foo{entity="replica",foo="a"} 0
foo{entity="replica",foo="b"} 1
# EOF

info metrics consisting of metrics to monitor how info changes over labels (and/or time).

# TYPE x info
x_info{entity="ctrl",name="pretty",version="8.1"} 1
x_info{entity="repl",name="prettier",version="8.2"} 1

gauge histograms — ie., just histograms for gauges.
unknown — ie., what Prometheus calls untyped metric.

And lastly, exemplars, at most one per histogram’s (or gauge histogram’s) bucket.

They are a mechanism which allows you to attach an ID off a trace to directly link to that trace.

For example, imagine that you know your latency in a bucket is more than 60 seconds and you want to know why this is happening. Now, you have exactly this trace over to that since you have the link to that other trace ID.

foo_bucket{le="0.1"} 8 # {} 0.054
foo_bucket{le="1"} 10 # {id="9856e"} 0.67
foo_bucket{le="10"} 17 # {id="12fa8"} 9.8 1520879607.789

Clearly, not every monitoring solution will support exemplars. In fact, they are designed to be optional.

For instance, the plan of Prometheus at the moment is to ignore them. Instead, OpenCensus (and others soon) will support them.

State of the Art

Prometheus supports OpenMetrics since the version 2.5.0, and it can scrape OpenMetrics endpoints in a transparent way.

You can emit OpenMetrics with the official python client, since it implements an experimental first reference parser of the existing draft for the text format.

Very important companies, such as Google and Uber, are supporting OpenMetrics and will soon implement other reference parsers.

OpenCensus, which is going to merge with OpenTracing in an effort called OpenTelemetry, will incorporate OpenMetrics.

If you want to start implementing OpenMetrics, or porting your hand-made endpoints, there is out there a one-liner, using the aforementioned Prometheus python client, to verify they conform to the current standard.

import sys
from prometheus_client.openmetrics import parser
s = """test_packets{key="a",node="b"} 8
# EOF
"""
list(parser.text_string_to_metric_families(s))

You could also use the official Golang library, as it included experimental support earlier this year for a subset of OpenMetrics.
And a few months ago, a CLI to test the OpenMetrics parser and the consequent exposition format was published.

Nothing more yet.

And, since it has been in the making for almost three years now, this is an issue!

We all know that simple looking things are usually difficult to do well.

What is missing is good constant communication, and maybe tools like a test suite to develop against, to let the community help definitely shape it.

And finally give birth to it!

We recently announced full Prometheus compatibility in Sysdig Monitor, which includes support for OpenMetrics among other cool features. And lastly, if you enjoyed this article, you might also be interested in learning how to instrument your code with Prometheus metrics and OpenMetrics.