All posts by Richard "RichiH" Hartmann

Prometheus Conformance Program: First round of results

Post Syndicated from Richard "RichiH" Hartmann original https://prometheus.io/blog/2021/05/14/prometheus-conformance-results/


Today, we’re launching the Prometheus Conformance Program. While the legal paperwork still needs to be finalized, we ran tests, and we consider the below our first round of results. As part of this launch Julius Volz updated his PromQL test results.

As a quick reminder: The program is called Prometheus Conformance, software can be compliant to specific tests, which result in a compatibility rating. The nomenclature might seem complex, but it allows us to speak about this topic without using endless word snakes.

Preamble

New Categories

We found that it’s quite hard to reason about what needs to be applied to what software. To help sort my thoughts, we created an overview, introducing four new categories we can put software into:
* Metrics Exposers
* Agent/Collector
* Prometheus Storage Backends
* Full Prometheus Compatibility

Call for Action

Feedback is very much welcome. Maybe counter-intuitively, we want the community, not just Prometheus-team, to shape this effort. To help with that, we will launch a WG Conformance within Prometheus. As with WG Docs and WG Storage, those will be public and we actively invite participation.

As we alluded to recently, the maintainer/adoption ratio of Prometheus is surprisingly, or shockingly, low. In different words, we hope that the economic incentives around Prometheus Compatibility will entice vendors to assign resources in building out the tests with us. If you always wanted to contribute to Prometheus during work time, this might be the way; and a way that will have you touch a lot of highly relevant aspects of Prometheus. There’s a variety of ways to get in touch with us.

Register for being tested

You can use the same communication channels to get in touch with us if you want to register for being tested. Once the paperwork is in place, we will hand contact information and contract operations over to CNCF.

Test results

Full Prometheus Compatibility

We know what tests we want to build out, but we are not there yet. As announced previously, it would be unfair to hold this against projects or vendors. As such, test shims are defined as being passed. The currently semi-manual nature of e.g. the PromQL tests Julius ran this week mean that Julius tested sending data through Prometheus Remote Write in most cases as part of PromQL testing. We’re reusing his results in more than one way here. This will be untangled soon, and more tests from different angles will keep ratcheting up the requirements and thus End User confidence.

It makes sense to look at projects and aaS offerings in two sets.

Projects

Passing

  • Cortex 1.10.0
  • M3 1.3.0
  • Promscale 0.6.2
  • Thanos 0.23.1

Not passing

VictoriaMetrics 1.67.0 is not passing and does not intend to pass. In the spirit of End User confidence, we decided to track their results while they position themselves as a drop-in replacement for Prometheus.

aaS

Passing

  • Chronosphere
  • Grafana Cloud

Not passing

  • Amazon Managed Service for Prometheus
  • Google Cloud Managed Service for Prometheus
  • New Relic
  • Sysdig Monitor

NB: As Amazon Managed Service for Prometheus is based on Cortex just like Grafana Cloud, we expect them to pass after the next update cycle.

Agent/Collector

Passing

  • Grafana Agent 0.19.0
  • OpenTelemetry Collector 0.37.0
  • Prometheus 2.30.3

Not passing

  • Telegraf 1.20.2
  • Timber Vector 0.16.1
  • VictoriaMetrics Agent 1.67.0

NB: We tested Vector 0.16.1 instead of 0.17.0 because there are no binary downloads for 0.17.0 and our test toolchain currently expects binaries.

Prometheus Conformance Program: First round of results

Post Syndicated from Richard "RichiH" Hartmann original https://prometheus.io/blog/2021/10/14/prometheus-conformance-results/


Today, we’re launching the Prometheus Conformance Program with the goal of ensuring interoperability between different projects and vendors in the Prometheus monitoring space. While the legal paperwork still needs to be finalized, we ran tests, and we consider the below our first round of results. As part of this launch Julius Volz updated his PromQL test results.

As a quick reminder: The program is called Prometheus Conformance, software can be compliant to specific tests, which result in a compatibility rating. The nomenclature might seem complex, but it allows us to speak about this topic without using endless word snakes.

Preamble

New Categories

We found that it’s quite hard to reason about what needs to be applied to what software. To help sort my thoughts, we created an overview, introducing four new categories we can put software into:

  • Metrics Exposers
  • Agents/Collectors
  • Prometheus Storage Backends
  • Full Prometheus Compatibility

Call for Action

Feedback is very much welcome. Maybe counter-intuitively, we want the community, not just Prometheus-team, to shape this effort. To help with that, we will launch a WG Conformance within Prometheus. As with WG Docs and WG Storage, those will be public and we actively invite participation.

As we alluded to recently, the maintainer/adoption ratio of Prometheus is surprisingly, or shockingly, low. In different words, we hope that the economic incentives around Prometheus Compatibility will entice vendors to assign resources in building out the tests with us. If you always wanted to contribute to Prometheus during work time, this might be the way; and a way that will have you touch a lot of highly relevant aspects of Prometheus. There’s a variety of ways to get in touch with us.

Register for being tested

You can use the same communication channels to get in touch with us if you want to register for being tested. Once the paperwork is in place, we will hand contact information and contract operations over to CNCF.

Test results

Full Prometheus Compatibility

We know what tests we want to build out, but we are not there yet. As announced previously, it would be unfair to hold this against projects or vendors. As such, test shims are defined as being passed. The currently semi-manual nature of e.g. the PromQL tests Julius ran this week mean that Julius tested sending data through Prometheus Remote Write in most cases as part of PromQL testing. We’re reusing his results in more than one way here. This will be untangled soon, and more tests from different angles will keep ratcheting up the requirements and thus End User confidence.

It makes sense to look at projects and aaS offerings in two sets.

Projects

Passing

  • Cortex 1.10.0
  • M3 1.3.0
  • Promscale 0.6.2
  • Thanos 0.23.1

Not passing

VictoriaMetrics 1.67.0 is not passing and does not intend to pass. In the spirit of End User confidence, we decided to track their results while they position themselves as a drop-in replacement for Prometheus.

aaS

Passing

  • Chronosphere
  • Grafana Cloud

Not passing

  • Amazon Managed Service for Prometheus
  • Google Cloud Managed Service for Prometheus
  • New Relic
  • Sysdig Monitor

NB: As Amazon Managed Service for Prometheus is based on Cortex just like Grafana Cloud, we expect them to pass after the next update cycle.

Agent/Collector

Passing

  • Grafana Agent 0.19.0
  • OpenTelemetry Collector 0.37.0
  • Prometheus 2.30.3

Not passing

  • Telegraf 1.20.2
  • Timber Vector 0.16.1
  • VictoriaMetrics Agent 1.67.0

NB: We tested Vector 0.16.1 instead of 0.17.0 because there are no binary downloads for 0.17.0 and our test toolchain currently expects binaries.

On Ransomware Naming

Post Syndicated from Richard "RichiH" Hartmann original https://prometheus.io/blog/2021/06/10/on-ransomware-naming/

As per Oscar Wilde, imitation is the sincerest form of flattery.

The names “Prometheus” and “Thanos” have recently been taken up by a ransomware group. There’s not much we can do about that except to inform you that this is happening. There’s not much you can do either, except be aware that this is happening.

While we do NOT have reason to believe that this group will try to trick anyone into downloading fake binaries of our projects, we still recommend following common supply chain & security practices. When deploying software, do it through one of those mechanisms:

Unless you can reasonably trust the specific providence and supply chain, you should not use software.

As there’s a non-zero chance that the ransomware group chose the names deliberately and thus might come across this blog post: Please stop. With both the ransomware and the naming choice.

Prometheus Conformance Program: Remote Write Compliance Test Results

Post Syndicated from Richard "RichiH" Hartmann original https://prometheus.io/blog/2021/05/04/prometheus-conformance-remote-write-compliance/

As announced by CNCF and by ourselves, we’re starting a Prometheus conformance program.

To give everyone an overview of where the ecosystem is before running tests officially, we wanted to show off the newest addition to our happy little bunch of test suites: The Prometheus Remote Write compliance test suite tests the sender part of the Remote Write protocol against our specification.

During Monday’s PromCon, Tom Wilkie presented the test results from the time of recording a few weeks ago. In the live section, he already had an update. Two days later we have two more updates:
The addition of the observability pipeline tool Vector, as well as new versions of existing systems.

So, without further ado, the current results in alphabetical order are:

Sender Version Score
Grafana Agent 0.13.1 100%
Prometheus 2.26.0 100%
OpenTelemetry Collector 0.26.0 41%
Telegraf 1.18.2 65%
Timber Vector 0.13.1 35%
VictoriaMetrics Agent 1.59.0 76%

The raw results are:

--- PASS: TestRemoteWrite/grafana (0.01s)
    --- PASS: TestRemoteWrite/grafana/Counter (10.02s)
    --- PASS: TestRemoteWrite/grafana/EmptyLabels (10.02s)
    --- PASS: TestRemoteWrite/grafana/Gauge (10.02s)
    --- PASS: TestRemoteWrite/grafana/Headers (10.02s)
    --- PASS: TestRemoteWrite/grafana/Histogram (10.02s)
    --- PASS: TestRemoteWrite/grafana/HonorLabels (10.02s)
    --- PASS: TestRemoteWrite/grafana/InstanceLabel (10.02s)
    --- PASS: TestRemoteWrite/grafana/Invalid (10.02s)
    --- PASS: TestRemoteWrite/grafana/JobLabel (10.02s)
    --- PASS: TestRemoteWrite/grafana/NameLabel (10.02s)
    --- PASS: TestRemoteWrite/grafana/Ordering (26.12s)
    --- PASS: TestRemoteWrite/grafana/RepeatedLabels (10.02s)
    --- PASS: TestRemoteWrite/grafana/SortedLabels (10.02s)
    --- PASS: TestRemoteWrite/grafana/Staleness (10.01s)
    --- PASS: TestRemoteWrite/grafana/Summary (10.01s)
    --- PASS: TestRemoteWrite/grafana/Timestamp (10.01s)
    --- PASS: TestRemoteWrite/grafana/Up (10.02s)
--- PASS: TestRemoteWrite/prometheus (0.01s)
    --- PASS: TestRemoteWrite/prometheus/Counter (10.02s)
    --- PASS: TestRemoteWrite/prometheus/EmptyLabels (10.02s)
    --- PASS: TestRemoteWrite/prometheus/Gauge (10.02s)
    --- PASS: TestRemoteWrite/prometheus/Headers (10.02s)
    --- PASS: TestRemoteWrite/prometheus/Histogram (10.02s)
    --- PASS: TestRemoteWrite/prometheus/HonorLabels (10.02s)
    --- PASS: TestRemoteWrite/prometheus/InstanceLabel (10.02s)
    --- PASS: TestRemoteWrite/prometheus/Invalid (10.02s)
    --- PASS: TestRemoteWrite/prometheus/JobLabel (10.02s)
    --- PASS: TestRemoteWrite/prometheus/NameLabel (10.03s)
    --- PASS: TestRemoteWrite/prometheus/Ordering (24.99s)
    --- PASS: TestRemoteWrite/prometheus/RepeatedLabels (10.02s)
    --- PASS: TestRemoteWrite/prometheus/SortedLabels (10.02s)
    --- PASS: TestRemoteWrite/prometheus/Staleness (10.02s)
    --- PASS: TestRemoteWrite/prometheus/Summary (10.02s)
    --- PASS: TestRemoteWrite/prometheus/Timestamp (10.02s)
    --- PASS: TestRemoteWrite/prometheus/Up (10.02s)
--- FAIL: TestRemoteWrite/otelcollector (0.00s)
    --- FAIL: TestRemoteWrite/otelcollector/Counter (10.01s)
    --- FAIL: TestRemoteWrite/otelcollector/Histogram (10.01s)
    --- FAIL: TestRemoteWrite/otelcollector/InstanceLabel (10.01s)
    --- FAIL: TestRemoteWrite/otelcollector/Invalid (10.01s)
    --- FAIL: TestRemoteWrite/otelcollector/JobLabel (10.01s)
    --- FAIL: TestRemoteWrite/otelcollector/Ordering (13.54s)
    --- FAIL: TestRemoteWrite/otelcollector/RepeatedLabels (10.01s)
    --- FAIL: TestRemoteWrite/otelcollector/Staleness (10.01s)
    --- FAIL: TestRemoteWrite/otelcollector/Summary (10.01s)
    --- FAIL: TestRemoteWrite/otelcollector/Up (10.01s)
    --- PASS: TestRemoteWrite/otelcollector/EmptyLabels (10.01s)
    --- PASS: TestRemoteWrite/otelcollector/Gauge (10.01s)
    --- PASS: TestRemoteWrite/otelcollector/Headers (10.01s)
    --- PASS: TestRemoteWrite/otelcollector/HonorLabels (10.01s)
    --- PASS: TestRemoteWrite/otelcollector/NameLabel (10.01s)
    --- PASS: TestRemoteWrite/otelcollector/SortedLabels (10.01s)
    --- PASS: TestRemoteWrite/otelcollector/Timestamp (10.01s)
--- FAIL: TestRemoteWrite/telegraf (0.01s)
    --- FAIL: TestRemoteWrite/telegraf/EmptyLabels (14.60s)
    --- FAIL: TestRemoteWrite/telegraf/HonorLabels (14.61s)
    --- FAIL: TestRemoteWrite/telegraf/Invalid (14.61s)
    --- FAIL: TestRemoteWrite/telegraf/RepeatedLabels (14.61s)
    --- FAIL: TestRemoteWrite/telegraf/Staleness (14.59s)
    --- FAIL: TestRemoteWrite/telegraf/Up (14.60s)
    --- PASS: TestRemoteWrite/telegraf/Counter (14.61s)
    --- PASS: TestRemoteWrite/telegraf/Gauge (14.60s)
    --- PASS: TestRemoteWrite/telegraf/Headers (14.61s)
    --- PASS: TestRemoteWrite/telegraf/Histogram (14.61s)
    --- PASS: TestRemoteWrite/telegraf/InstanceLabel (14.61s)
    --- PASS: TestRemoteWrite/telegraf/JobLabel (14.61s)
    --- PASS: TestRemoteWrite/telegraf/NameLabel (14.60s)
    --- PASS: TestRemoteWrite/telegraf/Ordering (14.61s)
    --- PASS: TestRemoteWrite/telegraf/SortedLabels (14.61s)
    --- PASS: TestRemoteWrite/telegraf/Summary (14.60s)
    --- PASS: TestRemoteWrite/telegraf/Timestamp (14.61s)
--- FAIL: TestRemoteWrite/vector (0.01s)
    --- FAIL: TestRemoteWrite/vector/Counter (10.02s)
    --- FAIL: TestRemoteWrite/vector/EmptyLabels (10.01s)
    --- FAIL: TestRemoteWrite/vector/Headers (10.02s)
    --- FAIL: TestRemoteWrite/vector/HonorLabels (10.02s)
    --- FAIL: TestRemoteWrite/vector/InstanceLabel (10.02s)
    --- FAIL: TestRemoteWrite/vector/Invalid (10.02s)
    --- FAIL: TestRemoteWrite/vector/JobLabel (10.01s)
    --- FAIL: TestRemoteWrite/vector/Ordering (13.01s)
    --- FAIL: TestRemoteWrite/vector/RepeatedLabels (10.02s)
    --- FAIL: TestRemoteWrite/vector/Staleness (10.02s)
    --- FAIL: TestRemoteWrite/vector/Up (10.02s)
    --- PASS: TestRemoteWrite/vector/Gauge (10.02s)
    --- PASS: TestRemoteWrite/vector/Histogram (10.02s)
    --- PASS: TestRemoteWrite/vector/NameLabel (10.02s)
    --- PASS: TestRemoteWrite/vector/SortedLabels (10.02s)
    --- PASS: TestRemoteWrite/vector/Summary (10.02s)
    --- PASS: TestRemoteWrite/vector/Timestamp (10.02s)
--- FAIL: TestRemoteWrite/vmagent (0.01s)
    --- FAIL: TestRemoteWrite/vmagent/Invalid (20.66s)
    --- FAIL: TestRemoteWrite/vmagent/Ordering (22.05s)
    --- FAIL: TestRemoteWrite/vmagent/RepeatedLabels (20.67s)
    --- FAIL: TestRemoteWrite/vmagent/Staleness (20.67s)
    --- PASS: TestRemoteWrite/vmagent/Counter (20.67s)
    --- PASS: TestRemoteWrite/vmagent/EmptyLabels (20.64s)
    --- PASS: TestRemoteWrite/vmagent/Gauge (20.66s)
    --- PASS: TestRemoteWrite/vmagent/Headers (20.64s)
    --- PASS: TestRemoteWrite/vmagent/Histogram (20.66s)
    --- PASS: TestRemoteWrite/vmagent/HonorLabels (20.66s)
    --- PASS: TestRemoteWrite/vmagent/InstanceLabel (20.66s)
    --- PASS: TestRemoteWrite/vmagent/JobLabel (20.66s)
    --- PASS: TestRemoteWrite/vmagent/NameLabel (20.66s)
    --- PASS: TestRemoteWrite/vmagent/SortedLabels (20.66s)
    --- PASS: TestRemoteWrite/vmagent/Summary (20.66s)
    --- PASS: TestRemoteWrite/vmagent/Timestamp (20.67s)
    --- PASS: TestRemoteWrite/vmagent/Up (20.66s)

We’ll work more on improving our test suites, both by adding more tests & by adding new test targets. If you want to help us, consider adding more of our list of Remote Write integrations.

Introducing the Prometheus Conformance Program

Post Syndicated from Richard "RichiH" Hartmann original https://prometheus.io/blog/2021/05/03/introducing-prometheus-conformance-program/

Prometheus is the standard for metric monitoring in the cloud native space and beyond. To ensure interoperability, to protect users from suprises, and to enable more parallel innovation, the Prometheus project is introducing the Prometheus Conformance Program with the help of CNCF to certify component compliance and Prometheus compatibility.

The CNCF Governing Board is expected to formally review and approve the program during their next meeting. We invite the wider community to help improve our tests in this ramp-up phase.

With the help of our extensive and expanding test suite, projects and vendors can determine the compliance to our specifications and compatibility within the Prometheus ecosystem.

At launch, we are offering compliance tests for three components:
* PromQL (needs manual interpretation, somewhat complete)
* Remote Read-Write (fully automated, WIP)
* OpenMetrics (partially automatic, somewhat complete, will need questionnaire)

We plan to add more components. Tests for Prometheus Remote Read or our data storage/TSDB are likely as next additions. We explicitly invite everyone to extend and improve existing tests, and to submit new ones.

The Prometheus Conformance Program works as follows:

For every component, there will be a mark “foo YYYY-MM compliant”, e.g. “OpenMetrics 2021-05 compliant”, “PromQL 2021-05 compliant”, and “Prometheus Remote Write 2021-05 compliant”. Any project or vendor can submit their compliance documentation. Upon reaching 100%, the mark will be granted.

For any complete software, there will be a mark “Prometheus x.y compatible”, e.g. “Prometheus 2.26 compatible”. Relevant component compliance scores are multiplied. Upon reaching 100%, the mark will be granted.

As an example, the Prometheus Agent supports both OpenMetrics and Prometheus Remote Write, but not PromQL. As such, only compliance scores for OpenMetrics and Prometheus Remote Write are multiplied.

Both compliant and compatible marks are valid for 2 minor releases or 12 weeks, whichever is longer.