Tag Archives: Handy Tips

Saving Time with a Custom Zabbix Agent Installer

2025-12-04 Rizqi Firmansyah

Post Syndicated from Rizqi Firmansyah original https://blog.zabbix.com/saving-time-with-a-custom-zabbix-agent-installer/31843/

When managing large-scale infrastructure, the process of installing monitoring agents is often repetitive and time-consuming. Administrators must log into each server, manually run installation commands, and configure the agent to connect to the Zabbix server. To address this issue, the Zabbix Agent Deployer custom module was created. This module enables the direct installation of Zabbix agents on multiple hosts from the Zabbix Web interface.

The features of the Zabbix Agent Deployer module include:

Bulk host list input using a CSV file.
The ability to automatically add hosts to Zabbix and remotely install the Zabbix Agent on the
associated hosts.
The ability to display installation log results directly within the module.

With this approach, administrators can add new hosts to the monitoring system faster and more efficiently.

Key use cases for the Zabbix Agent installer

The Zabbix Agent Deployer module enables several practical scenarios, including:

1. Faster provisioning for new servers – When adding a large number of servers, agents can be installed simultaneously without requiring a login to each machine.

2. Standardized installation – All agents are installed in the same way using a centralized script, reducing the risk of misconfiguration.

3. Easier additional provisioning – Provisioning new servers is easier for users because they don’t need to configure them directly on the server.

Getting started with the Zabbix Agent Deployer module

Solution overview architecture

To use this module, the main steps are:

1. Upload the custom module to the Zabbix frontend in the /usr/share/zabbix/modules/ directory.

2. Enable the module from the Administration → General → Modules page, and click the Scan Directory button. Locate the Zabbix agent deployer module and click Enabled.

3. Once activated, the Zabbix agent deployer module can be accessed in the Data Collection menu. Here’s a screenshot of the Zabbix agent deployer module.

4. Prepare a CSV file like the format below, or download a sample CSV from the module page.

With this CSV file, we will add two hosts to Zabbix to be monitored and automatically install the Zabbix agent on them.

5. Upload the CSV file to the Zabbix agent deployer module page and click Apply.

6. The Zabbix agent deployer module will handle the process of adding hosts to Zabbix and installing the Zabbix agent. The status can be seen as follows:

From the image above, server1 and server2 were successfully added to Zabbix, and the Zabbix agent installation was successful!

7. Check out the Zabbix hosts list page. Hosts will appear according to the uploaded CSV file.

Conclusion

The implementation of this custom Zabbix Agent installer extends Zabbix’s capabilities beyond its built-in functionality. The Zabbix Agent Deployer module enables a more efficient bulk host addition process, as all steps from adding hosts to Zabbix to installing the Zabbix agent can be integrated through a single page.

If you’re interested in implementing this, please contact us. Bangunindo is a premium Zabbix partner in Indonesia. We’re ready to help you design, implement, and optimize your Zabbix solution to suit your needs.

The post Saving Time with a Custom Zabbix Agent Installer appeared first on Zabbix Blog.

Aruba Central API Monitoring with Zabbix

2025-11-25 Tibor Volanszki

Post Syndicated from Tibor Volanszki original https://blog.zabbix.com/aruba-central-api-monitoring-with-zabbix/31370/

Aruba Central is a SaaS solution that allows you to manage your Enterprise Aruba network environment. Due to the increasing number of cloud migrations, we can expect that more and more Aruba customers will move their on-premise environment to it, which will also mean a change in their monitoring environment. In this article, I will show you how to switch to API- based monitoring using Aruba Central and Zabbix. All custom resources mentioned can be found in my repository.

Aruba Central’s API

Oauth 2.0 is used, so you can forget the simple token management. At the end it is great, but for monitoring purposes it is overkill. There is pretty good documentation (referred to later) regarding how you can generate your access token, but after two hours it expires so you need to continually refresh it. To do this, you must use a refresh token, which can help you to get a new access token AND a new refresh token.

Within two hours, use the latest refresh token to repeat this action again. At this point you can imagine that this is not something you can implement easily by using the Zabbix GUI only. Well, maybe with some javascript magic, but otherwise there is no native support for this logic at this point of time. So how can we do this? In short:

Generate your client credentials
Generate your first token
Schedule the token refresh for every two hours
Update your host macro via Zabbix API
Use the token in Zabbix HTTP agent checks
Monitor your environment based on JSONPath pre-processing

Initial steps within Aruba Central

To manage your API access, you need to launch your “HPE Aruba Networking Central” application, so do NOT look into your workspace modules – the “Personal API clients” menu is NOT what we are looking for. Turn off the “New Central” view – at this point the early access version is not so useful (hopefully it will change soon).

The first time you get there, you will not see any items, but under the “My Apps & Tokens” tab you can click the “Add Apps & Tokens” button and generate it. Technically, this is already enough to start to monitoring your network infrastructure, but within two hours it would stop. So the relevant data for us are the “Client ID” and “Client Secret.” Feel free to revoke the recently created token at the bottom area as we do not need it.

Record your credentials

For this article, I am using a simple file to store all the credentials, which will be sourced into a bash script. Please keep in mind that storing your sensitive credentials in a single file is a BAD practice! Your SECO/CISO would probably have a few words with you about it, so please consider a better approach. A more secure way would be to use some Key Vault solution (like Azure, AWS, Google, or Hashicorp). Anyway, let’s continue with this unsecure example:

#!/bin/bash

### ZABBIX VARS ###

# URL of your zabbix instance (assuming you do not use the "/zabbix" ending, if yes, then add it to the end)
zabbix_url="https://your.zabbix.instance.net"
# Your Zabbix API token. If you do not know how to get it, check the documentation.
zabbix_api_token="1234_your_zabbix_api_key_5678"
# Create a host with a macro, remain at the "Macros" tab, turn on debug mode, look for "[hostmacroid] =>"
zabbix_macro_id="12345"

### ARUBA VARS ###
# To find yours, go here and check "Table: Domain URLs for API Gateway Access"
base_url="YOUR_ARUBA_CENTRAL_BASE_URL"
# Click on your profile in the Central app and you will find it there: 32 char long hexa string
client_id="YOUR_CLIENT_ID"
# provided in the previous step
client_secret="YOUR_CLIENT_ID"
# provided in the previous step
customer_id="YOUR_CUSTOMER_ID"
# your login credential
account_username="YOUR_CENTRAL_LOGIN_USERNAME"
# your login credential
account_password="YOUR_CENTRAL_LOGIN_PASSWORD"
# to be populated later
csrftoken=""
session=""
auth_code=""

Get or refresh your token and update the Zabbix host macro

The next steps are based on the official Aruba documentation, which you can find here. Please remember that there are many ways to achieve our target – this is just one example and probably not the most optimal one. Feel free to change / improve it with your code in your preferred scripting language.

The below script assumes that the file containing the credentials (previous step) is named as “variables” and located in the folder named “central.”

Filename: aruba_central_token_new.sh

Purpose: To be used for first time token generation. Later, you only have to refresh your token with the script after this one.

Remarks: Aruba is limiting this API query set, so you can run it only ONCE every 30 minutes! If you made a typo somewhere, wait 30 minutes before your next attempt or tweak the result files.

#!/bin/bash

basedir=central
source $basedir/variables

curl -s --noproxy '*' -v --cookie-jar $basedir/cookie --location --request POST "$base_url/oauth2/authorize/central/api/login?client_id=$client_id" \
--header "Content-Type: application/json" \
--data-raw "{
    \"username\": \"$account_username\",
    \"password\": \"$account_password\"
}" > $basedir/result1.raw 2>&1

grep 'Added cookie' $basedir/result1.raw > $basedir/result1.filtered

csrftoken=$(grep csrftoken $basedir/result1.filtered | awk -F '"' '{print $2}')
session=$(grep session $basedir/result1.filtered | awk -F '"' '{print $2}')

curl -s --noproxy '*' --request POST "$base_url/oauth2/authorize/central/api?client_id=$client_id&response_type=code&scope=all" \
--header "Content-Type: application/json" \
--header "Cookie: session=$session" \
--header "X-CSRF-Token: $csrftoken" \
--data-raw "{
\"customer_id\": \"$customer_id\"
}" > $basedir/result2.raw

auth_code=$(cat $basedir/result2.raw | jq -r .auth_code)

curl -s --noproxy '*' --request POST "$base_url/oauth2/token" \
--header "Content-Type: application/json" \
--data "{
    \"client_id\": \"${client_id}\",
    \"client_secret\": \"${client_secret}\",
    \"grant_type\": \"authorization_code\",
    \"code\": \"${auth_code}\"         
}" > $basedir/result3.raw

refresh_token=$(cat $basedir/result3.raw | jq -r .refresh_token)
access_token=$(cat $basedir/result3.raw | jq -r .access_token)

if [ "$refresh_token" == "null" ]; then
    echo "something went wrong... exiting now"
    exit 1
fi

echo $access_token > $basedir/token_access.latest
echo $refresh_token > $basedir/token_refresh.latest

echo "access_token: $access_token"
echo "refresh_token: $refresh_token"

curl -s --request POST \
--url "$zabbix_url/api_jsonrpc.php" \
--header "Authorization: Bearer $zabbix_api_token" \
--header "Content-Type: application/json-rpc" \
--data "{\"jsonrpc\": \"2.0\",\"method\": \"usermacro.update\",\"params\": {\"hostmacroid\": \"${zabbix_macro_id}\",\"value\": \"${access_token_new}\"},\"id\": 1}"

rm -f $basedir/cookie

Filename: aruba_central_token_refresh.sh

Purpose: To refresh your existing token. It is expecting an existing refresh token in the “token_refresh.latest” file, so better to run the previous script one time before this.

Remarks: You can run this script as many times you want, but it will result in new tokens only once per every two hours (when the current one expires). Therefore, refreshing too frequently is pointless.

#!/bin/bash

basedir=central
source $basedir/variables

refresh_token_current=$(cat $basedir/token_refresh.latest | tr -d '\n')
refresh_token_new=""

curl -s --noproxy '*' --request POST "$base_url/oauth2/token?client_id=$client_id&client_secret=$client_secret&grant_type=refresh_token&refresh_token=$refresh_token_current" > $basedir/result4.raw

refresh_token_new=$(cat $basedir/result4.raw | jq -r .refresh_token)
access_token_new=$(cat $basedir/result4.raw | jq -r .access_token)
expires_in=$(cat $basedir/result4.raw | jq -r .expires_in)

if [ "$refresh_token_new" == "null" ]; then
    echo "something went wrong... exiting now"
    exit 1
fi

echo $access_token_new > $basedir/token_access.latest
echo $refresh_token_new > $basedir/token_refresh.latest

echo "access_token: $access_token_new"
echo "refresh_token: $refresh_token_new"
echo "expires_in: $expires_in"

curl -s --request POST \
--url "$zabbix_url/api_jsonrpc.php" \
--header "Authorization: Bearer $zabbix_api_token" \
--header "Content-Type: application/json-rpc" \
--data "{\"jsonrpc\": \"2.0\",\"method\": \"usermacro.update\",\"params\": {\"hostmacroid\": \"${zabbix_macro_id}\",\"value\": \"${access_token_new}\"},\"id\": 1}"

In my case, both the scripts and variables files are in the same “central” folder, which is in a git repository. Each time I call one of the scripts, it will record the new tokens in files, which are committed and pushed to the repo. In my own implementation, this is how I call the refresh script and sync the result with my repo:

git checkout master

basedir=central
source $basedir/variables
bash $basedir/aruba_central_token_refresh.sh

git add .
git commit -m "save the new tokens"
git push origin master

Schedule your token management

You must run your refresh script at least once per every two hours. To make this happen you have many options, including:

cron (old-school, outdated way)
systemctl timer (a better way, but only if it is monitored)
Jenkins / Github Actions/etc.
Zabbix itself, by calling your bash script

In my case, Jenkins does the scheduling and execution and the job is monitored via Zabbix.

Monitor your network infrastructure

When everything is in place, then the monitoring part is pretty simple. The usual JSONPath based logic can be used. API call documentation can be found here. The template contains only the wireless components, since I do not have my switches in Central. Implementing the switching part should not be difficult – just have a look at the “Switch” section, then clone and adjust one of your “get” items.

Screenshots

Latest data – tag based filtering:

Latest data – Site health

Latest data – Gateway info

Latest data – AP info

Triggers:

Some triggers are intentionally disabled, because they are a bit redundant. However, I wanted to cover all options. Sometimes less alerting is better if you have a ticketing system integration, otherwise your monitoring system will turn into a ticket factory.

Known issues and limitations

Since we are not querying the devices directly, some delay can be expected. Based on my recent testing, the delay compared to real time is between 3-10 minutes. In my test I disconnected my test environment and then started to do manual updates frequently. Some items got the real state earlier, some only later.

If your refresh script will malfunction for whatever reason (normally it should not), then you may have to run the other script once to generate a new token, or you can go to the GUI and check the last refresh token, with which you can override the content of the “token_refresh.latest” file.

Aruba is limiting the number of API queries to 5,000 per day. This could seem annoying, but it is way more than what you need (you should expect less than 1,000 in normal conditions, depending on your update frequency).

Zabbix API will not authorize your call unless you insert a line into your apache vhost configuration. This is a more generic Zabbix API issue that is not related to Aruba Central.

SetEnvIf Authorization "(.*)" HTTP_AUTHORIZATION=$1

If Aruba Central has a maintenance activity, then the token refreshing way could break. Running the token request script once should address the issue.

Summary

Aruba Central’s API is pretty decent, but if you start from zero it could take a while to get to the end of it. With this guide, my intention was to speed you up, but please do not consider my scripts and the shown example as the only or best possible way – I’m just hoping it can give you a good base for your own solution. Have fun!

The post Aruba Central API Monitoring with Zabbix appeared first on Zabbix Blog.

Making PaperCut NG Observable with Zabbix

2025-11-18 Patrik Uytterhoeven

Post Syndicated from Patrik Uytterhoeven original https://blog.zabbix.com/making-papercut-ng-observable-with-zabbix/31244/

In most organizations, printing is an essential but often invisible service. When it works, nobody notices. When it fails, productivity stalls. That’s why monitoring your print environment is just as important as monitoring servers, databases, or network devices.

At Opensource ICT Solutions, we specialize in turning complex systems into observable services. One recent example is our integration of PaperCut NG with Zabbix. This allows IT teams to track the health of their print infrastructure in real-time — everything from server resources to individual printers and devices.

Why monitoring PaperCut matters

PaperCut NG does much more than queue print jobs. It enforces quotas, integrates with authentication systems, and manages fleets of devices. If the database runs out of connections, the disk fills up, or the license expires, users feel the impact instantly.

By integrating PaperCut with Zabbix, we make these risks visible long before they become business problems. The result is:

Proactive detection of printer errors, low toner, or license issues.
Capacity planning through trend analysis of disk usage, memory, and DB connections.
Unified visibility — PaperCut health checks appear right alongside servers, networks, and applications in Zabbix dashboards.

How the integration works

The magic happens through the PaperCut System Health API and Zabbix’s flexible data collection methods.

HTTP agent items

Zabbix fetches raw JSON data directly from PaperCut using an HTTP agent item, such as:

This single call provides a full snapshot of server health.

Dependent items + JSONPATH

Instead of hammering the API with multiple requests, we extract the needed fields using dependent items with JSONPATH preprocessing.

For example:

This design means one request can populate dozens of metrics, keeping monitoring both efficient and lightweight.

Calculated items

Some values aren’t directly available from PaperCut. In those cases, we create calculated items inside Zabbix.

For example, the percentage of active DB connections is derived as:

This allows us to set intelligent triggers like “DB connections > 90%” without requiring PaperCut to calculate it for us.

Low-level discovery (LLD) for devices and printers

Perhaps the most powerful part of this integration is automatic discovery.

Printer LLD → Queries /api/health/printers and creates items and triggers per printer. If a printer goes into Paper Jam or No Toner, Zabbix knows immediately.
Device LLD → Queries /api/health/devices and builds items dynamically for each discovered device, tracking states like OK, WARNING, or ERROR.

This ensures that new printers and devices are monitored automatically — no manual configuration required!

Why this matters

Bringing all of this together, the integration turns PaperCut NG into a fully observable service inside Zabbix.

Efficiency → One API call, dozens of metrics.
Scalability → Automatic discovery of printers and devices.
Robustness → Alerts and dashboards for licenses, resources, and print queues.

For IT teams, this means fewer surprises, faster troubleshooting, and more confidence in a service that often goes unnoticed until it fails.

Our expertise

This PaperCut integration is just one example of how we at Opensource ICT Solutions help organizations unlock the full potential of Zabbix. We don’t just install monitoring – we design intelligent, scalable integrations that make hidden systems visible. Whether it’s print management, databases, custom applications, or network devices, we know how to extend Zabbix to fit your environment and give you the insights that matter most.

Feel free to download our template and documentation for free from our GitHub: https://github.com/OpensourceICTSolutions/ZabbixPapercutNG

Want to make your business-critical systems truly observable? Let’s talk about how we can tailor Zabbix to your needs: [email protected]

The post Making PaperCut NG Observable with Zabbix appeared first on Zabbix Blog.

Monitoring Website Changes with Zabbix Browser Item

2025-11-11 Adi Rusmanto

Post Syndicated from Adi Rusmanto original https://blog.zabbix.com/monitoring-website-changes-with-zabbix-browser-item/31684/

In today’s digital era, information is an asset and most of it is obtained from websites. The ability to automatically monitor website content changes has become a crucial competitive advantage, as even small changes on a website can affect business strategies, security postures, and data-driven decision-making. Accordingly, Zabbix 7.0 saw the introduction of a new feature called Browser Item, which allowed users to perform advanced website monitoring using a browser.

The Browser Item feature includes the ability to:

● Capture screenshots of the current website state
● Measure website performance and availability metrics
● Extract and analyze data from web pages
● Generate automatic alerts based on detected changes or errors

This means Zabbix is no longer limited to traditional IT infrastructure monitoring. It can now also serve as a tool for monitoring strategic external information.

Key use cases for website change monitoring with Zabbix

The Zabbix Browser Item opens up many valuable use cases for organizations that want to proactively track website changes. Below are some key examples:

Monitoring release notes

Tracking vendor release notes is essential for IT teams. With Zabbix, we can automatically detect new releases, extract relevant information, and notify the appropriate team members so they can respond faster.

Tracking security advisories

Security advisories are critical for maintaining a strong security posture. By monitoring websites that publish vulnerability information using Zabbix, security teams can be promptly alerted about new threats and take timely actions to reduce risks.

Monitoring competitor websites

In a competitive market, staying informed about competitor activities is vital. Zabbix allows users to monitor competitor websites for pricing updates, new product offerings, marketing campaigns, or news announcements, while providing valuable business intelligence to support strategic decisions.

Monitoring tender announcements

Zabbix can also monitor websites for new tender announcements from government portals or business partners, ensuring our organization stays aware of the latest business opportunities.

Ensuring internal website integrity

Beyond external sites, we can also use the Browser Item to ensure the integrity and availability of our own websites. It helps detect unexpected content changes, broken links, or performance degradation that may affect the user experience or signal potential issues. Proactive monitoring helps maintain a high-quality user experience and protect our brand reputation.

Getting started with website change monitoring in Zabbix

Solution overview architecture

This diagram shows how Zabbix uses a WebDriver to capture and analyze website content.
The collected data is stored in Zabbix for visualization and alerts when changes are
detected.

Step-by-step configuration

In this example, we’ll monitor changes on the Nginx Security Advisories webpage.

Step 1: Prepare the Web Driver

Zabbix requires a Web Driver to perform browser-based monitoring. One commonly used option is Selenium, which can be deployed using the following Docker image:

https://hub.docker.com/r/selenium/standalone-chrome

Step 2: Configure WebDriverURL on Zabbix server or proxy

Update the WebDriverURL parameter in your Zabbix Server or Zabbix Proxy configuration to point to the Selenium service you deployed.

Step 3: Create a Browser Item in Zabbix

1. Create a host if it doesn’t already exist.

2. Add a new item with the following settings:

Type: Browser
Type of information: Text

The key part is the script section. Below is the example script.

The script uses two methods:

browser.navigate method defines the URL to be monitored
browser.findElements method specifies the page section where changes should be detected

Note: The StartBrowserPollers parameter must be enabled on the Zabbix server or proxy configuration for browser items to work. It is enabled by default with the value StartBrowserPollers=1.

Step 4: Create dependent items

The Browser Item produces a JSON result containing website data. This item serves as the master item for dependent items such as:

Extracting the latest security advisories
Capturing a website screenshot

Step 5: Create a trigger for change alerts

Create a trigger that compares the current and previous values of the “latest security advisories” item. If any change is detected, Zabbix will automatically send an alert notifying your team of the update.

Step 6: Display data on the dashboard

To visualize the monitored data, we can use the Item History widget on a Zabbix dashboard to show both the latest security advisories and the corresponding screenshot, for example.

Conclusion

The Browser Item feature in Zabbix 7.0 elevates website monitoring beyond simple availability checks. It enables comprehensive monitoring of website changes, unlocking a variety of use cases such as tracking release notes, security advisories, competitor activity, and more.

If you’re interested in implementing this capability, feel free to contact us. Bangunindo is a Zabbix Premium Partner in Indonesia, ready to help you design, implement, and optimize your Zabbix monitoring solution to fit your specific needs.

The post Monitoring Website Changes with Zabbix Browser Item appeared first on Zabbix Blog.

Monitoring a Starlink Dish with Zabbix

2025-10-21 Alexander Petrov-Gavrilov

Post Syndicated from Alexander Petrov-Gavrilov original https://blog.zabbix.com/monitoring-a-starlink-dish-with-zabbix/31543/

Did you realize that you can monitor a Starlink dish using just Zabbix? The idea (or rather the need) to use Starlink came to me almost as soon as I moved to a fairly rural area. Local internet providers have not yet “provided” fiberoptic or stable mobile connectivity to places like this, and while searching for a solution I accidentally discovered that Starlink was already providing service to some local companies. As I later found out, they also offered service in my area for residential customers.

To make a long story short, since internet access is crucial in the IT field, I decided to acquire and then monitor my very own Starlink dish. At first, this proved challenging because regular user data access is quite limited. However, thanks to Zabbix browser monitoring, I managed to solve it fairly easily. In this post I will share my solution with you, including the template.

Table of Contents

Monitoring configuration

First, you need to make sure you have Zabbix installed (either a Zabbix proxy or server) on the same network that the Starlink dish and router are on. The next step is to configure Zabbix for browser monitoring.

WebDriver installation

# podman run --name webdriver -d \
-p 4444:4444 \ 
-p 7900:7900 \
--shm-size="2g" \
--restart=always -d docker.io/selenium/standalone-chrome:latest

Port 4444 will be the port on which the WebDriver will be listening, and port 7900 will be used by NoVNC, which allows us to observe browser behavior in case a browser with a GUI is used.

Zabbix server/proxy configuration

After WebDriver is installed, we need to set up the communication between Zabbix and the driver. This can be done by editing the Zabbix server/proxy configuration file and updating the following parameters:

### Option: WebDriverURL 
# WebDriver interface HTTP[S] URL. For example http://localhost:4444 used with 
# Selenium WebDriver standalone server. 
# 
# WebDriverURL= 
WebDriverURL=http://localhost:4444 
### Option: StartBrowserPollers 
# Number of pre-forked instances of browser item pollers. 
# 
# Range: 0-1000 
# StartBrowserPollers=1 
StartBrowserPollers=5

With the configuration parameters in place, restart the Zabbix server/proxy to apply the changes:

systemctl restart zabbix-server

Creating a host

First, we need to navigate to the “Data collection” > “Hosts” section and create a host that represents our Starlink dish. The host in my example will look like this:

The host also has a user macro:

{$LINK} with value: http://webapp.starlink.com to point to the correct Starlink dish web app:

Creating a browser item

We will now configure our browser item to collect and monitor the list of metrics exposed in the Starlink browser app:

We are using the bare minimum here, so make sure the update intervals are as frequent as you need. However, I would not recommend updating it more frequently than every 5 minutes. It’s also not a good idea to store the history, since it is already stored trough dependent items.

The most important part of the item is the script itself:

var browser, result;
var opts = Browser.chromeOptions();

opts.capabilities.alwaysMatch['goog:chromeOptions'].args = [];
browser = new Browser(opts);
browser.setScreenSize(Number(1980), Number(1020));

try {
    var params = JSON.parse(value);
    browser.navigate(params.url);

 // Wait for the dish to report status
    Zabbix.sleep(2000);

    // Find the JSON text element(s)
    var jsonElements = browser.findElements("xpath", "//div[@id='root']/div[@class='App']/div[@class='Main']/div[2]/div[@class='Section'][2]/pre[@class='Json-Format']/div[@class='Json-Text']");
    var extractedData = [];

    for (var i = 0; i < jsonElements.length; i++) {
        var text = jsonElements[i].getText();

        // Try parsing JSON
        try {
            extractedData.push(JSON.parse(text));
        } catch (e) {
            // If not valid JSON, include raw text instead
            extractedData.push({ raw: text, error: "Invalid JSON format" });
        }
    }

    // Collect result 
    result = browser.getResult();

    // Replace with parsed JSON data
    result.extractedJsonData = extractedData.length === 1 ? extractedData[0] : extractedData;

}
catch (err) {
    if (!(err instanceof BrowserError)) {
        browser.setError(err.message);
    }
    result = browser.getResult();
}
finally {
    // Return a clean JSON object
    return JSON.stringify(result.extractedJsonData);
}

So what does this script do? It opens the Starlink web app, waits for the Starlink dish to output all the status data, and, after a bit of parsing, returns the data highlighted in the screenshot:

Now we can click on the three dots on the left of our newly created item in the items page and proceed to create dependent items for each value we are interested in!

Creating dependent items

Now we just click here:

As an example, to create an item that monitors the hardware version we can create an item like this:

With JSONPath preprocessing:

In the end we get the data in Zabbix:

All other items (except alerts) will follow the same logic – just update the item name, key, and JSONPath in preprocessing to extract the required values.

Creating dependent LLD item prototypes

To automate the alerts items creation, we can create a dependent discovery rule. In the “Discovery” section, create a new discovery rule:

With preprocessing using Java Script:

var data = JSON.parse(value);
var alerts = data.alerts;
var lld = [];

for (var key in alerts) {
    if (alerts.hasOwnProperty(key)) {
        lld.push({
            "{#ALERT}": key
        });
    }
}

return JSON.stringify({ data: lld });

This will provide us with following JSON data:

{
  "data": [
    {
      "{#ALERT}": "dishIsHeating"
    },
    {
      "{#ALERT}": "dishThermalThrottle"
    },
    {
      "{#ALERT}": "dishThermalShutdown"
    },
    {
      "{#ALERT}": "powerSupplyThermalThrottle"
    },
    {
      "{#ALERT}": "motorsStuck"
    },
    {
      "{#ALERT}": "mastNotNearVertical"
    },
    {
      "{#ALERT}": "slowEthernetSpeeds"
    },
    {
      "{#ALERT}": "softwareInstallPending"
    },
    {
      "{#ALERT}": "movingTooFastForPolicy"
    },
    {
      "{#ALERT}": "obstructed"
    }
  ]
}

All that’s left ‘to do is to create a dependent item prototype:

With preprocessing, of course:

JSONPath will transform to extract each specific alert and “Boolean to Decimal” will save us some space in the database by tranforming true/false booleans to digits.

Result

In the end, we can monitor all the data:

Even more data can be collected using exporters – if you are willing to do a bit of extra configuration, of course! Let me know if you are interested, and I will show you a completely different approach with a template.

Before I forget, the template used in this tutorial can be found here.

The post Monitoring a Starlink Dish with Zabbix appeared first on Zabbix Blog.

Running Zabbix with MariaDB and Galera Active/Active Clustering

2025-09-30 Nathan Liefting

Post Syndicated from Nathan Liefting original https://blog.zabbix.com/running-zabbix-with-mariadb-and-galera-active-active-clustering/31104/

High availability on a platform like Zabbix is a hard requirement for many users. With native high availability on the Zabbix servers, proxies, and at the frontend through various solutions for web servers, all that’s left is at the database layer. Any downtime in your MariaDB database would disrupt your monitoring availability, at the least on the frontend side of things in case of proxy buffering. Let’s have a look at the easiest way to create a high availability (HA) architecture for Zabbix using MariaDB with built-in Galera clustering – by removing single points of failure from your database and finalizing the HA puzzle for Zabbix.

Architecture overview

Let’s start of with the MariaDB + Galera number one design requirement. For a proper quorum to be made, 3 nodes should be used in the cluster. With only two nodes in a Galera cluster, quorum rules become a bit of a headache, as Galera uses a majority vote (more than half the nodes) to decide if the cluster can still accept writes. In a two-node setup, all is good when the database is online. But when we lose one node, quorum is lost and that node needs to rejoin.

This makes a two-node setup fragile but not impossible, and it does work with Zabbix since we do only have one Zabbix server active at the time. In a split-brain scenario where both nodes either think they are the last to leave, you might have to decide which node you think has your up-to-date data. We will detail both scenario’s, but the principle remains the same. We will use MariaDB as our database and Galera will be used to create a primary/primary cluster. In such a cluster, all nodes in the cluster are writeable, which is great for the Zabbix native HA.

When we look in the Zabbix database, we can see that Zabbix keeps all of it’s Zabbix server HA information and states in the database.

This means that whatever one Zabbix server node writes into the database will also be replicated to all other nodes in the MariaDB Galera cluster.

The design

Knowing what we know now, we can create a very simple design for a solid Zabbix HA setup with Mariadb + Galera. When we have a single Zabbix frontend and we keep to the MariaDB + Galera requirement of having 3 database nodes, we get a fairly simple setup, as seen below.

In this setup, each Zabbix server connects to its own Database node and we don’t need added complexity by using load balancers. However, we do get an automatic failover from the Zabbix servers, as they know exactly which node is active through the database. However, in this situation we are still left with 3 frontends that do not have automatic failover, simply because we do not have database aware Apache or NGINX. This also works in a two database setup, with the side note that you might have quorum issues to manually resolve after an outage:

Adding onto this setup, we could install a VIP, load balancer, or something like HA proxy in front of the frontend to make a failover happen there as well. Keep in mind though, the failover needs to happen based on whether or not the webfrontend can reach a writeable database.

Optional Arbitrator

If you are set on running only 2 database nodes (your wallet is thankful), but still worried about quorums, we can bring in the ARBITRATOR.

If there are only 2 Database nodes in your Galera cluster, not to worry! It’s definitely possible even while maintaining a good quorum resolution in case of outages.

What about load balancing?

Lastly, it is also possible to add load balancing to the mix. Let’s say, for example, you cannot add a VIP to your environment but still need your WEB servers to failover. A load balancer can provide the solution here.

We still prefer to run the Zabbix servers with a direct database connection, but even there a load balancer could be added if you wish. However, please keep in mind that the more load balancers you add, the more complex troubleshooting might become. The whole idea about the setup without load balancers is to have a solid Zabbix setup that is easy to maintain, while providing high availability.

Conclusion

In the end, even with a minimal setup of 2 DB nodes, 2 Zabbix servers, and 2 WEB frontends, we can make a high availability setup. As we’ve shown with Galera, this setup becomes highly flexible, allowing us to run without automatic WEB failover all the way up to including complicated load balancers.

High availability doesn’t have to be overly complicated in a setup like this – it really is all about how far you want to push things. Besides that, in this setup everything is horizontally scalable on the database side. Do keep in mind, however, that Zabbix does still run in an Active/Passive setup.

I hope you enjoyed reading this blog post. If you have any questions or need help configuring anything in your Zabbix setup feel free to contact me and the team at Opensource ICT Solutions. We build a ton of cool stuff like this and more!

Nathan Liefting

https://oicts.com

A close up of a logo Description automatically generated

The post Running Zabbix with MariaDB and Galera Active/Active Clustering appeared first on Zabbix Blog.

Building HA Zabbix with PostgreSQL and Patroni

2025-09-16 Patrik Uytterhoeven

Post Syndicated from Patrik Uytterhoeven original https://blog.zabbix.com/building-ha-zabbix-with-postgresql-and-patroni/30960/

Running a monitoring platform like Zabbix in a production environment demands reliability and resilience. When your monitoring solution is down, you’re flying blind – and for many organizations, that simply isn’t acceptable. This post introduces a robust high-availability (HA) architecture for Zabbix, using PostgreSQL, Patroni, etcd, HAProxy, keepalived and PgBackRest. Built on RHEL 9 or derrivates, this solution combines modern open-source tools to provide automatic failover, load balancing, and seamless monitoring, all while maintaining consistency and performance.

Architecture overview

The HA design consists of multiple layers working in tandem to maintain continuity even during node or service failures:

Database Cluster Layer

2 or more nodes form the PostgreSQL cluster, managed by Patroni and coordinated using etcd. At any given time, one node is the primary (read/write), and the others are hot standbys ready to take over automatically.

Consensus layer

etcd runs on the same nodes and acts as the distributed configuration store and coordination layer for Patroni. It ensures a consistent cluster state and enables safe failover decisions.

Load balancing layer

Two HAProxy nodes provide a single point of entry for all clients (including Zabbix), routing requests to the current PostgreSQL primary. These nodes are monitored and coordinated via Keepalived to maintain a floating Virtual IP (VIP), ensuring seamless failover at the connection layer.

Backup layer

A separate backup server is responsible for running PgBackRest, which handles full and incremental backups, WAL archiving, and Point-In-Time Recovery (PITR). This server communicates securely with all database nodes over SSH.

Monitoring layer

Two Zabbix servers, running in active-passive mode, continuously monitor all layers of this stack including the HAProxy health, Patroni cluster role, and etcd status by accessing the PostgreSQL VIP for backend connectivity.

This multi-tiered setup ensures that no single failure be it a database, load balancer, or monitoring server brings down the monitoring platform.

Why HA matters for Zabbix

Zabbix depends heavily on its PostgreSQL database backend. Every metric, trigger, event, and alert is stored there. If PostgreSQL becomes unavailable, even briefly, data loss or monitoring blind spots can occur. That’s why introducing HA at the database layer is a crucial step when scaling Zabbix for enterprise environments.

While Zabbix itself supports HA at the application level, this architecture ensures that the database backend is also fully fault-tolerant, using modern consensus-based clustering with automatic failover.

Component overview

To achieve HA, we bring together several specialized components, each fulfilling a critical role in the system:

PostgreSQL

The relational database engine used by Zabbix. In this example setup, it runs on three nodes, forming a cluster managed by Patroni.

Patroni

Patroni is the orchestrator for the PostgreSQL cluster. It monitors node health, manages replication, promotes standbys when needed, and ensures only one writable leader exists at any time. Patroni leverages a distributed consensus store in this case, etcd but other DCS’s are possible to coordinate decisions across the cluster.

etcd

etcd is a lightweight and highly available key-value store used by Patroni to maintain the cluster’s state. It stores leader election data, health statuses, and locks. We deploy it as a three-node cluster, co-located with the PostgreSQL nodes for convenience, though this setup can be scaled independently if needed as etcd is very latency prone.

HAProxy

To simplify application connectivity, HAProxy acts as a load balancer in front of the database cluster. It monitors the role of each node using Patroni’s REST API and routes connections to the active primary server. If the leader fails, HAProxy automatically reroutes traffic to the new primary.

Keepalived

Keepalived provides a floating virtual IP address (VIP) across the HAProxy nodes. This VIP allows client systems, such as the Zabbix frontend, to connect to a single stable IP even if one HAProxy node fails.

PgBackRest

To protect the data itself, we use PgBackRest for full and incremental backups, as well as Point-In-Time Recovery (PITR). A dedicated backup server is included to pull and store archive logs and backups securely via SSH.

Zabbix server

Finally, we run two Zabbix servers in active-passive mode. Both are configured to connect to the PostgreSQL cluster through the VIP exposed by HAProxy. The Zabbix frontend is deployed on both nodes as well, ensuring continued accessibility through the load-balanced setup.

Topology at a glance

Here’s a simplified view of the architecture:

2 or more database nodes (PostgreSQL + Patroni + etcd)
Two HAProxy nodes, each configured with Keepalived to manage a floating virtual IP
One backup node for PgBackRest
Two Zabbix servers pointing to the PostgreSQL VIP

All systems are tied together with consistent hostname mappings, time synchronization (Chrony), and service monitoring.

Notes:

PgBackRest is directly connected to all three PostgreSQL nodes, allowing it to archive WAL segments and pull backups regardless of which node is primary.
This design enables full standby backups and supports Point-In-Time Recovery (PITR).
HAProxy ensures Zabbix always talks to the current primary node, while Patroni and etcd handle automatic failover and cluster state management.

Design rationale

This setup prioritizes resilience and self-healing. If any single component fails a database node, a load balancer, or even a monitoring server the system continues to function.

Using Patroni with etcd ensures that failovers are handled automatically, without human intervention. HAProxy ensures client traffic is always routed to the current primary, while Keepalived ensures that this routing layer itself is highly available.

We opted for PgBackRest over simple scripts or base backups because it provides not just efficient incremental backups, but also full WAL archiving and point-in-time recovery, which are invaluable for both disaster recovery and debugging.

Lastly, we chose to integrate Zabbix itself into this HA design, treating it not just as a application but as a fully resilient service able to monitor itself, so to speak.

Real-world considerations

Resource planning: While our nodes run comfortably, scaling this setup to heavy workloads requires careful tuning of memory, I/O, and PostgreSQL parameters.
etcd placement: Although we run etcd co-located with the database nodes in this example, separating etcd onto dedicated infrastructure is ideal for large-scale environments. This avoids resource contention and preserves quorum in extreme failure scenarios.
Monitoring the monitors: Zabbix itself must be monitored. In our setup, each component including etcd, Patroni, and PostgreSQL exposes health endpoints that can be used by Zabbix agents or scripts to generate alerts on replication lag, cluster health, and failover events.

Conclusion

This architecture provides a solid foundation for running Zabbix in a fault-tolerant, production-ready environment. It not only ensures high availability for the database layer but also offers flexibility, observability, and operational safety.

Whether you’re running internal infrastructure monitoring or offering Zabbix as a managed service, adopting this type of HA setup removes single points of failure and gives you peace of mind — all using open-source technologies that are battle-tested and widely supported.

If you need assistance with the migration or want to ensure best practices for scaling and optimizing Zabbix, don’t hesitate to reach out to OICTS. We are a Zabbix Premium Partner operating globally, with offices in the USA, UK, Netherlands, and Belgium, and we’re ready to help you every step of the way.

The post Building HA Zabbix with PostgreSQL and Patroni appeared first on Zabbix Blog.

Revolutionizing Zabbix Maintenance with Artificial Intelligence

2025-09-11 Grover Taipe

Post Syndicated from Grover Taipe original https://blog.zabbix.com/revolutionizing-zabbix-maintenance-with-artificial-intelligence/31284/

Can you imagine being able to schedule maintenance in Zabbix by simply telling a program: “I need to put the web server in maintenance tomorrow from 8 to 10 with ticket 100-178306”? That’s exactly what the Artificial Intelligence (AI) Scheduler Zabbix project I’ve developed does!

What problem does it solve?

Anyone who has worked with Zabbix knows that scheduling maintenance can sometimes be tedious, especially when you need to:

Configure complex routine maintenance
Handle Zabbix API bitmasks for specific days of the week or month
Search for specific hosts or groups
Document associated tickets

This project eliminates that friction by allowing the use of natural language to create both one-time and routine maintenance.

The magic behind the code

Conversational artificial intelligence

The system integrates both OpenAI GPT-4 and Google Gemini to interpret natural language requests. The AI doesn’t just understand what you want to do, but automatically:

Detects servers, groups, and dates
Identifies ticket numbers (XXX-XXXXXX format)
Automatically calculates complex Zabbix bitmasks
Generates contextual responses with examples

Fig. 1. Adding the AI Scheduler widget to your Zabbix dashboard

Advanced routine maintenance

What really stands out is its ability to handle complex patterns. Here are some practical examples that work:

“Daily backup for srv-backup from 2 to 4 AM with ticket 200-8341 until February 2027”
“Thursday and Friday maintenance from 5 to 7 AM until January 2027”
“Cleanup on the first Sunday of each month with ticket 100-178306 until December 2026”

Fig. 2. AI-generated maintenance summary with all calculated parameters

Elegant architecture

The project uses a three-layer architecture:

Frontend: Custom widget for Zabbix
Backend: Flask API with AI integration
Zabbix: Native API to create maintenance

Fig. 3. Maintenance successfully created and visible in Zabbix interface

Super-simple installation

One of the best features is how easy it is to get it running:

cp .env.example .env

You only need to configure your Zabbix URL and AI API key:

 docker compose up -d --build

And that’s it! You have an AI assistant working.

Multi-instance support

For organizations with multiple Zabbix servers, the project includes configuration for up to 5 simultaneous instances, each with its own configuration.

What impresses me most

Intelligent date detection

The system understands natural expressions like:

“Tomorrow from 8 to 10” → Next date with specific schedule
“Sunday from 2 to 4 AM” → Next Sunday at those hours
“24/08/25 10:00am” → Automatically converts the format

Automatic Bitmask management

Zabbix API bitmasks can be notoriously complicated. This system calculates them automatically:

Thursday and Friday = 8 + 16 = 24
Sundays only = 64
First week of the month with specific configuration

Fig. 4. Complex weekly maintenance scheduling with automatic bitmask calculation

Why is it important?

This project represents a natural evolution in systems administration. Instead of memorizing complex syntax or navigating multiple menus, you simply describe what you need in natural language. It’s especially valuable for:

Operations teams handling multiple maintenance tasks
Companies that need to document associated tickets
Organizations with complex maintenance patterns

The future is here

Projects like this demonstrate how artificial intelligence can make complex technical tools more accessible without sacrificing functionality. It’s not just automation – it’s intelligence applied to real infrastructure problems. If you work with Zabbix and are tired of manually configuring maintenance, this project is definitely worth checking out. It’s open source, well documented, and solves a real problem that many of us face every day. You can find the complete project on GitHub.

The post Revolutionizing Zabbix Maintenance with Artificial Intelligence appeared first on Zabbix Blog.

Migrating from PRTG to Zabbix: A High-Level Guide

2025-09-02 Patrik Uytterhoeven

Post Syndicated from Patrik Uytterhoeven original https://blog.zabbix.com/migrating-from-prtg-to-zabbix-a-high-level-guide/30845/

For companies looking to migrate from PRTG Network Monitor to Zabbix, one of the most critical aspects is making sure a smooth migration of monitored devices and configurations. While there is no official tool to directly migrate between the two platforms, creating a bridge using custom export/import scripts allows for an effective and large migation. This blog post outlines a practical approach to achieving that migration based on the export/import methodology we at Opensource ICT Solutions previously implemented for one of our clients.

Why migrate?

While PRTG offers an intuitive interface and is popular for its ease of use, Zabbix provides:

Greater flexibility and scalability
Full open-source licensing
More powerful automation and templating
A robust API for integrations
Lower costs, especially since Paessler was sold to an investor

These features make Zabbix an attractive choice for teams looking to scale or standardize on open-source infrastructure.

Migration overview

The migration involves two key steps:

Exporting PRTG device information
Importing data into Zabbix

Because the two systems are conceptually and structurally different, we focused our scripts on migrating what is most transferable: device names, IP addresses, and interface types. SNMP versions or PRTG-specific sensor details were excluded or simplified where not applicable to Zabbix. PRTG, for example, will only export probes that have an OID that was not built-in in PRTG but added later, making our export incomplete. This does not mean we did a partial migration, it just means we have not included it in the automated approach.

Step 1: Exporting from PRTG

We developed a Python-based script that interacts with the PRTG API to extract monitored device data and export it to a CSV file. The script filters out irrelevant objects and organizes the output for easy Zabbix processing.

This creates a clean CSV, like this:

Device Name, IP Address, Interface Type
zabbix-server,10.0.0.10,agent
ServerA,192.168.0.2,SNMP
ServerA,192.168.0.2,agent
core-switch,192.168.0.1,SNMP

This file serves as a clean, structured inventory of monitored devices.

Note: SNMP version fields were excluded in the final export, as Zabbix does not currently display or rely on an SNMP version in the same way PRTG does.

Step 2: Importing into Zabbix

Using Zabbix’s API, we created an import script that reads the CSV and:

Creates host entries
Assigns them to the appropriate host group
Adds relevant interfaces (e.g., Agent,ILO,SNMP or a combination of …)

Each host is configured based on its detected interface type in PRTG.

On the Zabbix side, we used the Zabbix API to automate the creation of hosts, interfaces, and template assignment. The import script reads the CSV line-by-line and takes action based on the interface type.

Considerations and “gotchas”

Templates: We didn’t add templates, as there is no 1:1 solution – PRTG has a different concept and adding a standard template would be possible but probably not the best solution.
Host Groups: For ease of use and the limited time we had, we added all hosts in a temporary host group made for the migration. Although we do have scripts that take it out from PRTG and create it in Zabbix, in this particular migration it was not needed.
Permissions: The API token used in the import script must have sufficient privileges to create hosts.

What is NOT migrated

Because of fundamental differences between the platforms, the following are not directly migrated:

Historical data or sensor readings: Mainly because the customer had no hard requirement for it.
Custom PRTG notifications or dependencies: It was easier to manually re-create them.
Maps or dashboards: The Zabbix approach is so different that it was easier to recreate it manually (and improve).
Sensors: Zabbix is working with a different concept.

Post-migration tips

Validation: After the import, verify that each host is reachable and monitored correctly in Zabbix.
Discovery: Consider using Zabbix’s LLD (Low-Level Discovery) to dynamically find interfaces, disks, or other entities.
Housekeeping: Disable PRTG monitoring only after confirming Zabbix is fully operational.

Conclusion

Migrating from PRTG to Zabbix is not a one click operation, but with some scripting, planning, and experience from a partner like us, it can be done efficiently and with minimal disruption. The custom export/import scripts act as a reliable bridge between the two systems, allowing for a clean transfer of your monitoring inventory. From there, Zabbix’s automation and scalability features can help take your monitoring to the next level.

The post Migrating from PRTG to Zabbix: A High-Level Guide appeared first on Zabbix Blog.

Running Zabbix with PostgreSQL and PG Auto Failover

2025-08-12 Patrik Uytterhoeven

Post Syndicated from Patrik Uytterhoeven original https://blog.zabbix.com/running-zabbix-with-postgresql-and-pg-auto-failover/31026/

Running a monitoring platform like Zabbix in a production environment requires bulletproof availability at the database layer. Any downtime in PostgreSQL, even for seconds, can disrupt monitoring visibility, triggering blind spots in alerts and data collection.

This post introduces a streamlined High-Availability (HA) architecture for Zabbix using PostgreSQL, pg_auto_failover, HAProxy, and PgBackRest. Built on RHEL 9 or derivatives, this architecture removes single points of failure and automates failover using minimal external dependencies, making it a strong candidate for modern observability backends.

Architecture overview

This HA design simplifies deployment by using a dedicated monitor node to orchestrate automatic failover between two PostgreSQL database nodes. With pg_auto_failover, we avoid the need for complex consensus layers like etcd or Consul while still achieving fast, reliable failover and recovery.

Database layer

Two PostgreSQL nodes are deployed in a primary/secondary configuration. These nodes are registered with a dedicated pg_auto_failover monitor, which continuously checks node health and replication status. In the event of a failure, the monitor promotes the secondary to primary with no manual intervention.

Each node is securely configured using scram-sha-256 authentication and self-signed / or owned SSL certificates to ensure encrypted communication within the cluster.

Monitor node (Arbiter)

The monitor node is a lightweight PostgreSQL instance that runs the pgautofailover extension. It holds state information about all participating nodes and acts as the arbiter during failover events. It requires only one node, reducing complexity compared to consensus-based DCS (Distributed Configuration Store) systems like etcd or ZooKeeper.

Load balancing layer

Two HAProxy nodes route all client (Zabbix) connections to the current PostgreSQL primary. A lightweight HTTP service on each DB node reports its current role (primary or not) and allows HAProxy to determine which node is writable. These proxies are kept highly available using Keepalived, which manages a shared Virtual IP (VIP) across both proxy servers.

This way, applications like Zabbix always connect to a stable endpoint, even during failover events.

Backup layer

Backups are handled using PgBackRest, deployed on a dedicated backup server. This server connects to both PostgreSQL nodes over SSH and performs the following:

Full and incremental backups
WAL archiving
Point-In-Time Recovery (PITR)

Passwordless SSH and proper pgbackrest.conf mappings are set up to support seamless interaction regardless of which node is currently primary.

Component overview

Component	Role
PostgreSQL	Relational backend storing all Zabbix metrics, alerts, events
pg_auto_failover	Ensures continuous availability by promoting replicas automatically
Monitor Node	Decides failover based on health checks and cluster state
HAProxy	Routes client traffic to the current primary
Keepalived	Provides VIP failover between HAProxy nodes
PgBackRest	Performs PITR-capable backups from any node
Zabbix Server	Connects to PostgreSQL via VIP to ensure continuity

Topology at a glance

Design

Unlike Patroni, which requires a distributed configuration store like etcd, pg_auto_failover uses a dedicated monitor node that simplifies orchestration. This setup reduces the operational burden while still delivering robust failover, automatic reconfiguration, and synchronization safeguards, including:

Synchronous_standby_names to enforce replication integrity
Service integration with systemd for reliable restarts
Failover detection with minimal latency

This design also ensures SSL-enabled encrypted communication, self-healing role changes, and full observability using Zabbix itself, which can be configured to monitor the PostgreSQL cluster through exposed health endpoints.

Real-world considerations

Upgrade Planning: The pg_auto_failover version in RPM repos may lag behind the latest upstream features like set_monitor_setting. Pin the package version if consistency is required.
Network Security: Only HAProxy nodes are allowed to query the internal role-check API on the DB nodes using custom firewall rules.
Cluster Hygiene: Always clean up config folders (~postgres/.config/pg_autoctl/…) if a node is misconfigured or needs to rejoin.
SELinux: Configure SELinux, use semanage and audit2allow to fix custom ports (e.g., 9877 for health checks).
Hybrid Logging: Setup PostgreSQL to log to both journald and traditional log files via stderr + logging_collector.

Conclusion

This architecture strikes a balance between simplicity and resilience. While Patroni is great for large-scale, multi-region setups requiring distributed consensus, pg_auto_failover offers a lighter-weight solution that covers most enterprise needs without complex dependencies.

By layering the following…

PostgreSQL 17
Pg_auto_failover with a single monitor
HAProxy + Keepalived for VIP failover
PgBackRest for backups

…you can then confidently run Zabbix in a highly available and secure fashion with minimal operational overhead.

If you’re considering implementing this setup or migrating from a single-node database backend, reach out to Opensource ICT Solutions, a Zabbix Premium Partner with global presence in the USA, the UK, the Netherlands, and Belgium. We can help you architect, deploy, and monitor Zabbix environments that scale with your needs.

The post Running Zabbix with PostgreSQL and PG Auto Failover appeared first on Zabbix Blog.

Migrating Nagios to Zabbix: Lessons Learned

2025-08-05 Nathan Liefting

Post Syndicated from Nathan Liefting original https://blog.zabbix.com/migrating-nagios-to-zabbix-lessons-learned/30917/

Recently, a new customer of ours at Opensource ICT Solutions asked whether we could migrate their Nagios instance to Zabbix. Because Nagios and Zabbix are very different in their storage methods, we told them that we would have to investigate and see if we could come up with a viable solution. It wasn’t long until we found a way to do it and started building some script to get it done.

Table of Contents

The customer’s wishes

No loss of any Nagios configuration data
Historic performance data migrated to Zabbix
Existing problems migrated from Nagios
Nagios XI to be disabled entirely, as the license is expiring

The customer was clear in their wishes – we needed to turn off Nagios, but without losing historic data. As such, they wanted all their old data visible in Zabbix instead of having Nagios running somewhere as a backup. This meant that a script had to be built to get that Nagios data out and into Zabbix.

The configuration data

The good part here is that it starts simple. When we dive into the Nagios configuration data, we clearly see that Nagios has hosts just like Zabbix. They just have a slightly different build than our usual Zabbix hosts. For example, we can see three different names for a host in Nagios:

Host Name
Alias = Host name
Display Name = Visible name

That immediately gives us a good way to hook up Nagios names to Zabbix host and visible names.

When we then take a look at the checks and how they are executed in Nagios, we also see similarities with Zabbix. In the end, both of them are monitoring solutions, of course. However, Nagios works more in a command execution kind of way, which is good for our migration. We can take this command and find an equivalent item in Zabbix. For example the check_icmp command can easily be translated into a simple check in Zabbix icmpping, icmppingloss, and icmppingsec.

For the check_tcp command we can do a similar translation. Making sure we use the simple check net.tcp.service whenever this command is executed on a Nagios host.

Because of the big differences between Nagios and Zabbix, this does mean we need to make some manual translations between the Nagios commands and Zabbix items. Depending on your Nagios instance, this could be a big task. Luckily for us, this was a smaller instance with only ICMP and TCP port checks.

The history (i.e. performance) data

Now that we know how to start creating our hosts and items, we need to understand how Nagios is storing its data. Zabbix has a big centralized MariaDB or PostgreSQL database, which makes it easy to parse through and work with our data. Unfortunately, Nagios instances use a different technique. Nagios stores data in .rrd (Round Robin Database) files and with it a .xml file to interpret the RRD file. The RRD files are not centralized like a Zabbix database, but they are more manageable in terms of storage size. We can see an RRD file per type of check in Nagios, which means we will have to grab the data from that file while understanding what it is going to belong to in Zabbix.

To see the data in the RRD file, we can use a special command line tool.

rrdtool /usr/local/nagios/share/perfdata/BeNeLux-Host-Name/Availability.rrd LAST --start -30d --end now | grep -v "nan"

Now we can clearly see that this specific RRD file above contains 8 columns, 7 with a performance value. The first column contains the timestamp in Unixtime, which is great because it will be perfect for storing in the Zabbix database. The other 7 columns in this file are different though, because we do not know what the value in the column belongs to. This is where the .xml file comes into play. The XML file belongs with the RRD file and contains details on what is included in the RRD file.

In this XML file we will find all of the required host information, which is great for creating the host in Zabbix. It also contains the check information, so we can also use this file to create the items in Zabbix. The biggest thing we will have to keep in mind is to make sure that the XML and RRD file match up in terms of number of RRD entries and columns. Column 1 in the RRD file will match with the first entry in our XML.

Let’s create a script

With the host, item and history data identified, we can start to create a script. In our case we decided to create a Python import tool. As Zabbix comes with some limitations in terms of which hostnames we can use (which are different from the limitations in Nagios), we need to sanitize our hostnames slightly.

Then all we need to do is parse through all the XML files and create new hosts in Zabbix through the Zabbix API.

It will be a very similar process for our items, as we parse through our XML file and create all of the required items in Zabbix through the API.

We can even create the triggers straight from the XML file by parsing through the different severities already set up in Nagios.

Once everything is created in Zabbix, the Python script can now start using RRDTool to parse through the RRD file, making sure to keep the XML file structure in mind when parsing through the columns.

This script can now create the hosts, the items, the triggers, and then import all of the data. We can see the hosts being created and data being imported.

The beauty of importing history data into Zabbix while the triggers are already created is then also seen below.

All of the triggers will trigger and be resolved based on the data imported, meaning that we can create problems with historic data. This means that not only do we have our historic data, but also all the problems with the correct duration as they are now discovered from the actual imported data.

To make this possible we can use the Zabbix sender tool. It has an option to include the timestamp upon every historic value imported.

Our Python script grabs the values from the RRD file and then converts them into a new _HOST_.sender file. This file will be sent to the Zabbix server using the Zabbix sender tool.

Looking at the file, we can see it contains only the name of the host, the unixtime stamp, and the actual value to send.

All we need to do is make our script send this file to the correct item in Zabbix.

Manual template and item creation

The last step will be our cleanup. We decided that we would start dirty with a one-on-one data import from Nagios. This means hostnames, item names, and trigger names are imported straight from Nagios. No templates will be created in Zabbix by the tool either, skipping the Zabbix best practice to use templates for all hosts.

We did this to make the initial import easier and not go overboard with scripting. It’s easier to have a messy Zabbix to clean up than to script everything perfectly in Python. Time is valuable.

What we did afterwards is create all the templates manually to take over the items as is from the hosts. For example, we can translate the ICMP ping and TCP stuff easily into a template.

After doing so, we do end up with some bad looking templates, but we can now start cleaning up.

We can also start creating normal trigger names and clean up…

…while changing our dynamic port names for something more expected as well.

And that’s it!

The post Migrating Nagios to Zabbix: Lessons Learned appeared first on Zabbix Blog.

How to Install Zabbix on Windows with a Linux Subsystem

2025-07-08 Alexander Petrov-Gavrilov

Post Syndicated from Alexander Petrov-Gavrilov original https://blog.zabbix.com/how-to-install-zabbix-on-windows-with-a-linux-subsystem/30311/

It’s a very well known fact that Zabbix can only be installed on Linux. But what if you are in a Windows environment and getting a Linux machine is not so simple or even possible? This can obstruct the implementation of Zabbix, or at least significantly delay it. Not only that, building a POC outside of the future environment makes data procurement a lot more complicated. Is there a way to work around this and get Zabbix as close to Windows as we possibly can?

Table of Contents

WSL

WSL/WSL 2 is a fast and easy solution for installing and using Zabbix in a smaller Windows-dominant environment, be that a POC or a small company office. WSL 2 runs a real Linux kernel in a lightweight VM while being optimized for Windows. This means a faster start, lower resource consumption, and the ability to share files with Windows directly, meaning you can use Windows File explorer to find and manage the VM files.

WSL 2 also allows you to use Linux CLI while working with Windows (i.e. running vim from a Windows terminal and editing Windows files directly). At this point, you may be asking yourself, “Why not Hyper-V and VirtualBox?” Those are definitely options too, but they are quite heavy on system resources. In addition, boot times are a bit longer and sharing files between a host and a guest OS is clunkier.

Maybe Docker Desktop then? It’s an absolutely valid option, but that would require a bit of Docker knowledge and you would still be using WSL, technically speaking. So, with that said, WSL is definitely the fastest and most reliable way to sprung a Zabbix instance in a Windows-focused environment.

We will use WSL 2, but as a note WSL 1 is also available. Here are the differences:

WSL 2 is usually the better performer overall, especially for dev environments. It also has better Linux compatibility.
WSL 1 Linux files aren’t isolated, which can make them more accessible. In WSL 2, Linux runs in a virtual disk (ext4), so Linux and Windows files are more separate. Integration is still pretty good, however.
WSL 2 has better Linux compatibility – systemd, iptables, etc.
WSL 1 shared the same IP as Windows, WSL 2 is a VM – some networking required.
With WSL 1 you can see Linux running processes in Task Manager. WSL 2 will have processes isolated.

Installing Zabbix using WSL

Install WSL

Open PowerShell as an Administrator and run:

PS C:\Windows\system32> wsl --install

If you’ve already have WSL 1 installed, update it:

PS C:\Windows\system32> wsl --update

You can also set WSL 2 as default:

PS C:\Windows\system32> wsl --set-default-version 2

Install/Get preferred Linux Distribution using either Microsoft Store (i.e. Ubuntu, Debian, Oracle Linux) or just download directly. I will be using Oracle Linux 9.4.

You can also download the RootFS tarball from the preferred distribution portal, but then the process will be a bit different. Create a folder using PowerShell:

PS C:\Windows\system32> mkdir C:\WSL\OracleLinux9

Copy the .tar.xz file to this folder, then run:

PS C:\Windows\system32> wsl --import OracleLinux9 C:\WSL\OracleLinux9 .\oraclelinux9-rootfs.tar.xz --version 2

After the image is installed or imported, start Oracle Linux using PowerShell:

PS C:\Windows\system32> oraclelinux94

When installation is finished, there is a prompt to create a default UNIX user account and password for the said user, as the username does not need to match your Windows username. I’ll set it to “zabbix” of course, but you can set it to any other.

PS C:\Windows\system32> Enter new UNIX username: 
PS C:\Windows\system32> zabbix
PS C:\Windows\system32> New password: <your-password>
PS C:\Windows\system32> passwd: all authentication tokens updated successfully.
PS C:\Windows\system32> Installation successful!

Now OracleLinux is ready for use!

Prepare the system

You will be immediately logged in to the new environment. If logged out, to log in again just execute in PowerShel:

PS C:\Windows\system32> oraclelinux94

Being logged in, first double check that your selected OS is indeed installed by executing in the PowerShell, which will now serve as your VM CLI access point:

[zabbix@PC-NAME ~]$ cat /etc/os-release
NAME="Oracle Linux Server"
VERSION="9.4"
ID="ol"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="9.4"
PLATFORM_ID="platform:el9"
PRETTY_NAME="Oracle Linux Server 9.4"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:oracle:linux:9:4:server"
HOME_URL="https://linux.oracle.com/"
BUG_REPORT_URL=https://github.com/oracle/oracle-linux

Confirmation received, make sure all OS updates are installed:

[zabbix@PC-NAME ~]$ sudo dnf update -y

When the update process is finished, you will need to decide whether you would like to use systemd or not (this may increase booting time). I will enable systemd. To do this, edit the wsl.conf on the Linux subsystem:

vi /etc/wsl.conf

Add to the newly created file:

[boot]
systemd=true

Reboot the images (this command will reboot all of them):

PS C:\Windows\system32> wsl.exe --shutdown

Start back your Linux distribution:

PS C:\Windows\system32> oraclelinux94

Install Zabbix database

We will need to prepare the database engine. Again, any preferred database engine can be used, in this case I install and configure MariaDB:

[zabbix@PC-NAME ~]$ sudo dnf install -y mariadb-server mariadb
[zabbix@PC-NAME ~]$ sudo systemctl enable --now mariadb

Confirm MariaDB is running:

[zabbix@PC-NAME ~]$ Systemctl status mariadb

mariadb.service - MariaDB 10.5 database server
     Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; preset: disabled)
     Active: active (running) since Tue 2025-04-29 12:39:54 EEST; 3min 55s ago
       Docs: man:mariadbd(8)
             https://mariadb.com/kb/en/library/systemd/
   Main PID: 235 (mariadbd)
     Status: "Taking your SQL requests now..."
      Tasks: 9 (limit: 26213)
     Memory: 109.6M
     CGroup: /system.slice/mariadb.service
             └─235 /usr/libexec/mariadbd --basedir=/usr

After confirmation, secure it a bit by creating a root password and selecting the options in bold:

[zabbix@PC-NAME ~]$ sudo mysql_secure_installation 

Enter current password for root (enter for none):
OK, successfully used password, moving on...

Setting the root password or using the unix_socket ensures that nobody
can log into the MariaDB root user without the proper authorisation.

You already have your root account protected, so you can safely answer 'n'.

Switch to unix_socket authentication [Y/n] n
 ... skipping.

You already have your root account protected, so you can safely answer 'n'.

Change the root password? [Y/n] Y
New password:
Re-enter new password:
Password updated successfully!
Reloading privilege tables..
 ... Success!

Remove anonymous users? [Y/n] Y
 ... Success!

Disallow root login remotely? [Y/n] Y
 ... Success!

Remove test database and access to it? [Y/n] Y


Reload privilege tables now? [Y/n] Y

 ... Success!

Cleaning up...

All done!  If you've completed all of the above steps, your MariaDB
installation should now be secure.

Thanks for using MariaDB!

Now to create the Zabbix database. Log in to MariaDB:

[zabbix@PC-NAME ~]$ sudo mysql -u root -p 
[zabbix@PC-NAME ~]$ Enter password: <enter your password, won’t be visible>

Follow the steps from the Zabbix installation page:

MariaDB [(none)]> create database zabbix character set utf8mb4 collate utf8mb4_bin;
MariaDB [(none)]> create user zabbix@localhost identified by '<custom-password>';
MariaDB [(none)]> grant all privileges on zabbix.* to zabbix@localhost;
MariaDB [(none)]> set global log_bin_trust_function_creators = 1;

MariaDB [(none)]> quit;

Installing Zabbix

Install the Zabbix repository:

[zabbix@PC-NAME ~]$ sudo dnf install https://repo.zabbix.com/zabbix/7.0/centos/9/x86_64/zabbix-release-latest-7.0.el9.noarch.rpm
[zabbix@PC-NAME ~]$ dnf clean all

Proceed to install the Zabbix server, frontend, and agent:

[zabbix@PC-NAME ~]$ sudo dnf -y install zabbix-server-mysql zabbix-web-mysql zabbix-apache-conf zabbix-sql-scripts zabbix-selinux-policy zabbix-agent
...
[zabbix@PC-NAME ~]$zabbix-agent-7.0.12-release1.el9.x86_64 zabbix-apache-conf-7.0.12-release1.el9.noarch zabbix-selinux-policy-7.0.12-release1.el9.x86_64  zabbix-server-mysql-7.0.12-release1.el9.x86_64 zabbix-sql-scripts-7.0.12-release1.el9.noarch
 zabbix-web-7.0.12-release1.el9.noarch zabbix-web-deps-7.0.12-release1.el9.noarch zabbix-web-mysql-7.0.12-release1.el9.noarch

Complete!

Now import the initial database schema:

[zabbix@PC-NAME ~]$ zcat /usr/share/zabbix-sql-scripts/mysql/server.sql.gz | mysql -u zabbix -p zabbix
Enter password: <enter your DB user password and wait until you will see the next line appear>
[root@ZBX-5CD3221K14 zabbix]#

Disable the log_bin_trust_function_creators option after import has finished:

# mysql -uroot -p
password
MariaDB [(none)]>  set global log_bin_trust_function_creators = 0;
MariaDB [(none)]>  quit;

Add your Zabbix user database password to the Zabbix server configuration file:

[zabbix@PC-NAME ~]$ vi /etc/zabbix/zabbix_server.conf

### Option: DBPassword
#       Database password.
#       Comment this line if no password is used.
#
# Mandatory: no
# Default:
DBPassword=<your-DB-user-password>

Start the Zabbix server and frontend and add them to autorun:

[zabbix@PC-NAME ~]$  systemctl restart zabbix-server zabbix-agent httpd php-fpm
[zabbix@PC-NAME ~]$  systemctl enable zabbix-server zabbix-agent httpd php-fpm
Created symlink /etc/systemd/system/multi-user.target.wants/zabbix-server.service → /usr/lib/systemd/system/zabbix-server.service.
Created symlink /etc/systemd/system/multi-user.target.wants/zabbix-agent.service → /usr/lib/systemd/system/zabbix-agent.service.
Created symlink /etc/systemd/system/multi-user.target.wants/httpd.service → /usr/lib/systemd/system/httpd.service.
Created symlink /etc/systemd/system/multi-user.target.wants/php-fpm.service → /usr/lib/systemd/system/php-fpm.service.

Installation of the backend is now finished, but we still need the frontend.

Exposing and installing the Zabbix frontend for WSL

Since WSL2 does not expose services to localhost by default, you need to determine the WSL IP:

[zabbix@PC-NAME ~]$ ip addr show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:15:5d:47:32:c6 brd ff:ff:ff:ff:ff:ff
    inet 172.29.128.155/20 brd 172.29.143.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::215:5dff:fe47:32c6/64 scope link
       valid_lft forever preferred_lft forever

Look for an IP like 172.x.x.x, then using your browser go to:

http://<WSL_IP>/zabbix

In this example, that would be:

http://172.29.128.155/zabbix

You can also port forward WSL to localhost with netsh in PowerShell:

PS C:\Windows\system32> netsh interface portproxy add v4tov4 listenport=8080 listenaddress=127.0.0.1 connectport=80 connectaddress=<WSL_IP>

Then you will be able to access Zabbix from http://localhost:8080/zabbix. Now, just finish the standard frontend setup and Zabbix is ready to use!

WSL advantages

Some extra advantages you get with this approach include clearer resource usage visibility:

Direct access to the Linux subsystem files through File explorer with your favorite Windows tools:

As you can see, docker is here as well. System and configuration files are also visible and editable:

Now you can proceed with building your Zabbix or Zabbix POC, (almost) without needing to leave your regular Windows environment!

The post How to Install Zabbix on Windows with a Linux Subsystem appeared first on Zabbix Blog.

Podman Container Monitoring with Prometheus Exporter, part 2

2025-06-12 Janis Eidaks

Post Syndicated from Janis Eidaks original https://blog.zabbix.com/podman-container-monitoring-with-prometheus-exporter-part-2/30538/

In the first part of this post, we explored how to get data with HTTP agent from the Prometheus Podman exporter and use the same item data for the Podman pods Discovery rule as well as item and trigger prototypes. In part 2 of the same series, we’ll learn how to discover and monitor Podman containers.

Table of Contents

Creating a template discovery rule

I will create another discovery rule for container discovery. This discovery rule is also based on the same item [Podman info] in the template – Podman containers by HTTP and Prometheus (you can check part one of this series to find out how to configure it). The parameters of the discovery rule are shown below. This discovery rule will allow us to discover the pod name and ID.

Template: Podman containers by HTTP and Prometheus

▲ Discovery rule
  ▪ Name:                   Container discovery
  ▪ Type                    Dependent item
  ▪ Key:                    training.containers.discovery
  ▪ Master item             Podman containers by HTTP and Prometheus: Podman info
  ▪ Delete lost resources  After 10d
  ▪ Disable lost resources Immediately
♯ Preprocessing
  ▪ Prometheus to JSON     podman_container_info
♦ LLD Macros
  ▪ {#CONTAINER.ID}        $.labels.id
  ▪ {#CONTAINER.NAME}      $.labels.name

Fig 1. Discovery rule: Container discovery

Fig 2. Discovery rule: Container discovery preprocessing tab

Fig 3. Discovery rule: Container discovery LLD macros tab

Next, different dependent item prototypes are created in this container discovery rule. As the Prometheus Podman exporter provides a lot of different metrics about the containers, I will create multiple such items: state, health, creation date, input/output network traffic information, and so on. So, check out what metrics can be acquired and use what is relevant for you.

You can also add a description of each item prototype. I am interested only in metrics with the discovered container ID macros, and I am not interested in what values are for the other fields, such as pod_id, pod_name, so I use ~”.*”, which matches any value. I will show the item prototype configuration screenshots of one of the item prototypes.

These item prototypes are similar to each other, with some minor differences, such as Prometheus patterns, or in some cases, with a different master item (item prototype as master item).

Fig 4. Discovery rule preprocessing step: Prometheus to JSON with pattern podman_container_info

Fig 5. Discovery rule LLD macros: assigning relevant JSONPATH to LLD macros

Creating a template discovery rule: Item prototypes

After the containers have been discovered, we have to create item prototypes. These prototypes will also be dependent item prototypes and will use the same item as the discovery rule: Podman info. Prometheus Podman exported returns a lot more metrics for the containers than it did for the pods.

You can get container metrics such as container health, state, creation date, disk read/write, memory usage, network usage, and more. In this blog post, I have added most of them, so check what metrics are relevant to your monitoring needs and start monitoring.

Fig 6. Low-level discovery rule and item prototypes based on the same item.

The screenshots of the item prototype is shown below.

Fig 7. Container state item prototype tab

Fig 8. Container state item prototype tag tab

Fig 9. Container state item prototype preprocessing tab

Remember, you can also test these item prototypes in the preprocessing step – just copy the Prometheus exporter data and set the relevant macro to value you want to check.

The configuration parameters of the item prototypes are shown below. There are a lot of metrics you can monitor, but remember to monitor what is relevant and necessary for you.

Template: Podman containers by HTTP and Prometheus; Discovery rule: Container discovery

● Item prototype #1
  ▪ Name: 		Container health: [{#CONTAINER.NAME}]
  ▪ Type 		Dependent item
  ▪ Key: 		container.health[{#CONTAINER.NAME}]
  ▪ Type of inf: 	Numeric (float)
  ▪ Master item		Podman containers by HTTP and Prometheus: Podman info
  ▪ Units: 
♦ Tags (name:value) 	
  ▪ Container:{#CONTAINER.NAME}	
  ▪ Metric:health		
♯ Preprocessing
  ▪ Prometheus pattern 	podman_container_health{id="{#CONTAINER.ID}",pod_id=~".*",pod_name=~".*"} value

● Item prototype #2
  ▪ Name: 		Container state: [{#CONTAINER.NAME}]
  ▪ Type 		Dependent item
  ▪ Key: 		container.state[{#CONTAINER.NAME}]
  ▪ Type of inf: 	Numeric (float)
  ▪ Master item		Podman containers by HTTP and Prometheus: Podman info
  ▪ Units: 		
♦ Tags (name:value)  			
  ▪ Container:{#CONTAINER.NAME}	
  ▪ Metric:state		
♯ Preprocessing
  ▪ Prometheus pattern	podman_container_state{id="{#CONTAINER.ID}",pod_id=~".*",pod_name=~".*"} value

● Item prototype #3
  ▪ Name: 		Created at: [{#CONTAINER.NAME}]
  ▪ Type 		Dependent item
  ▪ Key: 		container.created[{#CONTAINER.NAME}]
  ▪ Type of inf: 	Numeric (unsigned)
  ▪ Master item		Podman containers by HTTP and Prometheus: Podman info
  ▪ Units: 		unixtime
♦ Tags (name:value) 		
  ▪ Container:{#CONTAINER.NAME}	
  ▪Metric:created		
♯ Preprocessing
  ▪ Prometheus pattern 	podman_container_created_seconds{id="{#CONTAINER.ID}",pod_id=~".*",pod_name=~".*"} value

● Item prototype #4
  ▪ Name: 		Disk read per second: [{#CONTAINER.NAME}]
  ▪ Type 		Dependent item
  ▪ Key: 		container.disk.read[{#CONTAINER.NAME}]
  ▪ Type of inf: 	Numeric (unsigned)
  ▪ Master item		Podman containers by HTTP and Prometheus: Podman info
  ▪ Units: 		B
♦ Tags (name:value) 	
  ▪ Container:{#CONTAINER.NAME}	
  ▪ Metric:disk_read		
♯ Preprocessing
  ▪ Prometheus pattern	podman_container_block_output_total{id="{#CONTAINER.ID}",pod_id=~".*",pod_name=~".*"} value
  ▪ Change per second

● Item prototype #5
  ▪ Name: 		Disk write per second: [{#CONTAINER.NAME}]
  ▪ Type 		Dependent item
  ▪ Key: 		container.disk.write[{#CONTAINER.NAME}]
  ▪ Type of inf: 	Numeric (unsigned)
  ▪ Master item		Podman containers by HTTP and Prometheus: Podman info
  ▪ Units: 		B
♦ Tags (name:value) 	
  ▪ Container:{#CONTAINER.NAME}	 
  ▪ Metric:disk_write		
♯ Preprocessing
  ▪ Prometheus pattern	podman_container_block_input_total{id="{#CONTAINER.ID}",pod_id=~".*",pod_name=~".*"} value
  ▪ Change per second

● Item prototype #6
  ▪ Name: 		Exit code: [{#CONTAINER.NAME}]
  ▪ Type 		Dependent item
  ▪ Key: 		container.exit_code[{#CONTAINER.NAME}]
  ▪ Type of inf: 	Numeric (float)
    ▪ Master item	Podman containers by HTTP and Prometheus: Podman info
▪ Units: 			
♦ Tags 			
  ▪ Container:{#CONTAINER.NAME}	
  ▪ Metric:exit_code	
♯ Preprocessing
  ▪ Prometheus pattern	podman_container_exit_code{id="{#CONTAINER.ID}",pod_id=~".*",pod_name=~".*"} value

● Item prototype #7
  ▪ Name: 		Image tags: [{#CONTAINER.NAME}]
  ▪ Type 		Dependent item
  ▪ Key: 		container.image.tags[{#CONTAINER.NAME}]
  ▪ Type of inf: 	Character
  ▪ Master item		Podman containers by HTTP and Prometheus: Podman info
  ▪ Units: 			
♦ Tags 			
  ▪ Container:{#CONTAINER.NAME}	
  ▪ Metric:tag
♯ Preprocessing
▪ Prometheus pattern podman_container_info{id="{#CONTAINER.ID}",image=~".*",name=~".*",pod_id=~".*",pod_name=~".*",ports=~".*"} label image
  ▪ Regular expression	\.*(\/.\w.*)	\1

● Item prototype #8
  ▪ Name: 		Memory usage: [{#CONTAINER.NAME}]
  ▪ Type 		Dependent item
  ▪ Key: 		container.mem[{#CONTAINER.NAME}]
  ▪ Type of inf: 	Numeric (unsigned)
  ▪ Master item		Podman containers by HTTP and Prometheus: Podman info
  ▪ Units: 		B
♦ Tags 			
  ▪ Container:{#CONTAINER.NAME}	
  ▪ Metric:mem		
♯ Preprocessing
  ▪ Prometheus pattern podman_container_mem_usage_bytes{id="{#CONTAINER.ID}",pod_id=~".*",pod_name=~".*"} value

● Item prototype #9
  ▪ Name: 		Network input dropped: [{#CONTAINER.NAME}]
  ▪ Type 		Dependent item
  ▪ Key: 		container.net.in.drop[{#CONTAINER.NAME}]
  ▪ Type of inf: Numeric (unsigned)
  ▪ Master item		Podman containers by HTTP and Prometheus: Podman info
  ▪ Units: 		packets
♦ Tags 			
  ▪ Container:{#CONTAINER.NAME}	
  ▪ Metric:net_in_drop		
♯ Preprocessing
  ▪ Prometheus pattern	podman_container_net_input_dropped_total{id="{#CONTAINER.ID}",pod_id=~".*",pod_name=~".*"} value

● Item prototype #10
  ▪ Name: 		Network input errors: [{#CONTAINER.NAME}]
  ▪ Type 		Dependent item
  ▪ Key: 		container.net.in.errors[{#CONTAINER.NAME}]
  ▪ Type of inf: 	Numeric (unsigned)
  ▪ Master item		Podman containers by HTTP and Prometheus: Podman info
  ▪ Units: 		
♦ Tags 			
  ▪ Container:{#CONTAINER.NAME}	
  ▪ Metric:net_in_err		
♯ Preprocessing
  ▪ Prometheus pattern	podman_container_net_input_errors_total{id="{#CONTAINER.ID}",pod_id=~".*",pod_name=~".*"} value

● Item prototype #11
  ▪ Name: 		Network input total: [{#CONTAINER.NAME}]
  ▪ Type 		Dependent item
  ▪ Key: 		container.net.in.total[{#CONTAINER.NAME}]
  ▪ Type of inf: 	Numeric (unsigned)
  ▪ Master item		Podman containers by HTTP and Prometheus: Podman info
  ▪ Units: 		B
♦ Tags 			
  ▪ Container:{#CONTAINER.NAME}	
  ▪ Metric:net_in_tot
♯ Preprocessing
  ▪ Prometheus pattern	podman_container_net_input_total{id="{#CONTAINER.ID}",pod_id=~".*",pod_name=~".*"} value

● Item prototype #12
  ▪ Name: 		Network input per second: [{#CONTAINER.NAME}]
  ▪ Type 		Dependent item
  ▪ Key: 		container.net.in.change[{#CONTAINER.NAME}]
  ▪ Type of inf: 	Numeric (float)
  ▪ Master item		prototype - Network input total: [{#CONTAINER.NAME}] 
  ▪ Units: 		Bps
♦ Tags 			
  ▪ Container:{#CONTAINER.NAME}	
  ▪ Metric:net_in_change
♯ Preprocessing
  ▪ Change per second

● Item prototype #13
  ▪ Name: 		Network output dropped: [{#CONTAINER.NAME}]
  ▪ Type 		Dependent item
  ▪ Key: 		container.net.out.drop[{#CONTAINER.NAME}]
  ▪ Type of inf: 	Numeric (unsigned)
  ▪ Master item		Podman containers by HTTP and Prometheus: Podman info
  ▪ Units: 		
♦ Tags 			
  ▪ Container:{#CONTAINER.NAME}	
  ▪ Metric:net_out_drop	
♯ Preprocessing
  ▪ Prometheus pattern	podman_container_net_output_dropped_total{id="{#CONTAINER.ID}",pod_id=~".*",pod_name=~".*"} value

● Item prototype #14
  ▪ Name: 		Network output errors: [{#CONTAINER.NAME}]
  ▪ Type 		Dependent item
  ▪ Key: 		container.net.out.errors[{#CONTAINER.NAME}]
  ▪ Type of inf: 	Numeric (unsigned)
  ▪ Master item		Podman containers by HTTP and Prometheus: Podman info
  ▪ Units: 		
♦ Tags 			
  ▪ Container:{#CONTAINER.NAME}	
  ▪ Metric:net_out_err	
♯ Preprocessing
  ▪ Prometheus pattern	podman_container_net_output_errors_total{id="{#CONTAINER.ID}",pod_id=~".*",pod_name=~".*"} value

● Item prototype #15
  ▪ Name: 		Network output total: [{#CONTAINER.NAME}]
  ▪ Type 		Dependent item
  ▪ Key: 		container.net.out.total[{#CONTAINER.NAME}]
  ▪ Type of inf: 	Numeric (unsigned)
  ▪ Master item		Podman containers by HTTP and Prometheus: Podman info
  ▪ Units: 		B
♦ Tags 			
  ▪ Container:{#CONTAINER.NAME}	
  ▪ Metric:net_out_tot	
♯ Preprocessing
  ▪ Prometheus pattern	podman_container_net_output_total{id="{#CONTAINER.ID}",pod_id=~".*",pod_name=~".*"} value

● Item prototype #16
  ▪ Name: 		Network output per second: [{#CONTAINER.NAME}]
  ▪ Type 		Dependent item
  ▪ Key: 		container.net.out.change[{#CONTAINER.NAME}]
  ▪ Type of inf: 	Numeric (float)
  ▪ Master item		prototype - Network output total: [{#CONTAINER.NAME}]
  ▪ Units: 		Bps
♦ Tags 			 
  ▪ Container:{#CONTAINER.NAME}	
  ▪ Metric:net_out_change
♯ Preprocessing
  ▪ Name			Change per second

● Item prototype #17
  ▪ Name: 		Rootfs size: [{#CONTAINER.NAME}]
  ▪ Type 		Dependent item
  ▪ Key: 		container.rootfs.size[{#CONTAINER.NAME}]
  ▪ Type of inf: Numeric (unsigned)
  ▪ Master item		Podman containers by HTTP and Prometheus: Podman info
  ▪ Units: 		B
♦ Tags 			
  ▪ Container:{#CONTAINER.NAME}	
  ▪ Metric:rootfs
♯ Preprocessing
  ▪ Prometheus pattern	podman_container_rootfs_size_bytes{id="{#CONTAINER.ID}",pod_id=~".*",pod_name=~".*"} value

● Item prototype #18
  ▪ Name: 		Total system CPU time: [{#CONTAINER.NAME}]
  ▪ Type 		Dependent item
  ▪ Key: 		container.cpu.time
  ▪ Type of inf: 	Numeric (float)
  ▪ Master item		Podman containers by HTTP and Prometheus: Podman info
  ▪ Units: 		s
♦ Tags 			
  ▪ Container:{#CONTAINER.NAME}	
  ▪ Metric:sys_time
♯ Preprocessing
  ▪ Prometheus pattern: podman_container_cpu_system_seconds_total{id="{#CONTAINER.ID}",pod_id=~".*",pod_name=~".*"} value

Creating a template discovery rule: Trigger prototype

I have created a user macro {$CONTAINER.RUNNING.STATE} on the template with a value of 2, which corresponds to the containers running state. After that, create a trigger prototype to check if the container is in different state other than running.

Template: Podman containers by HTTP and Prometheus; Discovery rule: Container discovery

◘ Trigger prototypes
  ▪ Name:               Container [{#CONTAINER.NAME}] state has changed from running
  ▪ Severity:           Warning
  ▪ Expression:         last(/Podman containers by HTTP and Prometheus/container.state[{#CONTAINER.NAME}])<>{$CONTAINER.RUNNING.STATE}
  ▪ PROBLEM event generation mode: Single
  ▪ OK event closes: All problems

So, once all of this is done, and some container status changes from running and or pod status also changes from running, you will get a problem event.

Fig 10. Generated problem events when the podman pod and container change states.

Technically, I could also create a trigger for container health; however, as all of the received container values for me are -1 (meaning unknown) it makes little sense to make a trigger that will fire right away. You can also add additional item/trigger prototypes in the template. If everything is set up as expected, you should see something like the screenshot below after the LLD rule execution.

Fig 11. Example of the mysql-server container and zabbix pod item values.

Summary

Now, you can monitor both Podman pods and containers using both blog posts of this series. We used the same template item for both the container LLD and item prototypes from the first part of this post.

The post Podman Container Monitoring with Prometheus Exporter, part 2 appeared first on Zabbix Blog.

Podman Container Monitoring with Prometheus Exporter, part 1

2025-06-10 Janis Eidaks

Post Syndicated from Janis Eidaks original https://blog.zabbix.com/podman-container-monitoring-with-prometheus-exporter-part-1/30513/

In part one of this blog post, I will show you how to monitor Podman pods using HTTP agent item to retrieve data from the Prometheus Podman exporter. Let’s get started!

Table of Contents

Installing and checking Prometheus Podman exporter

First, you will need to install and enable the Prometheus Podman exporter (my OS is CentOS Stream release 9). Then, check that the service is active and running.

# dnf install -y prometheus-podman-exporter

# systemctl enable prometheus-podman-exporter –now

# systemctl status prometheus-podman-exporter

You can check that you are getting the data from the exporter with either the curl command from the machine/VM where the Prometheus podman exporter is installed and started:

# curl http://localhost:9882/metrics

Fig 1. Output of Prometheus podman exporter in CLI

Or through the browser (replace abc with the machine’s IP/DNS ): abc:9882/metrics.

Fig 2. Output of Prometheus Podman exporter in browser

A line starting with # is a comment and contains an explanation regarding the metric; in this case, podman_container_block_input_total will return data in bytes. In Figure 2, after the comments, you can see several podman_container_block_input_total metrics, one for each container, with different container IDs, pod IDs, and pod names listed in each metric. The metric’s value is displayed on the right side after curly brackets.

Creating a template and template items

Next, I will create a template Podman containers by HTTP and Prometheus where I will put all of the entities (everything will be created on the template). In the template, I will create an item Podman info, which will gather all of the necessary data at defined intervals. This approach will be convenient from a data collection standpoint as the same item data will be used for LLD and item prototypes. During testing, you can set “History” to store data for some time, and when everything is working as expected, then set “History” not to keep any data. This item will be used for the Low-Level Discovery rule and the item prototype.

The item Podman info parameters are as follows:

Template: Podman containers by HTTP and Prometheus

○ Item
  ▪ Name:         Podman info
  ▪ Type          HTTP agent
  ▪ Key:          podman.info
  ▪ Type of inf   text
  ▪ URL           http://{HOST.CONN}:9882/metrics
  ▪ Request type  GET
  ▪ Update int.   5m
  ▪ Req status c. 200
  ▪ History       Do not store
◊ Tags
  ▪ Podman:raw

At this moment, this item will contain just raw data, without any preprocessing steps applied. The IP address will be taken from any host interface added to the host. You will get an error message if the host has no interface.

Fig 4. Error on the host with the linked template without any interface

If you do not want to add an interface to the host, you can define a user macro on the template level and use that user macro in the items URL. After adding the template to the host, just modify the user macro value on the host to correct IP/DNS name.

Fig 6. Template item for data gathering with user macro instead of built in macro from host interface

I can also create an item to determine the number of containers created. I can count specific Prometheus pattern occurrences in the master item to determine this. For this, I will use the podman_container_state parameter. Likewise, I could use different parameters, such as podman_container_info, and count the occurrences of such a pattern. The parameters of the item container count:

Template: Podman containers by HTTP and Prometheus

○ Item
  ▪ Name:         Container count
  ▪ Type          Dependent item
  ▪ Key:          container.count
  ▪ Type of inf   Numeric (unsigned)
  ▪ Master item   Podman containers by HTTP and Prometheus: Podman info
◊ Tags
  ▪ Containers:total
♯ Preprocessing
  ▪ Prometheus pattern     podman_container_state     count

Fig 7. Template item preprocessing step for counting the total number of containers

Creating a Discovery rule in template

Next, the LLD rule will be created to discover Podman pods. It will be a dependent LLD rule based on a Podman info item with a preprocessing step to convert the Prometheus pattern data to JSON format. The caveat is that the LLD discovery will be executed as frequently as the data is received for the item. If there are a lot of hosts with such a template, there will be a lot of LLD processes executed, which can put a strain on your Zabbix instance.

To rectify this issue, I will add a preprocessing step: discard unchanged with heartbeat (as there are no dynamic parameters in the extracted pattern, otherwise we would need to filter out dynamically changing information). For LLD discovery, the recommended interval is around 1h. Additionally, LLD macros will be created from selected JSNOPath variables. The parameters of the LLD rule are shown below.

Template: Podman containers by HTTP and Prometheus

▲ Discovery rule
  ▪ Name:                   POD discovery
  ▪ Type                    Dependent item
  ▪ Key:                    training.pod.discovery
  ▪ Master item             Podman containers by HTTP and Prometheus: Podman info
  ▪ Delete lost resources  After 10d
  ▪ Disable lost resources Immediately
♯ Preprocessing
  ▪ Prometheus to JSON     podman_pod_info
  ▪ Discard unchanged with heartbeat 1h
♦ LLD Macros
  ▪ {#POD.ID}              $.labels.id
  ▪ {#POD.NAME}            $.labels.name

Fig 9. Discovery rule: Pod discovery preprocessing tab

Fig 10. Discovery rule: Pod discovery LLD macros tab

The block diagram below will show how the data is transformed. First, a preprocessing step is applied to the data to convert the Prometheus pattern to JSON format, as all data for LLD must be supplied in JSON format.

In the example below, the matching queried pattern is returned in JSON format after this preprocessing step.

Fig 11. Discovery rule preprocessing step: Prometheus to JSON with pattern podman_pod_info

After the preprocessing step, we can assign specific JSONPATH values to LLD macros.

Fig 12. Discovery rule LLD macros: assigning relevant JSONPATH to LLD macros

Creating a template Discovery rule: item prototypes

Now that we have discovered the macros we are interested in, the discovered macros can be used for further prototype (ITEM/HOST/TRIGGER) creation. In this example, I am using the same master item for LLD discovery and the dependent item prototypes, because it is convenient for me, and all the information is available in one item. But usually, there are scenarios where you have to use one item’s data for discovery and the data of another item for populating the prototype values.

In this case, I am interested in the pod ID, when the pod was created, the number of containers in the pod, and the state of the pod. Therefore, I will create the item prototypes and use the LLD macro in the name, key, and preprocessing step. Zabbix will cycle through the discovered LLD macro values and create the items based on the prototype by replacing the LLD macro with discovered values. Although you can set matching item prototype names (which will be confusing), you still have to use the LLD macro in the item key so that different item keys are generated – otherwise, you will get an error regarding duplicate keys. The item prototype parameters are given below.

Fig 13. Low-level discovery rule and item prototypes based on the same item.

Template: Podman containers by HTTP and Prometheus; Discovery rule: POD discovery

○ Item prototype #1
  ▪ Name:         POD ID: [{#POD.NAME}]
  ▪ Type          Dependent item
  ▪ Key:          pod.id[{#POD.NAME}]
  ▪ Type of inf   Character
  ▪ Master item   Podman containers by HTTP and Prometheus: Podman info
♦ Tags
  ▪ Metric:ID
  ▪ Pod:{#CONTAINER.NAME}
♯ Preprocessing
  ▪ Prometheus pattern     podman_pod_containers{id="{#POD.ID}"}          label    id

○ Item prototype #2
  ▪ Name:         POD state: {#POD.NAME}
  ▪ Type          Dependent item
  ▪ Key:          pod.state[{#POD.NAME}]
  ▪ Type of inf   Numeric (float)
  ▪ Master item   Podman containers by HTTP and Prometheus: Podman info
  ▪ Value mapping POD state
♦ Tags
  ▪ Metric:state
  ▪ Pod:{#CONTAINER.NAME}
♯ Preprocessing
  ▪ Prometheus pattern     podman_pod_state{id="{#POD.ID}"}      value

○ Item prototype #3
  ▪ Name:         POD created at: [{#POD.NAME}]
  ▪ Type          Dependent item
  ▪ Key:          pod.created[{#POD.NAME}]
  ▪ Type of inf   Numeric (unsigned)
  ▪ Units         unixtime
  ▪ Master item   Podman containers by HTTP and Prometheus: Podman info
♦ Tags
  ▪ Metric:created
  ▪ Pod:{#CONTAINER.NAME}
♯ Preprocessing
  ▪ Prometheus pattern     podman_pod_created_seconds{id="{#POD.ID}"}     value

○ Item prototype #4
  ▪ Name:         POD container count: [{#POD.NAME}]
  ▪ Type          Dependent item
  ▪ Key:          pod.count[{#POD.ID}]
  ▪ Type of inf   Numeric (unsigned)
  ▪ Master item   Podman containers by HTTP and Prometheus: Podman info
♦ Tags
  ▪ Metric:count
  ▪ Pod:{#CONTAINER.NAME}

On the template, I have also created a value map for deciphering the numerical pod state codes to text strings for better clarity.

Fig 14. Value mapping for the POD state item

Here are some screenshots of the POD state item prototype, shown below.

Fig 15. POD state item prototype: item prototype tab

Fig 16. POD state item prototype: tag tab

Fig 17. POD state item prototype: preprocessing tab

Creating a template Discovery rule: trigger prototype

We can also create a trigger prototype to generate an alert if there is something wrong with the pod. I have created a user macro {$POD.RUNNING.STATE} on the template with a value of 4, which corresponds to the running state.

Template: Podman containers by HTTP and Prometheus; Discovery rule: POD discovery

◘ Trigger prototypes:
  ▪ Name:               POD [{#POD.NAME}] state has changed from running
  ▪ Severity:           Warning
  ▪ Expression: last(/Podman containers by HTTP and Prometheus/pod.state[{#POD.NAME}])<>{$POD.RUNNING.STATE}
  ▪ PROBLEM event generation mode: Single
  ▪ OK event closes: All problems

Fig 18. Trigger prototype based on POD state item value

Once you link the template to the host and execute the LLD rule, you should start seeing the Podman pods ( if you have them), similar to the screenshot below.

Fig 19. Latest data for the host with the linked template

Summary

This blog post shows how to get data with HTTP agent from Prometheus Podman exporter and use the same item data for the Discovery rule as well as item and trigger prototypes. Check out part 2 of this series to find out how to discover and monitor Podman containers.

The post Podman Container Monitoring with Prometheus Exporter, part 1 appeared first on Zabbix Blog.

Database Monitoring using Zabbix agent 2 – Part 1, SQL

2025-06-03 Alexander Petrov-Gavrilov

Post Syndicated from Alexander Petrov-Gavrilov original https://blog.zabbix.com/database-monitoring-using-zabbix-agent-2-part-1-sql/30381/

If you find yourself needing additional flexibility when it comes to database monitoring, Zabbix agent 2 may be exactly what you need. Keep reading to see which features make it ideal for database monitoring and find out how to best use them for your own purposes.

Table of Contents

What is a database?

If you’ve been using Zabbix for a while, you know that a database is an organized collection of data that is stored and accessed electronically.
That data can be historical, configuration, business, social media-related, etc. A database, or rather a database management system (DBMS) allows you to store, manage, and retrieve information efficiently.

Types of DBMS

We can separate DBMS into multiple types. Depending on how data is stored, retrieved, managed, there can be quite a few, but we will try to limit ourselves to the most common four:

Relational databases (or RDBMS) see tables and SQL.
- MySQL
- MariaDB
- PostgreSQL
- Oracle

NoSQL databases store data in formats like JSON, key-value pairs, or graphs.
- MongoDB
- Redis
- InfluxDB
- ElasticSearch

Cloud databases use cloud platforms for scalability.
- Amazon RDS
- Azure SQL

Time-series databases (or TSDB databases) are optimized for time-stamped data.
- TimescaleDB
- InfluxDB

But what unites all those database engines? They can all be monitored by Zabbix!

Database monitoring

Database monitoring is important for a variety of reasons, the most common of which are to get a precise overview of database and application performance. Since databases can be a vital part of multiple departments and applications, poor performance may impact an entire company and its users, leading to unsatisfactory results on all sides.

To avoid such situations, the set of metrics we should monitor for database engines can include:

Database environment metrics
- CPU performance
- Memory usage
- Drive capacity
- Disk latency

Database performance metrics
- Query performance
- Transaction/operations/indexing
- Connections

Application and/or business related data
- Amount of users
- Transactions
- Inventory
- Configuration

Why Zabbix agent 2?

Zabbix Agent 2 includes multiple features that enhance its flexibility:

Task queue management with respect to both schedule and task concurrency.
Concurrent active checks with threads.
Multiple agent 2 unique metrics
Easier to extend using GO plugins.

Plugins in Zabbix Agent 2 are written in the Go programming language and provide a flexible, native way to extend the agent’s functionality. These plugins communicate directly with databases using their native APIs or libraries, which allows for correct and efficient performance monitoring.

But agent2 provides even more flexibility when focusing on database monitoring, allowing us to:

Limit query execution
Control the session time
Configure encryption between Zabbix agent and database
Control cache mode

All database data is collected using the best approach for the monitored database.

MySQL, monitoring relies on the Go-MySQL-Driver
PostgreSQL integration is managed through the pgx driver

The list goes on for supported database engines:

MySQL / MariaDB
PostgreSQL
ORACLE
MSSQL
MongoDB
Redis
Memcached

Monitoring SQL databases

Database environment

In this part we will focus on how to monitor and retrieve data from SQL databases and SQL database-related parameters. Monitoring SQL database environment metrics with Zabbix agent 2 is as straightforward as monitoring any virtual or physical machine with an OS. All we need to do is add the repo:

# dnf install https://repo.zabbix.com/zabbix/7.0/centos/9/x86_64/zabbix-release-latest-7.0.el9.noarch.rpm

Install the agent:

# dnf install zabbix-agent2

Then, make sure that connections from Zabbix server to Zabbix agent 2 are allowed using Server parameter:

### Option: Server
#       List of comma delimited IP addresses, optionally in CIDR notation, or DNS names of Zabbix servers and Zabbix proxies.
#       Incoming connections will be accepted only from the hosts listed here....
# Mandatory: no
# Default:
# Server=
Server=127.0.0.1,server-dns.example.com

Finally, link one of the many templates available out of the box:

SQL database performance metrics

What about the actual DB performance metrics? There are plenty of approaches we can take using Zabbix agent 2.

Out-of-the-box templates are available for multiple databases that can be monitored by Zabbix agent 2:

Each of the templates uses a database native way to get precise performance data, such as SHOW GLOBAL STATUS for MySQL or dbStats for MongoDB. Also, template provides instructions on how to prepare the database for monitoring. Let’s take MySQL/MariaDB for example:

Create a MySQL user for monitoring (<password> at your discretion) and give this user enough permissions for monitoring:

mysql> CREATE USER 'zbx_monitor'@'%' IDENTIFIED BY '<password>';
mysql> GRANT REPLICATION CLIENT,PROCESS,SHOW DATABASES,SHOW VIEW ON *.* TO 'zbx_monitor'@'%';

In order to collect replication metrics, MariaDB Enterprise Server 10.5.8-5 and above and MariaDB Community Server 10.5.9 and above require the SLAVE MONITOR privilege to be set for the monitoring user. The command then looks like this:

mysql> GRANT REPLICATION CLIENT,PROCESS,SHOW DATABASES,SHOW VIEW,SLAVE MONITOR ON *.* TO 'zbx_monitor'@'%';

Then create a host to represent your MySQL/MariaDB and link the “MySQL by Zabbix agent 2” template:

Configure the Macros on the same host:

And the data will start pouring in!

You can find instruction for other databases here.

SQL database internal data monitoring

A default template will tell us a lot about performance, but what if we also need application data? Something that is stored in the database, i.e.

Number of orders
Logged in users
Host count
List of failed transactions
Amount of media uploaded

Zabbix agent 2 lets users collect custom SQL query results with the help of configuration files and a specific item key:

<dbtype>.custom.query[connString,<user>,<password>,queryName,<args...>]:
• Dbtype – mysql, postgresql, oracle, mssql
• connString - URI or session name;
• user, password - Database login credentials;
• queryName - name of a custom query, matches SQL file name without .sql extension;
• args - one or several comma-separated arguments to pass to a query.

The main idea of this key is to construct efficient queries that can return multiple values. The values returned will be automatically transformed to JSON, which is both easier to preprocess and use for LLD creation.

I will add a simple query to find all hosts and their main interface availability in Zabbix:

SELECT hosts.host,interface.available FROM zabbix.hosts JOIN zabbix.interface ON hosts.hostid=interface.hostid WHERE hosts.status IN (0,1) AND hosts.flags IN (0,4) AND interface.main=1;

First I need to create a directory for custom queries:

# mkdir /etc/zabbix/zabbix_agent2.d/plugins.d/custom_queries

Now I will create an .sql file with a query and paste the mentioned query into the file:

# nano /etc/zabbix/zabbix_agent2.d/plugins.d/custom_queries/interfaces.sql

Now I will edit the MySQL plugin .conf file and set a custom queries path:

# nano /etc/zabbix/zabbix_agent2.d/plugins.d/mysql.conf
### Option: Plugins.Mysql.CustomQueriesPath
#       Full pathname of a directory containing *.sql* files with custom queries.
#
# Mandatory: no
# Default:
# Plugins.Mysql.CustomQueriesPath=
Plugins.Mysql.CustomQueriesPath=/etc/zabbix/zabbix_agent2.d/plugins.d/custom_queries/

Save the changes and restart Zabbix agent 2 to apply them:

# systemctl restart zabbix-agent2

Before adding the item using the web interface, it is always a good idea to test it:

zabbix_agent2 -t mysql.custom.query["tcp://localhost:3306","zbx_monitor","<password>","interfaces"]

The output will is now a easy to work with JSON pattern (beautified here):

[
  {
    "available": "1",
    "host": "Zabbix server"
  },
  {
    "available": "1",
    "host": "Test environment"
  },
  {
    "available": "1",
    "host": "MySQL database"
  },
  {
    "available": "1",
    "host": "MongoDB database"
  },
  {
    "available": "1",
    "host": "PostgreSQL database"
  },
  {
    "available": "1",
    "host": "Customer portal"
  }
]

Now, I’m sure the data is collected and can be used for LLD. I can create a new item on the MySQL database host to collect this data:

Since I know what kind of data will be returned, I can create a dependent Discovery rule on the same host:

The LLD macros tab will help to transform the current JSON to the LLD-suitable JSON, replacing “host” with {#HOST}.

Interface LLD item mecros — Interface LLD item macros

After adding the discovery itself, we can create the dependent item prototype, which will allow us to discover all hosts and their status:

Preprocessing here is a must, and it needs to be flexible enough to extract each individual host interface status:

Interface status item prototype preprocessing

Now after adding the item prototype, we can check the results:

An item cam be further enhanced using value mapping, to specify that 1 means available and 0 means not available.

With this approach, any internal database data can be extracted and monitored. In part 2 we will see how NoSQL databases can be monitored for both performance and internal data using Zabbix agent 2.

If you’d like more information on database monitoring, please don’t hesitate to sign up for our training course in Advanced Zabbix Database Monitoring, which covers multiple approaches to collecting database-related performance metrics and data using Zabbix Agent 2, ODBC, and API requests, as well as optimizing data collection by introducing dependent low-level discovery for minimal performance impact.

The post Database Monitoring using Zabbix agent 2 – Part 1, SQL appeared first on Zabbix Blog.

Let Zabbix be your Lucky Lady for Lotto Numbers

2025-05-23 Janne Pikkarainen

Post Syndicated from Janne Pikkarainen original https://blog.zabbix.com/let-zabbix-be-your-lucky-lady-for-lotto-numbers/30251/

Many years ago, maybe around 2017 or 2018, one of my ex-colleagues (Hi, Kevin!) said that I would probably even use Zabbix to come up with the winning lotto numbers. Just to strike back, I did exactly that with a small “easter egg” in Zabbix containing the lotto numbers – a quick bash script feeding the Zabbix item.

Let’s return to that topic, but use a Zabbix Script item type instead. Also, let’s take a look at few other details that help with monitoring.

Let’s create a host and a new template

To begin with, I created a new template and a new host. Here’s the host, nothing more needed than a name and my fancy template:

The template has only one item:

For the script, here’s what ChatGPT came up with. JavaScript is not my strongest skill, so for a fun little experiment this AI vibe coding should be good enough.

// Generate an array of numbers from 1 to 40
var numbers = [];
for (var i = 1; i <= 40; i++) {
   numbers.push(i);
}

// Shuffle the array using the Fisher-Yates algorithm
for (var i = numbers.length - 1; i > 0; i--) {
   var j = Math.floor(Math.random() * (i + 1));
   var temp = numbers[i];
   numbers[i] = numbers[j];
   numbers[j] = temp;
}

// Select the first 7 numbers from the shuffled array
var lotto = numbers.slice(0, 7);

// Optionally sort the selected numbers
lotto.sort(function(a, b) { return a - b; });

// Return the lottery numbers as a space-separated string
return lotto.join(" ");

So, that’s it! Well, almost – to stop Zabbix from coming up with new numbers all the time, here’s a very nice feature of Zabbix.

Custom intervals

If you set your Update interval to 0, you can use Custom intervals. This way, the new numbers will only be generated once per week every Monday at 8:00 to kick off your work week (assuming that your country has the lotto only once per week, of course).

Naturally, in the actual business world, this kind of exact scheduled monitoring can be extremely helpful as well. If you have something you don’t need to check all the time but only during business days and hours, or only on weekends, or only once per day, this is a handy way of doing it.

Does it work?

You know the answer — of course it does! Now when I search for “lotto”, click on “Latest data” and force the check to happen immediately by clicking on “Execute now”, this happens.

Time to dashboard it

I could peek at the values from Latest data, but that would be boring. With a dashboard, it’s a bit more entertaining…

I hope this post gave you some new ideas or maybe even introduced you to Script item type. If you win some major money with this trick, don’t forget to buy me a coffee!

The post Let Zabbix be your Lucky Lady for Lotto Numbers appeared first on Zabbix Blog.

Build a Culture of Monitoring and Get Buy-In with Zabbix

2025-05-06 Michael Kammer

Post Syndicated from Michael Kammer original https://blog.zabbix.com/build-a-culture-of-monitoring-and-get-buy-in-with-zabbix/30085/

In today’s fast-paced, interconnected IT world, simply waiting for something to fail before fixing it isn’t good enough. A proactive approach to monitoring, which aims to identify and address potential issues before they escalate into major disruptions, is a necessity rather than a luxury.

Here at Zabbix, we’ve got plenty of reason to believe that we offer the most flexible monitoring solution available on the market today. However, choosing the best monitoring tool for your organization’s needs is only half the battle – you also need to get buy-in from team members who may not understand the need for monitoring, may be fearful of and resistant to change, and may not be familiar with the technologies behind monitoring.

In this post, we’ll take a look at a few strategies you can use to help win over lukewarm or hesitant colleagues and build a culture of monitoring. We’ll also explore how choosing Zabbix for your monitoring needs can make each strategy a bit easier to implement.

Table of Contents

Strategy 1: Explain the “why”

One of the first questions that you can anticipate during any change initiative is simply, “what for?” The ethos of “don’t fix what isn’t broken” runs strong in the tech community, and unless you go above and beyond to explain why monitoring matters, your team will remain skeptical.

Zabbix can help you make your case by providing you with the evidence you need to bolster your case. We’ve got plenty of testimonials available from tech communities worldwide (including PeerSpot, Gartner, and Capterra), and no matter what field you’re in or how big your team is, we’ve most likely got a case study or two available that shows how monitoring with Zabbix was a game changer for a company like yours.

All of this should help you explain the rationale for the change in an open and transparent way. When it comes to monitoring, sharing details on costs, expected benefits, and what will happen if no change is made will build understanding around why monitoring is necessary and why monitoring with Zabbix is the right answer for your team’s needs.

Strategy 2: Show your team what’s in it for them

One of the most effective ways to get employee buy-in for monitoring is by highlighting the benefits it will bring to individual employees. Show how monitoring can simplify their tasks, improve efficiency, and enhance their work experience, and give them concrete examples of how the technology can make their jobs easier or help them to deliver better results.

We recently had a large managed services provider (MSP) use our monitoring solution as a true “force multiplier”, allowing them to monitor their systems, automate tasks based on real-time events, and provide immediate responses to issues without manual intervention. Thanks to Zabbix, their engineers report higher job satisfaction thanks to no longer having to be “on call” at all hours to solve simple issues, while management has seen productivity skyrocket thanks to their team’s newfound ability to find potential issues before they become real problems.

Strategy 3: Turn important stakeholders into monitoring champions

Determine who monitoring will impact and who needs to be kept informed. This might be team leaders, IT staff, end users, and/or an executive sponsor. Getting input from these groups early on will help you anticipate needs and concerns, and you’ll also want to identify influential employees who are enthusiastic about monitoring and get them to help you promote it.

A great way to help them do so is by encouraging them to attend one (or more) Zabbix events – we’ve got free meetings, online meetups, regional conferences, or even our yearly Summit in Latvia. No matter where you happen to be located, there’s a pretty good chance that we’ll soon be bringing your key people a chance to network with like-minded professionals from multiple industries, expand their knowledge, get answers to their questions, and explore how Zabbix can work for them.

Strategy 4: Provide adequate training

Equipping employees with the skills and knowledge they need to get the most out of a monitoring system means gaining a solid understanding of their current capabilities and then finding out which gaps you most urgently need to fill. Chances are, you’ll need to provide guidance, documentation, hands-on demonstrations, and access to experts – and this is another area where Zabbix has you covered.

Zabbix Certified trainings are designed to help your people learn Zabbix inside and out, giving them the practical knowledge they’ll need to increase their productivity and performance. When you explore our training options, you’ll find a wide variety of courses, everything from one-day sessions that cover the basics to week-long sessions that guarantee users the ability to tackle any Zabbix challenge on their own.

In addition, we’ve got plenty of other free resources available to teams and individuals looking to upskill, including our famously active forum, blog, webinars, and newsletter.

Conclusion

Building a culture of monitoring requires commitment from every level of an organization. By choosing Zabbix as the guide to your monitoring journey and following the strategies outlined in this article, you and your team can successfully implement and maintain a robust monitoring strategy that will help you achieve your organization’s IT goals.

To learn more about what Zabbix can do for you, visit our website.

The post Build a Culture of Monitoring and Get Buy-In with Zabbix appeared first on Zabbix Blog.

Getting Started with Zabbix – Hosts, Items, and Triggers

2025-04-29 Arturs Lontons

Post Syndicated from Arturs Lontons original https://blog.zabbix.com/getting-started-with-zabbix-hosts-items-and-triggers/30190/

Hosts, items, and triggers are some of the most basic concepts in Zabbix. To successfully configure their monitoring workflows, Zabbix users need to have a clear understanding of how these entities are used. This article is aimed at Zabbix beginners and should help anyone better understand the basics of Zabbix while providing guidance on how to start monitoring your initial set of hosts.

Table of Contents

Hosts

Hosts are top-level entities in Zabbix and represent your monitored endpoints. Whenever we need to monitor a device, web application, service, or anything else – we start by creating a host.

The host acts as a container for our items (representing the metrics we wish to collect) and triggers (problem threshold definitions). These entities can be created directly on the host or inherited from predefined templates.

Every host has 2 mandatory parameters – its unique name and at least a single host group. Host groups are used for grouping, filtering, and assigning read/write permissions to hosts. Hosts are not limited when it comes to the number of host groups they are assigned to.

A simple Linux server host with an agent interface and a Linux template

An interface might also be required, depending on the type of items we will create on the host. Interfaces define host addresses and, in case of SNMP interfaces, some additional authentication and security parameters.

There are 4 types of interfaces in total, representing 4 different data collection methods:

Agent
SNMP
JMX
IPMI

Zabbix supports other types of data collection methods, but for these 4 methods in particular an interface is required on the host. Other data collection methods define endpoint addresses directly in the item configuration or use push data collection (trapping) where Zabbix is not required to know the endpoint address.

Templates

Templates contain a set of predefined items and triggers and can be linked to hosts. This enables the standardization of monitoring workflows in your environment. Changes made on the template will be immediately applied on the hosts to which the template is linked. Zabbix comes prepackaged with over 300 templates for a variety of vendors and endpoint types.

Zabbix users aren’t limited to just the official templates – anyone can create their own templates with items and triggers tailored to the requirements of a particular environment. We also recommend adjusting the official templates – disable the unnecessary items and adjust the triggers so they don’t generate any unnecessary noise.

Items

Items are used to define the metrics that we wish to collect, and are configured on hosts or templates. Items can be of various types. The type of the item usually defines the protocol and the methods used to collect metrics via this item. Some examples of item types:

Zabbix agent
SNMP agent
SNMP trap
Simple check
HTTP agent
IPMI agent
JMX agent
SSH agent
…and many others.

The key of the item is used to specify what particular metric should be collected. There are some exceptions to this – for example, for SNMP agent items it’s the OID field, while the key can be written arbitrarily. The key should be unique per host.

The key uses a <key>[<parameters>] format. For example, if we wish to collect available memory by utilizing Zabbix agent, we will use the vm.memory.size[available] item key. If we wish to collect available memory in percent, we would use the vm.memory.size[pavailable] item key. A quick item key reference is available by pressing select next to the Key field. You can find more about the available item keys and other configuration details in our documentation.

The update interval specifies how often metrics should be collected for this key, and the history/trend storage periods define for how long the collected data should be retained.

Triggers

Once we have configured our items, we should create triggers to react to item values reaching problem thresholds. First, let’s define a simple trigger name. The name should be simple enough for our Zabbix administrators to understand the goal of the trigger simply by glancing at it.

Trigger reaction to low available memory over the last 10 minutes

The event name field is used to define the name with which our problems will be displayed. Since the problem event name is often used not just in Zabbix but also in the alerts that your administrators will receive in their mailboxes or via messaging and ITSM systems, the event name should be more descriptive, giving general details about the problematic situation.

Operational data fields are used to display information about the current state of items analyzed by the trigger. By default, the field will display the current value of our item (available memory, for example). This allows users to compare the current item values with item values at the time of problem creation and decide if any additional interference is necessary to resolve the problem.

The expression field defines the logic behind detecting a problem. Here, we can either type in the expression manually or press the add button and build the expression by selecting the item that we wish to analyze – plus one of the various functions used for analysis. For example, the last function is used to analyze only the last received value and can generate a lot of noise when used for resource monitoring. Meanwhile, average, minimum, and maximum functions can be used to analyze values less sensitively over time. There are many more functions available for a variety of more advanced use cases – from string analysis functions to predictive functions and many others.

A large selection of functions can be used in trigger expressions to detect problems

Once the trigger is created, it will be recalculated every time any of the related items receive a new value.

This article covered only the basics of Host, item and trigger configuration. There are many more options for more advanced use cases. If you’re interested or need help with more advanced Zabbix features, please check out a variety of tutorials, how-tos and case studies in our blog and YouTube channel.

The post Getting Started with Zabbix – Hosts, Items, and Triggers appeared first on Zabbix Blog.

Enhancing Visualizations in Zabbix with the ECharts Module

2025-04-17 Matheus da Silva Andrade

Post Syndicated from Matheus da Silva Andrade original https://blog.zabbix.com/enhancing-visualizations-in-zabbix-with-the-echarts-module/30199/

One of the great advantages of Zabbix is its extensible and modular architecture. This allows the platform to be enhanced with third-party modules, significantly expanding its functionalities without compromising the stability of the core system. The ECharts-Zabbix module is an excellent example of this flexibility in action.

Table of Contents

What is the ECharts-Zabbix module?

ECharts-Zabbix is a module that adds customizable widgets to Zabbix, using the ECharts library to create interactive and dynamic visualizations of your monitoring data. This module complements Zabbix’s standard visual capabilities, enabling richer and more informative graphical representation of complex monitoring environments.

What are the key features available with ECharts in Zabbix?

By integrating ECharts and Zabbix, you gain access to:

Multiple chart types (line, bar, pie, gauge, scatter, heatmap, and more)
Complete customization of colors, styles, legends, and tooltips
Fluid animations for a better user experience
Compatibility with Zabbix light and dark themes
Direct integration with data without the need for external tools
Responsive visualizations that adapt to different screen sizes
Helper functions for data formatting and dynamic color generation

Installation and configuration

Installing modules in Zabbix is easy thanks to the platform’s flexibility:

Download the module from the official repository
Extract the files to the modules folder of your Zabbix server
In the Zabbix frontend, go to Administration > General > Modules
Find the ECharts-Zabbix module in the list and click “Enable”
The widget will be available for use in Zabbix dashboards and screens

Practical use cases

Server performance monitoring with Gauge charts

Gauge charts are ideal for visualizing metrics such as CPU, memory, and disk usage. The flexibility of Zabbix combined with ECharts allows you to create impressive visual panels that clearly show the current state of these metrics:

```javascript

const field = context.panel.data.series[0].fields[0];

const value = field.value;

const gaugeData = [{

  value: value,

  name: field.name,

  title: {

    offsetCenter: ['0%', '30%']

  },

  detail: {

    offsetCenter: ['0%', '60%']

  }

}];

return {

  backgroundColor: 'transparent',

  series: [{

    type: 'gauge',

    startAngle: 90,

    endAngle: -270,

    center: ['50%', '50%'],

    radius: '90%',

    pointer: {

      show: false

    },

    progress: {

      show: true,

      overlap: false,

      roundCap: true,

      clip: false,

      itemStyle: {

        borderWidth: 0

      }

    },

    axisLine: {

      lineStyle: {

        width: 20,

        color: [[1, 'rgba(255,255,255,0.1)']]

      }

    },

    splitLine: {

      show: false

    },

    axisTick: {

      show: false

    },

    axisLabel: {

      show: false

    },

    data: gaugeData,

    title: {

      fontSize: 14,

      fontWeight: 'normal'

    },

    detail: {

      width: 80,

      height: 20,

      fontSize: 14,

      fontWeight: 'normal',

      borderWidth: 0

    }

  }]

};

```

Liquid fill chart example

This chart type is great for visualizing percentage-based metrics, like disk usage or SLA compliance, in a visually appealing way:

```javascript

if (!context.panel.data.series || !context.panel.data.series[0] || !context.panel.data.series[0].fields) {

    console.error('Dados não disponíveis no formato esperado');

    return {};

}

const field = context.panel.data.series[0].fields[0];

return {

    backgroundColor: 'transparent',

    series: [{

        type: 'liquidFill',

        data: [field.value / 100],

        radius: '80%',

        color: ['#91cc75'],

        backgroundStyle: {

            color: 'rgba(255, 255, 255, 0.1)'

        },

        label: {

            formatter: function() {

                return field.name + '\n' + field.value.toFixed(2) + field.units;

            },

            fontSize: 28,

            color: 'black'

        },

        outline: {

            show: false

        }

    }]

};

```

Below are some other visualization examples available on our github:

Colors and gradients

You can use simple hexadecimal colors or create sophisticated gradients:

```javascript

// Linear gradient

new echarts.graphic.LinearGradient(0, 0, 0, 1, [

  { offset: 0, color: '#83bff6' },

  { offset: 1, color: '#188df0' }

])

```

Number formatting

Format your numerical data as needed:

```javascript

// 2 decimal places

formatter: function(value) {

  return value.toFixed(2) + field.units;

}

// Using context helper

formatter: function(value) {

  return context.helpers.formatNumber(value, 2) + field.units;

}

```

Element positioning

Precisely control where elements are displayed:

```javascript

// Centered

offsetCenter: [0, '70%']

// Custom grid

grid: {

  top: '5%',

  left: '3%',

  right: '4%',

  bottom: '3%',

  containLabel: true

}

```

The Zabbix module ecosystem

Zabbix has a growing ecosystem of modules and integrations, developed by both the community and specialized companies like Monzphere, which contributes the ECharts-Zabbix module. This development dynamic demonstrates how Zabbix has evolved to become a truly extensible platform.

To learn more about the ECharts-Zabbix module and other solutions for Zabbix, you can visit our official GitHub repository or Monzphere’s website.

Conclusion

Zabbix’s modular architecture is one of its greatest differentiators, allowing the platform to grow and adapt to the specific needs of each monitoring environment. The ECharts-Zabbix module is an excellent example of how this flexibility can be leveraged to transform the data visualization experience in Zabbix.

For modern monitoring environments where clear and effective data visualization is essential, the combination of Zabbix with specialized modules represents a complete and adaptable solution. Try expanding your Zabbix with the ECharts module and discover how it can transform your monitoring dashboards!

The post Enhancing Visualizations in Zabbix with the ECharts Module appeared first on Zabbix Blog.

Deploying Zabbix Components with Docker and Docker Compose

2025-04-08 Janis Eidaks

Post Syndicated from Janis Eidaks original https://blog.zabbix.com/deploying-zabbix-components-with-docker-and-docker-compose/30025/

Installing Zabbix from packages can feel overwhelming, due to the availability of different configuration options. The detailed and comprehensive documentation certainly helps to check the purpose of these multiple options, what values can be set in their fields, and if one is required for your planned deployment. There are quite a few official Zabbix blog posts about Zabbix in containers, and this post is aimed at showcasing how additional Zabbix components can be easily set up in a docker environment, along with docker run and docker compose examples.

Table of Contents

For those who would prefer to use Zabbix in a containerized environment such as Docker, or who want to try out Zabbix quickly, this guide is for you (you can also check out the other Zabbix Docker blog posts). You can also mix and match Zabbix components installed from packages or built from source with those running in containers.

Please follow the official guide on how to set up the docker here.

To better understand the Zabbix architecture for those who are trying out Zabbix for the first time, I will give you an overview that should make it much easier to follow and understand Zabbix.

Zabbix consists of 3 main components (the bare minimum to get started):

Zabbix Server – responsible for everything related to data collection, trigger evaluation, event generation, and alerting.
Zabbix Frontend – responsible for the configuration (modifying or changing the configuration of the monitoring targets) and visualization (dashboards, graphs, tables, and widgets).
Database – this is where the Zabbix configuration and monitoring history data are stored.

You can monitor your targets with the bare minimum setup; however, more comprehensive and complete monitoring can be achieved by using the C-based Zabbix-agent or GO-based Zabbix-agent2 in combination with templates, user parameters, and more. To set up the minimum necessary Zabbix components, you can use this example in the guide.

There are also official guides available on the Zabbix documentation page (for both: the docker run and docker compose) or the Docker/Github.

As of this writing , these official Zabbix docker components are available from the docker hub page:

Zabbix Server (with MySQL/PostgreSQL database)
Zabbix Proxy (with MySQL/SQLite3 database)
Zabbix Frontend (Apache/Nginx with MySQL/PostgreSQL DB)
Zabbix Agent (TLS encryption)
Zabbix Agent2 (TLS encryption)
Zabbix Java Gateway
Zabbix SNMP traps
Zabbix Web Service

Tags are used to select which OS container an image will be based on, as well as which Zabbix component version you wish to employ. If you only specify tag value – latest, you will get the latest Zabbix version based on the Alpine Linux. The images based on Linux Alpine are more lightweight than the other distros.

When something does not work as expected or fails, check the container error logs! This will be useful for debugging purposes and will help to narrow down the cause of an issue. Additionally, when debugging you can also specify additional options, such as specific lines of log, timestamp since or until, or following the log file content.

# docker logs --tail 50 container_name_or_id

    --details        Show extra details provided to logs
-f, --follow         Follow log output
    --since string   Show logs since timestamp (e.g. "2013-01-02T13:23:37Z") or relative (e.g. "42m" for 42 minutes)
-n, --tail string    Number of lines to show from the end of the logs (default "all")
-t, --timestamps     Show timestamps
    --until string   Show logs before a timestamp (e.g. "2013-01-02T13:23:37Z") or relative (e.g. "42m" for 42 minutes)

In some rare cases, when there is a container issue (everything else is correct, worked before, etc.), restarting the docker service can sometimes solve the issue.

So, what is different if you have only used Zabbix installed from packages? The examples below illustrate the differences in configuration options based on different Zabbix deployment methods: a) package-based/compiled installation, b) docker run command, and c) docker compose file example. First of all, you will have to specify environment variables in the docker run command or docker compose file. The list of available environment variables for each docker image is available in both docker hub and Github.

A). Package-based config

# vi /etc/zabbix/zabbix_server.conf
...
DBName=zabbix
DBUser=zabbix_usr
DBPassword=zabbix_pwd
...

B).Docker run config

docker run --name zbxsrv -t \
...
-e MYSQL_DATABASE=zabbix\
-e MYSQL_USER=zabbix_usr \
-e MYSQL_PASSWORD=zabbix_pwd\
...

C). Docker compose config

# vi /../...yaml
...
  environment:
   MYSQL_DATABASE=zabbix
   MYSQL_USER=zabbix_usr
   MYSQL_PASSWORD=zabbix_pwd

The environment variables are represented as key-value pairs, e.g., VAR=VAL. The values can optionally be unquoted or double-quoted. If some environment variable value contains special characters, you will need to escape them. To properly escape them, check out the docker documentation page.

You can create custom, user-defined networks to connect multiple containers to the same network. On such networks, containers can resolve each other by name or alias. If needed, you can assign a specific IP address to a container (if the address is already used, you will get an error).

# docker network create --subnet 172.20.0.0/16 --ip-range 172.20.240.0/20 zabbix-net

Docker run

In this section, we have an example of docker run commands for two Zabbix components: Zabbix proxy and Java gateway. When using custom, user-defined networks, you can use container names for communication between containers instead of using IP addresses. Here, instead of defining the IP address for Zabbix Java gateway, the container name is used. You can set a static IP address for your container or let docker do it for you, but confirm if the change of the IP address will not cause issues in case your container gets a different IP address. This can become an issue if you use an IP address in some configuration fields instead of a container name.

A lot of parameters are specified using environment variables with the option -e. Also, 3 different ports are exposed on your host machine. To keep the SQLite3 database file upon container deletion, the container directory containing database file is mounted to host directory (the proxy DB is usually used as a buffer storage before sending data to Zabbix server and usually is not used to store data beyond the moment when the data is sent).

docker run --name zabbix-proxy-active-01 \
-e ZBX_HOSTNAME="Zabbix-proxy-active-01" \
-e ZBX_SERVER_HOST=46.101.140.98 \
-e ZBX_PROXYMODE="0" \
-e ZBX_JAVAGATEWAY_ENABLE=true \
-e ZBX_JAVAGATEWAY=zabbix-java-gateway-proxy \
-e ZBX_JAVAGATEWAYPORT=10052 \
-e ZBX_STARTJAVAPOLLERS=5 \
--network=zabbix-net \
-e ZBX_LISTENPORT=10101  \
-p 10101:10101 \
-p 10050:10050 \
-p 10051:10051 \
-v /var/lib/zabbix/db_data:/var/lib/zabbix/db_data \
--restart unless-stopped \
--init -d zabbix/zabbix-proxy-sqlite3:alpine-7.2.4

docker run --name zabbix-java-gateway-proxy \
--network=zabbix-net \
--restart unless-stopped \
-d zabbix/zabbix-java-gateway:alpine-7.2.4

You can start each of these Zabbix components using the docker run command, however, any change to the container configuration will require you to stop the container, delete it, and execute the docker run command again. You also have another option – you could create a docker compose file and write the necessary configuration in yaml format. When you need to add some changes to the container configuration, run the docker compose down command to remove containers, edit the docker compose file, and run docker compose up command to start them up again with the new configuration:

docker compose -f ./docker_compose_v3_proxy.yaml down
docker compose -f ./docker_compose_v3_proxy.yaml up -d

If you have not mounted volume or directory to container for the data you want to keep, you can copy the data from the container to your host. Otherwise, that data will be gone if you delete the container or use the docker compose down command. So, it is important to set up the persistent storage/volume for the data that needs retaining, so you don’t lose important data from the container when container configuration is changed. You also need to expose the ports for the necessary services for the appropriate components (if they are set up on on separate hosts): zabbix-server, zabbix-proxy, zabbix-agent/zabbix-agent2 (default ports: 10050 for Zabbix agent passive mode, 10051 for Zabbix-agent active mode, some different port for proxy, 10052 for Java gateway).

Here we have the same docker run options written to docker compose file, including the environment variables, mounted directories and exposed ports. You can specify as many services as needed and start them just with docker compose command.

docker_compose_v3_proxy.yaml

services:
  zabbix-proxy-active-01:
    image: "${PROXY_SQLITE3_IMAGE}:${ALPINE_IMAGE_TAG}"
    environment:
      ZBX_HOSTNAME: Zabbix-proxy-active-01
      ZBX_SERVER_HOST: ${ZBX_SERVER_HOST}
      ZBX_PROXYMODE: 0
      ZBX_LISTENPORT: 10101
      ZBX_JAVAGATEWAY_ENABLE: true
      ZBX_JAVAGATEWAY: zabbix-java-gateway-proxy
      ZBX_JAVAGATEWAYPORT: 10052
      ZBX_STARTJAVAPOLLERS: 5
    volumes:
      - /var/lib/zabbix/db_data:/var/lib/zabbix/db_data:rw
    networks:
      - backend
    ports:
      - 10101:10101
      - 10050:10050
      - 10051:10051
    restart: unless-stopped

  zabbix-java-gateway-proxy:
    image: "${JAVA_GW_IMAGE}:${ALPINE_IMAGE_TAG}"
    networks:
      - backend
    restart: unless-stopped

networks:
  backend:
    name: zabbix-net
    external: true

.env

PROXY_SQLITE3_IMAGE=zabbix/zabbix-proxy-sqlite3
JAVA_GW_IMAGE = zabbix/zabbix-java-gateway
ALPINE_IMAGE_TAG=alpine-7.2.4
ZBX_SERVER_HOST=46.101.140.98

You can also use official Zabbix-supplied docker compose files, try them out, and modify them as needed.

You can read more about the official docker compose files here.

Containerized Zabbix components allow us to use test different scenarios within the docker:

Creating HA Zabbix-server nodes
Creating multiple proxies
Creating multiple agents
Adding more Java gateways
Creating multiple frontends
Easily configure Browser monitoring
Configure SNMP traps
Easily make scheduled reports

Deploying multiple redundant Zabbix servers

To enable HA Zabbix server mode, modify both the Zabbix-server container and Zabbix-frontend container configuration environment variables.

For the HA Zabbix server mode, add 2 environment variables:

ZBX_HANODENAME
ZBX_NODEADDRESS

All of the containers are set with the user-defined network, therefore I will use the container name in the ZBX_HANODENAME option instead of the static address, as it will be resolved by docker. If you need to use a different listen port for the trapper, you need to define it using the environment variable ZBX_LISTENPORT. You can omit the port in variable ZBX_HANODENAME, as the ZBX_LISTENPORT (default is 10051) will be applied automatically.

Here is the docker run example for the Zabbix-server HA mode.

docker run --name zabbix-server-mysql-ha1 -t \
-e DB_SERVER_HOST="mysql-server" \
-e MYSQL_DATABASE="zabbix" \
-e MYSQL_USER="zabbix" \
-e MYSQL_PASSWORD="zabbix_pwd" \
-e MYSQL_ROOT_PASSWORD="root_pwd" \
-e ZBX_HANODENAME="zabbix-server-HA1" \
-e ZBX_NODEADDRESS="zabbix-server-mysql-ha1" \
--network=zabbix-net \
-p 10151:10051 \
--restart unless-stopped \
-d zabbix/zabbix-server-mysql:alpine-7.2.4

docker run --name zabbix-server-mysql-ha2 -t \
-e DB_SERVER_HOST="mysql-server" \
-e MYSQL_DATABASE="zabbix" \
-e MYSQL_USER="zabbix" \
-e MYSQL_PASSWORD="zabbix_pwd" \
-e MYSQL_ROOT_PASSWORD="root_pwd" \
-e ZBX_HANODENAME="zabbix-server-HA2" \
-e ZBX_NODEADDRESS="zabbix-server-mysql-ha2" \
--network=zabbix-net \
-p 10251:10051 \
--restart unless-stopped \
-d zabbix/zabbix-server-mysql:alpine-7.2.4

From the frontend container, remove these two environment variables:

ZBX_SERVER_HOST
ZBX_SERVER_PORT

docker run --name zabbix-web-nginx-mysql -t \
-e ZBX_SERVER_HOST="zabbix-server-mysql" \
-e ZBX_SERVER_PORT=10051
-e DB_SERVER_HOST="mysql-server" \
-e MYSQL_DATABASE="zabbix" \
-e MYSQL_USER="zabbix" \
-e MYSQL_PASSWORD="zabbix_pwd" \
-e MYSQL_ROOT_PASSWORD="root_pwd" \
--network=zabbix-net \
-p 80:8080 \
--restart unless-stopped \
-d zabbix/zabbix-web-nginx-mysql:alpine-7.2.4

Once both container configurations are modified, you should be able to see the currently added HA server nodes and their states without issues.

Fig. 1. Containers of HA Zabbix server containers

Fig. 2. Dashboard – system information

You can also execute commands on the container:

# docker exec -it container_name_or_id sh -c "zabbix_server -R ha_status"

Fig. 3. Executing command on container

Containers of HA Zabbix server containers

I’t’s possible to allocate an interactive pseudo-TTY shell, by adding option -ti and specifying shell after the container name or id.

# docker exec -ti container_name_or_id /bin/bash

Fig. 4. Executing command from within container

You can also start multiple proxies at once in docker. This can help to offload preprocessing to the proxy, gather data from the targets behind the firewall, and send collected data back to the Zabbix server, only requiring one port.

Fig. 5.Overall block diagram of Zabbix monitoring opportunities

Deploying multiple Zabbix proxies

First, you must choose the proxy mode and set the environment variable ZBX_PROXYMODE.

For active mode proxy, please define the server host address for a single server or addresses separated by a semicolon in the case of HA Zabbix server configuration (example shown below).

docker run --name zabbix-proxy-active-01 \
-e ZBX_HOSTNAME="Zabbix-proxy-active-01" \
-e ZBX_SERVER_HOST="zabbix-server-mysql-ha1;zabbix-server-mysql-ha2;zabbix-server-mysql-ha3" \
-e ZBX_PROXYMODE="0" \
--network=zabbix-net \
-e ZBX_LISTENPORT=10101  \
-p 10101:10101 \
-v /var/lib/zabbix/db_data:/var/lib/zabbix/db_data \
--restart unless-stopped \
--init -d zabbix/zabbix-proxy-sqlite3:alpine-7.2.4

For passive mode proxy, define the server host address for a single server or addresses separated by a comma in the case of HA Zabbix server configuration (example shown below).

docker run --name zabbix-proxy-passive-01 \
-e ZBX_HOSTNAME="Zabbix-proxy-passive-01" \
-e ZBX_SERVER_HOST="zabbix-server-mysql-ha1,zabbix-server-mysql-ha2,zabbix-server-mysql-ha3" \
-e ZBX_PROXYMODE="1" \
--network=zabbix-net \
-e ZBX_LISTENPORT=10102 \
-p 10102:10102 \
-v /var/lib/zabbix/db_data:/var/lib/zabbix/db_data \
--restart unless-stopped \
--init -d zabbix/zabbix-proxy-sqlite3:alpine-7.2.4

docker_compose_v3_proxies.yaml

services:
  zabbix-proxy-active-01:
    image: "${PROXY_SQLITE3_IMAGE}:${ALPINE_IMAGE_TAG}"
    environment:
      ZBX_HOSTNAME: zabbix-proxy-active-01
      ZBX_SERVER_HOST: zabbix-server-mysql-ha1;zabbix-server-mysql-ha2;zabbix-server-mysql-ha3
      ZBX_PROXYMODE: 0
      ZBX_LISTENPORT: 10101
    volumes:
      - /var/lib/zabbix/db_data:/var/lib/zabbix/db_data:rw
    networks:
      - backend
    ports:
      - 10101:10101
    restart: unless-stopped

  zabbix-proxy-passive-01:
    image: "${PROXY_SQLITE3_IMAGE}:${ALPINE_IMAGE_TAG}"
    environment:
      ZBX_HOSTNAME: zabbix-proxy-passive-01
      ZBX_SERVER_HOST: zabbix-server-mysql-ha1,zabbix-server-mysql-ha2,zabbix-server-mysql-ha3
      ZBX_PROXYMODE: 1
      ZBX_LISTENPORT: 10102
    volumes:
      - /var/lib/zabbix/db_data:/var/lib/zabbix/db_data:rw
    networks:
      - backend
    ports:
      - 10102:10102
    restart: unless-stopped
networks:
  backend:
    name: zabbix-net
    external: true

.env

PROXY_SQLITE3_IMAGE=zabbix/zabbix-proxy-sqlite3
JAVA_GW_IMAGE = zabbix/zabbix-java-gateway
ALPINE_IMAGE_TAG=alpine-7.2.4
ZBX_SERVER_HOST=46.101.140.98

The proxy name in the frontend must be the same as the value set in proxy environment variable ZBX_HOSTNAME! Also, in frontend for active proxies, you don’t need to add the proxy address.

Next, you can set hosts to be monitored by Zabbix-proxies, but make sure to update the agent configuration, so agents accept connections from proxy.

Fig. 6. Hosts monitored by proxy

Fig. 7.List of proxies and hosts monitored by them

Configuring Proxy groups

You can create as many proxy containers as necessary in Docker, and you can also create proxy groups for load balancing (it is based on the number of hosts per proxy).

First, create a proxy group in the frontend:

Set proxy group name
Select failover period
Minimum number of proxies

Fig. 8.Creating a new proxy group

Next, add proxies to the proxy group, and specify the address for active agents and port for the active agents.

Fig. 9. Adding proxy to proxy group

Do not forget to change Zabbix agent configuration for hosts now monitored through the proxy group (add proxy groups IPs/DNS to Server and ServerActive options).

Fig. 10. Creating a new host and monitoring it through proxy group

You can see additional information regarding the proxies in the Frontend section: Administration/ Proxies.

Fig. 11. List of all configured proxies and those belonging to proxy group

Adding more Java gateways

Zabbix server or proxy can communicate with only one Zabbix java gateway, however, you are not limited tin how many Zabbix proxies you create together with Zabbix Java Gateway. You can make an unlimited number of pairs, consisting of Zabbix proxy with Zabbix Java Gateway.

For the containerized Zabbix server, you will need to add these 4 environment variables:

ZBX_JAVAGATEWAY_ENABLE=true
ZBX_JAVAGATEWAY=zabbix-java-gateway-server
ZBX_JAVAGATEWAYPORT=10052
ZBX_STARTJAVAPOLLERS=5

And start the Java gateway for the zabbix-server in docker:

docker run --name zabbix-java-gateway-server -t \
--network=zabbix-net \
--restart unless-stopped \
-d zabbix/zabbix-java-gateway:alpine-7.2.4

Or if you want to add java gateway to the Zabbix proxy, then add these 4 environment variables to Zabbix proxy in docker:

ZBX_JAVAGATEWAY_ENABLE=true
ZBX_JAVAGATEWAY=zabbix-java-gateway-proxy
ZBX_JAVAGATEWAYPORT=10052
ZBX_STARTJAVAPOLLERS=5

And start the java gateway as a container:

docker run --name zabbix-java-gateway-proxy -t \
--network=zabbix-net \
--restart unless-stopped \
-d zabbix/zabbix-java-gateway:alpine-7.2.4

And here we have a host, monitored by zabbix-agent2 through zabbix-proxy-active-02

Fig. 12. Host monitored by proxy with configured Java gateway

Upgrading docker proxies with SQLite3 database

If you have older Zabbix components already running in docker and you have upgraded the server, you will also need to upgrade the proxies.

If you have a container created from the proxy zabbix-proxy-sqlite3 image and want to upgrade it, you will lose the existing data stored in the SQLite3 database. For most users, the database functions as a buffer to temporarily keep the data until it’s sent to Zabbix server and the loss of the proxy database file data is of no consequence.

Once you have updated the image for the container, the proxy will detect the existing old database version on startup. If the directory is mounted to database file, it will delete the database file and create a new one. This will impact those who keep data after sending it to Zabbix server and use the data from the proxy database for other purposes.

Fig. 13. Database upgrade for proxy container with SQLite3 database

Upgrading docker proxies with MySQL database

To upgrade the MySQL database for proxy, log in in the MySQL database, set the log_bin_trust_function_creators flag to 1. Change the proxy image version to a newer one and start the container.

mysql> set global log_bin_trust_function_creators = 1;

If you have not set the flag, you will receive an error of database upgrade.

Fig. 14. Failed database upgrade for proxy with MySQL database

Replace the previous version of the proxy image with the new one, check the log file, and check the docker logs to see when the database schema upgrade has finished. After the upgrade, set the flag back to 0.

mysql> set global log_bin_trust_function_creators = 0;

The upgrade has been successful, and the proxy service has started after that.

Fig. 15. Successful database upgrade for proxy with MySQL database

An official docker image for the proxy with Postgresql database support is not available due to the extensive number of existing images and different versions.

Deploying multiple frontends

You can launch as many frontends as you need if you are experiencing a sudden surge in Zabbix users. Just specify which port to assign for it and you are good to go (don’t forget to also open the port in the firewall).

docker run --name zabbix-web-nginx-mysql1 -t \
-e DB_SERVER_HOST="mysql-server" \
-e MYSQL_DATABASE="zabbix" \
-e MYSQL_USER="zabbix" \
-e MYSQL_PASSWORD="zabbix_pwd" \
-e MYSQL_ROOT_PASSWORD="root_pwd" \
--network=zabbix-net \
-p 80:8080 \
--restart unless-stopped \
-d zabbix/zabbix-web-nginx-mysql:alpine-7.2.4

Fig. 16. One started Zabbix frontend container in docker

docker run --name zabbix-web-nginx-mysql2 -t \
-e DB_SERVER_HOST="mysql-server" \
-e MYSQL_DATABASE="zabbix" \
-e MYSQL_USER="zabbix" \
-e MYSQL_PASSWORD="zabbix_pwd" \
-e MYSQL_ROOT_PASSWORD="root_pwd" \
--network=zabbix-net \
-p 81:8080 \
--restart unless-stopped \
-d zabbix/zabbix-web-nginx-mysql:alpine-7.2.4

Fig. 17. Two started Zabbix frontend containers in docker

docker run --name zabbix-web-nginx-mysql3 -t \
-e DB_SERVER_HOST="mysql-server" \
-e MYSQL_DATABASE="zabbix" \
-e MYSQL_USER="zabbix" \
-e MYSQL_PASSWORD="zabbix_pwd" \
-e MYSQL_ROOT_PASSWORD="root_pwd" \
--network=zabbix-net \
-p 82:8080 \
--restart unless-stopped \
-d zabbix/zabbix-web-nginx-mysql:alpine-7.2.4

Fig. 18. Three started Zabbix frontend containers in docker

Fig. 19. Multiple frontends accessed through different ports

Browser monitoring

Browser monitoring setup has never been easier! Just add two parameters to zabbix-server container config:

ZBX_WEBDRIVERURL=selenium:4444
ZBX_STARTBROWSERPOLLERS=2

And start the web driver in the docker (with a standalone chrome browser):

docker run --name selenium -t\
--network=zabbix-net \
--restart unless-stopped \
-p 4444:4444 \
--shm-size="1g" \
-d selenium/standalone-chrome:latest

Next step: create a new host, add the template, specify which page to monitor with Macro values, and it’s DONE!!!!

Fig. 20. Creating host for monitoring website

Fig. 21. Screenshot of the monitored website

SNMP traps

For the snmptraps to work, the same directory must be shared among the zabbix-server and zabbix-snmptrap container. On the Zabbix-server side, you need to explicitly set snmp environment variable ZBX_ENABLE_SNMP_TRAPS to true and mount directory /var/lib/zabbix/snmptraps.

You also need to add the same volume to the snmptrap container.

And run the snmptraps container (make sure there is no permission issue for the directory)

docker run --name zabbix-snmptraps -t \
-v /var/lib/zabbix/snmptraps:/var/lib/zabbix/snmptraps:rw \
--network=zabbix-net \
-p 162:1162/udp \
--restart unless-stopped \
-d zabbix/zabbix-snmptraps:alpine-7.2-latest

Fig. 22. Received SNMP trap message

Scheduled reports

You can also easily configure scheduled reports by adding 2 additional environment variables to the Zabbix-server. In my case, both of these containers are in the same custom user network, therefore I will use the container name zabbix-web-service in the ZBX_WEBSERVICEURL option.

ZBX_STARTREPORTWRITERS=5
ZBX_WEBSERVICEURL=http://zabbix-web-service:10053/report

Start the Zabbix-web service, specify also these 2 parameters (you can skip those if defaults are used). You can also allow any incoming connections by setting ZBX_ALLOWEDIP=0.0.0.0/0. We discourage this, however.

ZBX_ALLOWEDIP=zabbix-server-mysql
ZBX_LISTENPORT=10053

Before testing scheduled reports, make sure you have enabled and configured the email media type.

Fig. 23. Configured and enabled media type

It is also encouraged to test it and check that you have received the test email.

Fig. 24. Successful media type test response

Fig. 25. Received test response on the selected media type.

Next, configure the user media where the scheduled report will be sent.

Fig. 26.Media type defined for the user

Last, but not least, set the frontend URL in the section Administration/General/Other section. In my case, I set the container name of the frontend and specify the port.

for Apache: http://<server_ip_or_name>/zabbix
for Nginx: http://<server_ip_or_name>

Fig. 27. Configured frontend address for the Frontend URL option

Next, create a scheduled report based on the dashboard of your choice.

Fig. 28.Configuring scheduled report

Check that you have received the test report in your mail.

Fig. 29.Successful scheduled report test.

Fig. 30. Received scheduled report test in the email

Now you know how to set up scheduled reports!

Docker container monitoring

You can also monitor Docker containers with a containerized Zabbix instance*

* Disclaimer: If docker service is not running, Zabbix monitoring will also not function and you will not receive notifications and alerts.

You can also monitor your docker instance with the Zabbix agent 2, however, you will be required to install Zabbix-agent 2 on the host either as a package or build it from the source.

You will also need to give user zabbix access to the docker.sock file. Just add user zabbix to group docker:

# usermod -aG docker zabbix

Otherwise, you will get an error message in items:

Cannot fetch data: Get "http://1.28/info": dial unix /var/run/docker.sock: connect: permission denied.

Go back to the frontend and create a Host for monitoring the docker containers:

Link template: Docker by Zabbix agent 2
Add host to host group
Specify host address or dns name, set the correct connect to option, and specify the agent port (if a default port is used, then set 10050).

Fig. 31. Configuring the host for monitoring the docker container

Now, if some issue happens to other containers, Zabbix will monitor them. But to be notified of an issue, don’t forget to enable and configure the media, user media, media templates, and trigger actions, so that you receive alerts.

Fig. 32.Latest data for the docker host

Thank you for reading – I hope you’ve found this article helpful and informative!

The post Deploying Zabbix Components with Docker and Docker Compose appeared first on Zabbix Blog.