All posts by Brian van Baekel

NetBox and Zabbix – An Integration that Just Fits

Post Syndicated from Brian van Baekel original https://blog.zabbix.com/netbox-and-zabbix-an-integration-that-just-fits/31404/

If you are running Zabbix, you know that it can be a tedious job to add hosts, link templates, and (even harder) make sure it is consistent with your CMDB. What if you already have a CMDB? In that case, it means you need to synchronize the CMDB with Zabbix…manually? Of course not!

Before we continue – this blog post and plugin both belong to Opensource ICT Solutions. We specialize in Zabbix (it’s our core business!) and as such try to make a living out of this open-source product. The plugin we will discuss is open source, and as such we do not have a commercial benefit from it – it’s brought to you by us, as a way to give back to the community (and maybe score some consultancy opportunities).

If you are familiar with NetBox already, it’s time to get excited. If you are not familiar with it, NetBox provides a powerful “single source of truth” for managing everything in your network: IP address management (IPAM), data center infrastructure management (DCIM), device inventory, rack layouts, cabling, virtual assets, and more. It’s built under the Apache 2.0 license, so the core software is fully open source, with an active community contributing plugins, integrations, and custom extensions. The platform is highly flexible – you can add custom fields, enforce custom validation and protection rules, integrate via REST and GraphQL APIs, and run multiple automations.

How cool would it be if you could use that in combination with Zabbix, so that if you create a new entity in your CMDB (your single source of truth) and sync that with Zabbix, you could just focus on one product and always can be assured your monitoring is complete?

What are we solving?

Many of our customers use NetBox as their CMDB and Zabbix as their monitoring solution. The challenge they run into is keeping NetBox and Zabbix in sync — a task engineers don’t usually enjoy.

For customers who don’t use a CMDB (or at least not NetBox), there’s always the uncertainty of whether a host in Zabbix has the right templates and macros applied. While Zabbix does allow bulk updates, you still need detailed knowledge of each device’s role to keep things consistent.

NetBox, on the other hand, already stores much richer context about configuration items. A device or virtual machine can have a role, device type, tenant, and even its site or location defined. All that’s missing is a way to leverage this information to make sure those devices are monitored correctly in Zabbix.

On top of that, this approach makes it simple – if a device is registered in the CMDB (and therefore something you’re responsible for), it’s also monitored in the right way. From a project delivery perspective, documentation only needs to be done once, and it ensures that it’s actually done. In short: if it’s not in the CMDB, it’s not monitored — and therefore not our responsibility.

It also means the project delivery engineer(s) don’t necessarily need to know in depth how Zabbix works: as long as they can populate the CMDB – the monitoring will be taken care of automatically.

What did we develop?

In short, a native plugin for NetBox that communicates with the Zabbix API. From there, it will gather information like templates and macros that exist in your Zabbix environment. This is completely API based, so in NetBox you just add an new Zabbix Server and let it synchronize:

Zabbix netbox sync
Screenshot about a new Zabbix server in NetBox

At this point, nothing fancy happens. It is just establishing the connection and synchronizing templates, macros, etc. The rest of the configuration is done in your NetBox instance.

How does it look?

 

 

We’ve got the normal/native menu list items from NetBox, and for those familiar with it already the list below shows nothing new except for the “Zabbix” option:

  • Organization – Define sites, locations, and tenants to structure your infrastructure
  • Racks – Manage physical racks and their layout in data centers
  • Devices – Inventory of physical and virtual devices like servers, routers, and switches
  • Connections – Model physical cabling and logical connections between devices
  • Wireless – Manage wireless LANs, SSIDs, and related equipment
  • IPAM – IP Address Management: subnets, prefixes, IPs, and VRFs
  • VPN – Configure tunnels, peers, and VPN terminations
  • Virtualization – Track clusters, virtual machines, and virtual interfaces
  • Circuits – Manage provider circuits, WAN links, and related contracts
  • Power – Define power feeds, panels, and outlet connections.
  • Provisioning – Support for building and automating device/service onboarding
  • Customization – Extend NetBox with custom fields, rules, and UI tweaks
  • Operations – Tools for workflows, jobs, and operational tasks
  • Admin – Administrative settings for users, groups, and global configuration

The Zabbix menu is new here and actually gives us control over what is present in Zabbix. The objects here should look familiar if you know Zabbix:

  • Servers
  • Proxies
  • Proxy Groups
  • Templates
  • Macros
  • Tags
  • Hostgroups
  • Maintenance

NetBox menu including Zabbix
NetBox menu including Zabbix plugin

In the various NetBox native objects, there will be information regarding the Zabbix setup.

Is it available already?

Of course it is, otherwise this blog post would’ve been completely useless! Installation can be done via https://pypi.org/project/nbxsync.

We released our NetBox plugin under the GNU Affero General Public License v3 (AGPL-3.0) because it best protects both our work and the community. Unlike permissive licenses, AGPL ensures that anyone who modifies or extends the plugin must share their changes under the same license, even if the software is only offered as a service. This prevents closed forks, guarantees improvements flow back into the community, and aligns with the collaborative spirit of NetBox and Zabbix.

While AGPL still allows use in commercial environments, it prevents organizations from profiting off private modifications without contributing back. In short, AGPL-3.0 keeps the plugin fair, transparent, and truly open source. This is also the license Zabbix uses, so the community is already familiar with it.

We’ve open-sourced and released the code on our Opensource ICT Solutions GitHub,. and it can be found here: https://github.com/OpensourceICTSolutions/nbxsync

We think documentation is important, as we’ve often been in a situation where we had to discover ourselves how something works due to lack of documentation. We really try to keep you out of that situation and therefor created extensive documentation for this project. Obviously, we can help you when you are lost, but as that costs us time as well it won’t be a free service. The documentation is available here: https://nbxsync.com.

As we think it’s great to work on a project together, we welcome community contributions. However, in order to accept any pull requests, please create an issue on our Github repo first. Please do read our development guidelines and understand that we are more than happy to incorporate suggestions/pull requests if they benefit the wider community.

Can I configure it myself?

Yes. We will assume you’ve got NetBox in place already. If not, please follow the official NetBox documentation to install it: https://netboxlabs.com/docs/netbox/installation/.

As it’s a native plugin, the installation is straightforward and well documented by NetBox: https://netboxlabs.com/docs/netbox/plugins/installation/. In our documentation, we provide the plugin-specific configuration. If this feels daunting, we’re more than happy to assist you with it as part of our consultancy offering.

So, with NetBox in place and the plugin installed, let’s actually walk through the NetBox configuration to give you a feeling of how it works. We will have to configure quite a bit in NetBox as a foundation, which hopefully is done already if you’ve got NetBox implemented in your organization.

In any case, we need to add one or multiple new Zabbix servers. We open the Zabbix menu and click on “Servers” where we add this server:

NetBox Zabbix Server configuration
NetBox Zabbix Server configuration

Once added, NetBox will automatically synchronize with the Zabbix server and get the templates out of it, ready to be used! The macros will also get synchronized along with the templates,, so they are also available in NetBox.

NetBox dictates that devices should be in a site, so we start with that. In Organization Sites we create a new site. A few fields are mandatory and populated in the screenshot below:

NetBox Sites Configuration
NetBox Sites Configuration

Name, Slug, and Status are mandatory. In a production setup, you probably want to populate some other fields as well, such as Tenant, Region, etc. But we are not writing a NetBox tutorial and as such we will completely ignore that. Once you are done, click on “Create” at the bottom of the configuration.

After the site has been created, it is time to add a Manufacturer under the menu “Devices.”

In this case we will add Cisco as a vendor:

Netbox Manufacturers configuration
Netbox Manufacturers configuration

Once done, click on “Create” at the bottom of the configuration. Of course you can (or should) add multiple vendors – all that you actually use!

The next step is device type. In the end, we need to know the vendor, but it is equally important to know what type of device we are monitoring. As such, the next step is to add a device type, again under the main menu “Devices.” As we add in the example, we are going to add a CBS220 switch:

NetBox device type configuration
NetBox device type configuration

Once again, click on “create” when you are done.

Last but not least, we need to add a device role. The device role is an important attribute because it helps us clearly define the function of the device within the network. By categorizing devices based on their role (such as router, switch, firewall, server, or access point) we create a structured overview that makes it much easier to manage, monitor, and troubleshoot the environment. Assigning roles also ensures consistency, improves documentation quality, and allows us to quickly identify the purpose of each device in larger infrastructures.

We go to “Devices” Device roles and from there:

NetBox Device Role
NetBox Device Role

Now we can finally add the device itself! This is what it all is about – the work we’ve done before is really just laying the foundation for this moment. We add a device which will eventually become a Host in Zabbix, with all related properties pushed from NetBox its configuration.

So we navigate to Devices Devices and from there add it:

NetBox Device configuration
NetBox Device configuration (truncated some fields)

After we save the device by clicking “Create,” NetBox immediately takes us to the newly created device’s detail page. Here we can see an overview of all the information we have just entered, such as the device name, role, site, rack position, and other attributes. This page acts as the central point for managing and extending the device configuration.

From here, we can add interfaces, assign IP addresses, connect cables, or link the device to virtual resources. In other words, once created the device record becomes the foundation for documenting its place and function in the network.

NetBox device overview
NetBox device overview with Zabbix options

In this screenshot, we can see already that there is a new tab “Zabbix” (just under the device name) and we’ve also got a new button “Sync Zabbix.”

In the tab “Zabbix” we should assign this device to a Zabbix server, as by default it will not get assigned to any. You might think this is a bit strange, especially if you’ve got one Zabbix server. However, the mindset during development is that NetBox typically is used by MSPs, which have multiple Zabbix servers and even might have the need to assign multiple Zabbix servers to this device for operational reasons.

We open the tab “Zabbix” and click on “Add” next to the Zabbix Servers. A new configuration page opens and we select the server we just added:

NetBox Zabbix server assignment
NetBox Zabbix server assignment

When you click on “create” the server is assigned. We can of course add an template to it, but as we know the vendor and type already, there should be some inheritance!

Let’s go back to Device Manufacturers and click on the vendor(Cisco) we just added. Click on the name and you will see that this object also got a new “Zabbix” tab. In this tab you can configure that for this vendor, always these templates, hostgroups, tags and macros should be used. Here we will just add the template to this vendor, to show inheritance:

Netbox template inheritance
Netbox template inheritance

Once you’ve clicked on Create, navigate back to the device we made and observe how the template is inherited. As Zabbix also requires a host group and an interface, we are going to configure that now.

We will start with the host group, so click on Zabbix -> Hostgroups. There we create a new one as per the screenshot below. There is something strange with our configuration, as we use Jinja2 templates instead of static names.

The object name is “Device site” but the actual value will resolve to the site name we created (OICTS HQ) earlier. The power here lies in the variables – if we create a new device for another site and link this hostgroup, it will automatically resolve to the correct site name with no need for static configurations anymore!

Of course, the host group should be assigned to a Zabbix server again:

NetBox Zabbix hostgroups
NetBox Zabbix hostgroups

The next step is to create a Zabbix host interface, which is essential for monitoring and communication between Zabbix and the device. To do this, we leverage the IPAM (IP Address Management) functionality within NetBox.

IPAM provides a structured way to manage and allocate addresses across the network, ensuring consistency and avoiding conflicts. In this case, we navigate to IPAM → IP Addresses and add a new IP address that will serve as the management interface for the device. This IP address will later be linked to the Zabbix host configuration, allowing monitoring data to flow seamlessly.

NetBox IPAM config - IP address
NetBox IPAM config – IP address

If we now go back to Devices -> the device we want to configure tab “Zabbix” we should add an Host interface and Host group. Click on Add for the respective config and populate the minimum fields. For the Host interfaces that looks like this:

NetBox Zabbix host Interface
NetBox Zabbix host Interface

For the host group, there are fewer fields to fill in compared to other objects. All you need to do is select the appropriate group from the available options. This keeps the process straightforward and avoids unnecessary configuration.

Once saved, the host group will be correctly linked and ready for use in Zabbix:

NetBox Zabbix hostgroups
NetBox Zabbix hostgroups

So the final result looks like this. At this point, all of the required elements have been configured in NetBox and properly linked to the Zabbix environment. The device now has its host group, host interface, and templates assigned, giving us a complete picture of how it will appear in monitoring.

What we see here is essentially the end-to-end outcome of the earlier configuration steps, where NetBox acts as the single source of truth and Zabbix automatically inherits the correct setup.

NetBox Zabbix device overview
NetBox Zabbix device overview

Now it’s time to actually synchronize the device with Zabbix. At the top of the device detail page, right next to the device name, there is a button labeled “Sync Zabbix.” By clicking this button, NetBox will push all the information we’ve configured—such as interfaces, templates, and host groups—directly into Zabbix.

Within a few seconds, the host is created and fully ready for monitoring, without any manual setup inside Zabbix. With the heavy lifting automated, you can sit back and relax knowing that the device has been synchronized correctly.

Actually, let’s head over to Zabbix and confirm the synchronization:

Zabbix host overview from NetBox
Zabbix host overview from NetBox

Brilliant! The host is there, the template is linked, the host group automatically was set to “OICTS HQ” and the interface also looks correct. Monitoring will start and we did not touch Zabbix itself!

Want to see it in action?

Can do! We’ve created a YouTube video for you to actually see how it works. On top of that, we plan to host webinars regarding this plugin as well. You can register for all our webinars for free via the Zabbix website.

Is this it?

No! Actually there is a lot more we can do with this NetBox plugin, but it’s just that this blog post is not the correct place to show it all. Just to give you an idea, we can set maintenance from NetBox, which automatically will sync it to Zabbix. This way we again have a single source of truth and make sure we can see from a helicopter view where the impact is.

Furthermore, automatic synchronization can be set up so that any changes in Zabbix are overridden by the NetBox configuration. This way, we make sure there is no drift between NetBox and Zabbix. It also guarantees that if engineers forget to manually synchronize, no harm is done. However, the manual sync button will always be there, as nobody wants to wait to fix the monitoring when changes are made!

In addition, the plugin fully supports proxies and proxy groups – just as you know them from Zabbix. We’ve just haven’t shown it here to keep it somewhat short.

Roadmap

Although this project is just a side gig (we still dedicate our resources to Zabbix) we of course have a vision and roadmap that we would like to chase.

One major feature that’s on the roadmap is to show host problems in NetBox. By retrieving the current problems for a given host and showing them in NetBox, we should be able to limit the time spent in Zabbix even further. Our goal is to realize a “Single Pane of Glass” (just as NetBox is the “Single Source of Truth.”

The post NetBox and Zabbix – An Integration that Just Fits appeared first on Zabbix Blog.

Monitoring Zabbix Security Advisories

Post Syndicated from Brian van Baekel original https://blog.zabbix.com/monitoring-zabbix-security-advisories/28672/

Zabbix plays a crucial role in monitoring all kinds of “things” – IoT devices, domains, cloud infrastructures and more. It can also be integrated with third-party solutions – for example, with Oxidized for configuration backup monitoring. Given the nature of Zabbix, it usually contains a lot of confidential information as well as (more importantly) some kind of elevated access to network elements while being used by operators, engineers, and customers. This requires that Zabbix as a product should be as secure as possible.

Zabbix has upped their security game and is actively working with HackerOne to take full advantage of the reach of their global community by providing a bug bounty program. And though it doesn’t happen too often, from time to time a security issue arises in Zabbix or one of its dependencies, warranting the release of a Security Advisory.

The issue

Zabbix typically releases a Security Advisory and might even assign a CVE to the issue. Cool, that is what we expect from reputable software developers. They even inform their customers with support contracts before publishing the advisory, in order to allow them to patch installations beforehand.

Unfortunately, if you don’t have a support contract you’re expected to find out about these security advisories on your own, either by monitoring the Security Advisory page or by monitoring the published CVEs for Zabbix. NIST has a public API that can be used and that works well, but the issue with CVE’s is that they are often incomplete and thus useless. For example, CVE-2024-22119 contains far less information than the advisory.

Currently, Zabbix does not publish an API for their Security Advisories. There is the public tracker which contains all entries and can be queried via API, but because it is unstructured text, it is really hard to parse.

The solution

We want to automatically be notified of new security advisories, and the only data source that contains all data in a structured way is the Zabbix Security Advisory page. However, structured doesn’t mean easily parseable – in fact, it is just raw HTML. We could try to solve this issue in Zabbix, but the easier solution in this case is to scrape the page and generate a JSON file which then can be parsed by Zabbix to achieve our goal, which is automated notifications of new advisories.

Webscraping

We’ve chosen to scrape the Zabbix site using Rust, utilizing the Scraper crate to parse the HTML and flesh out the relevant parts we want. Without going into too much detail, the interesting information is stored in 2 tables, one with the table-simple class applied and one with the table-vertical class applied. Using CSS selectors (which is what the Scraper crate requires), we can retrieve the information we want.

This information is then stored in a struct, which gets added to a hashmap. The result is stored in a vector, which is added to a struct, which eventually is used to generate the JSON we require. Phew.

The resulting JSON is easily parseable by Zabbix:

{
  "last_updated": {
    "secs": ,
    "nanos": ,
  },
  "reports": [
    _list of reports_
  ]
}

The ‘reports’ array contains one entry per advisory, and each entry has the following layout. Unsurprisingly, this closely matches the information that is available on the Zabbix Security Advisory page:

    {
      "_zbxref_": {
        "zbxref": "_zbxref_",
        "cveref": "CVE-XXXX-XXXX",
        "score": X.X,
        "synopsis": "_synopsis_",
        "description": "_description_",
        "vectors": "_vectors_",
        "resolution": "_resolution_",
        "workaround": "_workaround_",
        "acknowledgement": "_acknowledgement_",
        "components": [
          _list of components_,
          _list of components_
        ],
        "affected_version": [
          {
            "affected": "_version_",
            "fixed": "_version_"
          }
        ]
      }
    }

Now, we could provide you with the code of the scraping tool and wish you good luck with making sure the tool runs every X hours and somehow, somewhere stores the resulting JSON for Zabbix to parse. That would be the easy way out, right?

Instead, we’ve chosen to host the Rust program as an AWS Lambda function, triggered every 2 hours by the AWS EventBridge Scheduler and with some code added to the Rust program (function?) to upload the resulting JSON to an AWS S3 bucket. This chain of AWS products not only makes sure that our cloud bill increases, but also guarantees we don’t have to host (and maintain!) anything ourselves.

The result? Just one HTTP GET away…

Template

TL;DR: Download the template here.

Now that the data is available in JSON, it’s fairly easy to parse it using Zabbix. Using the HTTP Agent data collection, we download the JSON from AWS. The URI is stored in the {$ZBX_ADVISORY_URI} macro, which allows for easy modification. By default, it points to the JSON file hosted on AWS S3. This retrieval is done by the Retrieve the Zabbix Security Advisories item, which acts as the source for every other operation. It retrieves the JSON every hour, and with the JSON being generated every 2 hours, the maximum delay between Zabbix publishing a new advisory and you getting it into Zabbix is 3 hours.

The retrieve the Zabbix Security Advisories item acts as a master item for the Last Updated item. This item uses a JSONPath preprocessing step to flesh out the information we want: $.last_updated.secs. The resulting data is stored as unixtime so that we mere mortals can easily read when the last update of the JSON file was performed.

A trigger is configured for this item to ensure that the JSON file isn’t too old. The trigger JSON Feed is out of date has the following expression:
last(/Zabbix Security Advisories/zbx_sec.last_updated)>{$ZBX_ADVISORY_UPDATE_INTERVAL}*{$ZBX_ADVISORY_UPDATE_THRESHOLD}

By default, {$ZBX_ADVISORY_UPDATE_INTERVAL} is set to 2 hours (which is the interval the file gets updated by our tool) and {$ZBX_ADVISORY_UPDATE_THRESHOLD} is set to 3. So, when the JSON file hasn’t been updated within the last 6 hours, this trigger will trigger.

The item Number of advisories uses the same principle, where a JSONPath preprocessing step is used to flesh out the information we want: $.reports. However, as $.reports is an array, we can use functions on it. In this case .length(), which returns an integer. This number is used in the associated trigger A new Zabbix Security Advisory has been published, which simply triggers when the value changes.

This is all very cool, but the JSON has a lot more information, including details about each report. In order to get these details into Zabbix, we use a discovery rule to ‘loop’ through the JSON and create items based on what we’ve discovered: Discover Advisories. This rule uses (again) a JSONPath preprocessing step to get the details we want: $.reports[*][*]. Based on the resulting data (which is a single report in this case), 2 LLD Macros are assigned: {#ZBXREF} – based on the JSONpath $.zbxref and {{#CVEREF} – based on the JSONpath $.cveref.

For each discovered report, 8 items are created. They all work using the same principle, so I will only describe one: Advisory {#ZBXREF} / {#CVEREF} – Acknowledgement. This item uses the master item Zabbix Security Advisories, just like all other items described so far. JSONPath is once again used to get the information we want. The expression $.reports[*][“{#ZBXREF}”].acknowledgement.first() provides exactly what we need, where we combine a LLD macro ({#ZBXREF}) and a JSONpath function (.first()) to first ‘select’ the correct advisory in the JSON and then retrieve the value.

All other 7 items work like this, and there is only one exception: Advisory {#ZBXREF} / {#CVEREF} – Components. The ‘components’ value in the JSON file is actually an array with 1 or more items, describing which components might be affected. But we cannot store arrays in Zabbix, so we use another preprocessing step to convert the array into a string. A few lines of Javascript is all we need:

components = JSON.parse(value);
return components.toString();

First, we parse the JSON input (‘value’) into an array, only to apply the javascript .toString() function on it. The toString method of arrays calls join() internally, which joins the array and returns one string containing each array element separated by commas, which is exactly what we want: a string, separated by commas.

To make working with these advisories easier, each item has the component tag applied, with the value zabbix_security. If the item belongs to an advisory, the advisory tag is added with the value of {#ZBXREF} (which is the advisory number/name). That way, we can easily filter on all Zabbix Security items, filter on all items for a single advisory, and (to make things even better) the type tag is also applied, with the actual type being ‘workaround’ or ‘description.’ This allows for filtering on all Zabbix Security items, of the type ‘score’ (et cetera) to easily gain insight into the different advisories and their score, synopsis, description, components, et cetera.

Dashboard

The tags on the items allow for filtering, but with Zabbix 7.0 we can use all great new nifty features, such as the Item Navigator widget combined with the Item Value widget. Let’s take a look at what configuring such a dashboard might look like if you set up the Item Navigator widget as follows:

Item Navigator configuration

And then ‘link’ the Item Value widget to it:

You should get a somewhat decent dashboard. It isn’t perfect (given that the Item Value widget only seems to be able to display a single line of text) but it’s something.

Disclaimer

Though we use this functionality ourselves, this all comes without any guarantee. The technology used to retrieve data (screen scraping) is mediocre at best and could break at any moment if and when Zabbix changes the layout of their page.

The post Monitoring Zabbix Security Advisories appeared first on Zabbix Blog.

Making Patient Care Easier with Zabbix and Open-Future

Post Syndicated from Brian van Baekel original https://blog.zabbix.com/making-patient-care-easier-with-zabbix-and-open-future/28406/

The Antwerp University Hospital (UZA) is a university center known for top clinical and customer-friendly patient care, high-quality academic training, and groundbreaking scientific research with an important international dimension. The UZA has 593 hospital beds in 26 nursing units, as well as 41 highly specialized medical services where more than 800,000 patients are consulted every year and over 4,000 employees, including 642 doctors. Keep reading to see how Zabbix premium partner Open-Future rises to the challenge of monitoring this massive IT infrastructure.

The challenge

Due to the large amount of users connecting on a daily basis, the UZA’s Zabbix server was set up as a virtual machine with a front-end separate from the Zabbix server and database. Splitting the front-end from the Zabbix server allows them to use dedicated resources for the front-end and the Zabbix server.

Most of the monitoring is done by Zabbix agents on Linux and Windows. In order for the applications to see if everything is working as it should be, the Open-Future team leverages UserParameters and database monitoring with Zabbix Agent 2. For some more specific monitoring cases, we also make use of custom SQL scripts.

Because one server can have multiple teams responsible for just the application or the OS, getting the correct information to the right team proved to be a challenge. A simple solution was the creation of different trigger actions for every team that included only the triggers that were needed. Unfortunately this proved to be very difficult to manage over time and error-prone when changes were needed.

The solution

By making extensive use of tags in Zabbix, our team could add labels to the items and link them back to the correct user groups. This made it easier to send the right information to the correct teams and allowed them to both drastically reduce the number of actions that had to be created and simplify the actions that were created.

The results

Zabbix has proven itself as a powerful and versatile monitoring and management platform that allows our team to gain real-time insight into the performance of the UZA’s IT infrastructure and applications. Zabbix’s ability to collect and visualize various types of data (including network traffic, server load, application performance, and more) makes it easy to identify and resolve issues before they impact operations or patient care.

At present, Open-Future monitors about 1,400 hosts, a mix of Windows, Linux and BareMetal monitored by proxies. This allows us to monitor more then 10.000 metrics with more then 55,000 triggers to notify us in case of any potential issues. We make use of custom templates, plugins, and scripts to gather all needed information.

The impact of Zabbix on our operational efficiency cannot be overstated. Automated alerts and reporting functionality let us respond quickly to incidents and issues, which reduces downtime and maximizes the availability of critical systems. This has direct benefits for the UZA’s patients, as we can make sure that vital systems like electronic medical records are always available and that the quality of care is maintained at the highest level.

The post Making Patient Care Easier with Zabbix and Open-Future appeared first on Zabbix Blog.

Monitoring Configuration Backups with Zabbix and Oxidized

Post Syndicated from Brian van Baekel original https://blog.zabbix.com/monitoring-configuration-backups-with-zabbix-and-oxidized/28260/

As a Zabbix partner, we help customers worldwide with all their Zabbix needs, ranging from building a simple template all the way to (massive) turn-key implementations, trainings, and support contracts. Quite often during projects, we get the question, “How about making configuration backups of our network equipment? We need this, as another tool was also capable of doing this!”

The answer is always the same – yes, but no. Yes, technically it is possible to get your configuration backups in Zabbix. It’s not even that hard to set up initially. However, you really should not want configuration backups. Zabbix is not made for them, and you will run into limitations within minutes. As you can imagine, the customer is never happy with this limitation, and some actively start to question where we think the limitation is to see if it is a limitation for them as well. So we simply set up an SSH agent item and get that config out:

Voila! Once per hour Zabbix will log in to that device, execute the command ‘show full-configuration,’ and get back the result. Frankly, it just works. You check the Monitoring -> Latest data section of the host and see that there is data for this item. Problem solved, right?

No. As a matter of fact, this is where the problems start. Zabbix allows us to store up to 64KB of data in a item value. The above screenshot is of a (small) fortigate firewall and the config if stored in a text file is just over 1.1MB. So, Zabbix truncates the data, which renders the backup useless –  restore will never work. At the same time, Zabbix is not sanitizing the output, so all secrets are left in it.

To make it even worse, it’s challenging to make a diff of different config versions/revisions – that feature is just not there. Most of the time, the customer is at this point convinced that Zabbix is not the right tool and the next question pops up – “Now what? How can we fix this?” This is where our added value is presented, as we do have a solution here which is rather affordable (free) as well.

The solution is Oxidized, which is basically Rancid on steroids. This project started years ago and is released under the Apache 2.0 license. We found it by accident, started playing around with it, and never left it. The project is available on Github (https://github.com/ytti/oxidized) and written in Ruby. Incidentally, if you (or your company) have Ruby devs and want to give something back to the community, the Oxidized project is looking for extra maintainers!

At this point, we show our customers the GUI of Oxidized, which in our case involves just making backups of a few firewalls:

So we have the name, the model, and (in this case) just one group. The status shows whether the backup was successful or not, the last update and when the last change was detected. At the same time, under actions, we can get the full config file, look at previous revisions(and diff them) combined with a ‘execute now’ option.

Looking at the versions, it’s simply showing this:

This is already giving us a nice idea of what is going on. We see the versions and dates at a glance, but the moment we check the diff option, we can easily see what was actually changed:

The perfect solution, except that it is not integrated with Zabbix. That means double administration and a lot of extra work, combined with the inevitable errors – devices not added, credential mismatches, connection errors, etc. Luckily, we can easily change the format of the above information from GUI to json by just adding ‘.json’ at the end of the url:

http://<IP/DNS>:<PORT>/nodes.json

This will give the following output:

[
  {
    "name": "fw-mid-01",
    "full_name": "fw-mid-01",
    "ip": "192.168.4.254",
    "group": "default",
    "model": "FortiOS",
    "last": {
      "start": "2024-06-13 06:46:14 UTC",
      "end": "2024-06-13 06:46:54 UTC",
      "status": "no_connection",
      "time": 40.018852483
    },
    "vars": null,
    "mtime": "unknown",
    "status": "no_connection",
    "time": "2024-06-13 06:46:54 UTC"
  },
  {
    "name": "FW-HUNZE",
    "full_name": "FW-HUNZE",
    "ip": "192.168.0.254",
    "group": "default",
    "model": "FortiOS",
    "last": {
      "start": "2024-06-13 06:46:54 UTC",
      "end": "2024-06-13 06:47:04 UTC",
      "status": "success",
      "time": 10.029043912
    },
    "vars": null,
    "mtime": "2024-06-13 06:47:05 UTC",
    "status": "success",
    "time": "2024-06-13 06:47:04 UTC"
  }
]

As you might know, Zabbix is perfectly capable of parsing json formats and creating items and triggers out of them. A master item, dependent lld (https://blog.zabbix.com/low-level-discovery-with-dependent-items/13634/), and within minutes you’ve got Oxidized making configuration backups while Zabbix is monitoring and alerting on the status:

At this point we’re getting close to a nice integration, but we haven’t overcome the double configuration management yet.

Oxidized can read its configuration from multiple sources, including a CSV file, SQL, SQLite, MySQL or HTTP. The easiest is a CSV file – just make sure you’ve got all information in the correct column and it works. An example:

Oxidized config:

source:
  default: csv
  csv:
    file: /var/lib/oxidized/router.db
    delimiter: !ruby/regexp /:/
    map:
      name: 0
      ip: 1
      model: 2
      username: 3
      password: 4
    vars_map:
      enable: 5

CSV file:

rtr01.local:192.168.1.1:ios:oxidized:5uP3R53cR3T:T0p53cR3t

Great, now we have to configure 2 places (Zabbix and Oxidized) and get a username/password cleartext in a CSV file. What about SQL as a source, and letting it connect to Zabbix? From there we should be able to get information regarding the hostname, but somehow we need the credentials as well. That’s not a default piece of information in Zabbix, but UserMacros can save us here.

So on our host we add 2 extra macros:

At the same time, we need to tell Oxidized what kind of device it is. There are multiple ways of doing this, obviously. A tag, a usermacro, hostgroups, you name it. In order to do this, we place a tag on the host:

Now we make sure that Oxidized is only taking hosts with the tag ‘oxidized’ and extract from them the host name, IP address, model, username, and password:

+-----------+----------------+---------+------------+------------------------------+
| host      | ip             | model   | username   | password                     |
+-----------+----------------+---------+------------+------------------------------+
| fw-mid-01 | 192.168.4.254  | fortios | <redacted> | <redacted>                   |
| FW-HUNZE  | 192.168.0.254  | fortios | <redacted> | <redacted>                   |
+-----------+----------------+---------+------------+------------------------------+

This way, we simply add our host in Zabbix, add the SSH credentials, and Oxidized will pick it up the next time a backup is scheduled. Zabbix will immediately start monitoring the status of those jobs and alert you if something fails.

This blog post is not meant as a complete integration write down, but rather as a way to give some insight into how we as a partner operate in the field, taking advantage of the flexibility of the products we work with. This post should give you enough information to build it yourself, but of course we’re always available to help you or just build it as part of our consultancy offering.

 

The post Monitoring Configuration Backups with Zabbix and Oxidized appeared first on Zabbix Blog.

Monitor new Zabbix releases natively

Post Syndicated from Brian van Baekel original https://blog.zabbix.com/monitor-new-zabbix-releases-natively/28105/

In this blog post, I’ll guide you through building your own template to monitor the latest Zabbix releases directly from the Zabbix UI. Follow the simple walkthrough to know how.

Introduction

With the release of Zabbix 7.0, it is possible to see which Zabbix version you are running and what the latest version is:

A great improvement obviously but (at least in 7.0.0rc1) I am missing the triggers to notify me and perhaps also really interesting, there is nothing available about older versions.

Once I saw the above screenshot, I became curious about where that data actually came from, and what’s available. A quick deep-dive into the sourcecode ( https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/ui/js/class.software-version-check.js#18 ) gave away the URL that is used for this feature: https://services.zabbix.com/updates/v1 Once you visit that URL you will get a nice JSON formatted output:

{
  "versions": [
    {
      "version": "5.0",
      "end_of_full_support": true,
      "latest_release": {
        "created": "1711361454",
        "release": "5.0.42"
      }
    },
    {
      "version": "6.0",
      "end_of_full_support": false,
      "latest_release": {
        "created": "1716274679",
        "release": "6.0.30"
      }
    },
    {
      "version": "6.4",
      "end_of_full_support": false,
      "latest_release": {
        "created": "1716283254",
        "release": "6.4.15"
      }
    }
  ]
}

And as you may know, Zabbix is quite good at parsing JSON formatted data. So, I built a quick template to get this going and be notified once a new version is released.

In the examples below I used my 6.4 instance, but this also works on 6.0 and of course 7.0.

Template building

Before we jump into the building part, it’s important to think of the best approach for this template. I think there are 2:

  • Create 1 HTTP item and a few dependent items for the various versions
  • Create 1 HTTP item, a LLD rule and a few item prototypes.

I prefer the LLD route, as that is making the template as dynamic as possible (less time to maintain it) but also more fun to build!

Let’s go.

First, you go to Data Collection -> Templates and create a new template there:

Of course, you can change the name of the template and the group. It’s completely up to you.

Once the template is created, it’s still an empty shell and we need to populate it. We will start with a normal item of type HTTP agent:

(note: screenshot is truncated)

We need to add 3 query fields:

  • ‘type’ with value ‘software_update_check’
  • ‘version’ with value ‘1.0’
  • ‘software_update_check_hash’ with a 64 characters: you can do funny things here 😉 for the example i just used ‘here_are_exact_64_characters_needed_as_a_custom_hash_for_zabbix_’

As we go for the LLD route, I already set the “History Storage period” to “Do not keep history”. If you are building the template, it’s advised to keep the history and make sure you’ve got data to work with for the rest of the template. Once everything works, go back to this item and make sure to change the History storage period.

In the above screenshot, you can see I applied 2 preprocessing steps already.

The first is to replace the text ‘versions’ with ‘data’. This is done because Zabbix expects an array ‘data’ for its LLD logic. That ‘data’ is not available, so I just replaced the text ‘versions’. Quick and dirty.
The second preprocessing step is a “discard unchanged with heartbeat”. As long as there is no new release, I do not care about the data that came in, yet I want to store it once per day to make sure the item is still working. With this approach, we monitor the URL every 30 minutes so we get ‘quick’ updates but still do not use a lot of disk space.

The result of the preprocessing configuration:

Now it’s time to hit the “test all steps” button and see if everything works. The result you’ll get is:

{
  "data": [
    {
      "version": "5.0",
      "end_of_full_support": true,
      "latest_release": {
        "created": "1711361454",
        "release": "5.0.42"
      }
    },
    {
      "version": "6.0",
      "end_of_full_support": false,
      "latest_release": {
        "created": "1716274679",
        "release": "6.0.30"
      }
    },
    {
      "version": "6.4",
      "end_of_full_support": false,
      "latest_release": {
        "created": "1716283254",
        "release": "6.4.15"
      }
    }
  ]
}

This is almost identical to the information directly from the URL, except that ‘versions’ is replaced by ‘data’. Great. So as soon as you save this item we will monitor the releases now (don’t forget to link the template to a host otherwise nothing will happen)!
At the same time, this information is absolutely not useful at all, as it’s just a text portion. We need to parse it, and LLD is the way to go.

In the template, we go to “Discovery rules” and click on “Create discovery rule” in the upper right corner.
Now we create a new LLD rule, which is not going to query the website itself, but will get its information from the HTTP agent we’ve just created.

In the above screenshot, you see how it’s configured. a name, type ‘Dependent item’ some key just because Zabbix requires a key, and the Master item is the HTTP agent item we just created.

Now all data from the http agent item is pushed into the LLD rule as soon as it’s received, and we need to create LLD macros out of it. So in the Discovery rule, you jump to the 3rd tab ‘LLD macros’ and add a new macro there:

{#VERSION} with JSONPATH$..version.first()

Once this is done save the LLD rule and let’s create some item prototypes.

The first item prototype is the most complicated, the rest are “copy/paste”, more or less.

We create a new item prototype that looks like this:

As the type is dependent and it is getting all its information from the HTTP agent master item, there is preprocessing needed to filter out only that specific piece of information that is needed. You go to the preprocessing tab and add a JSONpath step there:

 

For copy/paste purposes: $.data.[?(@.version=='{#VERSION}’)].latest_release.created.first()
There is quite some magic happening in that step. We tell it to use a JSONpath to find the correct value, but there is also a lookup:

[?(@.version=='{#VERSION}')]

What we are doing here is telling it to go into the data array, look for an array ‘version’ with the value {#VERSION}. Of course that {#VERSION} LLD macro is going to be replaced dynamically by the discovery rule with the correct version. Once it found the version object, go in and find the object  ‘latest_release’ and from that object we want the value of ‘created’. Now we will get back the epoch time of that release, and in the item we parse that with Unit unixtime.

Save the item, and immediately clone it to create the 2nd item prototype to get the support state:

Here we change the type of information and of course the preprocessing should be slightly different as we are looking for a different object:

JSONPath:

$.data.[?(@.version=='{#VERSION}')].end_of_full_support.first()

Save this item as well, and let’s create our last item to get the minor release number presented:

The preprocessing is again slightly different:

JSONPath:

$.data.[?(@.version=='{#VERSION}')].latest_release.release.first()

At this point you should have 1 master item, 1 LLD rule and 3 Item prototypes.

Now, create a new host, and link this template to it. Fairly quick you should see data coming in and everything should be nicely parsed:

The only thing that is missing now is a trigger to get alerted once a new version has been released, so let’s go back into the template, discovery rule and find the trigger prototypes. Create a new one that looks like this:

Since we populated the even name as well, our problem name will reflect the most recent version already:

 

Enjoy your new template! 🙂

The post Monitor new Zabbix releases natively appeared first on Zabbix Blog.

Monitoring Juniper Mist wireless network

Post Syndicated from Brian van Baekel original https://blog.zabbix.com/monitoring-juniper-mist-wireless-network/19093/

As Premium Zabbix partner, Opensource ICT Solutions is building Zabbix solutions all over the world. That means we have customers with a broad variety of requirements, thoughts on how to monitor things, which metrics are important and how to alert upon it. If one of those customers approaches us with a question concerning a task the likes of which we have never done before, it’s a challenge. And we love challenges! This blog post will cover one such challenge that we solved some time ago.

Quanza is a leading infrastructure operator offering a broad portfolio of services to completely take over the management of networks, data centers and cloud services. With more than 70 colleagues and at least as many specializations, everyone at Quanza works towards the same goal: designing, building, and operating an optimal IT infrastructure. Exactly like you would expect it… and then some. Quanza understands that you prefer to focus on your own innovation. By continuously mapping out your wishes, Quanza provides customized solutions that keep your network up and running 24×7. Today and in the future.

With a relentless focus on mission-critical environments, often of relevance to society, Quanza has an impressive line-up of customers. Some enterprises that chose to partner up with Quanza are SURF, Payvision, the Volksbank, and the Amsterdam Internet Exchange (AMS-IX), one of the world’s largest internet hubs.

Recently, customers started asking Quanza to embed Juniper MIST products for wired and wireless networks in their service portfolio. In order to fully support the network’s lifecycle (build, operate and innovate), the Juniper MIST products will need to be monitored by their 24×7 NOC. This is where we came into play, with our Zabbix knowledge.

We quickly decided to combine the knowledge Quanza has of the Juniper MIST equipment and API and our Zabbix knowledge to build the best possible monitoring solution.

SNMP or cloud?

The Juniper MIST solution is a cloud-based solution that provides a single pane of management for Juniper Networks products. As it’s cloud-based, it’s not a “traditional” network solution. As such, SNMP is not an option for device monitoring as they are communicating only with “the cloud” and we cannot access them directly like we used to do with traditional network equipment.

So, we started to investigate other options. One of the most common options right now is talking to some sort of API and pulling the metrics from that API. With Zabbix “HTTP agent” item key, this is no problem at all. Unfortunately, that’s not how the MIST API works. It’s pushing data instead of letting you pull it (actually, it does – but this doesn’t scale at all). Now, the Zabbix HTTP Agent item type allows trapping, but only in a specific Zabbix sender format. Of course, the MIST API does not allow that.

This means we have a problem. SNMP is not available. Pulling data is not a viable, scalable option. Pushing the data is an option, but Zabbix does not understand that.

Since we are not talking about some sort of proprietary monitoring tool which is completely closed and way too static, there is always a solution with Zabbix as long as you’re creative enough.

Getting data into Zabbix

We needed some middleware. Something that was able to receive that data from MIST and convert it into something that we can push into Zabbix.

That’s exactly what we did. We, together with Quanza, built a middleware that uses an API token to authenticate against the MIST API endpoint. Once the authentication is successful, the middleware is allowed to subscribe to certain “channels”. These channels provide event and performance data. You can compare it with MQTT, where a subscription to channels/topics is needed to get the information you are interested in.

Mist Middleware explained

  • Step 1:  Authenticate using an API token.
  • Step 2: Subscribe to channels
  • Step 3: Receive performance and event data
  • Step 4: Filter out only the relevant (performance) data for Zabbix
  • Step 5: Push into Zabbix

Once we had this in place, the MIST part was finished. We had our data and were able to push it into the monitoring solution.

Parsing in Zabbix

So, right now we have the data available for Zabbix. Time to find a neat way to use it. As the environments (both inventory and the types of equipment that are used) might be dynamic, we definitely do not want to apply any manual work to monitor newly added sites/equipment.

That means that low-level discovery rules are pretty much the only viable solution.

Here we go:

Describing host prototypes

 

 

Within Zabbix, we configure 1 host (the Discovery host) and apply a template on that host, with exactly 1 LLD rule: Query our middleware, and based on the information received, create new hosts (Host prototypes).

The data that is received looks like this:

{
"NODEID":"<NODEID>",
"NAME":"AP-<SITE>-<NUMBER",
"SITENAME":"<SITENAME>",
"SITEID":"<SITEID>",
"MAC":"<MAC ADDRESS>",
"ORGNAME":"<ORG NAME>"
},

Those new/discovered hosts will have the names of the AP and corresponding organization and location (in Mist: site). We also link a template to the discovered host and add it to a Host group with the variables we’ll need later, such as the organization, site name, siteID etc.

So, We need to parse those JSON elements. Luckily Zabbix provides, within the LLD rule config the option to parse this into LLD macros, so for example the Node id is parsed into {#ID} with the use of JSONPath $.NODEID:

LLD macro configuration

Once this process is complete, we have a new host per AP. Of course, there is no data on that host and querying the middleware or Mist is a bad idea. Scalability will be extremely problematic with more than a few organizations and sites configured in the Mist environment. As we’re building this with a big network integrator, scalability is a thing and we do not want to risk having a noticeable performance impact by using polling.

How about pushing data from the middleware into Zabbix? Once the data is received from Mist by the middleware, it’s parsed, filtered and then it pushes out whatever must be pushed out to Zabbix. We decided the best option is to push per host as we have those already available in Zabbix.

Now we should ensure two things:

    • do not overwhelm Zabbix with data being pushed in
    • Getting all the data with the least number of ‘pushes’ into Zabbix

Again, the flexibility of Zabbix is extremely useful here. On the AP hosts, there is a template with exactly 1 trapper item: receive performance data. From there, everything will be handled by the Zabbix ‘Master/Dependent’ item concept. We then extract data like temperatures, CPU load, memory usage, etc.

At the same time, we receive data regarding network usage (interface statistics) and radio information. As we do not know upfront how many network interfaces and radio’s there are on a particular Access Point, we do not want to hard-code such information. Here we are combining the concept of low-level discovery with dependent items (The following blog post covers the logic behind such an approach: Low-Level Discovery with Dependent items – Zabbix Blog)

Using ‘low-level discovery with dependent Items’, all relevant items are created ‘dynamically’ in such a way that a change on the MIST side (for example a new type of Access Point) doesn’t require changes on the Zabbix side. Monitoring starts within minutes and you’ll never miss any problem that might arise!
Just to give you an idea of the flow:
The Master Item gets a JSON format like this (and we’ve parsed only a small portion here) pushed into it from the middleware:

{
"mac":"<MAC ADDRESS>",
"model":"<MODEL>",
"port_stat":{
"eth0":{
"up":true,
"speed":1000,
"full_duplex":true,
"tx_bytes":37291245,
"tx_pkts":169443,
"rx_bytes":123742282,
"rx_pkts":779249,
"rx_errors":0,
"rx_peak_bps":14184,
"tx_peak_bps":5444
}
},
"cpu_util":2,
"cpu_user":652611,
"cpu_system":901455,
"radio_stat":{
"band_5":{
"num_clients":<CLIENTS>,
"channel":<CHANNEL>,
"bandwidth":0,
"power":0,
"tx_bytes":0,
"tx_pkts":0,
"rx_bytes":0,
"rx_pkts":0,
"noise_floor":<NOISE>,
"disabled":true,
"usage":"5",
"util_all":0,
"util_tx":0,
"util_rx_in_bss":0,
"util_rx_other_bss":0,
"util_unknown_wifi":0,
"util_non_wifi":0
}
"env_stat":{
"cpu_temp":<CPU TEMP>,
"ambient_temp":<AMBIENT TEMP>,
"humidity":0,
"attitude":0,
"pressure":0
}
}

Within the Master item, we’re basically not parsing anything, it’s just there to receive the values and push them into the Dependent items. In the dependent items, we start “cherry-picking” only those metrics that we would like to see. As it’s JSON format, preprocessing step “JSONPath” comes in handy. At the same time, we’re looking into efficiency, so a second step is added: discard unchanged with heartbeat (1d):

Example: Getting out the statistics of the 2.4Ghz band radio:

Item prototype proprocessing

Of course, this has to be done with all items.

So far, we’ve heavily focussed on the technical part, but Zabbix does have quite a few options to visualize the data as well. As we’re waiting on the next LTS release, we have only set up a very small dashboard with a few widgets. One of the better ones:

number_clients

Here we’re using the new graph type widget, but instead of plotting the number of clients per AP, we’re plotting a dataset with an “aggregate” function. Of course, if we look at the dashboard widgets, there are many more things that can be visualized…

Efficiency and security considerations

As we were building this, we had 2 main considerations:

    • Efficiency
    • Security

Efficiency, as we are anticipating that Quanza will be responsible for quite a few MIST environments on top of the current environments in the near future, combined with a strict limit of allowed API calls against the MIST API. As such, it is really important to keep those API calls as low as possible. Next to that, with every new Access Point added, the load on the Zabbix server is increasing. Now that is not really a problem, as Zabbix is perfectly capable of monitoring thousands of metrics simultaneously, though it has its limits. And you do not want to hit those limits in a production environment with the only solution being migration to beefier hardware.

Security-wise this challenge had a few things going on since we’re talking to an external exposed API. MIST can invoke webhooks. This might’ve been a bit easier (we explored it, but there were of course other things to keep in mind while going down that road), but the main concern was the requirement that Zabbix / an interface to Zabbix is exposed to the internet. That didn’t look too appealing and required a bit more maintenance. The preferable solution was to create that middleware where we have full control of what queries are executed, how the API token is protected, which connections are established etc. etc.

Conclusion

Although this question was challenging, together with Quanza we created a scalable, secure, and dynamic solution. Zabbix is flexible enough to facilitate the tricks required to provide reliable monitoring and alerting in an efficient and secure manner. We strongly believe the only limitation is your own creativity and this case proves that once again.

Quanza can now ensure the availability of their customer Juniper MIST-based networks, and in case something breaks their 24×7 manned NOC will be able to take whatever action is required to ensure the availability of the customers’ network – all thanks to the flexibility of Zabbix.

The post Monitoring Juniper Mist wireless network appeared first on Zabbix Blog.

Low-Level Discovery with Dependent items

Post Syndicated from Brian van Baekel original https://blog.zabbix.com/low-level-discovery-with-dependent-items/13634/

The low-level discovery was introduced in Zabbix 2.0 and still belongs to one of the all-time favorites. Before LLD was available, adding items was all manual work. For example adding new disks, new interfaces, network ports on switches and everything else was all manual labor. And then LLD came around and suddenly we were able to ‘discover’ entities, and based on those discovered entities we can add new items, triggers, and such automatically.

Contents

  • Low-Level Discovery setup
  • Dependent items
  • Combing Low-Level Discovery and Dependent items
  • Conclusion

For a video guide, check out the Zabbix YouTube here: Zabbix: Low Level Discovery with Dependent items – YouTube

Low-Level Discovery setup

Let’s go over the idea of Low-Level Discovery first.

For the sake of clarity, we will stick with the default Zabbix agent item. Of course, as we will discover it’s only the format that matters for Zabbix to consider a response as LLD information. Let’s use built-in agent key: vfs.fs.discovery. Once we force the Zabbix agent to execute this item, it will reply with something like this:

[{"{#FSNAME}":"/sys","{#FSTYPE}":"sysfs"},{"{#FSNAME}":"/proc","{#FSTYPE}":"proc"},{"{#FSNAME}":"/dev","{#FSTYPE}":"devtmpfs"},{"{#FSNAME}":"/sys/kernel/security","{#FSTYPE}":"securityfs"},{"{#FSNAME}":"/dev/shm","{#FSTYPE}":"tmpfs"},{"{#FSNAME}":"/dev/pts","{#FSTYPE}":"devpts"},{"{#FSNAME}":"/run","{#FSTYPE}":"tmpfs"},{"{#FSNAME}":"/sys/fs/cgroup","{#FSTYPE}":"tmpfs"},{"{#FSNAME}":"/sys/fs/cgroup/systemd","{#FSTYPE}":"cgroup"},{"{#FSNAME}":"/sys/fs/pstore","{#FSTYPE}":"pstore"},{"{#FSNAME}":"/sys/firmware/efi/efivars","{#FSTYPE}":"efivarfs"},{"{#FSNAME}":"/sys/fs/bpf","{#FSTYPE}":"bpf"},{"{#FSNAME}":"/sys/fs/cgroup/net_cls,net_prio","{#FSTYPE}":"cgroup"},{"{#FSNAME}":"/sys/fs/cgroup/devices","{#FSTYPE}":"cgroup"},{"{#FSNAME}":"/sys/fs/cgroup/hugetlb","{#FSTYPE}":"cgroup"},{"{#FSNAME}":"/sys/fs/cgroup/memory","{#FSTYPE}":"cgroup"},{"{#FSNAME}":"/sys/fs/cgroup/rdma","{#FSTYPE}":"cgroup"},{"{#FSNAME}":"/sys/fs/cgroup/freezer","{#FSTYPE}":"cgroup"},{"{#FSNAME}":"/sys/fs/cgroup/cpu,cpuacct","{#FSTYPE}":"cgroup"},{"{#FSNAME}":"/sys/fs/cgroup/cpuset","{#FSTYPE}":"cgroup"},{"{#FSNAME}":"/sys/fs/cgroup/perf_event","{#FSTYPE}":"cgroup"},{"{#FSNAME}":"/sys/fs/cgroup/blkio","{#FSTYPE}":"cgroup"},{"{#FSNAME}":"/sys/fs/cgroup/pids","{#FSTYPE}":"cgroup"},{"{#FSNAME}":"/sys/kernel/tracing","{#FSTYPE}":"tracefs"},{"{#FSNAME}":"/sys/kernel/config","{#FSTYPE}":"configfs"},{"{#FSNAME}":"/","{#FSTYPE}":"xfs"},{"{#FSNAME}":"/sys/fs/selinux","{#FSTYPE}":"selinuxfs"},{"{#FSNAME}":"/proc/sys/fs/binfmt_misc","{#FSTYPE}":"autofs"},{"{#FSNAME}":"/dev/hugepages","{#FSTYPE}":"hugetlbfs"},{"{#FSNAME}":"/dev/mqueue","{#FSTYPE}":"mqueue"},{"{#FSNAME}":"/sys/kernel/debug","{#FSTYPE}":"debugfs"},{"{#FSNAME}":"/sys/fs/fuse/connections","{#FSTYPE}":"fusectl"},{"{#FSNAME}":"/boot","{#FSTYPE}":"ext4"},{"{#FSNAME}":"/boot/efi","{#FSTYPE}":"vfat"},{"{#FSNAME}":"/home","{#FSTYPE}":"xfs"},{"{#FSNAME}":"/run/user/0","{#FSTYPE}":"tmpfs"}]

When we put this in a more readable format (truncated) it will look like this:

[
{
"{#FSNAME}":"/sys",
"{#FSTYPE}":"sysfs"
},
{
"{#FSNAME}":"/proc",
"{#FSTYPE}":"proc"
},
{
"{#FSNAME}":"/dev",
"{#FSTYPE}":"devtmpfs"
},
{
"{#FSNAME}":"/sys/kernel/config",
"{#FSTYPE}":"configfs"
},
{
"{#FSNAME}":"/",
"{#FSTYPE}":"xfs"
},
{
"{#FSNAME}":"/boot",
"{#FSTYPE}":"ext4"
},
{
"{#FSNAME}":"/home",
"{#FSTYPE}":"xfs"
}
]

In this format it suddenly becomes clear, we have the {#FSNAME} macro, with the name of a filesystem, combined with the type, captured in {#FSTYPE}.

Perfect! We feed this information into Zabbix, and LLD magic will happen.
Based on the Item prototypes, new items per {#FSNAME} will be added, and monitoring will start on those items.

Looking at the Item prototypes, they look a lot like normal items:

So, we have one item prototype that is responsible for providing the LLD information, and then the created ‘normal’ items to query the filesystem statistics. As you can imagine, with just 5 filesystems and 1 metric per filesystem, queried once per minute, no problem. But what if we have 50 filesystems, 7 metrics per filesystem and they get queried every 10 seconds… That’s a lot of queries against the host! Not only does that add load to the Zabbix server, but obviously also to the monitored host. It works, but is it ideal? It certainly isn’t!

So we’ve basically just setup this:

Dependent items

But then Zabbix introduced dependent items. Let’s take a quick look at dependent items and what they are

We have one master item that gathers all information (in bulk) and propagates that information to all the dependent items. On those dependent items we just do the cherry picking and filtering of the relevant metrics. Let’s put this to work and see how that goes.

So we create an item, with, in this case, the http agent type, which will collect the following information regarding the server status in a single request:

ServerVersion: Apache/2.4.37 (centos)
ServerMPM: event
Server Built: Nov  4 2020 03:20:37
CurrentTime: Monday, 08-Mar-2021 14:35:20 CET
RestartTime: Monday, 08-Mar-2021 11:04:09 CET
ParentServerConfigGeneration: 1
ParentServerMPMGeneration: 0
ServerUptimeSeconds: 12671
ServerUptime: 3 hours 31 minutes 11 seconds
Load1: 0.01
Load5: 0.03
Load15: 0.00
Total Accesses: 1182
Total kBytes: 10829
Total Duration: 95552
CPUUser: 5.01
CPUSystem: 7.34
CPUChildrenUser: 0
CPUChildrenSystem: 0
CPULoad: .0974667
Uptime: 12671
ReqPerSec: .0932839
BytesPerSec: 875.14
BytesPerReq: 9381.47
DurationPerReq: 80.8393
BusyWorkers: 1
IdleWorkers: 99
Processes: 4
Stopping: 0
BusyWorkers: 1
IdleWorkers: 99
ConnsTotal: 4
ConnsAsyncWriting: 0
ConnsAsyncKeepAlive: 0
ConnsAsyncClosing: 0
Scoreboard: _________________________________________________________________________________________W__________............................................................................................................................................................................................................................................................................................................

 

Now, we create some dependent items, that depend on that first item (which we will call the Master item). Every time the Master item receives information, the complete reply will be pushed to the dependent items, without any altering of that data. So the master and dependent items are identical when no preprocessing is applied. That’s why on the dependent items we apply preprocessing to filter relevant information, for example, the BusyWorkers:

Perfect. So querying a host once, getting all the metrics in bulk, and then parsing it in Zabbix using preprocessing. Say bye to excessive load on the monitored host… (and due to preprocessing processes within Zabbix, no problem on the Zabbix server side).

Combining Low-Level Discovery and Dependent items

Ok, and what if we combine these to concepts? LLD with Dependent items? Wouldn’t that be the ultimate goal? Automatically creating new items without putting extra load to the monitored host? Let’s get this going!

To stick with the first example of LLD, we will discover filesystems, but now without the vfs.fs.discovery key, but the newly introduced vfs.fs.get key. Once we force the agent to execute this key, we will see this reply:

[{"fsname":"/dev","fstype":"devtmpfs","bytes":{"total":1940963328,"free":1940963328,"used":0,"pfree":100.000000,"pused":0.000000},"inodes":{"total":473868,"free":473487,"used":381,"pfree":99.919598,"pused":0.080402}},{"fsname":"/dev/shm","fstype":"tmpfs","bytes":{"total":1958469632,"free":1958469632,"used":0,"pfree":100.000000,"pused":0.000000},"inodes":{"total":478142,"free":478141,"used":1,"pfree":99.999791,"pused":0.000209}},{"fsname":"/run","fstype":"tmpfs","bytes":{"total":1958469632,"free":1892040704,"used":66428928,"pfree":96.608121,"pused":3.391879},"inodes":{"total":478142,"free":477519,"used":623,"pfree":99.869704,"pused":0.130296}},{"fsname":"/sys/fs/cgroup","fstype":"tmpfs","bytes":{"total":1958469632,"free":1958469632,"used":0,"pfree":100.000000,"pused":0.000000},"inodes":{"total":478142,"free":478125,"used":17,"pfree":99.996445,"pused":0.003555}},{"fsname":"/","fstype":"xfs","bytes":{"total":95516360704,"free":55329644544,"used":40186716160,"pfree":57.926877,"pused":42.073123},"inodes":{"total":46661632,"free":46535047,"used":126585,"pfree":99.728717,"pused":0.271283}},{"fsname":"/boot","fstype":"ext4","bytes":{"total":1023303680,"free":705544192,"used":247296000,"pfree":74.046435,"pused":25.953565},"inodes":{"total":65536,"free":65497,"used":39,"pfree":99.940491,"pused":0.059509}},{"fsname":"/home","fstype":"xfs","bytes":{"total":5358223360,"free":5286903808,"used":71319552,"pfree":98.668970,"pused":1.331030},"inodes":{"total":2621440,"free":2621428,"used":12,"pfree":99.999542,"pused":0.000458}},{"fsname":"/run/user/0","fstype":"tmpfs","bytes":{"total":391692288,"free":391692288,"used":0,"pfree":100.000000,"pused":0.000000},"inodes":{"total":478142,"free":478137,"used":5,"pfree":99.998954,"pused":0.001046}}]

And if we format it to be more readable, it will look like this (truncated):

[
  {
    "fsname":"/",
    "fstype":"xfs",
    "bytes":{
      "total":95516360704,
      "free":55329644544,
      "used":40186716160,
      "pfree":57.926877,
      "pused":42.073123
    },
    "inodes":{
      "total":46661632,
      "free":46535047,
      "used":126585,
      "pfree":99.728717,
      "pused":0.271283
    }
  },
  {
    "fsname":"/home",
    "fstype":"xfs",
    "bytes":{
      "total":5358223360,
      "free":5286903808,
      "used":71319552,
      "pfree":98.668970,
      "pused":1.331030
    },
    "inodes":{
      "total":2621440,
      "free":2621428,
      "used":12,
      "pfree":99.999542,
      "pused":0.000458
    }
  }
]

Per filesystem, we get the original information FSNAME and FSTYPE, but also the statistics of these filesystems… bulk metrics! So, we create a normal item (Which will serve as the master item) getting out all those metrics in a single query:

Once we’ve got this data in Zabbix, we feed it into the LLD rule, giving this LLD rule the dependent LLD type:

Of course there are no ready to use LLD macros in this data, but since it is in JSON format, it shouldn’t be too hard to create the LLD macros with the ‘LLD macros’ option in the frontend and the relevant JSONPath expression:

Note: Technically we do not need to create the {#FSTYPE} macro to get this working!

Once this is done, we should be ready to create the item prototypes for this LLD rule. The data is there, macros are available, nothing is going to stop us now!

Let’s move on to item prototypes. But of course, we do not want to poll that remote host again per discovered filesystem. That means we will make this item prototype of the dependent item type as well, pointing it back to the master item we’ve created.

For the first item prototype, we want to obtain the total size per filesystem:

But, as I mentioned earlier: a dependent item without any preprocessing is identical to the master item and of course that would be wrong in this case. We just want to see the total bytes per filesystem and not all the collected statistics. In the configuration above we already know what to get out, so the Type of information and Units are filled already. What is not visible on that screenshot is the preprocessing rule that we need. Here the ‘JSONPath’ preprocessing step comes in handy since we receive JSON data. We would like to get out this part for our item (truncated):

[
  {
    "fsname":"/",
    "fstype":"xfs",
    "bytes":{
      "total":95516360704,
      "free":55329644544,
       "used":40186716160,
      "pfree":57.926877,
      "pused":42.073123

So, if we try to get this information out using JSONPath, it should look like: $.bytes.total.first() but this will match on any filesystem, so we need to configure it a bit more specific like: $[?(@.fsname==’/’)].bytes.total.first() 

As you can see, the JSONPath is a bit more complex here. We are forcing it to match on @.fsname==’/’ and from that entity, get out the bytes.total. Now, to make it even more complex we shouldn’t configure the filesystem hardcoded in the JSONPath since we’re working with Item prototypes. It should be the LLD Macro {#FSNAME} instead!

Now we save this item prototype, grab a cup of coffee (or just force a config_cache_reload on the server) and just wait for the magic to happen.

We’ve now built this setup:

 

So the master item will get values (i.e. obtain bulk data every minute) and push it into the LLD rule. From there, as per item prototypes, items will be created and those are populated from the master item as well, filtering out only the relevant metrics using Preprocessing.

So far, so good, but we have one small problem to solve: We want to get metrics every minute or so, but since all those metrics will get pushed into the LLD rule, we might be adding unnecessary extra load due to the high frequency. Luckily, solving that problem is no too hard. Navigate to the discovery rule, go to the ‘Preprocessing tab’ and select ‘Discard unchanged with heartbeat’ parameter: 1h or even larger interval!

This is insane! With just one poll/query to a host, we will utilize the power of LLD and dependent items, getting all metrics without adding minimal extra load on that host.

 

Conclusion

That’s it. If you’ve setup everything correctly, you should now get out quite a few filesystem metrics without adding any extra performance overhead on the host by performing unnecessary data requests.

Of course, if you need help optimizing your Zabbix environment, support contracts, consultancy, or training, we from Opensource ICT Solutions are always available to assist you in every possible way, worldwide, 24×7.

Thanks for reading this blog post, see you in the next one.

Getting your notifications via Signal

Post Syndicated from Brian van Baekel original https://blog.zabbix.com/getting-your-notifications-via-signal/13286/

Recently, Whatsapp pushed their new privacy policy where they announced to share more data with Facebook, causing an exodus to other platforms, where Signal is one of the more popular ones, among Telegram. Both are great alternatives, but I prefer Signal due to the open-source part, end to end encryption, and last but not least: their business model (living on donations instead of selling your data).

Typically, Zabbix is sending notifications to whatever medium you’ve chosen if a problem is detected. We all know the Email messages, the various webhook integrations with Slack/MS Teams/ Jira, etc, perhaps even some text message integrations and such. Now, if we’re migrating to Signal, we suddenly have access to the Signal API and can utilize it to receive Zabbix notifications. Nice!

There is only one drawback. You need a separate phone number to register against Signal. Don’t use your own phone number – unless you want to lose the ability to use Signal ;(

There are various ways to get a phone number for this purpose:

  • Use the phone number of your current SMS gateway
  • Use the company phone number (a lot of cloud PBX are providing the option to receive the verification email)
  • Purchase a prepaid phone number.
  • Use a service like Twilio

You just need to receive one text message, the rest of the communications will go via the internet

Time to get rid of Whatsapp and move to Signal! But… How to use Signal to get your notifications?

Signal-cli

Although we could built everything from scratch, talking to the API of Signal, there is a nice implementation available in order to talk to Signal within a few minutes: Signal-cli

Although this github page is very comprehensive in order to get Signal-cli installed, but of course it is not doing anything with Zabbix.

Configuration tasks

For this guide, we’re using:

  • Centos 8
  • Zabbix 5.2

signal-cli installation

First, lets install the Signal-cli utility, and in order to do so we need to resolve the dependency of Java by installing the openjdk application:

dnf -y install java-11-openjdk-devel.x86_64

After this installation, we should be good to continue with the installation of signal-cli. According to their installation guide, this should be sufficient:

export VERSION="0.7.3"
wget https://github.com/AsamK/signal-cli/releases/download/v"${VERSION}"/signal-cli-"${VERSION}".tar.gz
sudo tar xf signal-cli-"${VERSION}".tar.gz -C /opt
sudo ln -sf /opt/signal-cli-"${VERSION}"/bin/signal-cli /usr/local/bin/

At the time of writing, the most recent version is 0.7.3, and that’s what we’re installing here. If in the future a new version is released, of course you should install that!

If everything went as expected, we should be able to register ourself to Signal.

signal-cli registration

Since we want to execute these commands by Zabbix, we must make sure the registration is done with the correct user on the Zabbix server, otherwise you will get the following error message:

Unregistered user error

(ERROR App – User +19293771253 is not registered.)

In order to prevent this error, lets do the authentication against Signal as Zabbix user:

Important: The USERNAME (your phone number) must include the country calling code, i.e. the number must start with a “+” sign and you must replace everything between the  < > in the following examples with your own values

runuser -l zabbix -c 'signal-cli -u <NUMBER> register'

Now, check for incoming test messages on this phone number. Within seconds you should receive a 6 digit code in the following format: xxx-xxx

Once you’ve received the text, it’s time to complete the registration:

runuser -l zabbix -c 'signal-cli -u <NUMBER> verify <CODE>'

Since we’re running these commands as a different user, we won’t see the output of them. Let’s just test!

Sending messages from the command line is straight forward:

runuser -l zabbix -c 'signal-cli -u <NUMBER> send -m <MESSAGE> <RECEIVER NUMBER>'

You will see the message id as output. Simply ignore it, since it’s not relevant at this point.

Within seconds:

It works! Great.

So now we’ve got this part covered, time to get the AlertScript set up, before heading to the frontend.

Zabbix AlertScript setup

Ok, so now we’ve got the registration done, we need to make sure Zabbix can utilise it. In order to do so, we use a very old method. Although it would’ve made more sense to use the webhook option, that means I had to built the communication with Signal from scratch.

So AlertScripts it is. In your terminal/SSH session with the Zabbix server open a new file with this command: vi /usr/lib/zabbix/alertscripts/signal.sh and insert the following contents:

#!/bin/bash
signal-cli -u '+19293771253' send -m "$1" $2

 That’s right. just 2 lines. After saving the file, change the owner and set the permissions:

chown zabbix:zabbix /usr/lib/zabbix/alertscripts/signal.sh
chmod 7000 /usr/lib/zabbix/alertscripts/signal.sh

and it’s time to move to our frontend.

Zabbix mediatype configuration

In the frontend, go to Administration -> Mediatypes and create a new mediatype:

Signal Mediatype

Name: Signal
Type: Script
Script name: signal.sh
Script parameters:
    {ALERT.MESSAGE}
    {ALERT.SENDTO}

don’t forget to configure some Message templates as well (second tab in the Mediatype configuration). You can just use the defaults if you click on ‘add’

Zabbix media configuration

Next step. Navigate to Administration -> Users (or just open your own user profile) and create a new media:

new-media

Type: Signal
Sendto: <your number>
When active / severity as per needs

Important: The USERNAME (your phone number) must include the country calling code, i.e. the number must start with a “+” sign

We’re almost there, just some configuration on the actions

Zabbix action configuration

This step is only needed if you are sending notifications right now via a specific mediatype. If you configured the ‘send only to’ option to ‘- All -‘ there is nothing to change, and it will work straight away!

Otherwise, navigate to Configuration -> Actions and find the action you want to change, and in the Operations, Recovery operations and Update operations change the ‘send only to’ option to ‘Signal’

Save your action and it’s time to test – Generate some problem to confirm the implementation actually works.

Wrap up

That’s it. By now you should have a working implementation where Zabbix is sending notifications to Signal. The setup was extremely straight forward and easy to configure. Nevertheless, if you need help getting this going, we (Opensource ICT Solutions) offer consultancy services as well, and are more than happy to help you out!