Tag Archives: Business Service monitoring

What is Network Monitoring? Everything You Need to Know

Post Syndicated from Michael Kammer original https://blog.zabbix.com/what-is-network-monitoring-everything-you-need-to-know/26539/

Your company’s network is the glue that bonds your enterprise together. The technology of networking is growing more stable and reliable all the time, but it doesn’t mean you can leave your network unattended – quality network monitoring is an absolute must-have.

What are network monitoring systems?

At its most basic, network monitoring is a critical IT process where all networking components (as well as key performance indicators like network hardware CPU utilization and network bandwidth) are continuously and proactively monitored to improve performance, eliminate bottlenecks, and prevent network congestion and downtime.

Put more simply, it’s the act of keeping an eye on all the connected elements that are relevant to your business. That means all your hardware and software resources, including routers, switches, firewalls, servers, PCs, printers, phones, and tablets.

A network monitoring system is a set of software tools that lets you program this action. It allows you to constantly monitor your network infrastructure by doing systematic tests to look for issues and notifying you if any are found. A good system makes monitoring your network easy by:

  • Allowing you to see all information in dashboards
  • Generating reports on demand
  • Sending alerts
  • Displaying the monitoring data you need in easy-to-read graphs

What are some key benefits of network monitoring?

A quality network monitoring solution allows you to:

Benchmark standard performance

Monitoring gives you the visibility to benchmark your network’s everyday performance. It also makes it easy to spot any fluctuations in performance, which in turn allows you to identify any unwanted changes.

Effectively allocate resources

IT teams need a clear understanding of the source of problems. They also need the ability to minimize tedious troubleshooting and put in place proactive measures to stay ahead of IT outages. To use a plumbing analogy, monitoring lets them fix cracks before a leak happens.

Identify security threats

Preventing security breaches is a major challenge for any organization. As attacks become increasingly more sophisticated and difficult to trace, detecting and mitigating any form of network threat before it escalates is critical. Network monitoring makes it easier to protect data and systems by providing early warning of any suspicious anomalies.

Manage a changing IT environment

New technologies like internet-enabled sensors, wireless devices, and cloud technologies make it harder for IT teams to track performance fluctuations or suspicious activity. A network monitoring solution can:

  • Give IT teams a comprehensive inventory of wired and wireless devices
  • Make it easy to analyze long-term trends
  • Help you get the most out of your available assets

Proactively detect and resolve issues before they affect users

Monitoring a network closely allows an organization to quickly resolve issues and prevent major disruptions. This means fewer interruptions to operations and better utilization of IT resources.

Deploy new technology and system upgrades successfully

Thanks to monitoring, IT teams can learn how equipment has performed over time and use trend analysis to see whether current technology can scale to meet business needs. This can:

  • Give a clear picture of whether a network is able to support the launch of a new technology
  • Mitigate any risks associated with a major change
  • Easily demonstrate ROI by providing comprehensive metrics

What are some different types of network monitoring?

Different types of monitoring exist depending on what exactly needs to be monitored. Some of the most common include fault monitoring, log monitoring, network performance monitoring, configuration monitoring, and availability monitoring.

Fault monitoring

As the name suggests, fault monitoring involves finding and reporting faults in a computer network. It is crucial for maintaining uninterrupted network uptime and is essential to keeping all programs and services running smoothly.

Log monitoring

Resources such as servers, applications, and websites continuously generate logs, which can:

  • Provide valuable insights into user activity
  • Help a business comply with regulations
  • Promptly resolve incidents
  • Boost network security

Network performance monitoring (NPM)

NPM tracks monitoring parameters like latency, network traffic, bandwidth usage, and throughput, with the goal of optimizing user experience. NPM tools provide valuable information that can be used to minimize downtime and troubleshoot network issues.

Configuration monitoring

Monitoring network configuration involves keeping track of the software and firmware in use on the network and making sure that any inconsistencies are identified and addressed. This prevents any gaps in visibility or security.

Network availability monitoring

Availability monitoring is the monitoring of all IT infrastructure to determine the uptime of devices. By consistently monitoring devices and servers, organizations can receive alerts when there is a network crash or when a device becomes unavailable. ICMP, SNMP, and Syslogs are the most commonly used availability monitoring techniques.

How does it work?

Network monitoring uses multiple techniques to test the availability and functionality of a network. Here are a few of the most common techniques used to collect data for monitoring software:

Ping

A ping is the simplest technique that monitoring software uses to test hosts within a network. The monitoring system sends out a signal and records:

  • Whether the signal was received,
  • How long it took the host to receive the signal
  • Whether any signal data was lost

That data is then used to determine:

  • Whether the host is active
  • How efficient the host is
  • The transmission time and packet loss experienced when communicating with the host
  • Any other vital information

Simple network management protocol (SNMP)

SNMP is the most widely used protocol for modern network management systems. It uses monitoring software to monitor individual devices in a network. In this system, each monitored device has SNMP agent monitoring software that sends information about the device’s performance to the monitoring solution, which collects this information in a database and then analyzes it for errors.

Syslog

Syslog is an automated messaging system that sends messages when an event affects a network device. Technicians can set up devices to send out messages when the device encounters an error, shuts down unexpectedly, encounters a configuration failure, and more. These messages often contain information that can be used for system management as well as security systems.

Scripts

Scripts are simple programs that collect basic information and instruct the network to perform an action within certain conditions. They can fill gaps in monitoring software functionality, performing scheduled tasks such as resetting and reconfiguring a public access computer every night.

Scripts can also be used to collect data, sending out an alert if results don’t fall within certain thresholds. Network managers will usually set these thresholds, programming the network software to send out an alert if data indicates issues, including:

  • Slow throughput
  • High error rates
  • Unavailable devices
  • Slower-than-usual response times

How can businesses benefit from network monitoring?

Here are 5 ways that quality network monitoring can benefit any business:

Increased reliability

The main function of any monitoring solution is to show whether a device is working or not. A proactive approach to maintaining a healthy network will keep tech support requests and downtime to an absolute minimum.

Improved visibility

Having complete visibility of all your hardware and software assets allows you to easily monitor the health of your network. Monitoring tracks the data moving along cables and through servers, switches, connections, and routers. In the event of a problem, your IT team can identify the root cause and fix the issue quickly.

Enhanced performance

Network monitoring software lets you know which parts of your network are being properly used, overused, or underused. You can also uncover unnecessary costs that can be eliminated or identify a network component that needs upgrading.

Stricter compliance

Today’s IT teams need to meet strict regulatory and protection standards in increasingly complex networks. The latest compliance guidelines recommend actively watching for changes in normal system behavior and unusual data flow. The data provided by monitoring tools makes it easy to assess your entire system and deliver a service that meets all required standards.

Greater profitability

Network monitoring makes businesses more productive by saving network management time and lowering operating costs. If your team is aware of current and impending issues, you can reduce downtime and increase productivity and efficiency.

The Zabbix advantage

At Zabbix, we’ve perfected an enterprise IT infrastructure monitoring software that can deploy anywhere and monitor any device, system, or app in any environment while providing comprehensive data protection, easy integration, and unlimited visualization options.

You can also count on complete transparency, a predictable release cycle, a vibrant and active user community, and an outstanding user experience.

Everything we do scales easily, so we’re able to grow right along with you. What’s more, we offer a comprehensive range of professional services, including implementation, integration, custom development, consulting services, technical support, and a full suite of training programs.

The best part? Because Zabbix is open-source, it’s not just affordable – it’s free. Get in touch with us to find out more and get started on the path to maximum network efficiency today.

FAQ

What is an example of basic network monitoring?

An example of basic network monitoring is a network engineer collecting real-time data from a data center and setting up alerts when a problem (such as a device failure, a temperature spike, a power outage, or a network capacity issue) appears.

What is network monitoring used for?

Network monitoring can:

• Determine whether a network is running optimally in real time
• Proactively identify deficiencies and optimize efficiency
• Catch and repair problems before they impact operations
• Reduce downtime and make sure employees have access to the resources they need
• Boost the availability of APIs and webpages
• Optimize network performance and availability

What is the most popular network monitoring program?

Some of the most popular network monitoring programs available on the market include:

• Zabbix
• SolarWinds Network Performance Monitor
• Auvik
• Datadog
• ManageEngine OpManager
• Site24x7
• Checkmk
• Progress WhatsUp Gold
• Microsoft Resource Monitor
• Wireshark
• Nagios
• Ntop
• Cacti
• FreeNATS
• Icinga

What are the key steps in network monitoring?

A network monitoring process includes all phases involved in executing efficient network monitoring. These phases include:

  • Locating all key network components
  • Actively monitoring the components
  • Creating alerts for component health and metrics
  • Making a plan for managing issues
  • Analyzing generated reports
  • Adjusting the process as necessary

The post What is Network Monitoring? Everything You Need to Know appeared first on Zabbix Blog.

Real Life Business Service Monitoring

Post Syndicated from Alexander Petrov-Gavrilov original https://blog.zabbix.com/real-life-business-service-monitoring/24915/

Learn more about Zabbix business service monitoring features and check out our real-life use cases. The article is based on a Zabbix Summit 2022 speech by Aleksandrs Petrovs-Gavrilovs.

Business service monitoring with Zabbix

Hello everyone, my name is Alex and today I am going to write about Advanced business service and SLA monitoring and the related use cases.

Some of you may already be familiar with business services and the core idea behind them. In the vast majority of organizations, we have services that we provide to our customers or/and for internal use. The availability of those services is usually based either on hardware, software or people’s presence and availability. 

But no matter how well our monitoring is configured, there are times when we can miss how each specific device affects our business and that is where business service monitoring can help us.

With the help of business service monitoring it is possible to see what exactly is going on with your business depending on the state of every single part of your infrastructure. This allows us, the admins and service owners, to understand what it really means when a piece of hardware breaks or a device becomes unreachable. With business service monitoring, we see what exactly impacts our business and how severe the situation is, including calculating SLA (Service Level Agreement) and evaluating it against the defined SLO (Service Level Objective).

Business service hierarchy example

So let’s check out some examples of what business real-life business services may look like.

An average service tree example

In this example, we have a service tree that is based on support services. It has phones and phones are plugged into PBX, while PBX is plugged into the switch. And this is just one example, in reality, we could have a more complex infrastructure consisting of containers, CRM services and so on. And we of course monitor all of them, but what if we are interested in the business perspective as well?

To see the business perspective we need to go to the new service section in the main menu, where we can create and view the service tree itself. In addition, in the same section, we can configure the actions, which enable us to react in cases when something happens with one of the services.
We can also specify the SLO we strive to achieve and see SLA reports on the current situation.

Basic service overview

The service view also lets us see if we have problems affecting our services and track their root cause.

Service with active problems

Defining which service is affected by what problem is done by utilizing problem tags, which essentially link them together. Services can also have their own tags, which we use to group services and understand how one service relates to another. We can also use service tags to build an SLA report or execute actions in case a service is affected by a problem. Permissions too are based on service tags, allowing to creation of different views for different users.

But those are just the basics – what’s more interesting are the actual use cases. Let’s take a look at how Zabbix users actually use business service monitoring to their advantage based on real business examples. 

Business service tree for a financial institution

Real business service use cases can be helpful examples that can help you design your own Zabbix business service trees. Maybe you already have a similar business of your own and you need that starting point for everything to “click” – that starting point can be a real-life example.

Example service tree of a bank

The first example will seem a bit convoluted while actually being very straightforward. Here we can see an actual financial institution business service tree. You can see they have quite a lot of different interconnected services. First look at the service tree raw schema may be a bit confusing, but in Zabbix it’s pretty straightforward.

The internal service is connected to emails and emails are related to customer services at the same since we do need to communicate with the customers, not only internally! In addition, we also have to define services representing the underlying systems and applications which support our email services. That is easy to do with Zabbix services.

Easy to read e-mail service state

Imagine now, if you don’t have the services functionality at all, how fast can you check the status of the email service when all you have is only a list of problems for multiple devices? How can you check the service statistics for an entire year? That was the question that the service owners and administrators had in this use case and they solved it by defining Zabbix business service trees.

The “root” service

We start by defining the root business service – Financial institution. It is linked to 15 main services. The 15 services are grouped into internal or external ones. The lower-level services also contain the sub-services that the main services are based on. I.e., we have an Accounting service based on specific VM availability, where the accounting software resides on.

Detailed service tree

The services are divided into specific categories so the service owners can read the situation a lot easier without spending a lot of time figuring out which problem causes which situation. With a single click, the service owners can immediately see which components or child services each service is based on and the actual service SLA. This also gives the benefit of displaying the root cause problem and being able to quickly identify which child services are causing issues with a particular business service.

Parent-Child service relationship

Don’t forget, that the business service trees can be multi-level –  child services can have their own child services and services can also be interconnected with each other. For example – in the Parent-Child service relationship screenshot, we can see that we have an Accounting service. Accounting uses Microsoft services. Microsoft services are also used internally. So what happens when Microsoft services stop working? We will know that accounting will be affected, the internal services will be affected and we will see the exact chain of events – what and how exactly went wrong in the organization and which components need fixing.

Service state configuration

Services can have a varying impact on your business. Some services are more critical than others. Additional rules enable Zabbix to take the potential service impact into account. The first two additional rules analyze the percentage of affected child services and set the severity of the service problem accordingly.
But if the two most critical services are affected, that will immediately become a disaster. For example, online banking – you can imagine that any bank now has an online banking service and if it goes down – all the customers will be affected; it could even hit the news, not only monitoring. So of course they want to immediately know about that kind of a disaster, and with Zabbix services – they will. By defining additional rules and service weights, you can react to problems preemptively and fix the issues before they cause downtime for your end users.

SLA reporting

In Zabbix, we can choose for what periods SLA should be calculated – daily, weekly, monthly, yearly, or a mixed selection of those. Based on our selection, we can see real-time reports for services and as an example, by the end of the year or a day, understand what needs the most attention and review the service performance. Or to put in a closer-to-reality example – find out by accounting reports if the licenses were renewed in time so that the software which is used by accounting is always available.  We can also build a dashboard that will contain the reports, showing what is the current summary for the service so they can plan, buy new software, buy a new license and get new hardware and always be ahead again of whatever might happen.

Service state dashboard

Service permissions in user roles can be used to create different service views. This can be used to hide sensitive service information or simply display the services at the required level of detail. For example, a more detailed view can be provided for internal support users since they will need as much information as possible to fix any service-related issues. Separate views can be provided for Accounting and Management teams, showing only the relevant data to ensure a quick and reliable decision-making process. 

What if we want to make things even more simple for our Accounting and Management teams? We can use actions and scheduled report functionality to deliver the required information to the user’s mailbox without having them periodically log into Zabbix.

Service permissions

Business service tree for an MSP

Another example is an MSP (managed service provider) service tree. This use case is encountered pretty frequently and the tree is always easy to read even in the raw schema view as this:

Manager Service Provider service tree

We use a hosting company for our example. The company provides a particular set of services for its users. There are also some internal services that can also be used by the customers – for example, Zabbix itself.

Zabbix can be a great tool in MSP scenarios since it’s straightforward to provide customers with access to Zabbix and build a dashboard view with the latest statistics related to a particular user.
In this example, we can see the main service which is hosting, divided across customers, where each customer is a branch of that tree, using the hosting services the company provides. We also see that monitoring is a service itself because in this case customers also have the advantage of using Zabbix to get detailed information about the services they use and their current state. Seeing the current level of SLA for the servers they use and does it match the expectations.

Customer overview

The MSP, of course, retains the full view of the customers and all customers are equally important and deserve a proper quality of service so of course each customer will have an equal weight assigned to them. As soon as any customer has a problem, the related service will be marked with a high-level severity on the service tree. This way, the MSP will immediately see which customer is affected, making it possible to assist them as quickly as possible.

If you have a bigger environment – maybe you have hundreds of customers, you may opt out of defining service weights in your configuration since the number of services changes very rapidly. How can we react to global issues then?
We can use percentage rules instead of reacting to just the static weight number. This way, we can check is the problem related to a single customer or is it something global and everyone is now affected.

Root cause view in the services will allow you to start fixing everything immediately. Meanwhile, each customer can be informed individually using the service actions and conditions. This should be easy to do if we have properly named or tagged the services.

Customer service configuration

Don’t forget to define the permissions so that any customer, as Mooyani here, can have access to their Services immediately after login, ensuring that information not only remains private but also relevant for the current user.

Customer view

All information for Customers can be placed on their personal dashboards where they can see all the details whenever they need to. Monitoring the traffic going through their VMs, resource usage, application statuses and any other monitored entities. Don’t forget that service SLA reports can also be placed on Zabbix dashboards. This way your customers can see that the MSP meets the terms defined in the agreement and everything is performing as expected. 

To summarize – monitoring your infrastructure is great from any perspective, including business monitoring. it’s always a good idea to provide this view as an MSP to your customers, so they can see we meet the standards we define for ourselves and course promise for our users.

What’s Up, Home? – Zabbix the Weatherman

Post Syndicated from Janne Pikkarainen original https://blog.zabbix.com/whats-up-home-zabbix-the-weatherman/20897/

This week, I advanced my project on multiple fronts, so welcome to this little smorgasbord of different topics. In my future posts, I will go deeper into each topic as my project goes forward.

Zabbix the weatherman

Let me begin with a monitoring blooper.

As Zabbix has very well-working forecast/prediction functions for your usual IT capacity trending, I tried what happens if I let it predict the outdoor temperature based on recent temperatures. On my first try, this did not go as I planned.

You see, currently, here in Finland the temperatures change a lot during a 24 hours period: from nightly -10C or below temperatures to maybe +5C to +10C during the day. As I asked Zabbix to predict the weather based only on one hour of data one day ago, this did not go as planned.

OK, clearly the one hour worth of data was too little. What if ask Zabbix to base its forecast on one week worth of data?

The prediction slightly improves — at least it won’t predict a nuclear winter anymore — but only slightly. Zabbix in its little mind has no idea that the weather could get warmer due to the springtime. Or, in case Zabbix was right, I’ll let you know in a week.

Average data for Joe Average

As my monitoring setup collects more data, one thing I can get out of it will be averages. What’s the average temperature? What’s the average for this and that?

Above shows the average data for the last 24 hours, and on my Grafana dashboard the values change dynamically based on the time period I choose on it.

Who wouldn’t need home SLA reports?

Everybody knows how The Suits love their reports. I have this mental image where I think during their mornings they are like

[x] coffee
[x] warm bread
[x] orange juice
[x] classical music
[x] latest reports

And oh dear, their morning is ruined if the [x] is missing from the last entry. Poor Suits.

Anyway, as the recent Zabbix 6.0 brought us revamped Business Services Monitoring, why not use it for home monitoring, too? This part includes very much work in progress, but I will show you the current results.

When I’m finished, each room will be configured as its own Business Service. For now, I only have entered the room names and some other stuff. There is only one room with some actual content, for now, and it’s our bedroom. What happens if I click on it?

I will get to see if the lights and temperature are OK, both from a technical standpoint and for their values. In case the status would not be OK, the root cause column would show me the reason why everything is not OK — though I would not need to click my way this far, the data would be shown on the previous page already.

As for SLAs (Service Level Agreement, for example, if you promise that your service will be available 99.9% of the time, it better be or your customer will be a sad panda and yell at you), those are also a work in progress. Zabbix can be let to generate daily/weekly/whatever SLA reports for any of the configured Business Services. I have yet to build them, but I have one for my home router already.

Come on, it’s sunny, let’s go out, Zabbix!

True story: this morning my wife asked that could I add pollen monitoring to Zabbix. My non-technical wife is getting excited about home monitoring, too! (I think she’s only pretending. Still AWESOME!)
I still need to add pollen monitoring — the data is available as open data — but I initialized The Great Outdoors Monitoring in two other areas.

Where’s my train?

Just before creating this post, I proved to myself that I can show live train data on Grafana. I sure got a screenful, as I have not played around with GraphQL too much, and for now, I got way more trains than I planned to get, and the data contains extra fields I need to filter out with Grafana’s Organise Fields. Still, connection established! Wooooo!

What’s for lunch?

Only added one lunch restaurant for now, but in theory, I will receive an alert whenever the restaurant posts its new weekly lunch menu. Zabbix is configured to be a good netizen though and it will only try to fetch the menu every one hour on Monday morning, no point to poll them all week, so let’s see how this will work.

That’s all for now. See you next week!

I have been working at Forcepoint since 2014 and I am a walking monitoring unit. — Janne Pikkarainen

* Please note, that this blog post was originally written a few months ago, in early Spring, and the temperature records do not correspond to the actual weather at the time of publication.

The post What’s Up, Home? – Zabbix the Weatherman appeared first on Zabbix Blog.

Handy Tips #28: Keeping track of your services with business service monitoring

Post Syndicated from Arturs Lontons original https://blog.zabbix.com/handy-tips-28-keeping-track-of-your-services-with-business-service-monitoring/20307/

Configure and deploy flexible business services and monitor the availability of your business and its individual components.

The availability of a business service tends to depend on the state of many interconnected components. Therefore, detecting the current state of a business service requires a sufficiently complex and flexible monitoring logic.

Define flexible business service trees and stay informed about the state of your business services:

  • Business services can depend on an unlimited number of underlying components
  • Select from multiple business service status propagation rules

  •  Calculate the business service state based on the weight of the business service components
  • Receive alerts whenever your business service is unavailable

Check out the video to learn how to configure business service monitoring.

How to configure business service monitoring:

  1. Navigate to Services – Services
  2. Click the Edit button and then click the Create service button
  3. For this example, we will define an Online store business service
  4. Name your service and mark the Advanced configuration checkbox
  5. Click the Add button under the Additional rules
  6. Set the service status and select the conditions
  7. For this example, we will set the status to High
  8. We will use the condition “If weight of child services with Warning status or above is at least 6
  9. Set the Status calculation rule to Set status to OK
  10. Press the Add button
  11. Press the Add child service button
  12. For this example, we will define Web server child services
  13. Provide a child service name and a problem tag
  14. For our example, we will use node name Equals node # tag
  15. Mark the Advanced configuration checkbox and assign the service weight
  16. Press the Add button
  17. Repeat steps 12 – 17 and add additional child services
  18. Simulate a problem on your services so the summary weight is >= 6
  19. Navigate to Services – Services and check the parent service state

Tips and best practices
  • Service actions can be defined in the Services → Service actions menu section
  • Service root cause problem can be displayed in notifications with the {SERVICE.ROOTCAUSE} macro
  • Service status will not be propagated to the parent service if the status propagation rule is set to Ignore this service
  • Service-level tags are used to identify a service. Service-level tags are not used to map problems to the service

The post Handy Tips #28: Keeping track of your services with business service monitoring appeared first on Zabbix Blog.

Top 10 reasons to migrate to Zabbix 6.0 LTS by Dmitry Krupornitsky / Zabbix Summit Online 2021

Post Syndicated from Arturs Lontons original https://blog.zabbix.com/top-10-reasons-to-migrate-to-zabbix-6-0-lts-by-dmitry-krupornitsky-zabbix-summit-online-2021/18445/

Today we will take a look at the top 10 reasons to migrate to Zabbix 6.0 LTS. We will discuss features and changes included not only in Zabbix 6.0 LTS but also in intermediate major versions – Zabbix 5.2 and Zabbix 5.4.

The full recording of the speech is available on the official Zabbix Youtube channel.

High availability

With Zabbix 6.0 LTS, native support for Zabbix server high availability clusters is finally here. High availability setups can protect you from software and hardware failures and allow you to minimize downtime while performing maintenance tasks. Before Zabbix 6.0 LTS, users were required to use a dedicated piece of clustering software to enable high availability. Most users used a combination of Corosync + pacemaker software. This required additional knowledge related to these tools, to ensure a proper high availability cluster setup, configuration, maintenance, and other tasks related to managing your Zabbix high availability cluster. You could also use other 3rd party vendor solutions, but such solutions also require additional knowledge and in many cases incur additional licensing costs.

The native Zabbix server high availability cluster is an opt-in solution that provides high availability for the Zabbix server component. This solution consists of multiple Zabbix server instances – nodes, where each node is configured separately and uses the same database. Each node has two modes of operation – active or standby. Only a single node can be active at a time. The standby nodes do not perform any data collection, data processing, or any other Zabbix server activities. The standby nodes do not listen for connection on ports and have a minimal number of connections established to the Zabbix backend database. The high availability nodes are compatible with one another across different minor Zabbix server versions.

Learn how to deploy your own Zabbix server high availability cluster by following the steps provided in our Zabbix Summit blog post dedicated to this topic.

New Zabbix interface options

Zabbix 6.0 LTS provides multiple Zabbix interface improvements. One of the major changes that the users will notice when switching to Zabbix 6.0 LTS is the migration from screens to dashboards. The screens will be migrated to dashboards automatically during the upgrade. Dashboards consist of multiple highly customizable widgets, which can be placed on a dashboard with a click of a button. With Zabbix 6.0 LTS many new widgets will be available for different purposes – more flexible views of your metrics with the Single item value widget, a Geomap widget for a better overview of your infrastructure state, Top N/Bottom N views provide a whole new way to look at your metrics and more.

Now you will be able to save your favorite problem filters and access your filters in tabs for more simple filtering of the commonly accessed problem views.

Zabbix 6.0 LTS introduces timezone configuration on a per-user basis. Users can now have their preferred timezone configured via the user settings in the Zabbix frontend. The same is also true for language – this can also now be configured individually for each user.

The Zabbix frontend is now more customizable than ever. There are several ways in which you can customize your Zabbix frontend:

  • Replace the Zabbix logo with your company’s branding
  • Hide links to Zabbix support/integration pages
  • Set a custom help page link
  • Change the copyright notice in the footer of the frontend.

Implementing these changes requires customizing the underlying PHP code – we tried to make this as simple and accessible as possible, so you can quickly make the necessary changes yourself.

There are also many other Interface improvements, such as multi-page dashboards, third-level menus, graph improvements, and many others.

Improved security

Security is always something that we focus on when developing Zabbix. Zabbix 6.0 LTS brings many new security-related improvements and features:

  • User roles allow you to define roles with granular permissions related to the frontend access and the actions that each user role is permitted to perform
    • Roles are still based on user types – Zabbix User, Admin, Super admin, and user type restrictions still apply, but can be further customized per each role
    • User group to host group permissions (Read, Read/Write, Deny) still need to be used in combination with roles to ensure granular access to your data
    • For example, now we can define users that have access to host configuration but restrict access to other configuration sections.

In Zabbix 6.0 LTS it is possible to define custom password complexity requirements for Zabbix frontend logins. We can define password length/complexity policies and prohibit the usage of easy to guess common passwords.

The Zabbix API has also seen some security improvements. Now it is possible to generate a persistent API token for a particular user, define an expiration date and use the token in your API calls, without the need to regularly re-issue a new API token.

Zabbix 5.2 release also added the ability to store sensitive information in an external vault. As of the release of Zabbix 6.0 LTS, only HashiCorp Vault is supported, but CyberArk Vault support is also coming in Zabbix 6.2 release.

A set of architectural and structural measures have been taken to completely restructure the Zabbix Audit log. The updated Audit log entry contains records of all configuration changes made by the Zabbix server and Zabbix frontend. The new Audit log also contains additional filtering options, such as filtering Audit log entries based on the operation during which the changes were performed. The new Audit log is not only more detailed but also reworked with minimum performance impact in mind.

Scalability improvements

Many scalability improvements have been introduced between the Zabbix 5.0 LTS release and Zabbix 6.0 LTS release. These improvements not only improve the performance of existing Zabbix instances but also lay the groundwork for the design of upcoming features in later releases.

Previously, trend-based trigger functions would always use database queries to obtain the required data. Starting from Zabbix 5.4, a new type of cache – Trend function cache, has been introduced. This cache stores the results of calculated trend functions. When processing the trend functions, the Zabbix server will check the Trend function cache for the cached results. In case of failure, the Zabbix server will read the data from the database and cache the results.

The scalability improvements allow for better parallel data processing on Zabbix servers with heavy loads. Zabbix Instances with tens of thousands or more new values per second will greatly benefit from the improved performance.

The introduction of the graceful startup of the Zabbix server can help you improve performance and prevent unwanted downtimes, especially with large distributed environments. Whenever a Zabbix server gets started up after downtime, the existing Zabbix proxies start sending the data backlog to the Zabbix server. it is extremely important to maintain the stability and performance of the Zabbix server during this time window. Graceful startup improves the Zabbix server data backlog handling logic during such situations.

To prevent unwanted delays and other issues when using zabbix_get and zabbix_sender command-line tools, it is now possible to define a custom Timeout parameter for these tools.

Advanced business service monitoring

The new Busines service monitoring features allow Zabbix users to not only define complex service trees but also receive alerts in situations where the status of a business service has been changed. This is valuable to every user that wishes to monitor their business services, no matter how simple or complex the service is.

Combined with a large number of new and improved service status calculation rules. By defining custom service weights and advanced service status propagation rules, the business services can be defined in an extremely flexible fashion. Services are also not linked to individual triggers anymore, instead, we use tag-based service mapping to map our services to problem events.

The service functionality has also received scalability improvements. Zabbix can support the monitoring of over 100 000 business services. The scalability improvements have been implemented from both the UI/UX and the performance perspectives.

The old all-or-nothing business service permission approach has been redesigned to a granular read/write permissions for individual business services. This is not only an improvement from the security perspective, but also adds the ability to define services in a multi-tenant fashion, where each tenant has access only to the services that they own.

With the redesign of the business services, we have added the support for root cause analysis, allowing users to see the underlying problem which caused a particular service to change its state.

You can read more about Business service monitoring in our Zabbix Summit blog post dedicated to this topic.

Tag and template improvements

Item applications have been replaced with tags. This design decision adds consistency to filtering, mapping, grouping, and other tag-related functions when it comes to different Zabbix entities. Tags can also be used to provide additional information related to your entities in a manner that is much more flexible than it was with applications.

Universal template IDs introduced for each of the template elements allow you to define much more robust template management workflows, especially when you combine this with a CI/CD template management approach. These IDs are unique and can be used to match a particular template entity, such as item, trigger, graph, and so on. By utilizing the Universal template IDs, Zabbix now understands which entity we are trying to update, which entity no longer exists, whether it is a new entity or we are adjusting an existing entity. The default template export format is now YAML, though JSON and XML formats are still supported. This was done to improve the template management usability since the YAML format is more user-friendly and easier to edit manually. All of the official Zabbix templates available on the Zabbix git page have already been converted to the YAML format.

The redesign of the templates has also allowed us to improve the visualization of the changes made when importing a template. Now users can see the list of changes in a diff-like display and understand the impact that the template import will have on the  Zabbix entities.

Value maps have been moved to host and template levels. This is another design decision that we made to enable support for fully self-contained templates, that are easy to manage and deploy, and can be easily imported into different Zabbix environments. While global value maps might be easy to manage in small environments, this is not the case in larger environments, where different teams are working with a single or between multiple Zabbix instances. Therefore, the global value maps have been removed.

Reporting and visualization

With the addition of Scheduled reports functionality, any dashboard can now be converted into a scheduled report. While this feature was originally added in Zabbix 5.4, with the release of Zabbix 6.0 LTS and a set of new widgets, the reporting functionality has gained a lot of additional value that these widgets grant specifically from the reporting perspective. Users can create scheduled reports and receive them in their mailbox at a specific time either on a daily, weekly, monthly, or yearly basis. The time period for which the report will provide the information can also be selected.

The new Geographical map widget allows you to quickly deploy a geomap with an overview of the state of your infrastructure. The geomap widget supports filters, so we can display only a particular part of your infrastructure. Zabbix uses an open-source Javascript interactive maps library called Leaflet and supports multiple map providers such as OpenStreetMap, OpenTopoMap, USGS US Topo, and more. Users also have the ability to define and use a custom map tile provider. The map will display your infrastructure and also highlight any detected problems as well as display problem counters. This is a major step forward from the old approach, which required users to use the regular map functionality together with Zabbix API scripting, to provide information on a geographical map.

Advanced problem detection

Zabbix 5.4 release introduced a new unified syntax for defining trigger expressions, calculated, and aggregated items. There are multiple benefits that come with the new trigger syntax. First off – the syntax is now unified and can be used for defining triggers, calculated items, and providing values in maps or graph names. The syntax also has a more functional approach, instead of being object-oriented. This allows us to solve many complex use cases, for example dynamically calculate or aggregate a value from all hosts tagged with a specific tag or belonging to a specific host group. Aggregated item type has also been removed and users can now define aggregate checks under the calculated item type.

New monitoring functionality and integrations

As with every major release, Zabbix 6.0 LTS comes with a set of new items and improves the functionality of already existing items:

  • It is now possible to monitor SSL certificate validity and expiration data, such as the expiry date, issuer, version, subject, and more
  • New Zabbix Agent 2 metrics allow you to collect file owner information, file properties, extended interface info, extended TCP info, SHA2 hashes for files, and more
  • New templates for NGINX+, HPE/Dell servers, CISCO ASAv, Cloudflare

Finally – Zabbix 6.0 LTS

Many of our users and customers prefer sticking with the LTS releases instead of upgrading between each major version. As with every LTS release, there are major benefits to sticking with Zabbix 6.0 LTS:

  • LTS release receive thorough testing and full long term support
    • 3 years of full support – general, critical, and security fixes/improvements
    • 5 years of limited support – critical and security fixes

Questions

Q: Which of the current versions are still supported and for how long are they going to remain supported? What updates can we expect these versions to receive?

A: Currently we have three supported major versions available. Zabbix 5.4, which will not be supported after the release of Zabbix 6.0 LTS. We also still provide support for Zabbix 5.0 LTS and Zabbix 4.0 LTS. Zabbix 5.0 LTS will continue receiving full support until the middle of 2023 and limited support until the middle of 2025, while Zabbix 4.0 LTS will receive limited support until November 2023.

 

Q: Could you elaborate on how tags are more flexible than applications and are there any other benefits to using tags?

A: Zabbix already supports tags for most of the essential Zabbix objects, such as triggers, hosts, host prototypes, and templates. With the introduction of tags for items, tags can now be found everywhere. This way you can have tags that provide different additional information and assign values for your objects. Tags have several usages – for example, we can use them to mark events. If we have an item with a tag, this tag will mark any problem related to this item. Problem events will inherit tags from the whole tag chain – hosts, templates, triggers, items, and more. Further down the line, we can use our actions to react to specific tags. If you recall, Business services are also mapped to problems based on the tag mapping. Of course, tags can also be used for filtering and grouping different Zabbix objects.

 

Q: Is there a guideline to the migration process from an older version to Zabbix 6.0 LTS? Is there a change list that I can look at to see what other features have received an overhaul?

A: Regarding the upgrade itself – our documentation contains guidelines for both upgrading from packages and upgrading from sources. The documentation may also contain upgrade notes regarding any extra steps or precautions required when upgrading to a particular version. Regarding the feature changes – we recommend reading through the major version release notes. For example, if you’re upgrading from Zabbix 5.0 LTS to Zabbix 6.0 LTS, make sure to familiarize yourself not only with the Zabbix 6.0 LTS release notes, but also read through the Zabbix 5.2 and Zabbix 5.4 release notes, since changes introduced in these versions will also be a part of Zabbix 6.0 LTS.

The post Top 10 reasons to migrate to Zabbix 6.0 LTS by Dmitry Krupornitsky / Zabbix Summit Online 2021 appeared first on Zabbix Blog.

Gaining new insights with Business service monitoring by Aleksandrs Petrovs-Gavrilovs / Zabbix Summit Online 2021

Post Syndicated from Arturs Lontons original https://blog.zabbix.com/gaining-new-insights-with-business-service-monitoring-by-aleksandrs-petrovs-gavrilovs-zabbix-summit-online-2021/17973/

Zabbix 6.0 LTS comes with a complete redesign of the service monitoring. From improved business service scalability to advanced service status calculation logic and alerting. Let’s take a look at the Business Service monitoring feature and how you can use it to ensure full transparency for your business services.

The full recording of the speech is available on the official Zabbix Youtube channel.

Business services can be quite complex. They tend to consist of many different moving parts with redundancy and failover mechanism in place, all of which need to be taken into consideration when we wish to analyze the current status of our services.

BSM Checklist

Let’s take a look at what needs to be done so we can successfully define and monitor our business service:

  • First, we have to define what exactly is our business service and what components does it consist of?
  • We need to understand what are our expectations when it comes to service uptime. When should the service be up and running? What are the acceptable downtimes? Should it run 24/7/365 or maybe it’s a service that is critical only during our working hours?
  • Once we know what needs to be monitored, we need to make sure that we are collecting the data that reflects the status of different service components.
  • Finally – we have to find a suitable tool to track and measure our service.

Define your business

Let’s take a look at how a business may look like. As I mentioned before – business services can consist of many different components. Let’s take a look at an example of how business services may look like:

The tree structure here represents our Business services. We can see that we have classified the services into two branches – Internal services and User services. The User services consist of components such as Websites, Helpdesk services, Phones. These general services are based on lower-level components such as the actual physical phones for the phone service, underlying software for the Website and Helpdesk services, and so on.

This can make things quite complicated since usually, organizations will have many more components to take care of. That’s why, let’s see how we can simplify this tree and define our services in a more simple manner, like the service tree below:

Now we are left with only 3 levels for our services. Let’s take a look at how we can move this to Zabbix:

Here we can see a high-level view of our services. Once again we have our Internal services and User services. These here high-level services consist of child services and define what these components consist of and what their SLAs should be. We can also define tags to provide additional details to our services – which customer uses the service, the type of service, maybe even the location that the service is used in – this part is completely up to your imagination.

Once you have defined the services, their respective components and have linked them to the problems by using tags, you will finally be able to see the full picture. Zabbix will display not only the status of the service but also the root cause of the problem. This way we can provide service status information not only on the service owner level but also provide information that your technical staff can use to fix the issue.

Configuring SLAs

Configuring Business Service monitoring can be done from the MonitoringServices section. In Zabbix 6.0 LTS you are not required to start defining the service tree from the root service. Now you can define your own root level services. To create a service, all we have to do is switch to the Edit mode by clicking the Edit button in the upper right corner of the services screen and click the Create service button right next to it. We have also made some additional changes to the service section UI/UX. Now you also have multiple fast edit buttons next to each service. You can use them to Add a child service, edit an existing service, or delete an existing service.

Next, let’s take a look at the actual service creation steps.

  • We need to provide a name for our service
  • If the service is not a top-level service you have to select a parent service
  • Define problem tags. Problems tagged with the matching tags will affect the service status
  • Define the status calculation rule

Major improvements have been made to status calculation rules. We still support the old logic of the Use the most critical of child services / Most critical if all children have problems / Set status to ok, but there are also many advanced service status calculation rules.

  • Now we have the ability to select a specific status (Warning, Average, High, and so on) for our service in case of a problem
  • Select the number of children, More than/Less than N children, Percentage of children that should be affected for the parent service status change to take place
  • Define weights for child services and perform status changes based on the weight of the affected child services

Child services can also apply different propagation rules for the parent service

  • Child services can Increase or decrease the parent status service status by N severities, ignore the child service, apply a fixed status or apply the status depending on the problem severity

For our example let’s use an HA cluster use case. HA clusters consist of multiple nodes – for our example, we will use 3 nodes.

  • First, we define that the HA cluster consists of 3 nodes – 3 child services.
  • Each node will have equal weight – 1
  • On the parent service, we will define multiple status rules
    • If the weight of the child services is 1 (1 node is down) – the parent service will change its status to Warning
    • If the weight of the child services is 2 (2 nodes are down) – the parent service will change its status to Average
    • If the weight of the child services is 3 (all nodes are down) – the parent service will change its status to Disaster

In the above image, we can see how the corresponding status change will look like in the Services section. Note that we can also see the root cause of the parent service status change in the Root cause column.

We also have the ability to define the acceptable SLAs as well as SLA calculation uptime and downtime periods for our services. We have the option to define scheduled uptimes and downtimes, during which SLA should or shouldn’t be calculated (Such as weekends, for example), as well as one-time downtimes for one-time maintenance purposes.

Services can utilize tags to provide additional information about your services, such as the service type, service customer, service location, and more. On top of that, tags can also be used in the Service action condition logic, so you can define granular alerting logic for your service status changes.

The Child services tab allows you to quickly look at the related child services, their problem tags, and status calculation rules.

Child services can also be crosslinked between multiple parent services. This means that you don’t have to duplicate and recreate child services if they are used as a component of multiple parent services.

Track, solve and measure

Once we have configured our service, what remains is keeping track of our service statuses, SLAs and staying notified about service status changes and their root cause.

For this purpose, it is vital to secure access to our services. This is especially critical for MSPs, which may have multiple customers and each customer should have access only to the services related to that particular customer. To that end, the Roles section has also received an update related to the Service permissions. We can now define Read-Write and Read access to either specific services or services marked with a particular tag.

The Root cause section displays the root cause problems that affected the service status change. You will be able to click on the root cause problem and open it in the Problems section for further analysis of what caused your services to change their status and which host has been affected by it.

Previously I mentioned alerting on service status change, so let’s dig deeper into that. In Zabbix 6.0 LTS we have added a new type of action – Service actions. Zabbix can now react to service status changes and notify you when a service changes its status. The Service action conditions can analyze if a status has been changed on a particular service, a service that matches or contains a specific string in its name, tag, or tag value. If the conditions are true, Zabbix can send out an email, deliver a phone call or an SMS, create a ticket in your helpdesk system or perform any other alerting and notification workflow.

Many other BSM features are coming as we continue the development of Zabbix 6.0 LTS:

  • SLA graphical visualizations with support for over 100k services
  • Daily, Monthly, Weekly SLA reports
  • New service tree and SLA reporting widgets available from the dashboard
  • Service tree import and export
  • Impact analysis – see which service affects other related services in what way.

Questions

Q: Will the existing services be migrated to Zabbix 6.0 LTS?
A: The existing services will be migrated to Zabbix 6.0 LTS during the upgrade. All of the configuration for the existing services will stay intact after the migration.

Q: Does host maintenance suppress service calculation in Zabbix 6.0 LTS?
A: Host maintenance will not affect the service calculation. If you wish to define maintenance periods for your services –  use scheduled or one-time downtime options when configuring an individual service.

Q: How are the Fixed status and Ignore this service calculation rules going to work?
A: Fixed status services will not change their status no matter what happens to the child services – the service status will remain fixed. As for Ignore this service – the service status change will be ignored and will not affect the parent services.

What’s new in Zabbix 6.0 LTS by Artūrs Lontons / Zabbix Summit Online 2021

Post Syndicated from Arturs Lontons original https://blog.zabbix.com/whats-new-in-zabbix-6-0-lts-by-arturs-lontons-zabbix-summit-online-2021/17761/

Zabbix 6.0 LTS comes packed with many new enterprise-level features and improvements. Join Artūrs Lontons and take a look at some of the major features that will be available with the release of Zabbix 6.0 LTS.

The full recording of the speech is available on the official Zabbix Youtube channel.

If we look at the Zabbix roadmap and Zabbix 6.0 LTS release in particular, we can see that one of the main focuses of Zabbix development is releasing features that solve many complex enterprise-grade problems and use cases. Zabbix 6.0 LTS aims to:

  • Solve enterprise-level security and redundancy requirements
  • Improve performance for large Zabbix instances
  • Provide additional value to different types of Zabbix users – DevOPS and ITOps teams, Business process owner, Managers
  • Continue to extend Zabbix monitoring and data collection capabilities
  • Provide continued delivery of official integrations with 3rd party systems

Let’s take a look at the specific Zabbix 6.0 LTS features that can guide us towards achieving these goals.

Zabbix server High Availability cluster

With the release of Zabbix 6.0 LTS, Zabbix administrators will now have the ability to deploy Zabbix server HA cluster out-of-the-box. No additional tools are required to achieve this.

Zabbix server HA cluster supports an unlimited number of Zabbix server nodes. All nodes will use the same database backend – this is where the status of all nodes will be stored in the ha_node table. Nodes will report their status every 5 seconds by updating the corresponding record in the ha_node table.

To enable High availability, you will first have to define a new parameter in the Zabbix server configuration file: HANodeName

  • Empty by default
  • This parameter should contain an arbitrary name of the HA node
  • Providing value to this parameter will enable Zabbix server cluster mode

Standby nodes monitor the last access time of the active node from the ha_node table.

  • If the difference between last access time and current time reaches the failover delay, the cluster fails over to the standby node
  • Failover operation is logged in the Zabbix server log

It is possible to define a custom failover delay – a time window after which an unreachable active node is considered lost and failover to one of the standby nodes takes place.

As for the Zabbix proxies, the Server parameter in the Zabbix proxy configuration file now supports multiple addresses separated by a semicolon. The proxy will attempt to connect to each of the nodes until it succeeds.

Other HA cluster related features:

  • New command-line options to check HA cluster status
  • hanode.get API method to obtain the list of HA nodes
  • The new internal check provides LLD information to discover Zabbix server HA nodes
  • HA Failover event logged in the Zabbix Audit log
  • Zabbix Frontend will automatically switch to the active Zabbix server node

You can find a more detailed look at the Zabbix Server HA cluster feature in the Zabbix Summit Online 2021 speech dedicated to the topic.

Business service monitoring

The Services section has received a complete redesign in Zabbix 6.0 LTS. Business Service Monitoring (BSM) enables Zabbix administrators to define services of varying complexity and monitor their status.

BSM provides added value in a multitude of use cases, where we wish to define and monitor services based on:

  • Server clusters
  • Services that utilize load balancing
  • Services that consist of a complex IT stack
  • Systems with redundant components in place
  • And more

Business Service monitoring has been designed with scalability in mind. Zabbix is capable of monitoring over 100k services on a single Zabbix instance.

For our Business Service example, we used a website, which depends on multiple components such as the network connection, DB backend, Application server, and more. We can see that the service status calculation is done by utilizing tags and deciding if the existing problems will affect the service based on the problem tags.

In Zabbix 6.0 LTS there are many ways how service status calculations can be performed. In case of a problem, the service state can be changed to:

  • The most critical problem severity, based on the child service problem severities
  • The most critical problem severity, based on the child service problem severities, only if all child services are in a problem state
  • The service is set to constantly be in an OK state

Changing the service status to a specific problem severity if:

  • At least N or N% of child services have a specific status
  • Define service weights and calculate the service status based on the service weights

There are many other additional features, all of which are covered in our Zabbix Summit Online 2021 speech dedicated to Business Service monitoring:

  • Ability to define permissions on specific services
  • SLA monitoring
  • Business Service root cause analysis
  • Receive alerts and react on Business Service status change
  • Define Business Service permissions for multi-tenant environments

New Audit log schema

The existing audit log has been redesigned from scratch and now supports detailed logging for both Zabbix server and Zabbix frontend operations:

  • Zabbix 6.0 LTS introduces a new database structure for the Audit log
  • Collision resistant IDs (CUID) will be used for ID generation to prevent audit log row locks
  • Audit log records will be added in bulk SQL requests
  • Introducing Recordset ID column. This will help users recognize which changes have been made in a particular operation

The goal of the Zabbix 6.0 LTS audit log redesign is to provide reliable and detailed audit logging while minimizing the potential performance impact on large Zabbix instances:

  • Detailed logging of both Zabbix frontend and Zabbix server records
  • Designed with minimal performance impact in mind
  • Accessible via Zabbix API

Implementing the new audit log schema is an ongoing effort – further improvements will be done throughout the Zabbix update life cycle.

Machine learning

New trend functions have been added which utilize machine learning to perform anomaly detection and baseline monitoring:

  • New trend function – trendstl, allows you to detect anomalous metric behavior
  • New trend function – baselinewma, returns baseline by averaging data periods in seasons
  • New trend function – baselinedev, returns the number of standard deviations

An in-depth look into Machine learning in Zabbix 6.0 LTS is covered in our Zabbix Summit Online 2021 speech dedicated to machine learning, anomaly detection, and baseline monitoring.

New ways to visualize your data

Collecting and processing metrics is just a part of the monitoring equation. Visualization and the ability to display our infrastructure status in a single pane of glass are also vital to large environments. Zabbix 6.0 LTS adds multiple new visualization options while also improving the existing features.

  • The data table widget allows you to create a summary view for the related metric status on your hosts
  • The Top N and Bottom N functions of the data table widget allow you to have an overview of your highest or lowest item values
  • The single item widget allows you to display values for a single metric
  • Improvements to the existing vector graphs such as the ability to reference individual items and more
  • The SLA report widget displays the current SLA for services filtered by service tags

We are proud to announce that Zabbix 6.0 LTS will provide a native Geomap widget. Now you can take a look at the current status of your IT infrastructure on a geographic map:

  • The host coordinates are provided in the host inventory fields
  • Users will be able to filter the map by host groups and tags
  • Depending on the map zoom level – the hosts will be grouped into a single object
  • Support of multiple Geomap providers, such as OpenStreetMap, OpenTopoMap, Stamen Terrain, USGS US Topo, and others

Zabbix agent – improvements and new items

Zabbix agent and Zabbix agent 2 have also received some improvements. From new items to improved usability – both Zabbix agents are now more flexible than ever. The improvements include such features as:

  • New items to obtain additional file information such as file owner and file permissions
  • New item which can collect agent host metadata as a metric
  • New item with which you can count matching TCP/UDP sockets
  • It is now possible to natively monitor your SSL/TLS certificates with a new Zabbix agent2 item. The item can be used to validate a TLS/SSL certificate and provide you additional certificate details
  • User parameters can now be reloaded without having to restart the Zabbix agent

In addition, a major improvement to introducing new Zabbix agent 2 plugins has been made. Zabbix agent 2 now supports loading stand-alone plugins without having to recompile the Zabbix agent 2.

Custom Zabbix password complexity requirements

One of the main improvements to Zabbix security is the ability to define flexible password complexity requirements. Zabbix Super admins can now define the following password complexity requirements:

  • Set the minimum password length
  • Define password character requirements
  • Mitigate the risk of a dictionary attack by prohibiting the usage of the most common password strings

UI/UX improvements

Improving and simplifying the existing workflows is always a priority for every major Zabbix release. In Zabbix 6.0 LTS we’ve added many seemingly simple improvements, that have major impacts related to the “feel” of the product and can make your day-to-day workflows even smoother:

  • It is now possible to create hosts directly from MonitoringHosts
  • Removed MonitoringOverview section. For improved user experience, the trigger and data overview functionality can now be accessed only via dashboard widgets.
  • The default type of information for items will now be selected automatically depending on the item key.
  • The simple macros in map labels and graph names have been replaced with expression macros to ensure consistency with the new trigger expression syntax

New templates and integrations

Adding new official templates and integrations is an ongoing process and Zabbix 6.0 LTS is no exception here’s a preview for some of the new templates and integrations that you can expect in Zabbix 6.0 LTS:

  • f5 BIG-IP
  • Cisco ASAv
  • HPE ProLiant servers
  • Cloudflare
  • InfluxDB
  • Travis CI
  • Dell PowerEdge

Zabbix 6.0 also brings a new GitHub webhook integration which allows you to generate GitHub issues based on Zabbix events!

Other changes and improvements

But that’s not all! There are more features and improvements that await you in Zabbix 6.0 LTS. From overall performance improvements on specific Zabbix components, to brand new history functions and command-line tool parameters:

  • Detect continuous increase or decrease of values with new monotonic history functions
  • Added utf8mb4 as a supported MySQL character set and collation
  • Added the support of additional HTTP methods for webhooks
  • Timeout settings for Zabbix command-line tools
  • Performance improvements for Zabbix Server, Frontend, and Proxy

Questions and answers

Q: How can you configure geographical maps? Are they similar to regular maps?

A: Geomaps can be used as a Dashboard widget. First, you have to select a Geomap provider in the Administration – General – Geographical maps section. You can either use the pre-defined Geomap providers or define a custom one. Then, you need to make sure that the Location latitude and Location longitude fields are configured in the Inventory section of the hosts which you wish to display on your map. Once that is done, simply deploy a new Geomap widget, filter the required hosts and you’re all set. Geomaps are currently available in the latest alpha release, so you can get some hands-on experience right now.

Q: Any specific performance improvements that we can discuss at this point for Zabbix 6.0 LTS?

A: There have been quite a few. From the frontend side – we have improved the underlying queries that are related to linking new templates, therefore the template linkage performance has increased. This will be very noticeable in large instances, especially when linking or unlinking many templates in a single go.
There have also been improvements to Server – Proxy communication. Specifically – the logic of how proxy frees up uncompressed data. We’ve also introduced improvements on the DB backend side of things – from general improvements to existing queries/logic, to the introduction of primary keys for history tables, which we are still extensively testing at this point.

Q: Will you still be able to change the type of information manually, in case you have some advanced preprocessing rules?

A: In Zabbix 6.0 LTS Zabbix will try and automatically pick the corresponding type of information for your item. This is a great UX improvement since you don’t have to refer to the documentation every time you are defining a new item. And, yes, you will still be able to change the type of information manually – either because of preprocessing rules or if you’re simply doing some troubleshooting.

Zabbix 6.0 LTS – The next great leap in monitoring by Alexei Vladishev / Zabbix Summit Online 2021

Post Syndicated from Alexei Vladishev original https://blog.zabbix.com/zabbix-6-0-lts-the-next-great-leap-in-monitoring-by-alexei-vladishev-zabbix-summit-online-2021/17683/

The Zabbix Summit Online 2021 keynote speech by Zabbix founder and CEO Alexei Vladishev focuses on the role of Zabbix in modern, dynamic IT infrastructures. The keynote speech also highlights the major milestones leading up to Zabbix 6.0 LTS and together we take a look at the future of Zabbix.

The full recording of the speech is available on the official Zabbix Youtube channel.

Digital transformation journey
Infrastructure monitoring challenges
Zabbix – Universal Open Source enterprise-level monitoring solution
Cost-Effectiveness
Deploy Anywhere
Monitor Anything
Monitoring of Kubernetes and Hybrid Clouds
Data collection and Aggregation
Security on all levels
Powerful Solution for MSPs
Scalability and High Availability
Machine learning and Statistical analysis
More value to users
New visualization capabilities
IoT monitoring
Infrastructure as a code
Tags for classification
What’s next?
Advanced event correlation engine
Multi DC Monitoring
Zabbix Release Schedule
Zabbix Roadmap
Questions

Digital transformation journey

First, let’s talk about how Zabbix plays a role as a part of the Digital Transformation journey for many companies.

As IT infrastructures evolve, there are many ongoing challenges. Most larger companies for example have a set of legacy systems that require to be integrated with more modern systems. This results in a mix of legacy and new technologies and protocols. This means that most management and monitoring tools need to support all of these technologies – Zabbix is no exception here.

Hybrid clouds, containers, and container orchestration systems such as K8S and OpenShift have also played an immense part in the digital transformation of enterprises. It has been a very major paradigm shift – from physical machines to virtual machines, to containers and hybrid parts. We certainly must provide the required set of technologies to monitor such environments and the monitoring endpoints unique to them.

The rapid increase in the complexity of IT infrastructures caused by the two previous points requires our tools to be a lot more scalable than before. We have many more moving parts, likely located in different locations that we need to stay aware of. This also means that any downtime is not acceptable – this is why the high availability of our tools is also vital to us.

Let’s not forget that with increased complexity, many new potential security attack vectors arise and our tools need to support features that can help us with minimizing the security risks.

But making our infrastructures more agile usually comes at a very real financial cost. We must not forget that most of the time we are working with a dedicated budget for our tools and procedures.

Infrastructure monitoring challenges

The increase in the complexity of IT infrastructures also poses multiple monitoring challenges that we have to strive to overcome:

  • Requirements for scalability and high availability for our tools
    • The growing number of devices and networks as well as the increased complexity of IT infrastructures
  • Increasingly complex infrastructures often force us to utilize multiple tools to obtain the required metrics
    • This leads to a requirement for a single pane of glass to enable centralized monitoring
  • Collecting values is often not enough – we need to be able to leverage the collected data to gain the most value out of it
  • We need a solution that can deliver centralized visualization and reporting based on the obtained data
  • Our tools need to be hand-picked so that they can deliver the best ROI in an already complex infrastructure

Zabbix – Universal Open Source enterprise-level monitoring solution

Zabbix is a Universal free and Open Source enterprise-level monitoring solution. The tool comes at absolutely no cost and is available for everyone to try out and use. Zabbix provides the monitoring of modern IT infrastructures on multiple levels.

Universal is the term that we are focusing on. Given the open-source nature of the product, Zabbix can be used in infrastructures of different sizes – from small and medium organizations to large, globe-spanning enterprises. Zabbix is also capable of delivering monitoring of the whole IT stack – from hardware and network monitoring to high-level monitoring such as Business Service monitoring and more.

Cost-Effectiveness

Zabbix delivers a large set of enterprise-grade features at no cost! Features such as 2FA, Single sign-on solutions, no restrictions when it comes to data collection methods, number of monitored devices and services, or database size.

  • Exceptionally low total cost of ownership
    • Free and Open Source solution with quality and security in mind
    • Backed by reliable vendors, a global partner network, and commercial services, such as the 24/7 support
    • No limitations regarding how you use the software
    • Free and readily available documentation, HOWTOs, community resources, videos, and more.
    • Zabbix engineers are easy to find and hire for your organization
    • Cost is fully under your control – Zabbix Commercial services are under fixed-price agreements

Deploy Anywhere

Our users always have the choice of where and how they wish to deploy Zabbix. With official packages for the most popular operating systems such as RHEL, Oracle Linux, Ubuntu, Raspberry Pi OS, and more. With official Helm charts, you can quickly also deploy Zabbix in a Kubernetes cluster or in your OpenShift instance. We also provide official Docker container images with pre-installed Zabbix components that you can deploy in your environment.

We also provide one-click deployment options for multiple cloud service providers, such as Amazon AWS, Microsoft Azure, Google Cloud, Openstack, and many other cloud service providers.

Monitor Anything

With Zabbix, you can monitor anything – from legacy solutions to modern systems. With a large selection of official solutions and substantial community backing our users can be sure that they can find a suitable approach to monitor their IT infrastructure components. There are hundreds of ready-to-use monitoring solutions by Zabbix.

Whenever you deploy a new IT solution in your enterprise, you will want to tie it together with the existing toolset. Zabbix provides many out of the box integrations for the most popular ticketing and alerting systems

Recently we have introduced advanced search capabilities for the Zabbix integrations page, which allows you to quickly lookup the integrations that currently exist on the market. If you visit the Zabbix integrations page and look up a specific vendor or tool, you will see a list of both the official solutions supported by Zabbix and also a long list of community solutions backed by our users, partners, and customers.

Monitoring of Kubernetes and Hybrid Clouds

Nowadays many existing companies are considering migrating their existing infrastructure to either solutions such as Kubernetes or OpenShift, or utilizing cloud service providers such as Amazon AWS or Microsoft Azure.

I am proud to announce, that with the release of Zabbix 6.0 LTS, Zabbix will officially support out-of-the-box monitoring of OpenShfit and Kubernetes clusters.

Data collection and Aggregation

Let’s cover a few recent features that improve the out-of-the-box flexibility of Zabbix by a large margin.

Synthetic monitoring is a feature that was introduced a year ago in Zabbix version 5.2 and it has already become quite popular with our user base. The feature enables monitoring of different devices and solutions over the HTTP protocol. By using synthetic monitoring Zabbix can connect to your HTTP endpoints, such as cloud APIs, Kubernetes, and OpenShift APIs, and other HTTP endpoints, collect the metrics and then process them to extract the required information. Synthetic monitoring is extremely transparent and flexible – it can be fine-tuned to communicate with any HTTP endpoints.

Another major feature introduced in Zabbix 5.4 is the new trigger syntax. This enables our users to define much more flexible trigger expressions, supporting many new problem detection use cases. In addition, we can use this syntax to perform flexible data aggregation operations. For example, now we can aggregate data filtered by wildcards, tags, and host groups, instead of specifying individual items. This is extremely valuable for monitoring complex infrastructures, such as Kubernetes or cloud environments. At the same time, the new syntax is a lot more simple to learn and understand when compared to the old trigger syntax.

Security on all levels

Many companies are concerned about security and data protection when it comes to the tools that they are using in their day-to-day tasks. I’m happy to tell you that Zabbix follows the highest security standards when it comes to the development and usage of the product.

Zabbix is secure by design. In the diagram below you can see all of the Zabbix components, all of which are interconnected, like Zabbix Agent, Server, Proxy, Database, and Frontend. All of the communication between different Zabbix components can be encrypted by using strong encryption protocols like TLS.

If you’re using Zabbix Agent, the agent does not require root privileges. You can run Zabbix Agent under a normal user with all of the necessary user level restrictions in place. Zabbix agent can also be restricted with metric allow and deny lists, so it has access only to the metrics which are permitted for collection by your company policies.

The connections between the Zabbix database backend and the Zabbix Frontend and Zabbix Server also support encryption as of version 5.0 LTS.

As for the frontend component – users can add an additional security layer for their Zabbix frontends by configuring 2FA and SSO logins. Zabbix 6.0 LTS also introduces flexible login password complexity requirements, which can reduce the security breach risk if your frontend is exposed to the internet. To ensure that Zabbix meets the highest standards of the company security compliance, the new Audit log, introduced in Zabbix 6.0 LTS, is capable of logging all of the Zabbix Frontend and Zabbix Server operations.

For an additional security layer – sensitive information like Usernames, Passwords, API keys can be stored in an external vault. Currently, Zabbix supports secret storage in the HashiCorp Vault. Support for the CyberArk vault will be added in the Zabbix 6.2 release.

Another Zabbix feature – the Zabbix API, is often used for the automation of day-to-day configuration workflows, as well as custom integrations and data migration tasks. Zabbix 5.4 added the ability to create API tokens for particular frontend users with pre-defined token expiration dates.

In Zabbix 5.2 we added another layer for the Zabbix Frontend user permissions – User Roles. Now it is possible to define granular user roles with different types of rights and privileges, assigned to specific types of users in your organization. With User Roles, we can define which parts of the Zabbix UI the specific user role has access to and which UI actions the members of this role can perform. This can be combined with API method restrictions which can also be defined for a particular role.

Powerful Solution for MSPs

When we combine all of these features, we can see how Zabbix becomes a powerful solution for MSP customers. MSPs can use Zabbix as an added value service. This way they can provide a monitoring service for their customers and get additional revenue out of it. It is possible to build a customer portal which is a combination of User Roles for read-only access to dashboards and customized UI, rebranding option – which was just introduced in Zabbix 6.0 LTS, and a combination of SLA reporting together with scheduled PDF reports, so the customers can receive reports on a weekly, daily or monthly basis.

Scalability and High Availability

With a growing number of devices and ever-increasing network complexity, Scalability and High availability are extremely important requirements.

Zabbix provides Load balancing options for Zabbix UI and Zabbix API. In order to scale the Zabbix Frontend and Zabbix API, we can simply deploy additional Zabbix Frontend nodes, thus introducing redundancy and high availability.

Zabbix 6.0 LTS comes with out-of-the-box support for the Zabbix Server High Availability cluster. If one of the Zabbix Server nodes goes down, Zabbix will automatically switch to one of the standby nodes. And the best thing about the Zabbix Server High Availability cluster – it takes only 5 minutes to get it up and running. the HA cluster is very easy to configure and use.

One of the features in our future roadmap is introducing support for the History API to work with different time-series DB backends for extra efficiency and scalability. Another feature that we would like to implement in the future is load balancing for Zabbix Servers and Zabbix Proxies. Combining all of these features would truly make Zabbix a cloud-native application with unlimited horizontal scalability.

Machine learning and Statistical analysis

Defining static trigger thresholds is a relatively simple task, but it doesn’t scale too well in dynamic environments. With Machine Learning and Statistical Analysis, we can analyze our data trends and perform anomaly detection. This has been greatly extended in Zabbix 6.0 LTS with Anomaly Detection and Baseline Monitoring functionality.

Zabbix 6.0 Adds an extended set of functions for trend analysis and trend prediction. These support multiple flexible parameters, such as the ability to define seasonality for your data analysis. This is another way how to get additional insights out of the data collected by Zabbix

More value to users

When I think about the direction that Zabbix is headed in, and look at the Zabbix roadmap, one of the main questions I ask is “How can we deliver more value to our enterprise users?”

In Zabbix 6.0 LTS we made some major steps to make Zabbix fit not only for infrastructure monitoring but also fit for Business Service monitoring – the monitoring of services that we provide for our end-users or internal company users. Zabbix 6.0 LTS comes with complex service level object definitions, real-time SLA reporting, multi-tenancy options, Business Service alerting options, and root cause and Impact analysis.

New visualization capabilities

It is important to present the collected data in a human-readable way. That’s why we invest a lot of time and effort in order to improve the native visualization capabilities. In Zabbix 6.0 LTS we have introduced Geographical Maps together with additional widgets for TOP N reporting and templated and multi-page dashboards.

The introduction of reports in Zabbix 5.2 allowed our users to leverage their Zabbix Dashboards to generate scheduled PDF reports with respect to user permissions. Our users can generate daily, weekly, monthly or yearly reports and send them to their infrastructure administrators or customers.

IoT monitoring

With the introduction of support for Modbus and MQTT protocols, Zabbix can be used to monitor IoT devices and obtain environmental information from different sensors such as temperature, humidity, and more. In addition, Zabbix can now be used to monitor factory equipment, building management systems, IoT gateways, and more.

Infrastructure as a code

With IT infrastructures growing in scale, automation is more important than ever. For this reason, many companies prefer preserving and deploying their infrastructure as code. With the support of YAML format for our templates, you can now keep them in a git repository and by utilizing CI/CD tools you can now deploy your templates automatically.

This enables our users to manage their templates in a central location – the git repository, which helps users to perform change management and versioning and then deploy the template to Zabbix by using CI/CD tools.

Tags for classification

Over the past few versions, we have made a major push to support tags for most Zabbix entities. The switch from applications to tags in Zabbix 5.4 made the tool much more flexible. Tags can now be used for the classification of items, triggers, hosts, business services. The tags that the users define can also be used in alerting, filtering, and reporting.

What’s next?

You’re probably wondering – what’s coming next? What are the main vectors for the future development of Zabbix?

First off – we will continue to invest in usability. While the tool is made by professionals for professionals, it is important for us to make using the tool as easy as possible. Improvements to the Zabbix Frontend, general usability, and UX can be expected very soon.

We plan to continue to invest in the visualization and reporting capabilities of Zabbix. We want all data collected by our monitoring tool to provide information in a single pane of glass. This way our users can see the full picture of their environment while also seeing the root cause analysis for the ongoing problems that we face. This way we can get most of the data that Zabbix collects.

Extending the scope of monitoring is an ongoing process for us. We would like to implement additional features for compliance monitoring. I think that we will be able to introduce a solution for application performance monitoring very soon. We’d like to make log monitoring more powerful and comprehensive. monitoring of public and private clouds is also very important for us, given the current IT paradigms.

We’d like to make sure that Zabbix is absolutely extendable on all levels. While we can already extend Zabbix with different types of plugins, webhooks, and UI modules there’s more to come in the near future.

The topic of high availability, scalability, and load balancing is extremely important to us. We will continue building on the existing foundations to make Zabbix a truly cloud-native solution.

Advanced event correlation engine

Advanced event processing is a really important topic. When we talk about a monitoring solution, we pay very much attention to the number of metrics that we are collecting. We mustn’t forget, that for large-scale environments the number of events that we generate based on those metrics is also extremely important. We need to keep control and manage the ever-growing number of different events coming from different sources. This is why we would like to focus on noise reduction, specifically – root cause analysis.

For this reason, we can expect Zabbix to introduce an advanced event correlation model in the future. This model should have the ability to filter and deduplicate the events as well as perform event enrichment, thus leading to a much better root cause analysis.

Multi DC Monitoring

Currently, Multi DC monitoring can be done with Zabbix by deploying a distributed Zabbix instance that utilizes Zabbix proxies. But there are use cases, where it would be more beneficial to have multiple Zabbix servers deployed across different datacenters – all reporting to a single location for centralized event processing, centralized visualization, and reporting as well as centralized dashboards. This is something that is coming soon to Zabbix.

Zabbix Release Schedule

Of course, the burning question is – when is Zabbix 6.0 LTS going to be released? And we are very close to finalizing the next LTS release. I would expect Zabbix 6.0 LTS to be officially released in January 2022.

As for Zabbix 6.2 and 6.4 – these releases are still planned for Q2 and Q4, 2022. The next LTS release – Zabbix 7.0 LTS is planned to be released in Q2, 2023.

Zabbix Roadmap

If you want to follow the development of Zabbix – we have a special page just for that – the Zabbix Roadmap. Here you can find up-to-date information about the development plans for Zabbix 6.2, 6.4, and 7.0 LTS. The Roadmap also represents the current development status of Zabbix 6.0 LTS.

Questions

Q: What would you say is the main benefit of why users should migrate from Zabbix 5.0/4.0 or older versions to 6.0 LTS?

A: I think that Zabbix 6.0 LTS is a very different product – even when you compare it with the relatively recent Zabbix 5.0 LTS. It comes with many improvements, some of which I mentioned here in my keynote. For example, Business Service monitoring provides huge added value to enterprise customers.

With the new trigger syntax and the new functions related to anomaly detection and baseline monitoring our users can get much more out of the data that they already have in their monitoring tool.

The new visualization options – multiple new widgets, geographical maps, scheduled PDF reporting provide a lot of added value to our end-users and to their customers as well.

Q: Any plans to make changes on the Zabbix DB backend level – make it more scaleable or completely redesign it?

A: Right now we keep all of our information in a relational database such as MySQL or PostgreSQL. We have added the support for TimescaleDB which brings some huge advantages to our users, thanks to improved data storage and performance efficiency.

But we still have users that wish to connect different storage engines to Zabbix – maybe specifically optimized to keep time-series data. Actually, this is already on our roadmap. Our plan is to introduce a unified API for historical data so that if you wish to attach your own storage, we just have to deploy a plugin that will communicate both with our historical API and also talk to the storage engine of your choosing. This feature is coming and is already on our Roadmap.

Q: What is your personal favorite feature? Something that you 100% wanted to see implemented in Zabbix 6.0 LTS?

A: I see Zabbix 6.0 LTS as a combination of Zabbix 5.2, 5.4, and finally the features introduced directly in Zabbix 6.0 LTS. Personally, I think that my favorite features in Zabbix 6.0 LTS are features that make up the latest implementation of Anomaly detection.

We could be at the very beginning of exploring more advanced machine learning and statistical analysis capabilities, but I’m pretty sure that with every new release of Zabbix there will be new features related to machine learning, anomaly detection, and trend prediction.

This could provide a way for Zabbix to generate and share insights with our users. Analysis of what’s happening with your system, with your metrics – how the metrics in your system behave.

Summary of Zabbix Summit Online 2021, Zabbix 6.0 LTS release date and Zabbix Workshops

Post Syndicated from Arturs Lontons original https://blog.zabbix.com/summary-of-zabbix-summit-online-2021-zabbix-6-0-lts-release-date-and-zabbix-workshops/17155/

Now that the Zabbix Summit Online 2021 has concluded, we are thrilled to report we hosted attendees from over 3000 organizations from more than 130 countries all across the globe.

This year, the main focus of the speeches was the upcoming Zabbix 6.0 LTS release, as well as speeches focused on automating Zabbix data collection and configuration, Integrating Zabbix within existing company infrastructures, and migrating from legacy tools to Zabbix. 21 speakers in total presented their use cases and talked about new Zabbix features during the Summit with over 8 hours of content.

In case you missed the Summit or wish to come back to some of the speeches – both the presentations (in PDF format) and the videos of the speeches are available on the Zabbix Summit Online 2021 Event page.

Zabbix 6.0 LTS release date

As for Zabbix 6.0 LTS – as per our statement during the event, you can expect Zabbix 6.0 LTS to release in early 2022. At the time of this post, the latest pre-release version is Zabbix 6.0 Alpha 7, with the first Beta version scheduled for release VERY soon. Feel free to deploy the latest pre-release version and take a look at features such as Geomaps, Business Service monitoring, improved Audit log, UX improvements, Anomaly detection with Machine Learning, and more! The list of the latest released Zabbix 6.0 versions as well as the improvements and fixes they contain is available in the Release notes section of our website.

Zabbix 6.0 LTS Workshops

The workshops will focus on particular Zabbix 6.0 LTS features and will be available once the Zabbix 6.0 LTS is released. The workshops will provide a unique chance to learn and practice the configuration of specific Zabbix 6.0 LTS features under the guidance of a certified Zabbix trainer at absolutely no cost! Some of the topics covered in the workshops will include – Deploying Zabbix server HA cluster, Creating triggers for Baseline monitoring and Anomaly detection, Displaying your infrastructure status on Geomaps, Deploying Business Service monitoring with root cause analysis, and more!

Upcoming events

But there’s more! On December 9 2021 Zabbix will host PostgreSQL Monitoring Day with Zabbix & Postgres Pro. The speeches will focus on monitoring PostgreSQL databases, running Zabbix on PostgreSQL DB backends with TimescaleDB, and securing your Zabbix + PostgreSQL instances. If you’re currently using PostgreSQL DB backends r plan to do so in the future – you definitely don’t want to miss out!

As for 2022 – you can expect multiple meetups regarding Zabbix 6.0 LTS features and use cases, as well as events focused on specific monitoring use cases. More information will be publicly available with the release of Zabbix 6.0 LTS.