Tag Archives: Zabbix Summit

Expand your Knowledge at Zabbix Summit 2025

Post Syndicated from Michael Kammer original https://blog.zabbix.com/expand-your-knowledge-at-zabbix-summit-2025/31168/

October is just around the corner, and that annual shift into Q4 can mean only one thing – it’s almost Summit time! Zabbix Summit 2025 will take place on October 8-10 in Riga, Latvia at the Radisson Blu Hotel Latvija, and it’s shaping up to be the perfect blend of established traditions and fresh approaches – we’ve been at this for a (very lucky) 13 years now, and we’d like to think we’ve kept the aspects of the Summit experience that everyone knows and loves while adding a few twists! Here’s what you can expect for the price of admission:

Top-tier presentations from Zabbix leaders and experts

The learning begins with Zabbix Founder and CEO Alexei Vladishev’s keynote speech, which promises to be an “info drop” full of details about upcoming releases, new features, and what Alexei sees on the horizon for Zabbix. From there, it will be time for over 30 main stage speakers spread across two days of conference action. Some of the highlights include:

Presentations from Zabbix experts on topics like:

  • Turning playbooks into automated action plans
  • Streaming metrics for multiple tenants without chaos
  • Syncing systems painlessly
  • Maintaining control over massive amounts of Zabbix data
  • Detecting and responding to security threats before they escalate

Deep dives that will show you how to:

  • Spot the blind spots in large-scale networks (and fix them)
  • Keep tabs on Zabbix itself (after all, even monitoring needs monitoring)
  • Take full control of tag management
  • Use Zabbix Proxy to scale without breaking a sweat

Practical case studies, including:

  • Turning sensor data into insights with AI
  • Keeping SAP environments and multisite clusters in check
  • Transforming enterprise-level monitoring
  • Supercharging operations via migration projects
  • Making discovery, correlation, and AI work together for smart monitoring in action

Expect all this, plus inside information from the Zabbix team on the path to becoming a Zabbix partner and how Zabbix services can help you scale efficiently. As if that weren’t enough, this year’s Summit will also feature special guest Dylan Beattie! A Software Development Consultant and Founder of Ursatile, Dylan is an international keynote speaker, and a long-time contributor to the open-source community.

At the Summit, Dylan will give a talk titled “Open Source, Open Minds. The Cost of Free Software.” Expect stories about why developers choose to give their code away, what happens when they change their minds, the quirks of licenses and legalities, and the big question of whether open source can ever be truly sustainable.

Dedicated Dev and Community tracks

Created by developers and for developers, the Dev Track makes its debut this year and brings together some of the top minds on the Zabbix development team to cover topics as diverse as extending Zabbix Agent 2 with custom plugins, enhanced widget development, and template design best practices.

For attendees of a slightly less technical persuasion, the Community Track is there to facilitate author led discussions about community-driven content and resources, like the Zabbix Book. Assembled by longtime Zabbix enthusiasts Patrik Uytterhoeven, Brian van Baekel, and Nathan Liefting, the Zabbix Book will get its own breakout room, where Summit attendees can brainstorm in small groups about how to improve the book via new ideas and topics.

Hands-on workshops

The Summit experience has always been about finding opportunities to put theory into practice, and this year’s workshops showcase the latest features and use cases in action. Attendees will be able to dive into workshops on AI powered monitoring with Zabbix and ESP32, nested LLDs (low-level discovery), reducing alert noise, diagnosing performance issues with Diaginfo, and using Netflow integration via H5 Network. It’s a rare opportunity to confirm your knowledge retention by performing real-world tasks under the guidance of workshop hosts and their assistants.

Training and certification (yes, with discounts!)

A Zabbix Summit is the perfect place to get recognized as a Zabbix specialist or professional by taking part in Zabbix Certified Training sessions and exams at bargain prices. These one-day courses will be held from October 6 through October 13:

  • Automation and Integration with Zabbix API
  • Advanced Zabbix Database Monitoring
  • Advanced Zabbix SNMP Monitoring
  • Zabbix Certified Specialist Upgrade
  • Zabbix Certified Professional Upgrade

If you find yourself in Riga after the Summit, it’s worth your time to take part in the full Zabbix Certified Specialist course scheduled for October 13-17. Please remember that you can choose more than one training course and also keep in mind that you can attend the courses (without the 50% Summit discount) even if you’re not joining us at the Summit. You can register for all training sessions and exams here.

Networking and community building

 

A big part of what makes a Zabbix Summit a Zabbix Summit is the vibe – a big, global community coming together to catch up with old friends, welcome new members, and celebrate a certain open-source monitoring solution that brings us all together. That atmosphere of conviviality is exactly what makes a Summit such a one-of-a-kind networking opportunity. We’ve put together an open house visit and three evening events that are the ideal places to connect with like-minded monitoring enthusiasts, show off your skills, or get your company’s name in front of industry decision-makers.

This year’s Zabbix Open House on October 8 is your chance to see where the magic happens – drop by our offices and chat with our team members, grab yourself a coffee in our kitchen, and take part in a quiz that will teach even the most seasoned Zabbix fans a few new fun facts.

No summit would be complete without its events, and the opening event of Zabbix Summit 2025 on October 8 will take place at Riga’s renowned Monkey Club, with delicious fusion cuisine, a broad selection of cocktails and beverages, and a chance to unwind in style with your fellow Summit attendees.

The main event on October 9 is hosted by the Tallinn Quarter Hangar, which boasts a concert hall as well as a modern, open-plan street food kitchen and bar that are guaranteed to offer something for everyone.

On October 10, Zabbix Summit 2025 will wrap up at downtown Riga’s Burzma food hall, which offers 10 restaurants and a bar serving up a broad range of flavors from every corner of the globe. It’s the perfect location to relive Summit highlights in the company of your fellow Zabbix enthusiasts, and we’re looking forward to seeing you there!

Can’t make it? There’s always YouTube

A Zabbix Summit is one of those “you had to be there” events, but if you can’t make it to Riga, no worries – as in previous years, we’re going to be livestreaming all the speeches on our YouTube channel! Find out more and subscribe to the livestream here.

The post Expand your Knowledge at Zabbix Summit 2025 appeared first on Zabbix Blog.

Zabbix Summit 2024 in Review

Post Syndicated from Michael Kammer original https://blog.zabbix.com/zabbix-summit-2024-in-review/28901/

In what has become a highly anticipated annual tradition, Zabbix employees, partners, users, and just plain fans from every corner of the globe showed up in Riga on October 4 and 5 for Zabbix Summit 2024, celebrating a very special open-source monitoring solution that unites us all.

The 12th in-person version of our premier yearly event saw delegates arrive from 48 different countries, and just as every year the atmosphere was like a family reunion, with old friends reconnecting, remote colleagues meeting for the first time in person, and plenty of good vibes all around.

In case you couldn’t manage to make it to Riga and participate, fear not – we’ve put together this post to try to give you a taste of what Zabbix Summit 2024 was all about. As long-time Zabbix veterans say, “There’s nothing like a Zabbix Summit!”

And now, a word from our sponsors

Zabbix Summit 2024 could never have happened without the assistance of our featured sponsors, all part of Zabbix’s official partner network:

initMAX – Diamond Sponsor
IntelliTrend – Platinum Sponsor
IZI-IT – Platinum Sponsor
Quadrata – Platinum Sponsor
Allenta – Gold Sponsor
Metricio – Gold Sponsor
Docomo Business – Gold Sponsor
SRA OSS – Silver Sponsor
Inqbeo – Lunch and coffee break sponsor

Opening doors and minds alike

The day before the Summit, our team welcomed dozens of guests to our office for the traditional pre-Summit Open-Door day. We provided a whiteboard where attendees could leave their thoughts about Zabbix, set up a special Zabbix quiz, and organized a guided tour of the office. Countless questions were asked and answered, endless cups of coffee were poured, and a friendly, welcoming vibe was established that lasted through the end of the Summit and beyond.

Live from the main stage

This year’s Summit hosted 36 speakers who gave 34 speeches. Allowing our audience to ask questions during live Q&A sessions proved so popular last year that there was little doubt we’d continue it this year, as it promotes audience participation and keeps the speakers themselves on their toes. Here are brief recaps of a few of the standout speeches:

Zabbix Cloud and the way forward

Our CEO and Founder Alexei Vladishev kicked off the presentations with a keynote speech that introduced Zabbix Cloud and listed all the features that make it a secure, flexible, and functional alternative to the Zabbix we all know and love. Alexei also gave a short preview of the upcoming Zabbix 7.2 version. Stay tuned!

An intro to Zabbix Cloud

Zabbix Head of Product Dmitrijs Lamberts provided a deeper dive into Zabbix Cloud, sharing detailed information on how Zabbix Cloud works, providing insight into the pricing tiers, and explaining all the features and benefits that have the Zabbix community buzzing with excitement, whetting the audience’s appetite for a live demo he conducted on Day 2 of the Summit.

Using Zabbix to monitor solar energy

Mitsuhiro Ono of the Toyota Motor Corporation and Toshihiro Akamatsu of SRA OSS showed how they achieve distributed monitoring with Zabbix. They also demonstrated a case study that showed exactly how they use their Zabbix dashboard to provide the kind of detailed solar energy oversight that makes the adoption of green power in Japan possible.

Keeping tabs on MariaDB

On Day 2 of the Summit, Anders Karlsson of the MariaDB Corporation discussed how to monitor and manage MariaDB Server Clusters running with the MariaDB MaxScale database proxy. He also demonstrated how MariaDB MaxScale (which also monitors the MariaDB Servers) comes into the picture and touched on topics like managing failover, monitoring database traffic, and routing and load balancing.

Meeting security challenges with Zabbix

Gabriele Minniti and Vincenzo Morrone of Whysecurity demonstrated how the power of the Zabbix API can combine with the APIs of the vendors they work with to create centralized dashbords for controlling the cybersecurity posture of all their customers, no matter what technologies are being used.

The business track

For the first time at Zabbix Summit 2024, we constructed a second stage for slightly less technical, more business-oriented presentations. The speeches delivered there were among the most thought-provoking and fascinating of the Summit, and did much to help us reach new audiences. Here are a few highlights:

An evolving IT monitoring landscape

Zabbix LatAm CEO Luciano Alves gave a well-received talk that focused on the latest trends in the global monitoring market, presenting the results of a survey that took the pulse of over 100 global enterprise organizations.

Zabbix for managed service providers (MSPs)

Andre Morton of AGM Network Consultancy explained why having a flexible and scalable monitoring solution is extremely vital for MSPs and showed how a variety of different features, from authentication mechanisms to automatic remediation and visualization, work together to make Zabbix the perfect monitoring solution for MSPs.

Yes, we still had time for fun!

The Zabbix community works together, innovates together, and when it’s time to let off steam they have fun together at our Summit networking events, of which there were three this year:

  • This year’s welcome event was held at the architecturally stunning National Library of Latvia – or as Latvians call it, “The Castle of Light.” Tasty beverages and delicious food were on the menu, as was a guided tour of the library itself.
  • The main event was held at Riga’s famed Fantadroms Concert and Event Space, where attendees could dance to live music and enjoy more food and drinks as they caught up with friends and forged valuable connections with their counterparts in other organizations.
  • We sent the community on their way with a closing event at the Burzma food hall in Riga’s old town, with a cornucopia of food and music from around the world as well as plenty of opportunities for attendees to relive the Summit, have a few laughs, and wish each other well until next year!

Couldn’t make it this year? No problem!

At this point, you’re probably regretting that you didn’t manage to attend the Summit, but don’t worry – you can recreate the atmosphere in the privacy of your own home or office!

Recordings of both days are available on Zabbix’s YouTube channel:

Streaming – Zabbix Summit 2024 Day 1 
Streaming – Zabbix Summit 2024 Day 2

The slides and texts of the presentations are also available for reference and download here as well.

Whether you attended in person or streamed the Summit online, we hope that you had a great time, learned a lot, and are eager to do it all again in 2025!

The post Zabbix Summit 2024 in Review appeared first on Zabbix Blog.

Blending Zabbix and AI with Tomáš Heřmánek

Post Syndicated from Michael Kammer original https://blog.zabbix.com/blending-zabbix-and-ai-with-tomas-hermanek/28832/

Zabbix Summit 2024 is only a few days away, which means that it’s time for the last of our interviews with Summit speakers. Our final chat this year is with Tomáš Heřmánek, the CEO and Founder of initMAX s.r.o. We asked him about his beginnings in the tech industry, how he got started with Zabbix, and how AI will change the game for monitoring in general and Zabbix in particular.

Please tell us a bit about yourself and the journey that led you to initMAX.

My journey in the IT field started with small ISPs and later took a significant leap into the world of Linux and application management, where the need for effective monitoring became evident. I worked for a company that prioritized high-quality open-source solutions, and it was during this time that we adopted Zabbix version 1.8 as a replacement for Nagios, which we found to be inflexible. Shortly after our deployment, Zabbix 2.0 was released. It introduced JMX monitoring, which was crucial for us. Since then, Zabbix has been our go-to solution for monitoring.

I set a personal goal to master this outstanding monitoring system and participated in the first official Zabbix training in the Czech Republic, where I earned my initial certifications as a Zabbix Specialist and Professional on version 3.0. The training experience drew me deeper into the world of Zabbix, especially after meeting a burgeoning group of enthusiasts in the country. I felt compelled to give back to the community that had supported me.

How long have you been using Zabbix? What kind of Zabbix-related tasks does your team tackle on a daily basis?

When I started my own company, becoming a Zabbix partner was a natural choice. To further contribute to the community, I pursued the Expert and Trainer certifications. It was the most challenging 14 days of my life, but it was worth it. For anyone serious about Zabbix, I highly recommend participating in official training sessions and actively engaging with the community through forums, local groups, Telegram, WhatsApp, blogs, and forums. This commitment to support and strengthen the community further.is also why we created our own wiki, which is accessible to everyone without restrictions.

Can you give us a few clues about what we can expect to hear during your Zabbix Summit presentation?

This year, I have prepared a demonstration for the Zabbix Summit showcasing how we integrate AI into our operations, including various modifications to the web interface that allow us to automate and streamline routine tasks. Besides showcasing these innovations, we will also be making some parts of our work available to the public. The main focus of my presentation will be on problem identification, automating the creation of preprocessing steps, and using a chatbot for creating hosts, reading configurations, and making modifications. Essentially, it’s a smart assistant and guide all in one.

The final section, which we find the most challenging, deals with automated event correlation and the creation of a topology, from which correlations partially derive and evaluate. We are using the new Zabbix 7.0 feature – root cause and symptoms – for visualization in Zabbix. Our goal is to showcase not only the capabilities of Zabbix in combination with AI, but also to contribute back to the community by sharing some of these developments freely.

In your experience, does Zabbix lend itself easily to enhancement via AI?

AI is something that truly fascinates us and is currently shaping the world. From our experience, we believe that the possibilities are limited only by our imagination. In the future, I can envision AI autonomously discovering elements that need to be monitored, integrating them into Zabbix, and configuring everything necessary for effective monitoring.

What changes do you think AI will bring to the world of monitoring in general over the next decade or so?

I foresee a shift in our roles, moving away from traditional IT tasks towards a focus on idea generation, control, and the customization of artificial intelligence. As AI continues to evolve, it will not only enhance automation but also empower us to explore and implement innovative solutions more effectively.

The post Blending Zabbix and AI with Tomáš Heřmánek appeared first on Zabbix Blog.

Zabbix for MSPs with Andre Morton

Post Syndicated from Michael Kammer original https://blog.zabbix.com/zabbix-for-msps-with-andre-morton/28748/

To help make sure that everyone’s up to speed with Zabbix Summit 2024 speakers and their topics, we’re continuing our series of interviews with Andre Morton of AGM Network Consultancy LTD. Keep reading to learn how he feels Zabbix can alleviate the typical pain points of managed service providers (MSPs), see how he uses Zabbix to maintain control of his network, and find out what he appreciates most about Zabbix.

Please tell us a bit about yourself and the journey that led you to AGM Network Consultancy LTD.

I started out studying Network Engineering at the University of Greenwich, and then went on to undertake a Masters of Networks and Security at the University of Kent. During my Masters, I was the a one-man IT Team for a Child Care agency spanning the UK. I then went on to work at three small IT companies/MSPs, being the only Network Engineer at each company and managing networks with 80 – 200 customers.

How long have you been using Zabbix? How has it impacted your everyday tasks?

I have been using Zabbix for about 10 years now. At first, I just used it to get insights via SNMP. Then I began using it to create visual troubleshooting aids for myself and non-networking team members. Finally, I began using Zabbix as my main inventory gathering tool for networking and infrastructure devices. When it comes to that, Zabbix has enabled me to control how I want to monitor the network, avoiding vendor limitations and allowing me to build my own scripts to run tests and actions that I would not otherwise be able to do.

Can you give us a few clues about what we can expect to hear during your Zabbix Summit presentation?

I may have to condense some things, as I don’t want to be too technical or take too long! I’ll definitely talk about what drew me to Zabbix, how I used Zabbix to turn problems that require a large amount of attention and time into scripts that can identify the problems and capture the problem states, and how Zabbix dashboards help me to get a clear overview of customer and site problems/general status. I’ll also speak about scripts that we now use to troubleshoot and undertake remote actions, give examples of what the value of the monitoring data is to MSPs before and after the problem, and let everyone in on my upcoming plans for Zabbix, which include webhooks from the map, scripts; Zabbix’s place in our bespoke systems, and network automation.

What, in your opinion, are the biggest pain points MSPs have, and how can Zabbix help alleviate them?

I’d say that there are two big pain points that Zabbix is of assistance with – providing troubleshooting time for big problems, and making sure that historical data is ready for troubleshooting.

What do you appreciate the most about Zabbix in your role?

Zabbix allows me to drastically reduce the amount of administration and troubleshooting that I have to undertake and provides a live inventory of devices (software/firmware details). Thanks to Zabbix, I don’t have to use multiple tools or log into multiple devices to get software and firmware version details.

The post Zabbix for MSPs with Andre Morton appeared first on Zabbix Blog.

Monitoring MariaDB Clusters and MaxScale with Anders Karlsson

Post Syndicated from Michael Kammer original https://blog.zabbix.com/monitoring-mariadb-clusters-and-maxscale-with-anders-karlsson/28718/

The heart and soul of a Zabbix Summit is the wide range of expert speakers who show up each year to share their experience, knowledge, and discoveries. Accordingly, we’re continuing our series of interviews with Summit 2024 speakers by having a chat with MariaDB Sales Engineer Anders Karlsson. He’ll grace our stage at Summit 2024  to talk about his 4 decades of work experience and share how he uses a variety of Zabbix features to monitor MariaDB clusters and MariaDB MaxScale.

Please tell us a bit about yourself and the journey that led you to MariaDB.

I have been working with databases nearly all of my professional life, which is more than 40 years by now. My first IT job was as a system administrator on a development system for Telco equipment running UNIX on a PDP/11 70. This was fun, and I got to use Unix very early (the early 1980’s) and I was also there at the start of the Internet (by emailing through UUCP to the US and then through what was then the Internet).

Following that, I joined another Telco company, which used a rather unknown database technology called Oracle (version 4.1.4). When this company moved their operations from Stockholm (where I lived) to Luxembourg, I decided to leave and look for other opportunities. I heard that Oracle was looking for people and I got a job there as a support engineer. At Oracle I soon got involved with lots of things beyond Tech Support – I was a trainer, a consultant, and eventually a sales engineer.

I left Oracle in the early 1990’s to join a small application development company as a developer, but this really wasn’t for me, so I soon left and joined Informix instead. I was at Informix until 1996 or so and then I worked for some other small companies around the end of the millennium. Next, I joined forces with a couple of old friends to develop a database solution. This wasn’t very successful, and I still needed a job.

I first ended up with TimesTen before they ran out of luck. After a year or so of freelancing, I was approached by an old friend from the Informix days who was now the sales manager for MySQL in Scandinavia. I joined MySQL in 2004 as a sales engineer and was there until Oracle took over. I then worked for a small Swedish startup for a couple of years, but I missed sales engineering, so when I got an offer to join MariaDB in 2012 I said yes.

How long have you been using Zabbix? What kind of Zabbix tasks do you get up to on a daily basis?

I have known about Zabbix and used it occasionally for a while, but while preparing for Zabbix Summit 2024 I have gotten to use it “in anger” a bit more. There are pros and cons to it, but in general I like it. It does have a lot of “Open Source” feel to it, but that is not really an issue for me.

Can you give us a few clues about what we can expect to hear during your Zabbix Summit presentation?

I will focus on monitoring MariaDB Clusters running Galera Cluster and the MariaDB MaxScale database proxy. Monitoring individual MariaDB servers is easy out of the box with Zabbix, but when you have a cluster you have to monitor certain cluster-wide attributes. MariaDB MaxScale keeps track of the state of the server in the cluster in detail and the cluster as whole, and I will show how to pull cluster-wide data from MaxScale using the MaxScale REST/JSON API and how to use that to build triggers and graphs in Zabbix. I will finish up by doing a demo of this with MariaDB MaxScale and a Galera Cluster.

What led you to the topic of Monitoring MariaDB Clusters and MariaDB MaxScale with Zabbix?

The main thing was that although there are community provided Zabbix templates for MariaDB MaxScale, and Galera can be monitored largely by the Zabbix agent, using these typically does not provide as much in terms of cluster-wide monitoring as I would like. It’s important to know how the reads and writes are distributed, what the state of the database cluster is, etc.

How do you see the role of Zabbix in MariaDB in the near future? Are you planning to use it for any other new tasks?

My next goal is to see if I can write a blog for MariaDB on Zabbix monitoring with some emphasis on MariaDB MaxScale.

The post Monitoring MariaDB Clusters and MaxScale with Anders Karlsson appeared first on Zabbix Blog.

Reducing Alert Noise with Birol Yildiz

Post Syndicated from Michael Kammer original https://blog.zabbix.com/reducing-alert-noise-with-birol-yildiz/28643/

Zabbix Summit 2024 is almost here, and we’re giving you a sneak peek into what you can expect to see on our main stage this year via a series of short interviews with a few of the eminent speakers who will grace us with their presence. First up is Birol Yildiz, the CEO and Co-founder of ilert GmbH and a man who is deeply passionate about keeping alert noise and fatigue to a minimum.

Please tell us a bit about yourself and the journey that led you to ilert GmbH.

My journey in the tech industry began with a deep passion for creating solutions that simplify and improve the lives of IT professionals. Before co-founding ilert GmbH, I spent over a decade working in various IT roles, ranging from software development to operations. I noticed that while monitoring systems were becoming increasingly sophisticated, the process of alert management and incident response was lagging behind.

This gap inspired me to create ilert, a platform focused on bridging that divide by optimizing alerting processes and reducing response times. Our goal at ilert has always been to empower teams with the tools they need to stay ahead of incidents, ensuring that their systems run smoothly and efficiently.

How long have you been using Zabbix? What kind of Zabbix-related tasks are you involved in on a daily basis?

Zabbix has been an integral part of ilert since 2018, when we first developed one of our early integrations with the platform. Recognizing its popularity among our customer base, we enhanced this integration in 2020, transforming it into a native integration and solidifying our partnership with Zabbix as a technology partner. Since then, Zabbix has become one of the most popular integrations within ilert.

On a daily basis, my involvement with Zabbix includes overseeing the continued optimization of our integration, ensuring that it meets the evolving needs of our users. I work closely with our development and support teams to identify and implement improvements based on user feedback and the latest developments from Zabbix.

Can you give us a few clues about what we can expect to hear during your Zabbix Summit presentation?

Alert fatigue has long been a significant challenge for the DevOps community, often leading to decreased efficiency and increased stress among professionals. In my presentation, we will explore innovative strategies that leverage AI to mitigate alert noise.

I’ll be discussing how to maximize the efficiency of your incident response process by leveraging Zabbix with advanced alerting and on-call management tools like ilert. I’ll share insights on reducing alert fatigue, improving incident response times, and ensuring that critical alerts reach the right people at the right time.

This talk will be particularly valuable for DevOps engineers looking to optimize their alert management systems and reduce the cognitive load caused by alert fatigue. Zabbix administrators will find it insightful, especially if they are interested in integrating advanced AI techniques into their monitoring workflows to achieve better performance and reliability.

Moreover, AI and machine learning enthusiasts will gain practical knowledge about applying AI in IT monitoring and alerting, making this session a comprehensive resource for anyone looking to advance their alert management strategies.

Reducing alert noise is something that’s on almost everyone’s wish list, but was there any particular incident or aspect of your professional life that made you want to focus on this topic?

Absolutely. There was a specific incident early in my career that left a lasting impact on me. We were using a monitoring system that generated a significant number of alerts, most of which were non-critical. One weekend, a critical issue was buried in a flood of low-priority alerts, leading to a delayed response and significant downtime for the business.

This incident underscored the importance of not just having a monitoring system in place but ensuring that it was configured to minimize noise and prioritize what truly matters. That experience drove me to focus on creating solutions that help teams filter out the noise and respond quickly to what’s really important, which is a core principle behind ilert’s offerings.

Are there any other similar issues that you can envision tackling with Zabbix?

Yes, beyond reducing alert noise, there’s a lot of potential in enhancing the collaboration between teams during incidents. For example, automating incident communication and resolution processes is an area where I see great value. By integrating Zabbix with incident management platforms like ilert, teams can not only reduce noise but also streamline communication, ensuring that the right people are involved at the right time and that resolution steps are clear and actionable.

Another area is optimizing the way multiple on-call teams work together using Zabbix and incident response platforms like ilert. In many organizations, different teams are responsible for specific sets of host groups in Zabbix, and it’s crucial that each team only receives alerts for the services they are directly responsible for. These are just a few examples of how we can continue to evolve our approach to incident management in conjunction with Zabbix.

The post Reducing Alert Noise with Birol Yildiz appeared first on Zabbix Blog.

What’s in Store at Summit ‘24?

Post Syndicated from Michael Kammer original https://blog.zabbix.com/whats-in-store-at-summit-24/28649/

October means different things to different people – it’s springtime in the Southern Hemisphere, autumn in the Northern Hemisphere, and Summit time if you’re a member of the Zabbix community! Summit time, of course, means the biggest of all Zabbix events, gathering the global Zabbix community in one place to have fun together and learn as much as we can from each other. Zabbix Summit 2024 will take place on October 3-5 in Riga at the Radisson Blu Hotel Latvija. Keep reading to find out more about what you can expect this year.

All new main stage presentations

During Zabbix Summit 2024, you’ll be able to catch a variety of presentations from top industry thought leaders. You’ll learn all about the latest Zabbix features, explore use cases from multiple industries, check out the latest integrations, and have the chance to get your questions answered during live Q&A sessions.

The Summit agenda will feature speeches on nearly any Zabbix-related topic that you can imagine, but this year we’ll also have a fresh focus on the potential of artificial intelligence, with presentations on topics like “New Approaches to Reduce Alert Noise with Zabbix and AIOps” and “Leveraging AI for Synthetic Web Monitoring” as well as a more business-focused group of speeches covering topics related to open-source integration and Zabbix for MSPs.

Hands-on learning in Zabbix Summit workshops

Zabbix Summit workshops are the ideal place to put the theory you learn during presentations into practice. You can check out the latest features and use cases in action, while performing a variety of real-world tasks under the guidance of workshop hosts and their assistants – many of whom are also featured presenters at this year’s Summit.

All you’ll need to do is bring your own laptop – depending on the topic covered in the particular workshop, an SSH client and a web browser may also be required. All workshop sessions will take place on the morning of October 5 (Day 2 of the Summit) and will begin at 10AM.

Zabbix Certified Training sessions and exams

Do you have a lifetime of monitoring experience, but are too shy to let everyone know it? When you attend Zabbix Summit 2024, you’ll be able to prove your skills as a Zabbix specialist or professional by taking part in Zabbix Certified Training sessions and exams. If you’re looking for more specific topics to dive into, the following one-day courses will also be held from October 2 through October 4:

  • Automation and Integration with Zabbix API
  • Advanced Problem and Anomaly Detection with Zabbix
  • Advanced Zabbix Data Pre-Processing
  • Advanced Zabbix SNMP Monitoring

If you don’t mind extending your stay in Riga just a bit longer (and seriously, why would you?), you’ll also be able to take the full Zabbix Certified Specialist or Professional courses scheduled for October 9-13. Please remember that you can choose more than one training course, and it’s possible to attend the courses (without the 10% Summit discount) even if you’re not attending the Summit.

You can sign up for all training sessions and exams here.

The Zabbix Summit Feedback and Testimonial corner

Just as at last year’s Summit, you’ll be able to share your Zabbix story with the rest of the Zabbix community at our Feedback and Testimonial corner. Sharing a testimonial or leaving a review will give you a chance to collect a piece of exclusive Zabbix Summit 2024 merchandise!

Exclusive items, cool new designs, and unique gadgets at our merchandise shop

Speaking of merch, you’ll be pleased to know that not only will exclusive Zabbix Summit merchandise be available at a special stand throughout the event, but we’ll also have an online platform that will allow you to pre-order your merchandise and pick it up at the Summit. We’ve got 5 exclusive new t-shirt designs, 4 fresh sock designs, brand-new beanies, and the usual assortment of gadgets, hoodies, and other merch that our fans have come to know and love – most of which has also gotten a new look for this year’s Summit as well.

Three incredible Zabbix Summit 2024 networking events

There’s a lot to take in and consider at a Zabbix Summit, but don’t worry – we’ve also made sure to give you plenty of time to network with your fellow Zabbix fans by organizing three big events that you won’t want to miss!

  • The Zabbix Summit 2024 welcome event will be held at the famous National Library of Latvia – or as Latvians call it, “The Castle of Light.” You’ll enjoy tasty beverages, delicious food, and a guided tour of the library as you mingle with fellow Zabbix enthusiasts and industry experts, making this the perfect way to kick off this year’s Summit.
  • You’ll want to prepare yourself for a truly unforgettable experience as the Zabbix Summit main event unfolds. We’re sure that you’ll find Riga’s famous Fantadroms Concert and Event Space to be the ideal place to forge valuable connections with like-minded professionals – while indulging in a unique array of culinary delights, refreshing beverages, and great music.
  • After all that, we’ll send you on your way with a closing event that will be the perfect grand finale to a Summit that you won’t soon forget! Located in the heart of Old Riga, Burzma is a food hall that spans 1,500 square meters across the entire fourth floor of a bustling shopping mall. With stunning rooftop views to inspire your dining experience, Burzma offers 10 restaurants and a bar serving up a diverse range of culinary delights.

A chance to see where the magic happens during our Open-Door day

In what has become a popular tradition, Zabbix will host an Open-Door day on Thursday, October 3 from 1PM to 3PM local time. You’ll be able to chat with Zabbix team members, tour our headquarters, and take part in a fun activity designed to help you learn more about Zabbix.

Booths galore!

As usual, the Zabbix team will have multiple booths in the conference hall where you can meet our engineers and developers and get your questions answered by the people who know best. Our Summit sponsors will have booths of their own as well, where you can enjoy a unique opportunity to interact with them on a personal level and get the lowdown on the solutions they offer.

Special events for support customers

All Zabbix support customers are invited to meet our team at a special Zabbix client lunch on October 3 at 14:00 (EEST), with the exact location to be announced at a later date. What’s more, Enterprise and Global support customers are also invited to the Zabbix roadmap Q&A session with Zabbix CEO and Founder Alexei Vladishev on October 5 at 10AM. You’ll learn about our software development plans and be able to raise questions or make suggestions based on your experience – definitely an opportunity you won’t want to miss!

Which Zabbix Summit ticket is right for you?

If you want to enjoy the full Zabbix Summit experience (conference, accommodation, food, even airport transfers), the Full Participation ticket package is definitely for you.

For loyal users who have contributed so much to our product over the years, the Zabbix Fan package is definitely the way to go – it includes everything you’ll get with the Full Participation package, plus a special official fan package that will guarantee you bragging rights in your office once you return from Riga.

If you’re only there for the sessions, the Hall only pass is ideal. If you enjoy both learning and networking with our team and enthusiasts from around the world, we think you’ll find the Hall and Networking pass to be perfect for your needs.

Want to bring a friend or partner along to the summit? No problem — get a Zabbix Summit Travel Companion pass for them so you can stay together and attend networking events, while we handle the rest of their Riga experience.

The Companion pass includes 3 nights’ accommodation in the Radisson Blu Latvija hotel (in the same room as the Summit attendee), 3 breakfasts, and 3 networking events, but that’s not all – we’ll also include an exclusive tour of Riga on October 4 with an English-speaking guide.

The tour features a visit to the Ethnographic Open-Air Museum of Latvia, and runs from approximately 10AM to 4PM, including lunch and some workshop activities at the museum. You can learn more about the museum here.

Visit this page to sign up for the ticket package of your choice.

Livestreaming on YouTube

We hope to see you soon in Riga, but if you can’t make it, don’t worry – as in previous years, we’re going to be livestreaming the speeches on our YouTube channel! Stay tuned for more details.

The post What’s in Store at Summit ‘24? appeared first on Zabbix Blog.

Zabbix Summit: A celebration of all things monitoring and open-source

Post Syndicated from Arturs Lontons original https://blog.zabbix.com/zabbix-summit-a-celebration-of-all-things-monitoring-and-open-source/21738/

Many of us have visited a number of different conferences over the years. The setting and the goal of the conferences can vary by a large degree – from product presentations to technology stack overviews and community get-togethers. Zabbix Summit is somewhat special in that, as it aims to combine all of the aforementioned goals and present them in a friendly, inclusive, and approachable manner.

As an open-source product with a team consisting of open-source enthusiasts, it is essential for us to ensure that the core tenets of what we stand for are also represented in the events that we host, especially so for Zabbix Summit. Our goal is for our attendees to feel right at home and welcome during the Summit – no matter if you’re a hardened IT and monitoring professional or just a beginner looking to chat and learn from the leading industry experts.

Connecting with the Zabbix community

Networking plays a large part in achieving the goals that we have set up for the event. From friendly banter during coffee breaks and speeches (you never know when a question will turn into a full-fledged discussion) to the evening fun-part events – all of this helps us build our community and encourages people to help each other and mutually contribute to each other’s projects.

Of course, the past two years have challenged our preconceptions of how such an event can be hosted in a way where we achieve our usual goals. While hosting a conference online can make things a bit more simple (everyone is already in the comfort of their home or office and organizers don’t have to spend time and other resources renting a venue, for example) the novelty of “online events” can wear of quite quickly. The conversations don’t flow as naturally as they do in person. Perusing through a list of attendees in Zoom isn’t quite the same as noticing a friend or recognizing an acquaintance while standing in line at the snack bar. As for the event speakers – steering your presentation in the correct direction can be quite complex without observing the emotional feedback of your audience. Are they bored? Are they excited? Is everyone half asleep 5 minutes in? Who knows.

With travel and on-premise events slowly becoming a part of our lives again, we’re excited to get back to our usual way of hosting Zabbix Summit. In 2022, it will be held on-premises in Riga, Latvia on October 7-8, and we can’t wait to interact with our community members, clients, and partners face-to-face again!

Making the best Zabbix Summit yet

As with every Zabbix Summit, this year’s event will build on the knowledge and feedback we have gained in previous years to make this year’s Summit the best it has ever been. This year will be special for us – we will be celebrating the 10th anniversary of the Zabbix Summit hosted on-premises! In addition to conducting the event on-site, we will also be live-streaming the event online, so if you can’t meet us in person – tune in and say hello to the Zabbix team virtually!

Zabbix Summit 2019 conference venue

Over the years we have managed to define a set of criteria for the Zabbix Summit speeches with the goal to provide content that can deliver unique value to our attendees. As a Zabbix certified trainer, a Zabbix fan, and a long-time Zabbix user, I know that there are certain types of speeches that immediately attract my attention:

  • In-depth Zabbix functionality overviews from Zabbix experts or Zabbix team members
  • Unique business monitoring use cases
  • Custom Zabbix integrations, applications, and extensions
  • How Zabbix is used in the context of the latest IT trends (e.g.: Kubernetes, cloud environments, configuration management tools such as Ansible and Chef)
  • Designing and scaling Zabbix deployments for different types of large and distributed environments

This is something that we try to put extra focus on for the Zabbix Summit. Speeches like these are bound to encourage questions from the audience and serve as a great demonstration of using Zabbix outside the proverbial box that is simple infrastructure monitoring.

Looking back at Zabbix Summit 2021, we had an abundance of truly unique speeches that can serve as guidelines for complex monitoring use cases. Some of the speeches that come to mind are Wolfgang Alper’s Zabbix meets television – Clever use of Zabbix features, where Wolfgang talked about how Zabbix is used in the broadcasting industry to collect Graylog entries and even monitor TV production trucks!

Not to mention the custom solution used for host identification and creation in Zabbix called Omnissiah, presented during the last year’s Zabbix Summit by Jacob Robinson.

As Zabbix has greatly expanded its set of features since the previous year’s summit, this year we expect the speeches to cover an even larger scope of topics related to many different industries and technology stacks.

Workshops – what to expect

Workshops are a whole other type of ordeal. In an environment where we can have participants coming from different IT backgrounds with very different skill sets, it’s important to make the workshop interesting, while at the same time making it accessible to everyone.

Zabbix workshop session at the Zabbix Summit 2019

There are a few ways we go about this to ensure the best possible workshop experience for our Zabbix Summit attendees:

  • Use native Zabbix features to configure and deploy unique use cases
  • Focus on a thorough analysis of a particular feature, uncovering functionality that many users may not be aware of
  • Demonstrate the latest or even upcoming Zabbix features
  • Interact with the audience and be open to questions and discussions

In the vast majority of cases, this allows keeping a smooth pace during the workshop while also having fun and discussing the potential use cases and the functionality of the features on display.

Becoming Zabbix certified during Zabbix Summit 2022

But why stop at workshops? During the Zabbix Summit conferences, we always give our attendees a chance to test their knowledge by attempting to pass the Zabbix certified user, specialist, or professional certification exams. The exams not only test your proficiency in Zabbix but can also reveal some missing pieces in your Zabbix knowledge that you can discuss with the Zabbix community right on the spot. Receiving a brand new Zabbix certificate is also a great way to start your day, won’t you agree?

A moment of jubilation for our freshly certified Zabbix specialists and professionals

This year the Summit attendees will also get the chance to participate in Zabbix one-day courses focused on problem detection, Zabbix security, Zabbix API, and data pre-processing. Our trainers will walk you through each of these topics from A-Z and they’re worth checking out both for Zabbix beginners as well as seasoned Zabbix veterans. I can attest that by the end of the course you will have a list of features that you will want to try out in your own infrastructure – and I’m saying that as a Zabbix-certified expert.

As for those who already have Zabbix 5.0 certifications – we’ve got a nice surprise in store for you too. We will be holding Zabbix certified specialist and professional upgrade courses, which will get you up to speed with the latest Zabbix 6.0 features and upgrade your certification level to Zabbix 6.0 certified specialist and professional.

Scaling up the Zabbix Summit

But we haven’t slumbered for the last two years of working and hosting events remotely. We have continued growing as a team and expanding our partner and customer network. Who knows what surprises October will bring, but currently our plan is for Zabbix Summit 2022 to reflect our growth.

Zabbix team at the Zabbix Summit 2019

Currently, we stand to host approximately 500 attendees on-site and expect the online viewership to reach approximately 7000 unique viewers from over 80 countries all across the globe.

With over 20 speakers from industries such as banking and finance, healthcare and medical, IT & Telecommunications, and an audience consisting of system administrators, engineers, developers, technical leads, and system architects, Zabbix Summit is the monitoring event for knowledge sharing and networking across different industries and roles.

The fun part

Spending the major part of the day networking and partaking in knowledge sharing can be an amazing experience, but when all is said and done, most of us will want to unwind after an eventful day at the conference. The Zabbix Summit conference fun part events are where you will get to strengthen your bonds with other fellow Zabbix community members and simply relax in an informal atmosphere.

Zabbix Summit 2019 Sunset afterparty

The Zabbix Summit fun part consists of three parties.

  • Kick off Zabbix Summit 2022 by joining the Zabbix team and your fellow conference attendees for an evening of social networking and fun over cocktails and games at the Meet & Greet party.
  • Join the main networking event to mark the 10th anniversary of the Zabbix Summit. Apart from good vibes, cool music, and like-minded people, expect the award ceremony honoring the most loyal Zabbix Summit attendees, fun games to play, and other entertaining activities.
  • Celebrate the end of the Zabbix Summit 2022 by attending the closing party where you can network with conference peers and discuss the latest IT trends with like-minded people in a relaxed atmosphere.
Zabbix Summit 2019 Main party

Invite a travel companion

Zabbix Summit is also a great chance to take a friend or a loved one to the conference. The conference premises are located in the very heart of Riga – perfect for taking strolls across and exploring Riga Old Town.

If you’re interested in a more guided experience for your companion, we invite you to register for the Travel companion upgrade. Your travel companion will get to enjoy the Riga city tour followed by a lunch with the rest of the guests accompanying the Zabbix conference participants. Last time, we nurtured our travel companions with a delightful tour across the Riga Central market, accompanied by the Latvian-famous chef Martins Sirmais, and full of local food tasting. Our team is preparing something special also for this year. The tour will take place on October 7 during the conference time.

Visit the Zabbix offices

Are you a fan of the product and what we stand for? Why not pay us a visit and attend the Zabbix open doors day on October 6 from 13:00 till 15:00. Take a tour of the office and sit down with us for an informal chat and a cup of coffee or tea. There won’t be any speeches, workshops, or presentations, just friendly conversations with Zabbix team, our partners, and the community to warm up before the Summit. Although, there might be friendly foosball and office badminton tournaments if any volunteers will appear.

Welcoming our community members at the Zabbix Summit 2019 Open Doors day

All things said and done – Zabbix Summit is not only about deep technical knowledge and opinion sharing on monitoring. It is and has always been primarily a celebration of the Zabbix community. It is the community feedback that largely shapes the Zabbix summit and helps us build upcoming events on the foundations laid in the previous year. Throughout the years Zabbix summit has grown into much more than a simple conference – it’s an opportunity to travel, visit us, connect with like-minded people and spend a couple of days in a relaxed atmosphere in the heart of a beautiful Northern European city.

The post Zabbix Summit: A celebration of all things monitoring and open-source appeared first on Zabbix Blog.

Build Zabbix Server HA Cluster in 10 minutes by Kaspars Mednis / Zabbix Summit Online 2021

Post Syndicated from Kaspars Mednis original https://blog.zabbix.com/build-zabbix-server-ha-cluster-in-10-minutes-by-kaspars-mednis-zabbix-summit-online-2021/18155/

With the native Zabbix server HA cluster feature added in Zabbix 6.0 LTS, it is now possible to quickly configure and deploy a multi-node Zabbix Server HA cluster without using any external tools. Let’s take a look at how we can deploy a Zabbix server HA cluster in just 10 minutes.

The full recording of the speech is available on the official Zabbix Youtube channel.

Why Zabbix needs HA

Let’s dive deeper into what high availability is and try to define what the term High availability entails:

  • A system runs in high availability mode if it does not have a single point of failure
  • A single point of failure is a component failure of which halts the whole system
  • Redundancy is a requirement in systems that use high availability. In our case, we need a redundant component to which we can fail-over in case if the currently active component encounters an issue.
  • The failover process needs to be transparent and automated

In the case of the Zabbix components, the single point of failure is our Zabbix server. Even though Zabbix in itself is very stable, you can still encounter scenarios when a crash happens due to OS level issues or something more trivial – like running out of disk space. If your Zabbix server goes down, all of the data collection, problem detection, and alerting is stopped. That’s why it’s important to have some form of high availability and redundancy for this particular Zabbix component.

How to choose HA for Zabbix

Before the addition of native HA cluster support in Zabbix 6.0 LTS it was possible to use 3rd party HA solutions for Zabbix. This caused an ongoing discussion – which 3rd party solution should I use and how should I configure it for Zabbix components? On top of this, you would also have a new layer of software that requires proper expertise to deploy, configure and manage. There are also cloud-based HA options, but most of the time these incur an extra cost.

Not having the required expertise for the 3rd party high availability tools can cause unwanted downtimes or, at worst, can cause inconsistencies in the Zabbix DB backend. Here are some of the potential scenarios that can be caused by a misconfigured high availability solution:

  • The automatic failover may not be configured properly
  • A split-brain scenario with two nodes running concurrently, potentially causing inconsistencies in the Zabbix database backend
  • Misconfigured STONITH (Shoot the other node in the head) scenarios – potentially causing both nodes to go down

Native Zabbix HA solution

Zabbix 6.0 LTS native high availability solution is easy to set up and all of the required steps are documented in the Zabbix documentation. The native solution does not require any additional expertise and will continue to be officially supported, updated, and improved by Zabbix. Native high availability solution doesn’t require any new software components – the high availability solution stores the information about the Zabbix server node status in the Zabbix database backend.

How Zabbix cluster works

To enable the native high availability cluster for our servers, we first need to start the Zabbix server component in the high availability mode. To achieve this, we need to look at the two new parameters in the /etc/zabbix/zabbix_server.conf configuration file:

  • HANodeName – specify an arbitrary name for your Zabbix server cluster node
  • ExternalAddress – specify the address of the cluster node

Once you have made the changes and added these parameters, don’t forget to restart the Zabbix server cluster nodes to apply the changes.

Zabbix HA Node name

Let’s take a look at the HANodeName parameter. This is the most important configuration parameter – it is mandatory to specify it if you wish to run your Zabbix server in the high availability mode.

  • This parameter is used to specify the name of the particular cluster mode
  • If the HANodeName is not specified, Zabbix server will not start in the cluster mode
  • The node name needs to be unique on each of your nodes

In our example, we can observe a two-node cluster, where zbx-node1 is the active node and zbx-node2 is the standby node. Both of these nodes will send their heartbeats to the Zabbix database backend every 5 seconds. If one node stops sending its heartbeat, another node will take over.

Zabbix HA Node External Address

The second parameter that you will also need to specify is the ExternalAddress parameter.

In our example, we are using the address node1.example.com. The purpose of this parameter is to let the Zabbix frontend know the address of the currently active Zabbix server since the Zabbix frontend component also constantly communicates with the Zabbix server component. If this parameter is not specified, the Zabbix frontend might not be able to connect to the active Zabbix server node.

Zabbix frontend setup

Seasoned Zabbix users might know that the Zabbix frontend has its own configuration file, which usually contains the Zabbix server address and the Zabbix server port for establishing connections from the Zabbix frontend to the Zabbix server. If you are using the Zabbix high availability cluster, then you will have to comment these parameters out since instead of being static, now they depend on the currently active Zabbix server node and will be obtained from the Zabbix backend database.

Putting it all together

In the above example, we can see that we have two nodes – zbx-node1, which is currently active and zbx-node2. These nodes can be reachable by using the external addresses – node1.example.com and node2.example.com for zbx-node1 and zbx-node2 respectively. We can see that we also have deployed multiple frontends. Each of these frontend nodes will connect to the Zabbix backend database, read the address of the currently active node and proceed to connect to that node.

Zabbix HA node types

Zabbix server high availability cluster nodes can have one of the following multiple statuses:

  • Active – The currently active node. Only one node can be active at a time
  • Standby – The node is currently running in standby mode. Multiple nodes can have this status
  • Shutdown – The node was previously detected, but it has been gracefully shut down
  • Unreachable – Node was previously detected but was unexpectedly lost without a shutdown. This can be caused by many different reasons, for example – the node crashing or having network issues

In normal circumstances, you will have an active node and one or more standby nodes. Nodes in shutdown mode are also expected if, for example, you’re performing some maintenance tasks on these nodes. On the other hand, if an active node becomes unreachable, this is when one of the standby nodes will take over.

Zabbix HA Manager

How can we check which node is currently active and which nodes are running in standby mode? First off, we can see this in the Zabbix frontend – we will take a look at this a bit later. We can also check the node status from the command line. On every node – no matter active or standby, you will see that the zabbix_server and ha manager processes have been started. The ha manager process is responsible for checking the high availability node status in the database every 5 seconds and is responsible for taking over if the active node fails.

On the other hand, the currently active Zabbix server node will have many other processes – data collector processes such as pollers and trappers, history and configuration syncers, and many other Zabbix child processes.

Zabbix HA node status

The System information widget has received some changes in Zabbix 6.0 LTS. It is now capable of displaying the status of your Zabbix server high availability cluster and its individual nodes.

The widget can display the current cluster mode, which is enabled in our example and provides a list of all cluster nodes. In our example, we can see that we have 3 nodes – 1 active node,1 stopped node, and 1 node running in standby mode. This way we can not only see the status of our nodes but also their names, addresses, and last access times.

Switching Zabbix HA node

The witching between nodes is done manually. Once you stop the currently active Zabbix server node, another node will automatically take over. Of course, you need to have at least one more node running in standby status, so it can take over from the failed active node.

How failover works?

All nodes report their status every 5 seconds. Whenever you shut down a node, it goes into a shutdown state and in 5 seconds another node will take over. But if a node fails the workflow is a bit different. This is where something called a failover delay is taken into account. By default, this failover delay is 1 minute. The standby node will wait for one minute for the failed active node to update its status and if in one minute the active node is still not visible, then the standby node will take over.

Zabbix cluster tuning

It is possible to adjust the failover delay by using the ha_set_failover_delay runtime command. The supported range of the failover delay is from 10 seconds to 15 minutes. In most cases the default value of 1 minute will work just fine, but there could be some exceptions and it very much depends on the specifics of your environment.

We can also remove a node by using the ha_remove_node runtime command. This command requires us to specify the ID of the node that we wish to remove.

Connecting agents and proxies

Connecting Zabbix agents to your cluster

Now let’s talk about how we can connect Zabbix agents and proxies to your Zabbix cluster. First, let’s take a look at the passive Zabbix agent configuration.

  • Passive Zabbix agents require all nodes to be written in the configuration file under the Server parameter
  • Nodes are specified in a comma-separated list

Once you specify the list of all nodes, the passive Zabbix agent will accept connections from all of the specified nodes.

What about the active Zabbix agents?

  • Active Zabbix agents require all nodes to be written in the configuration file under the ServerActive parameter
  • Nodes need to be separated by semicolons

Notice the difference – comma-separated list for passive Zabbix agents and nodes separated by semicolons for active Zabbix agents!

Connecting Zabbix proxies to your cluster

Proxy configuration is very similar to the agent configuration. Once again – we can have a proxy running either in passive mode or active mode.

For the passive Zabbix proxies, we need to list our cluster nodes under the Server parameter in the proxy configuration file. These nodes should be specified in a comma-separated list. This way the proxies will accept connections from any Zabbix server node. As for the active Zabbix proxies – we need once again to list our nodes under the Server parameter, but this time the node names will be separated by semicolons.

Conclusion – Setting up Zabbix HA cluster

Let’s conclude by going through all of the steps that are required to set up a Zabbix server HA cluster.

  • Start Zabbix server in high availability mode on all of your Zabbix server cluster nodes – this can be done by providing the HANodeName parameter in the Zabbix server configuration file
  • Comment out the $ZBX_SERVER and $ZBX_SERVER_PORT in the frontend configuration file
  • List your cluster nodes in the Server and/or ServerActive parameters in the Zabbix agent configuration file for all of the Zabbix agents
  • List your cluster nodes in the Server parameter for all of your Zabbix proxies
  • For other monitoring types, such as SNMP – make sure your endpoints accept connections from all of the Zabbix server cluster nodes
  • And that’s it – Enjoy!

Zabbix HA workshop and training

Wish to learn more about the Zabbix server high availability cluster and get some hands-on experience with the guidance of a Zabbix certified trainer? Take a look at the following options!

  • The Zabbix server high availability workshop will be hosted shortly after the release of Zabbix 6.0 LTS, which is currently planned for January 2022. One of the workshop sessions will be focused specifically on Zabbix server high availability cluster configuration and troubleshooting.
  • Zabbix Certified professional training course covers the Zabbix server HA cluster configuration and troubleshooting. This is also a great opportunity to discuss your own Zabbix use cases and infrastructure with a Zabbix certified trainer. Feel free to check out our Zabbix training page to learn more!

Questions

Q: What about the high availability for the Zabbix frontend? Is it possible to set it up?
A: This is already supported since Zabbix 5.2. All you have to do is deploy as many Zabbix frontend nodes as you require and don’t forget to properly configure the external address so the Zabbix frontends are able to connect to the Zabbix servers and that’s all!

Q: Does high availability cause a performance impact on the network or the Zabbix backend database?
A: No, this should not be the case. The heartbeats that the cluster nodes send to the database backend are extremely small messages that get recorded in one of the smaller Zabbix database tables, so the performance impact should be negligible.

Q: What is the best practice when it comes to migrating from a 3rd party solution such as PCS/Corosync/Pacemaker to the native Zabbix server high availability cluster? Any suggestions on how that can be achieved?
A: The most complex part here is removing the existing high availability solution without breaking anything in the existing environment. Once that is done, all you have to do is upgrade your Zabbix instance to Zabbix 6.0 LTS and follow the configuration steps described in this post. Remember, that if you’re performing an upgrade instead of a fresh install, the configuration files will not have the new configuration parameters so they will have to be added in manually.

Gaining new insights with Business service monitoring by Aleksandrs Petrovs-Gavrilovs / Zabbix Summit Online 2021

Post Syndicated from Arturs Lontons original https://blog.zabbix.com/gaining-new-insights-with-business-service-monitoring-by-aleksandrs-petrovs-gavrilovs-zabbix-summit-online-2021/17973/

Zabbix 6.0 LTS comes with a complete redesign of the service monitoring. From improved business service scalability to advanced service status calculation logic and alerting. Let’s take a look at the Business Service monitoring feature and how you can use it to ensure full transparency for your business services.

The full recording of the speech is available on the official Zabbix Youtube channel.

Business services can be quite complex. They tend to consist of many different moving parts with redundancy and failover mechanism in place, all of which need to be taken into consideration when we wish to analyze the current status of our services.

BSM Checklist

Let’s take a look at what needs to be done so we can successfully define and monitor our business service:

  • First, we have to define what exactly is our business service and what components does it consist of?
  • We need to understand what are our expectations when it comes to service uptime. When should the service be up and running? What are the acceptable downtimes? Should it run 24/7/365 or maybe it’s a service that is critical only during our working hours?
  • Once we know what needs to be monitored, we need to make sure that we are collecting the data that reflects the status of different service components.
  • Finally – we have to find a suitable tool to track and measure our service.

Define your business

Let’s take a look at how a business may look like. As I mentioned before – business services can consist of many different components. Let’s take a look at an example of how business services may look like:

The tree structure here represents our Business services. We can see that we have classified the services into two branches – Internal services and User services. The User services consist of components such as Websites, Helpdesk services, Phones. These general services are based on lower-level components such as the actual physical phones for the phone service, underlying software for the Website and Helpdesk services, and so on.

This can make things quite complicated since usually, organizations will have many more components to take care of. That’s why, let’s see how we can simplify this tree and define our services in a more simple manner, like the service tree below:

Now we are left with only 3 levels for our services. Let’s take a look at how we can move this to Zabbix:

Here we can see a high-level view of our services. Once again we have our Internal services and User services. These here high-level services consist of child services and define what these components consist of and what their SLAs should be. We can also define tags to provide additional details to our services – which customer uses the service, the type of service, maybe even the location that the service is used in – this part is completely up to your imagination.

Once you have defined the services, their respective components and have linked them to the problems by using tags, you will finally be able to see the full picture. Zabbix will display not only the status of the service but also the root cause of the problem. This way we can provide service status information not only on the service owner level but also provide information that your technical staff can use to fix the issue.

Configuring SLAs

Configuring Business Service monitoring can be done from the MonitoringServices section. In Zabbix 6.0 LTS you are not required to start defining the service tree from the root service. Now you can define your own root level services. To create a service, all we have to do is switch to the Edit mode by clicking the Edit button in the upper right corner of the services screen and click the Create service button right next to it. We have also made some additional changes to the service section UI/UX. Now you also have multiple fast edit buttons next to each service. You can use them to Add a child service, edit an existing service, or delete an existing service.

Next, let’s take a look at the actual service creation steps.

  • We need to provide a name for our service
  • If the service is not a top-level service you have to select a parent service
  • Define problem tags. Problems tagged with the matching tags will affect the service status
  • Define the status calculation rule

Major improvements have been made to status calculation rules. We still support the old logic of the Use the most critical of child services / Most critical if all children have problems / Set status to ok, but there are also many advanced service status calculation rules.

  • Now we have the ability to select a specific status (Warning, Average, High, and so on) for our service in case of a problem
  • Select the number of children, More than/Less than N children, Percentage of children that should be affected for the parent service status change to take place
  • Define weights for child services and perform status changes based on the weight of the affected child services

Child services can also apply different propagation rules for the parent service

  • Child services can Increase or decrease the parent status service status by N severities, ignore the child service, apply a fixed status or apply the status depending on the problem severity

For our example let’s use an HA cluster use case. HA clusters consist of multiple nodes – for our example, we will use 3 nodes.

  • First, we define that the HA cluster consists of 3 nodes – 3 child services.
  • Each node will have equal weight – 1
  • On the parent service, we will define multiple status rules
    • If the weight of the child services is 1 (1 node is down) – the parent service will change its status to Warning
    • If the weight of the child services is 2 (2 nodes are down) – the parent service will change its status to Average
    • If the weight of the child services is 3 (all nodes are down) – the parent service will change its status to Disaster

In the above image, we can see how the corresponding status change will look like in the Services section. Note that we can also see the root cause of the parent service status change in the Root cause column.

We also have the ability to define the acceptable SLAs as well as SLA calculation uptime and downtime periods for our services. We have the option to define scheduled uptimes and downtimes, during which SLA should or shouldn’t be calculated (Such as weekends, for example), as well as one-time downtimes for one-time maintenance purposes.

Services can utilize tags to provide additional information about your services, such as the service type, service customer, service location, and more. On top of that, tags can also be used in the Service action condition logic, so you can define granular alerting logic for your service status changes.

The Child services tab allows you to quickly look at the related child services, their problem tags, and status calculation rules.

Child services can also be crosslinked between multiple parent services. This means that you don’t have to duplicate and recreate child services if they are used as a component of multiple parent services.

Track, solve and measure

Once we have configured our service, what remains is keeping track of our service statuses, SLAs and staying notified about service status changes and their root cause.

For this purpose, it is vital to secure access to our services. This is especially critical for MSPs, which may have multiple customers and each customer should have access only to the services related to that particular customer. To that end, the Roles section has also received an update related to the Service permissions. We can now define Read-Write and Read access to either specific services or services marked with a particular tag.

The Root cause section displays the root cause problems that affected the service status change. You will be able to click on the root cause problem and open it in the Problems section for further analysis of what caused your services to change their status and which host has been affected by it.

Previously I mentioned alerting on service status change, so let’s dig deeper into that. In Zabbix 6.0 LTS we have added a new type of action – Service actions. Zabbix can now react to service status changes and notify you when a service changes its status. The Service action conditions can analyze if a status has been changed on a particular service, a service that matches or contains a specific string in its name, tag, or tag value. If the conditions are true, Zabbix can send out an email, deliver a phone call or an SMS, create a ticket in your helpdesk system or perform any other alerting and notification workflow.

Many other BSM features are coming as we continue the development of Zabbix 6.0 LTS:

  • SLA graphical visualizations with support for over 100k services
  • Daily, Monthly, Weekly SLA reports
  • New service tree and SLA reporting widgets available from the dashboard
  • Service tree import and export
  • Impact analysis – see which service affects other related services in what way.

Questions

Q: Will the existing services be migrated to Zabbix 6.0 LTS?
A: The existing services will be migrated to Zabbix 6.0 LTS during the upgrade. All of the configuration for the existing services will stay intact after the migration.

Q: Does host maintenance suppress service calculation in Zabbix 6.0 LTS?
A: Host maintenance will not affect the service calculation. If you wish to define maintenance periods for your services –  use scheduled or one-time downtime options when configuring an individual service.

Q: How are the Fixed status and Ignore this service calculation rules going to work?
A: Fixed status services will not change their status no matter what happens to the child services – the service status will remain fixed. As for Ignore this service – the service status change will be ignored and will not affect the parent services.

What’s new in Zabbix 6.0 LTS by Artūrs Lontons / Zabbix Summit Online 2021

Post Syndicated from Arturs Lontons original https://blog.zabbix.com/whats-new-in-zabbix-6-0-lts-by-arturs-lontons-zabbix-summit-online-2021/17761/

Zabbix 6.0 LTS comes packed with many new enterprise-level features and improvements. Join Artūrs Lontons and take a look at some of the major features that will be available with the release of Zabbix 6.0 LTS.

The full recording of the speech is available on the official Zabbix Youtube channel.

If we look at the Zabbix roadmap and Zabbix 6.0 LTS release in particular, we can see that one of the main focuses of Zabbix development is releasing features that solve many complex enterprise-grade problems and use cases. Zabbix 6.0 LTS aims to:

  • Solve enterprise-level security and redundancy requirements
  • Improve performance for large Zabbix instances
  • Provide additional value to different types of Zabbix users – DevOPS and ITOps teams, Business process owner, Managers
  • Continue to extend Zabbix monitoring and data collection capabilities
  • Provide continued delivery of official integrations with 3rd party systems

Let’s take a look at the specific Zabbix 6.0 LTS features that can guide us towards achieving these goals.

Zabbix server High Availability cluster

With the release of Zabbix 6.0 LTS, Zabbix administrators will now have the ability to deploy Zabbix server HA cluster out-of-the-box. No additional tools are required to achieve this.

Zabbix server HA cluster supports an unlimited number of Zabbix server nodes. All nodes will use the same database backend – this is where the status of all nodes will be stored in the ha_node table. Nodes will report their status every 5 seconds by updating the corresponding record in the ha_node table.

To enable High availability, you will first have to define a new parameter in the Zabbix server configuration file: HANodeName

  • Empty by default
  • This parameter should contain an arbitrary name of the HA node
  • Providing value to this parameter will enable Zabbix server cluster mode

Standby nodes monitor the last access time of the active node from the ha_node table.

  • If the difference between last access time and current time reaches the failover delay, the cluster fails over to the standby node
  • Failover operation is logged in the Zabbix server log

It is possible to define a custom failover delay – a time window after which an unreachable active node is considered lost and failover to one of the standby nodes takes place.

As for the Zabbix proxies, the Server parameter in the Zabbix proxy configuration file now supports multiple addresses separated by a semicolon. The proxy will attempt to connect to each of the nodes until it succeeds.

Other HA cluster related features:

  • New command-line options to check HA cluster status
  • hanode.get API method to obtain the list of HA nodes
  • The new internal check provides LLD information to discover Zabbix server HA nodes
  • HA Failover event logged in the Zabbix Audit log
  • Zabbix Frontend will automatically switch to the active Zabbix server node

You can find a more detailed look at the Zabbix Server HA cluster feature in the Zabbix Summit Online 2021 speech dedicated to the topic.

Business service monitoring

The Services section has received a complete redesign in Zabbix 6.0 LTS. Business Service Monitoring (BSM) enables Zabbix administrators to define services of varying complexity and monitor their status.

BSM provides added value in a multitude of use cases, where we wish to define and monitor services based on:

  • Server clusters
  • Services that utilize load balancing
  • Services that consist of a complex IT stack
  • Systems with redundant components in place
  • And more

Business Service monitoring has been designed with scalability in mind. Zabbix is capable of monitoring over 100k services on a single Zabbix instance.

For our Business Service example, we used a website, which depends on multiple components such as the network connection, DB backend, Application server, and more. We can see that the service status calculation is done by utilizing tags and deciding if the existing problems will affect the service based on the problem tags.

In Zabbix 6.0 LTS there are many ways how service status calculations can be performed. In case of a problem, the service state can be changed to:

  • The most critical problem severity, based on the child service problem severities
  • The most critical problem severity, based on the child service problem severities, only if all child services are in a problem state
  • The service is set to constantly be in an OK state

Changing the service status to a specific problem severity if:

  • At least N or N% of child services have a specific status
  • Define service weights and calculate the service status based on the service weights

There are many other additional features, all of which are covered in our Zabbix Summit Online 2021 speech dedicated to Business Service monitoring:

  • Ability to define permissions on specific services
  • SLA monitoring
  • Business Service root cause analysis
  • Receive alerts and react on Business Service status change
  • Define Business Service permissions for multi-tenant environments

New Audit log schema

The existing audit log has been redesigned from scratch and now supports detailed logging for both Zabbix server and Zabbix frontend operations:

  • Zabbix 6.0 LTS introduces a new database structure for the Audit log
  • Collision resistant IDs (CUID) will be used for ID generation to prevent audit log row locks
  • Audit log records will be added in bulk SQL requests
  • Introducing Recordset ID column. This will help users recognize which changes have been made in a particular operation

The goal of the Zabbix 6.0 LTS audit log redesign is to provide reliable and detailed audit logging while minimizing the potential performance impact on large Zabbix instances:

  • Detailed logging of both Zabbix frontend and Zabbix server records
  • Designed with minimal performance impact in mind
  • Accessible via Zabbix API

Implementing the new audit log schema is an ongoing effort – further improvements will be done throughout the Zabbix update life cycle.

Machine learning

New trend functions have been added which utilize machine learning to perform anomaly detection and baseline monitoring:

  • New trend function – trendstl, allows you to detect anomalous metric behavior
  • New trend function – baselinewma, returns baseline by averaging data periods in seasons
  • New trend function – baselinedev, returns the number of standard deviations

An in-depth look into Machine learning in Zabbix 6.0 LTS is covered in our Zabbix Summit Online 2021 speech dedicated to machine learning, anomaly detection, and baseline monitoring.

New ways to visualize your data

Collecting and processing metrics is just a part of the monitoring equation. Visualization and the ability to display our infrastructure status in a single pane of glass are also vital to large environments. Zabbix 6.0 LTS adds multiple new visualization options while also improving the existing features.

  • The data table widget allows you to create a summary view for the related metric status on your hosts
  • The Top N and Bottom N functions of the data table widget allow you to have an overview of your highest or lowest item values
  • The single item widget allows you to display values for a single metric
  • Improvements to the existing vector graphs such as the ability to reference individual items and more
  • The SLA report widget displays the current SLA for services filtered by service tags

We are proud to announce that Zabbix 6.0 LTS will provide a native Geomap widget. Now you can take a look at the current status of your IT infrastructure on a geographic map:

  • The host coordinates are provided in the host inventory fields
  • Users will be able to filter the map by host groups and tags
  • Depending on the map zoom level – the hosts will be grouped into a single object
  • Support of multiple Geomap providers, such as OpenStreetMap, OpenTopoMap, Stamen Terrain, USGS US Topo, and others

Zabbix agent – improvements and new items

Zabbix agent and Zabbix agent 2 have also received some improvements. From new items to improved usability – both Zabbix agents are now more flexible than ever. The improvements include such features as:

  • New items to obtain additional file information such as file owner and file permissions
  • New item which can collect agent host metadata as a metric
  • New item with which you can count matching TCP/UDP sockets
  • It is now possible to natively monitor your SSL/TLS certificates with a new Zabbix agent2 item. The item can be used to validate a TLS/SSL certificate and provide you additional certificate details
  • User parameters can now be reloaded without having to restart the Zabbix agent

In addition, a major improvement to introducing new Zabbix agent 2 plugins has been made. Zabbix agent 2 now supports loading stand-alone plugins without having to recompile the Zabbix agent 2.

Custom Zabbix password complexity requirements

One of the main improvements to Zabbix security is the ability to define flexible password complexity requirements. Zabbix Super admins can now define the following password complexity requirements:

  • Set the minimum password length
  • Define password character requirements
  • Mitigate the risk of a dictionary attack by prohibiting the usage of the most common password strings

UI/UX improvements

Improving and simplifying the existing workflows is always a priority for every major Zabbix release. In Zabbix 6.0 LTS we’ve added many seemingly simple improvements, that have major impacts related to the “feel” of the product and can make your day-to-day workflows even smoother:

  • It is now possible to create hosts directly from MonitoringHosts
  • Removed MonitoringOverview section. For improved user experience, the trigger and data overview functionality can now be accessed only via dashboard widgets.
  • The default type of information for items will now be selected automatically depending on the item key.
  • The simple macros in map labels and graph names have been replaced with expression macros to ensure consistency with the new trigger expression syntax

New templates and integrations

Adding new official templates and integrations is an ongoing process and Zabbix 6.0 LTS is no exception here’s a preview for some of the new templates and integrations that you can expect in Zabbix 6.0 LTS:

  • f5 BIG-IP
  • Cisco ASAv
  • HPE ProLiant servers
  • Cloudflare
  • InfluxDB
  • Travis CI
  • Dell PowerEdge

Zabbix 6.0 also brings a new GitHub webhook integration which allows you to generate GitHub issues based on Zabbix events!

Other changes and improvements

But that’s not all! There are more features and improvements that await you in Zabbix 6.0 LTS. From overall performance improvements on specific Zabbix components, to brand new history functions and command-line tool parameters:

  • Detect continuous increase or decrease of values with new monotonic history functions
  • Added utf8mb4 as a supported MySQL character set and collation
  • Added the support of additional HTTP methods for webhooks
  • Timeout settings for Zabbix command-line tools
  • Performance improvements for Zabbix Server, Frontend, and Proxy

Questions and answers

Q: How can you configure geographical maps? Are they similar to regular maps?

A: Geomaps can be used as a Dashboard widget. First, you have to select a Geomap provider in the Administration – General – Geographical maps section. You can either use the pre-defined Geomap providers or define a custom one. Then, you need to make sure that the Location latitude and Location longitude fields are configured in the Inventory section of the hosts which you wish to display on your map. Once that is done, simply deploy a new Geomap widget, filter the required hosts and you’re all set. Geomaps are currently available in the latest alpha release, so you can get some hands-on experience right now.

Q: Any specific performance improvements that we can discuss at this point for Zabbix 6.0 LTS?

A: There have been quite a few. From the frontend side – we have improved the underlying queries that are related to linking new templates, therefore the template linkage performance has increased. This will be very noticeable in large instances, especially when linking or unlinking many templates in a single go.
There have also been improvements to Server – Proxy communication. Specifically – the logic of how proxy frees up uncompressed data. We’ve also introduced improvements on the DB backend side of things – from general improvements to existing queries/logic, to the introduction of primary keys for history tables, which we are still extensively testing at this point.

Q: Will you still be able to change the type of information manually, in case you have some advanced preprocessing rules?

A: In Zabbix 6.0 LTS Zabbix will try and automatically pick the corresponding type of information for your item. This is a great UX improvement since you don’t have to refer to the documentation every time you are defining a new item. And, yes, you will still be able to change the type of information manually – either because of preprocessing rules or if you’re simply doing some troubleshooting.

Summary of Zabbix Summit Online 2021, Zabbix 6.0 LTS release date and Zabbix Workshops

Post Syndicated from Arturs Lontons original https://blog.zabbix.com/summary-of-zabbix-summit-online-2021-zabbix-6-0-lts-release-date-and-zabbix-workshops/17155/

Now that the Zabbix Summit Online 2021 has concluded, we are thrilled to report we hosted attendees from over 3000 organizations from more than 130 countries all across the globe.

This year, the main focus of the speeches was the upcoming Zabbix 6.0 LTS release, as well as speeches focused on automating Zabbix data collection and configuration, Integrating Zabbix within existing company infrastructures, and migrating from legacy tools to Zabbix. 21 speakers in total presented their use cases and talked about new Zabbix features during the Summit with over 8 hours of content.

In case you missed the Summit or wish to come back to some of the speeches – both the presentations (in PDF format) and the videos of the speeches are available on the Zabbix Summit Online 2021 Event page.

Zabbix 6.0 LTS release date

As for Zabbix 6.0 LTS – as per our statement during the event, you can expect Zabbix 6.0 LTS to release in early 2022. At the time of this post, the latest pre-release version is Zabbix 6.0 Alpha 7, with the first Beta version scheduled for release VERY soon. Feel free to deploy the latest pre-release version and take a look at features such as Geomaps, Business Service monitoring, improved Audit log, UX improvements, Anomaly detection with Machine Learning, and more! The list of the latest released Zabbix 6.0 versions as well as the improvements and fixes they contain is available in the Release notes section of our website.

Zabbix 6.0 LTS Workshops

The workshops will focus on particular Zabbix 6.0 LTS features and will be available once the Zabbix 6.0 LTS is released. The workshops will provide a unique chance to learn and practice the configuration of specific Zabbix 6.0 LTS features under the guidance of a certified Zabbix trainer at absolutely no cost! Some of the topics covered in the workshops will include – Deploying Zabbix server HA cluster, Creating triggers for Baseline monitoring and Anomaly detection, Displaying your infrastructure status on Geomaps, Deploying Business Service monitoring with root cause analysis, and more!

Upcoming events

But there’s more! On December 9 2021 Zabbix will host PostgreSQL Monitoring Day with Zabbix & Postgres Pro. The speeches will focus on monitoring PostgreSQL databases, running Zabbix on PostgreSQL DB backends with TimescaleDB, and securing your Zabbix + PostgreSQL instances. If you’re currently using PostgreSQL DB backends r plan to do so in the future – you definitely don’t want to miss out!

As for 2022 – you can expect multiple meetups regarding Zabbix 6.0 LTS features and use cases, as well as events focused on specific monitoring use cases. More information will be publicly available with the release of Zabbix 6.0 LTS.

Zabbix 6.0 LTS at Zabbix Summit Online 2021

Post Syndicated from Arturs Lontons original https://blog.zabbix.com/zabbix-6-0-lts-at-zabbix-summit-online-2021/16115/

With Zabbix Summit Online 2021 just around the corner, it’s time to have a quick overview of the 6.0 LTS features that we can expect to see featured during the event. The Zabbix 6.0 LTS release aims to deliver some of the long-awaited enterprise-level features while also improving the general user experience, performance, scalability, and many other aspects of Zabbix.

Native Zabbix server cluster

Many of you will be extremely happy to hear that Zabbix 6.0 LTS release comes with out-of-the-box High availability for Zabbix Server. This means that HA will now be supported natively, without having to use external tools to create Zabbix Server clusters.

The native Zabbix Server cluster will have a speech dedicated to it during the Zabbix Summit Online 2021. You can expect to learn both the inner workings of the HA solution, the configuration and of course the main benefits of using the native HA solution. You can also take a look at the in-development version of the native Zabbix server cluster in the latest Zabbix 6.0 LTS alpha release.

Business service monitoring and root cause analysis

Service monitoring is also about to go through a significant redesign, focusing on delivering additional value by providing robust Business service monitoring (BSM) features. This is achieved by delivering significant additions to the existing service status calculation logic. With features such as service weights, service status analysis based on child problem severities, ability to calculate service status based on the number or percentage of children in a problem state, users will be able to implement BSM on a whole new level. BSM will also support root cause analysis – users will be informed about the root cause problem of the service status change.

All of this and more, together with examples and use cases will be covered during a separate speech dedicated to BSM. In addition, some of the BSM features are available in the latest Zabbix 6.0 LTS alpha release – with more to come as we continue working on the Zabbix 6.0 release.

Audit log redesign

The Audit log is another existing feature that has received a complete redesign. With the ability to log each and every change performed both by the Zabbix Server and Zabbix Frontend, the Audit log will become an invaluable source of audit information. Of course, the redesign also takes performance into consideration – the redesign was developed with the least possible performance impact in mind.

The audit log is constantly in development and the current Zabbix 6.0 LTS alpha release offers you an early look at the feature. We will also be covering the technical details of the new audit log implementation during the Summit and will explain how we are able to achieve minimal performance impact with major improvements to Zabbix audit logging.

Geographical maps

With Geographical maps, our users can finally display their entities on a geographical map based on the coordinates of the entity. Geographical maps can be used with multiple geographical map providers and display your hosts with their most severe problems. In addition, geographical maps will react dynamically to Zoom levels and support filtering.

The latest Zabbix 6.0 Alpha release includes the Geomap widget – feel free to deploy it in your QA environment, check out the different map providers, filter options and other great features that come with this widget.

Machine learning

When it comes to problem detection, Zabbix 6.0 LTS will deliver multiple trend new functions. A specific set of functions provides machine learning functionality for Anomaly detection and Baseline monitoring.

The topic will be covered in-depth during the Zabbix Summit Online 2021. We will look at the configuration of the new functions and also take a deeper dive at the logic and algorithms used under the hood.

During the Zabbix Summit Online 2021, we will also cover many other new features, such as:

  • New Dashboard widgets
  • New items for Zabbix Agent
  • New templates and integrations
  • Zabbix login password complexity settings
  • Performance improvements for Zabbix Server, Zabbix Proxy, and Zabbix Frontend
  • UI and UX improvements
  • Zabbix login password complexity requirements
  • New history and trend functions
  • And more!

Not only will you get the chance to have an early look at many new features not yet available in the latest alpha release, but also you will have a great chance to learn the inner workings of the new features, the upgrade and migration process to Zabbix 6.0 LTS and much more!

We are extremely excited to share all of the new features with our community, so don’t miss out – take a look at the full Zabbix Summit online 2021 agenda and register for the event by visiting our Zabbix Summit page, and we will see you at the Zabbix Summit Online 2021 on November 25!

Zabbix migration in a mid-sized bank environment

Post Syndicated from Angelo Porta original https://blog.zabbix.com/zabbix-migration-in-a-mid-sized-bank-environment/13040/

A real CheckMK/LibreNMS to Zabbix migration for a mid-sized Italian bank (1,700 branches, many thousands of servers and switches). The customer needed a very robust architecture and ancillary services around the Zabbix engine to manage a robust and error-free configuration.

Content

I. Bank monitoring landscape (1:45)
II. Zabbix monitoring project (h2)
III. Questions & Answers (19:40)

Bank monitoring landscape

The bank is one of the 25 largest European banks for market capitalization and one of the 10 largest banks in Italy for:

  • branch network,
  • loans to customers,
  • direct funding from customers,
  • total assets,

At the end of 2019, at least 20 various monitoring tools were used by the bank:

  • LibreNMS for networking,
  • CheckMK for servers besides Microsoft,
  • Zabbix for some limited areas inside DCs,
  • Oracle Enterprise Monitor,
  • Microsoft SCCM,
  • custom monitoring tools (periodic plain counters, direct HTML page access, complex dashboards, etc.)

For each alert, hundreds of emails were sent to different people, which made it impossible to really monitor the environment. There was no central monitoring and monitoring efforts were distributed.

The bank requirements:

  • Single pane of glass for two Data Centers and branches.
  • Increased monitoring capabilities.
  • Secured environment (end-to-end encryption).
  • More automation and audit features.
  • Separate monitoring of two DCs and branches.
  • No direct monitoring: all traffic via Zabbix Proxy.
  • Revised and improved alerting schema/escalation.
  • Parallel with CheckMK and LibreNMS for a certain period of time.

Why Zabbix?

The bank has chosen Zabbix among its competitors for many reasons:

  • better cross feature on the network/server/software environment;
  • opportunity to integrate with other internal bank software;
  • continuous enhancements on every Zabbix release;
  • the best integration with automation software (Ansible); and
  • personnel previous experience and skills.

Zabbix central infrastructure — DCs

First, we had to design one infrastructure able to monitor many thousands of devices in two data centers and the branches, and many items and thousands of values per second, respectively.

The architecture is now based on two database servers clusterized using Patroni and Etcd, as well as many Zabbix proxies (one for each environment — preproduction, production, test, and so on). Two Zabbix servers, one for DCs and another for the branches. We also suggested deploying a third Zabbix server to monitor the two main Zabbix servers. The DC database is replicated on the branches DB server, while the branches DB is replicated on the server handling the DCs using Patroni, so two copies of each database are available at any point in time. The two data centers are located more than 50 kilometers apart from each other. In this picture, the focus is on DC monitoring:

Zabbix central infrastructure — DCs

Zabbix central infrastructure — branches

In this picture the focus is on branches.

Before starting the project, we projected one proxy for each branch, that is, more or less 1,500 proxies. We changed this initial choice during implementation by reducing branch proxies to four.

Zabbix central infrastructure — branches

Zabbix monitoring project

New infrastructure

Hardware

  • Two nodes bare metal Cluster for PostgreSQL DB.
  • Two bare Zabbix Engines — each with 2 Intel Xeon Gold 5120 2.2G, 14C/28T processors, 4 NVMe disks, 256GB RAM.
  • A single VM for Zabbix MoM.
  • Another bare server for databases backup

Software

  • OS RHEL 7.
  • PostgreSQL 12 with TimeScaleDB 1.6 extension.
  • Patroni Cluster 1.6.5 for managing Postgres/TimeScaleDB.
  • Zabbix Server 5.0.
  • Proxy for metrics collection (5 for each DC and 4 for branches).

Zabbix templates customization

We started using Zabbix 5.0 official templates. We deleted many metrics and made changes to templates keeping in mind a large number of servers and devices to monitor. We have:

  • added throttling and keepalive tuning for massive monitoring;
  • relaxed some triggers and related recovery to have no false positives and false negatives;
  • developed a new Custom templates module for Linux Multipath monitoring;
  • developed a new Custom template for NFS/CIFS monitoring (ZBXNEXT 6257);
  • developed a new custom Webhook for event ingestion on third-party software (CMS/Ticketing).

Zabbix configuration and provisioning

  • An essential part of the project was Zabbix configuration and provisioning, which was handled using Ansible tasks and playbook. This allowed us to distribute and automate agent installation and associate the templates with the hosts according to their role in the environment and with the host groups using the CMDB.
  • We have also developed some custom scripts, for instance, to have user alignment with the Active Directory.
  • We developed the single sign-on functionality using the Active Directory Federation Service and Zabbix SAML2.0 in order to interface with the Microsoft Active Directory functionality.

 

Issues found and solved

During the implementation, we found and solved many issues.

  • Dedicated proxy for each of 1,500 branches turned out too expensive to provide maintenance and support. So, it was decided to deploy fewer proxies and managed to connect all the devices in the branches using only four proxies.
  • Following deployment of all the metrics and the templates associated with over 10,000 devices, the Data Center database exceeded 3.5TB. To decrease the size of the database, we worked on throttling and on keep-alive and had to increase the keep-alive from 15 to 60 minutes and lower the sample interval to 5 minutes.
  • There is no official Zabbix Agent for Solaris 10 operating system. So, we needed to recompile and test this agent extensively.
  • The preprocessing step is not available for NFS stale status (ZBXNEXT-6257).
  • We needed to increase the maximum length of user macro to 2,048 characters on the server-side (ZBXNEXT-2603).
  • We needed to ask for JavaScript preprocessing user macros support (ZBXNEXT-5185).

Project deliverables

  • The project was started in April 2020, and massive deployment followed in July/August.
  • At the moment, we have over 5,000 monitored servers in two data centers and over 8,000 monitored devices in branches — servers, ATMs, switches, etc.
  • Currently, the data center database is less than 3.5TB each, and the branches’ database is about 0.5 TB.
  • We monitor two data centers with over 3,800 NPVS (new values per second).
  • Decommissioning of LibreNMS and CheckML is planned for the end of 2020.

Next steps

  • To complete the data center monitoring for other devices — to expand monitoring to networking equipment.
  • To complete branch monitoring for switches and Wi-Fi AP.
  • To implement Custom Periodic reporting.
  • To integrate with C-level dashboard.
  • To tune alerting and escalation to send the right messages to the right people so that messages will not be discarded.

Questions & Answers

Question. Have you considered upgrading to Zabbix 5.0 and using TimeScaleDB compression? What TimeScaleDB features are you interested in the most — partitioning or compression?

Answer. We plan to upgrade to Zabbix 5.0 later. First, we need to hold our infrastructure stress testing. So, we might wait for some minor release and then activate compression.

We use Postgres solutions for database, backup, and cluster management (Patroni), and TimeScaleDB is important to manage all this data efficiently.

Question. What is the expected NVPS for this environment?

Answer. Nearly 4,000 for the main DC and about 500 for the branches — a medium-large instance.

Question. What methods did you use to migrate from your numerous different solutions to Zabbix?

Answer. We used the easy method — installed everything from scratch as it was a complex task to migrate from too many different solutions. Most of the time, we used all monitoring solutions to check if Zabbix can collect the same monitoring information.