Tag Archives: How-to

Podman Container Monitoring with Prometheus Exporter, part 1

2025-06-10 Janis Eidaks

Post Syndicated from Janis Eidaks original https://blog.zabbix.com/podman-container-monitoring-with-prometheus-exporter-part-1/30513/

In part one of this blog post, I will show you how to monitor Podman pods using HTTP agent item to retrieve data from the Prometheus Podman exporter. Let’s get started!

Table of Contents

Installing and checking Prometheus Podman exporter

First, you will need to install and enable the Prometheus Podman exporter (my OS is CentOS Stream release 9). Then, check that the service is active and running.

# dnf install -y prometheus-podman-exporter

# systemctl enable prometheus-podman-exporter –now

# systemctl status prometheus-podman-exporter

You can check that you are getting the data from the exporter with either the curl command from the machine/VM where the Prometheus podman exporter is installed and started:

# curl http://localhost:9882/metrics

Fig 1. Output of Prometheus podman exporter in CLI

Or through the browser (replace abc with the machine’s IP/DNS ): abc:9882/metrics.

Fig 2. Output of Prometheus Podman exporter in browser

A line starting with # is a comment and contains an explanation regarding the metric; in this case, podman_container_block_input_total will return data in bytes. In Figure 2, after the comments, you can see several podman_container_block_input_total metrics, one for each container, with different container IDs, pod IDs, and pod names listed in each metric. The metric’s value is displayed on the right side after curly brackets.

Creating a template and template items

Next, I will create a template Podman containers by HTTP and Prometheus where I will put all of the entities (everything will be created on the template). In the template, I will create an item Podman info, which will gather all of the necessary data at defined intervals. This approach will be convenient from a data collection standpoint as the same item data will be used for LLD and item prototypes. During testing, you can set “History” to store data for some time, and when everything is working as expected, then set “History” not to keep any data. This item will be used for the Low-Level Discovery rule and the item prototype.

The item Podman info parameters are as follows:

Template: Podman containers by HTTP and Prometheus

○ Item
  ▪ Name:         Podman info
  ▪ Type          HTTP agent
  ▪ Key:          podman.info
  ▪ Type of inf   text
  ▪ URL           http://{HOST.CONN}:9882/metrics
  ▪ Request type  GET
  ▪ Update int.   5m
  ▪ Req status c. 200
  ▪ History       Do not store
◊ Tags
  ▪ Podman:raw

At this moment, this item will contain just raw data, without any preprocessing steps applied. The IP address will be taken from any host interface added to the host. You will get an error message if the host has no interface.

Fig 4. Error on the host with the linked template without any interface

If you do not want to add an interface to the host, you can define a user macro on the template level and use that user macro in the items URL. After adding the template to the host, just modify the user macro value on the host to correct IP/DNS name.

Fig 6. Template item for data gathering with user macro instead of built in macro from host interface

I can also create an item to determine the number of containers created. I can count specific Prometheus pattern occurrences in the master item to determine this. For this, I will use the podman_container_state parameter. Likewise, I could use different parameters, such as podman_container_info, and count the occurrences of such a pattern. The parameters of the item container count:

Template: Podman containers by HTTP and Prometheus

○ Item
  ▪ Name:         Container count
  ▪ Type          Dependent item
  ▪ Key:          container.count
  ▪ Type of inf   Numeric (unsigned)
  ▪ Master item   Podman containers by HTTP and Prometheus: Podman info
◊ Tags
  ▪ Containers:total
♯ Preprocessing
  ▪ Prometheus pattern     podman_container_state     count

Fig 7. Template item preprocessing step for counting the total number of containers

Creating a Discovery rule in template

Next, the LLD rule will be created to discover Podman pods. It will be a dependent LLD rule based on a Podman info item with a preprocessing step to convert the Prometheus pattern data to JSON format. The caveat is that the LLD discovery will be executed as frequently as the data is received for the item. If there are a lot of hosts with such a template, there will be a lot of LLD processes executed, which can put a strain on your Zabbix instance.

To rectify this issue, I will add a preprocessing step: discard unchanged with heartbeat (as there are no dynamic parameters in the extracted pattern, otherwise we would need to filter out dynamically changing information). For LLD discovery, the recommended interval is around 1h. Additionally, LLD macros will be created from selected JSNOPath variables. The parameters of the LLD rule are shown below.

Template: Podman containers by HTTP and Prometheus

▲ Discovery rule
  ▪ Name:                   POD discovery
  ▪ Type                    Dependent item
  ▪ Key:                    training.pod.discovery
  ▪ Master item             Podman containers by HTTP and Prometheus: Podman info
  ▪ Delete lost resources  After 10d
  ▪ Disable lost resources Immediately
♯ Preprocessing
  ▪ Prometheus to JSON     podman_pod_info
  ▪ Discard unchanged with heartbeat 1h
♦ LLD Macros
  ▪ {#POD.ID}              $.labels.id
  ▪ {#POD.NAME}            $.labels.name

Fig 9. Discovery rule: Pod discovery preprocessing tab

Fig 10. Discovery rule: Pod discovery LLD macros tab

The block diagram below will show how the data is transformed. First, a preprocessing step is applied to the data to convert the Prometheus pattern to JSON format, as all data for LLD must be supplied in JSON format.

In the example below, the matching queried pattern is returned in JSON format after this preprocessing step.

Fig 11. Discovery rule preprocessing step: Prometheus to JSON with pattern podman_pod_info

After the preprocessing step, we can assign specific JSONPATH values to LLD macros.

Fig 12. Discovery rule LLD macros: assigning relevant JSONPATH to LLD macros

Creating a template Discovery rule: item prototypes

Now that we have discovered the macros we are interested in, the discovered macros can be used for further prototype (ITEM/HOST/TRIGGER) creation. In this example, I am using the same master item for LLD discovery and the dependent item prototypes, because it is convenient for me, and all the information is available in one item. But usually, there are scenarios where you have to use one item’s data for discovery and the data of another item for populating the prototype values.

In this case, I am interested in the pod ID, when the pod was created, the number of containers in the pod, and the state of the pod. Therefore, I will create the item prototypes and use the LLD macro in the name, key, and preprocessing step. Zabbix will cycle through the discovered LLD macro values and create the items based on the prototype by replacing the LLD macro with discovered values. Although you can set matching item prototype names (which will be confusing), you still have to use the LLD macro in the item key so that different item keys are generated – otherwise, you will get an error regarding duplicate keys. The item prototype parameters are given below.

Fig 13. Low-level discovery rule and item prototypes based on the same item.

Template: Podman containers by HTTP and Prometheus; Discovery rule: POD discovery

○ Item prototype #1
  ▪ Name:         POD ID: [{#POD.NAME}]
  ▪ Type          Dependent item
  ▪ Key:          pod.id[{#POD.NAME}]
  ▪ Type of inf   Character
  ▪ Master item   Podman containers by HTTP and Prometheus: Podman info
♦ Tags
  ▪ Metric:ID
  ▪ Pod:{#CONTAINER.NAME}
♯ Preprocessing
  ▪ Prometheus pattern     podman_pod_containers{id="{#POD.ID}"}          label    id

○ Item prototype #2
  ▪ Name:         POD state: {#POD.NAME}
  ▪ Type          Dependent item
  ▪ Key:          pod.state[{#POD.NAME}]
  ▪ Type of inf   Numeric (float)
  ▪ Master item   Podman containers by HTTP and Prometheus: Podman info
  ▪ Value mapping POD state
♦ Tags
  ▪ Metric:state
  ▪ Pod:{#CONTAINER.NAME}
♯ Preprocessing
  ▪ Prometheus pattern     podman_pod_state{id="{#POD.ID}"}      value

○ Item prototype #3
  ▪ Name:         POD created at: [{#POD.NAME}]
  ▪ Type          Dependent item
  ▪ Key:          pod.created[{#POD.NAME}]
  ▪ Type of inf   Numeric (unsigned)
  ▪ Units         unixtime
  ▪ Master item   Podman containers by HTTP and Prometheus: Podman info
♦ Tags
  ▪ Metric:created
  ▪ Pod:{#CONTAINER.NAME}
♯ Preprocessing
  ▪ Prometheus pattern     podman_pod_created_seconds{id="{#POD.ID}"}     value

○ Item prototype #4
  ▪ Name:         POD container count: [{#POD.NAME}]
  ▪ Type          Dependent item
  ▪ Key:          pod.count[{#POD.ID}]
  ▪ Type of inf   Numeric (unsigned)
  ▪ Master item   Podman containers by HTTP and Prometheus: Podman info
♦ Tags
  ▪ Metric:count
  ▪ Pod:{#CONTAINER.NAME}

On the template, I have also created a value map for deciphering the numerical pod state codes to text strings for better clarity.

Fig 14. Value mapping for the POD state item

Here are some screenshots of the POD state item prototype, shown below.

Fig 15. POD state item prototype: item prototype tab

Fig 16. POD state item prototype: tag tab

Fig 17. POD state item prototype: preprocessing tab

Creating a template Discovery rule: trigger prototype

We can also create a trigger prototype to generate an alert if there is something wrong with the pod. I have created a user macro {$POD.RUNNING.STATE} on the template with a value of 4, which corresponds to the running state.

Template: Podman containers by HTTP and Prometheus; Discovery rule: POD discovery

◘ Trigger prototypes:
  ▪ Name:               POD [{#POD.NAME}] state has changed from running
  ▪ Severity:           Warning
  ▪ Expression: last(/Podman containers by HTTP and Prometheus/pod.state[{#POD.NAME}])<>{$POD.RUNNING.STATE}
  ▪ PROBLEM event generation mode: Single
  ▪ OK event closes: All problems

Fig 18. Trigger prototype based on POD state item value

Once you link the template to the host and execute the LLD rule, you should start seeing the Podman pods ( if you have them), similar to the screenshot below.

Fig 19. Latest data for the host with the linked template

Summary

This blog post shows how to get data with HTTP agent from Prometheus Podman exporter and use the same item data for the Discovery rule as well as item and trigger prototypes. Check out part 2 of this series to find out how to discover and monitor Podman containers.

The post Podman Container Monitoring with Prometheus Exporter, part 1 appeared first on Zabbix Blog.

Mastering Amazon Q Developer Part 1: Crafting Effective Prompts

2025-05-19 Will Matos

Post Syndicated from Will Matos original https://aws.amazon.com/blogs/devops/mastering-amazon-q-developer-part-1-crafting-effective-prompts/

As organizations increasingly adopt AI-powered tools to enhance developer productivity, your ability to effectively communicate with these assistants becomes a valuable skill. This guide explores how you can craft prompts that deliver accurate, useful results when working with Amazon Q Developer.

Your success with Amazon Q Developer depends directly on how well you communicate with it. Through my work as a Principal Specialist Solutions Architect on the Next Generation Developer Experience team at AWS, I’ve observed that developers experience varying degrees of success based primarily on their approach to prompt construction. The difference between a vague request and a well-structured prompt can be the difference between wasted time and a productivity breakthrough.

Recent McKinsey research reveals that developers can complete tasks up to twice as fast with generative AI when using proper prompting techniques [1]. Even more impressive, developers tackling complex tasks are 25-30% more likely to complete them within given time-frames when using these tools effectively. These productivity gains aren’t automatic—they depend on mastering the art and science of prompt engineering.

Based on patterns observed across numerous customer interactions, this guide provides practical techniques to help you maximize the value of your AI-assisted development experience. You’ll learn how to transform your interactions to consistently produce helpful, relevant assistance that can dramatically improve your development workflow.

Key Takeaways

Structure your prompts with clear context, specific requirements, and desired output format
Include relevant technical details about your environment and constraints
Avoid vague requests and provide specific examples when possible
Use the provided prompt template to ensure consistent results

Getting Started with Amazon Q Developer

Already using Amazon Q Developer? Great! This guide will help you get more value from your interactions. If you haven’t set up Amazon Q Developer yet, check out the getting started guide.

Understanding the Impact of Good Prompts

The rapid adoption of AI technologies makes prompt engineering skills essential for today’s developers. McKinsey’s latest global survey reveals that 65% of organizations regularly use generative AI, nearly double from their previous survey. When developers master prompt engineering, they’re 25-30% more likely to complete complex tasks within given timeframes.

What Makes an Effective Prompt?

Specific Request: State exactly what you need
Clear Background: Describe your project, requirements, and constraints
Additional Context: Provide code, configuration, or other additional context
Expected Output: Specify how you want the information presented

Here’s how this works in practice:

Poor prompt:

How do I deploy a container on AWS?

Effective prompt:


I need to deploy a containerized Node.js e-commerce application that handles 
50,000 daily users with peak loads during promotional events.
Requirements:
- High availability across multiple regions
- MongoDB for persistence
- Auto-scaling capabilities

Please provide:
1. AWS architecture diagram
2. List of required services with configurations
3. Security best practices
4. Operational monitoring recommendations

Common Patterns to Avoid

Short or Vague Requests:

Add Docs
Make this better
Check this

'Add docs' simple prompt with generic response.

Not much to go on here. Amazon Q Developer will likely provide generic documentation.

'Check this' simple prompt with generic response.

Another vague prompt with a generic response.

Overly Broad Questions:

How do I use AWS?
What's the best practice?
Help with Lambda

Image showing the Amazon Q Developer IDE Chat panel where the user entered the vague prompt: 'Help with Lambda'. Amazon Q Developer responds by asking clarifying questions.

The prompt is so vague that Amazon Q Developer responds by asking clarifying questions.

Image showing the Amazon Q Developer chat pane where the user entered the prompt: "Create a Lambda function that processes S3 events."

The more specific prompt allows Amazon Q Developer to provide a more precise response.

Remember: The quality of information you receive directly correlates with the quality of the information you provide.

Proven Techniques for Better Results

To help you apply these principles consistently, I’ve developed a template structure that incorporates all the key elements of an effective prompt. This framework can be adapted for various scenarios and serves as a starting point for your interactions with Amazon Q Developer. While Amazon Q Developer will fill in some parts of this context (see the next post in this series), you just need to make sure this information is available.

These are the principles demonstrated in the template:

Technical Context Requirements
1. Specify your technology stack and versions
2. Include environment details
3. Mention compliance requirements
4. Define scale expectations
Example Specifications
1. Include relevant code snippets
2. Paste error messages
3. Reference configuration files
4. Show current architecture
Output Format Guidelines
1. Request specific documentation formats
2. Ask for diagrams when needed
3. Specify code language preferences
4. Indicate level of detail needed

Image showing the Amazon Q Developer chat panel with the user submitted prompt: "Document the requirements for an application that will process images. Format as a technical requirements document using markdown markup. Output as a single markdown code-block." The response is much more detailed, and aligns with the user's request.

The specification of the output format ensure the response is what you expect.

Quick Reference Prompt Template

Use this template to structure your prompts:


[Business Context] 
- Project description: 
- Performance requirements: 
- Compliance needs: 
- Scale expectations: 

[Technical Details] 
- Current technology stack: 
- Versions/dependencies: 
- Technical constraints: 
- Environment details: 

[Specific Request] 
- Task description: 
- Expected outcome: 
- Special considerations: 

[Output Format] 
- Desired format: 
- Level of detail: 
-  Examples needed: 
- Additional requirements:

Best Practices for Daily Use

Successfully working with Amazon Q Developer requires consistent application of proven practices. These guidelines, developed through extensive customer interactions, will help you maximize the value of your AI-assisted development experience.

Start with clear business objectives
Include relevant technical constraints
Specify performance requirements
Request specific output formats
Provide examples when possible

Through extensive customer interactions, we’ve found that following these practices consistently produces better results and reduces the need for follow-up clarification.

Take Action Now

Try the prompt template with your next Amazon Q Developer request
Bookmark the Amazon Q Developer documentation
Join the AWS Developer Community to share experiences
Enable Amazon Q Developer in your AWS account if you haven’t already

Additional Resources

What’s Next?

In the next part of this series, we’ll explore advanced context management in Amazon Q Developer and dive into the new prompt catalog features. You’ll learn how to:

Build and maintain context across multiple interactions
Use the prompt catalog effectively
Handle complex, multi-step development tasks
Optimize responses for your specific use cases

Stay tuned, and start applying these techniques today to transform how you build on AWS!

About the author:

Interactive Dashboard Creation for Large Organizations and MSPs

2025-05-19 Arturs Lontons

Post Syndicated from Arturs Lontons original https://blog.zabbix.com/interactive-dashboard-creation-for-large-organizations-and-msps/30132/

Dashboard widgets have received substantial improvements in the latest Zabbix releases – everything from brand-new widgets to greatly expanding upon existing widget features. The post will cover some of the new improvements as well as lesser-known dashboard and widget features, while discussing multiple dashboard use cases targeted at large organizations and MSPs.

Table of Contents

Broadcast and listen capabilities

Zabbix widgets can be used to not only display static data, but they can also be linked together by using widget broadcast and listen capabilities. Depending on the built-in capabilities, widgets can either broadcast data (such as the item, host, event, or time interval selected in the widget) or listen and display the selected data points – multiple widgets support both broadcast and listen capabilities.

Widgets can broadcast and listen for the following entities:

Hosts
Host groups
Time periods
Items
Events
Maps

Zabbix documentation contains the full list of widget broadcast and listen capabilities.

Navigator widgets

Host and Item navigator widgets serve as simple examples of broadcast widgets. The sole purpose of these widgets is to display an organized, interactive list of hosts or items. The selected hosts and items can be broadcast to other widgets such as graphs, gauges, problem widgets, an item value widget, and many others.

In addition to regular widget filters based on hosts, host groups, and tags, navigator widgets can be configured to group hosts or items based on tags, host groups, and existing problem severities. This can be used to provide an organized overview of hosts or items based on MSP clients, organization departments, and any other grouping.

Hosts grouped by MSP clients based on host group names

Any combination of widgets from the above table can be used to create interactive dashboards. For example, you could combine the Item value widget listen capabilities with the Geomap widget broadcast capabilities to display item values for hosts selected on the Geomap.

Broadcast hosts from the Geomap widget to Item value widgets

Dashboard-level host broadcast

Host overrides can also be performed on a dashboard level. Once you have set the Override host setting in your widgets to Dashboard, you can select the host in the top right corner of the dashboard. After the host is selected, the widgets will start displaying information related to the selected host.

Host information can also be broadcast on Dashboard level

Selecting non-existing items

One final thing you should consider when implementing widgets with host/item sources from broadcast widgets is what happens if the selected item does not exist on the selected host. In that case, your widget will display a message “No permission to referred object or it does not exist!” – the same error message the users will see if they lack the read permissions on an item. Ideally, you’d want to define widget filters and broadcast/listen configuration in a way where such errors can be avoided – especially if Zabbix is used as a central monitoring hub for users from multiple departments or organizations.

The item value widget displays an error message since the selected item does not exist on the selected host

The Zabbix graph widget has a variety of advanced features that can enable many new use cases and provide new insights based on the collected item values.

Data sets

The graph widget utilizes data sets to select, match, and group items that would be displayed in the graph. There are two types of data sets – item pattern and item list. When using item list data sets, you have to individually select each item that you wish to display on the graph. On the other hand, item pattern data sets provide more flexibility. Here we can utilize wildcards in host and item names to match items and hosts by name. This is especially useful for items discovered by low-level discovery in dynamic environments. With item pattern data sets, the addition or removal of items matching the pattern will be automatically reflected in the graph.

Trigger and problem display

Detected problems and trigger thresholds can also be displayed in dashboard graph widgets. The time periods during which a trigger related to the displayed items has been in a problem state will be highlighted in red. The graphs also provide an option to display a trigger line for triggers utilizing last, min, max, and avg functions.

Graph widget can display a trigger line and highlight periods during which a problem was active

Aggregation

The ability to aggregate data directly within the widget can be an extremely useful tool for gaining new insights from existing data. With graph widget aggregations, it is possible to aggregate each individual item (for example, displaying hourly averages for network traffic on each interface) or the whole data set (total hourly traffic from all interfaces).

Time shift

The time shift feature is useful for visually comparing current values with values collected some time in the past. For example, we could compare the current CPU load on our application server with the CPU load for the same time period yesterday. This could allow us to detect unexpected deviations just by glancing over the graph.

Missing data

Finally, the graph widget enables Zabbix users to choose how they wish to display missing values. Values for items could be missing for a variety of reasons – anything from data collection errors to various preprocessing workflows that could discard item values by design. Accordingly, it makes sense to design your graphs with the correct representation of missing data in mind.

Missing values in graphs can be displayed in the following formats:

Treat missing values as 0
Do not display missing values
Connect the last known value with the current value
Treat missing values as the last known value

Missing values are selected to not be displayed

Threshold values can be defined for multiple widgets to make the visualization of data more dynamic. This way, Zabbix dashboards can instantly highlight resources exceeding warning/critical thresholds, services in unexpected states, unreachable endpoints, and a variety of other issues. As of Zabbix 7.2, widget thresholds are available only for numeric item values.

Widgets with threshold support

Multiple widgets provide the ability to define value thresholds:

Item value
Gauge
Top hosts
Top items
Honeycomb

Thresholds can be defined in widget configuration. By defining one or multiple thresholds, we specify that whenever values for the selected item reach or exceed the threshold, they will be highlighted in the selected color.

Item value widget can be used to highlight problematic resources or services

Thresholds are useful for not only highlighting the problematic items in Item value or Gauge widgets, but can also be used to provide a broader view of overall resource utilization with Top hosts and Top items widgets. Since we aren’t limited to a single item, Top hosts and Top items widgets enable us to do a surface-level correlation by looking at the utilization of various resources and highlighting the resources nearing critical utilization thresholds.

Top hosts and Top items widgets can display a comprehensive overview of host resource usage

Another way to display and highlight our infrastructure state on a larger scale is by using the Honeycomb widget. The Honeycomb widget utilizes item patterns to display the matching item values. Here, thresholds can be combined with color interpolation to provide a more dynamic view of our environment. The Honeycomb widget is also capable of broadcasting the selected item and host, which enables us to quickly gain more information about the problematic host by clicking the corresponding cell in the widget.

Honeycomb widgets provide a dynamic overview of enterprise resource usage by supporting color interpolation features

Dashboards for MSPs

The previous sections have already highlighted a variety of features, useful widgets, and widget features for large organizations and MSPs. But let’s not forget that MSPs require granular access permission and control features. MSPs must also ensure that each client’s information (Hosts, items, dashboards) is fully isolated and secure from outside access.

Dashboard visibility

Each dashboard can be deployed either as a public or a private dashboard. Public dashboards are available to every user in read-only mode, while private dashboards require explicit read and write permissions for users who need access to them. MSPs can utilize private client organization dashboards to allow each client to view information about their environment in multiple views while completely hiding the dashboards assigned to other organizations.

Private dashboards require explicit read/write permissions

Host permissions

Dashboard visibility is only the first access control layer. Even when a Zabbix user has access to a dashboard, we must ensure that the user also belongs to a user group that has at least read permissions on the hosts displayed on a dashboard. Without at least read permissions, the hosts will not be displayed in dashboard widgets. This way, MSPs can utilize a single dashboard where each organization’s users can only see the information related to their environments, as opposed to having many duplicate dashboards, where each has a custom host filter that matches just the particular organization’s hosts.

User group-to-host group permissions have a direct impact on host visibility in dashboards

Restricting access to widgets

Access to each widget can also be restricted in Zabbix. This can be done globally by disabling widget modules under Administration—General—Modules or by disabling access to modules on an individual user role level. This can come in handy if the Zabbix environment in question enables users from various departments or organizations to create their own dashboards or edit existing ones. In addition, we may also have some custom community or in-house widgets which are utilized only by Zabbix administrators. which we may want to restrict access to.

If a Zabbix user opens a dashboard containing the restricted widget, the widget will be replaced with the message “No permissions to referenced object or it does not exist!” Ideally, it is recommended to avoid situations where users encounter such widgets, since such a message can be confusing to a user not familiar with various Zabbix permission and access error messages.

Access to modules can be restricted per each user role

Dashboard ownership

Dashboard ownership can also play a role in our user onboarding and offboarding process. Dashboard owners can edit permissions on the dashboards they own, but this can add an extra step in our user offboarding process since dashboards cannot remain without an owner! Therefore, before deleting a Zabbix user, we need to ensure that either their dashboards have also been removed or have their owners be changed. If we attempt to delete a user who is also a dashboard owner, Zabbix will display an error message.

Users who are owners of an existing dashboard cannot be deleted

This article touches upon only a few of the latest and lesser-known features useful to MSPs and large organizations. There are many more advanced ways of utilizing Zabbix widgets, permissions, tags, low-level discovery rules, and many other features that come in handy to organizations of various sizes, utilizing Zabbix for a variety of use cases. Follow our blog, watch the latest Zabbix videos on our YouTube channel, and check out our on-premise and online events to learn more about the flexibility of Zabbix data collection, alerting, and visualization features.

The post Interactive Dashboard Creation for Large Organizations and MSPs appeared first on Zabbix Blog.

Optimizing Incident Management with Zabbix and PagerDuty

2025-05-13 Zabbix LatAm

Post Syndicated from Zabbix LatAm original https://blog.zabbix.com/optimizing-incident-management-with-zabbix-and-pagerduty/30114/

When monitoring environments, we sometimes need to rely on third-party tools to better manage functionality and optimize responses to alerts. Let’s explore how to integrate Zabbix with PagerDuty, a real-time incident management solution designed to improve the reliability of digital services, including best practices and configuration details.

Table of Contents

What is PagerDuty?

PagerDuty is a real-time incident management platform designed to help IT teams react quickly to critical events. The tool helps organizations automate and manage incident response through a system of alerts, escalation, and coordination between teams. When a problem is detected in the system, PagerDuty notifies the responsible individuals and ensures that corrective action is taken quickly. This reduces downtime and improves operational efficiency. Integration with monitoring tools such as Zabbix makes it easy to identify issues before they impact users.

Some of PagerDuty’s key features include:

• Integration with monitoring tools (such as Zabbix)
• Notifications in multiple channels (email, SMS, calls)
• Automatic escalation of incidents to ensure agile responses
• Event analysis to improve the detection of recurring problems

How to integrate PagerDuty with Zabbix

In PagerDuty, go to “Services” and click on “Service Directory.” Create a new service.

Give it a proper name and description.

Accept the escalation terms and click “Next.”

On the next screen, select “Intelligent” and the “Auto-pause incident notifications” option, then click “Next.”

The next step is to add the Zabbix Webhook service, which will allow integration with Zabbix, and then click “Next.”

In Services > Service Directory, select the name of the service. In the “Integrations” tab, copy the integration token that is generated.

It is important to note that the PagerDuty webhook only shows the option of Zabbix versions 5.0 to 5.2, but it works correctly in later versions such as Zabbix 7.2, which was tested without any issues.

On Zabbix Server, go to Alerts > Media types > PagerDuty. Enter the integration token, the Zabbix URL, and select “Update.”

Send a test message to confirm that the integration is working correctly.

In the PagerDuty application, verify that the test alert was received.

To send notifications, you need to grant permissions to a user in Zabbix. Go to Users > Create User. In the “Media” tab, select PagerDuty as the notification method. Set the severity of the alerts you want to receive.

Subsequently, set up a Trigger Action in Alerts > Actions > Trigger Actions to define what types of alerts will be received (either by item or trigger) according to the needs of your team.

Best practices for integrating Zabbix and PagerDuty

• Customize notifications: Set rules to send only truly critical alerts, avoiding unnecessary notifications.
• Optimize escalations: Set up escalation rules so that alerts reach the right people at the right time.
• Monitor key metrics: Measure incident response times and adjust workflows as needed.
• Automate incident responses: Use PagerDuty’s capabilities to perform automated tasks in response to specific events.
• Notify about service failures: Use PagerDuty to start running recovery scripts, send notifications to the responsible teams, or even escalate the problem to a higher level if there is no solution in a stipulated length of time.

Conclusion

Zabbix’s integration with PagerDuty allows you to monitor the status of critical services in real time, even outside of working hours. This facilitates rapid incident response and improves your IT team’s ability to react.

This combination not only optimizes incident management but also helps minimize downtime, improve operational efficiency, and ensure the reliability of monitored systems.

With proper configuration and best practices, integrating Zabbix with PagerDuty can become essential for the proactive management of your technological infrastructure.

The post Optimizing Incident Management with Zabbix and PagerDuty appeared first on Zabbix Blog.

Migrating a CDK v1 Application to CDK v2 with Amazon Q Developer

2025-04-30 Dr. Rahul Sharad Gaikwad

Post Syndicated from Dr. Rahul Sharad Gaikwad original https://aws.amazon.com/blogs/devops/migrating-a-cdk-v1-application-to-cdk-v2-with-amazon-q-developer/

Introduction:

AWS Cloud Development Kit (AWS CDK) is an open-source software development framework for defining cloud infrastructure in code and provisioning it through AWS CloudFormation. As of June 1, 2023, AWS CDK version 1 is no longer supported. To avoid the potential issues that come with using an outdated version and to take advantage of the latest features and improvements, we highly recommend upgrading to AWS CDK version 2.

Amazon Q Developer, a generative AI-powered assistant for software development, enhances the efficiency of software development teams. It facilitates the creation of deployment-ready infrastructure as code (IaC) for AWS CloudFormation, AWS CDK, and Terraform. By using Amazon Q, developers can accelerate IaC development, enhance code quality, and decrease the likelihood of configuration errors.

This post demonstrates how Amazon Q Developer helps in upgrading the existing AWS CDK v1 application to AWS CDK v2.

Prerequisites

An AWS Builder ID or an AWS IAM Identity Center login controlled by your organization
A supported IDE, like Visual Studio Code
The AWS Toolkit IDE extension
Authenticate and Connect
Nodejs
AWS CDK v1
AWS CDK v2

Planning

In this blog post, I will explore a code example where I have created a VPC, Subnets, and an ECS Fargate cluster using AWS CDK version 1. I will then explain how you can use Amazon Q to transform the code from CDK v1 to CDK v2.

1. In order to initiate this process, I have begun by asking Amazon Q Developer for the necessary steps to migrate from CDK version 1 to version 2, which are outlined below.

Can you provide the steps to migrate from cdk version 1 to version 2?

Amazon Q Developer outlining the comprehensive process to upgrade AWS CDK applications from version 1 to version 2.

2. In the above screenshot Amazon Q Developer outlined several steps we can take to make the necessary changes. The first step is to update the dependencies. If I need guidance on how to update the dependencies, I can ask the Amazon Q Developer again for help by asking the steps regarding updating dependencies as below .

Can you provide the steps to update dependencies?

Amazon Q Developer offering detailed, AI-powered guidance to upgrade project dependencies by analyzing the existing codebase, identifying outdated or deprecated libraries and frameworks, and recommending precise updates to ensure compatibility with newer language versions.

3. After updating the dependencies, the next step is to update the import statements. To get guidance on how to update the import statements, I can ask the Amazon Q Developer assistant again for help by asking the steps regarding how to import statements as shown below.

@workspace Can you provide the steps to update import statements?

Amazon Q Developer advises on updating import statements by analyzing the current code context and guiding developers to replace legacy or outdated import paths with the latest.

In the above screenshot if you have noticed I have added @workspace before the question which automatically includes the most relevant chunks of my workspace code as context.

4. If any errors occur while updating the code as recommended by Amazon Q Developer, I can use Amazon Q Developer to debug the issue and provide the needed inputs to resolve it.

Amazon Q Developer diagnosing issues by analyzing error messages and AWS resource states, providing natural language explanations of root causes such as permission errors and misconfigurations.

5. Once I have finished the required steps, I can deploy the application using version 2 of the AWS CDK by running the cdk deploy command.

Deployment of the updated AWS CDK version 2 application, involving synthesizing CDK stacks to generate CloudFormation templates and deployment artifacts, bootstrapping the AWS environment to provision necessary resources.

6. In addition to its other capabilities, Amazon Q offers code review functionality. To initiate a code review, simply select Amazon Q and use the /review command. I’ll then have the option to review either the active files or the entire open workspace. Select your preference, and Amazon Q will analyze your project and provide comprehensive review results.

Amazon Q Developer performs comprehensive code analysis by reviewing your entire codebase or real-time code as you write, identifying security vulnerabilities, code quality issues, and deployment risks.

7. Amazon Q Developer can also generate documentation, including README files. To create documentation, select Amazon Q and enter the /doc command. Amazon Q will automatically generate a README file for your project. I can then review the generated documentation, accept the changes, or provide specific instructions for further modifications.

Amazon Q Developer automatically generates a comprehensive README file for the entire project by analyzing the codebase, project structure, and dependencies within the selected folder in the IDE.

Conclusion

In this blog, I demonstrated how Amazon Q Developer can simplify and accelerate the upgrade process from AWS CDK version 1 to version 2, ensuring your cloud infrastructure remains secure, efficient, and aligned with the latest AWS innovations. AWS CDK v2 offers a streamlined, consolidated library with improved performance and ongoing support, making infrastructure management easier and more reliable.

By leveraging Amazon Q Developer, a generative AI-powered assistant, teams can automate Infrastructure as Code development, enhance code quality, and minimize configuration errors. Together, these tools empower development teams to confidently modernize and scale their AWS environments, turning the upgrade process into a seamless opportunity for innovation and growth.

Resources

To learn more about Amazon Q Developer, see the following resources:

To learn more about the AWS CDK, see the following resources:

About the authors:

How to help prevent hotlinking using referer checking, AWS WAF, and Amazon CloudFront

2025-04-17 Alex Smith

Post Syndicated from Alex Smith original https://aws.amazon.com/blogs/security/how-to-prevent-hotlinking-by-using-aws-waf-amazon-cloudfront-and-referer-checking/

Note: This post was first published April 21, 2016. The updated version aligns with the latest version of AWS WAF (AWS WAF v2) and includes screenshots that reflect the changes in the AWS console experience.

AWS WAF Classic has been deprecated and will be end-of-life (EOL) in September 2025. This update describes how to use the latest version of AWS WAF (WAFv2) to help prevent hotlinking. Updates have been made to the screenshots to reflect the changes in the AWS Management Console for AWS WAF.

Hotlinking—also known as inline linking—is a form of content leeching where an unauthorized third-party website embeds links to resources originally referenced in a primary site’s HTML. The third-party website doesn’t incur the cost of hosting the content, which means that your website can be charged for the content other sites use. It also results in slow loading times, lost revenue, and potential legal issues.

Now, you can use AWS WAF to help prevent hotlinking. AWS WAF is a web application firewall that’s closely integrated with Amazon CloudFront—a content delivery network (CDN)—and can help protect your web applications from common web exploits that could affect application availability, compromise security, and consume excessive resources. In this blog post, I show you how to help prevent hotlinking by using header inspection in AWS WAF, while still taking advantage of the improved user experience from a CDN such as CloudFront.

Solution overview

You can address hotlinking in various ways. For instance, you can validate the Referer header (sent by a browser to indicate to the server which page the visitor was referred from) at your web server (for example, by using the Apache module mod_rewrite), and either issue a redirect back to your site’s main page or return a 403 Forbidden error to the visitor’s browser.

If you’re using a CDN such as CloudFront to speed up your site’s delivery of content, validating the Referer header at the web server becomes less practical. The CDN stores a copy of your content in the edge of its network of servers, so even if your web server validates the original request’s headers (in this case, the referer), additional requests for that content must be validated by the CDN itself, because they are unlikely to reach the origin web server.

Figure 1 illustrates this process.

Figure 1: Request – response flow showing instances of a cache-miss and a cache-hit

The process shown in Figure 1 is as follows:

A request is received from a user client (1) at a CloudFront edge location (2).
The edge location attempts to return a cached copy of the file requested. This request, if fulfilled from the cache, is considered a cache hit.
1. In the case of a cache miss—when the content is either not in the edge or is not valid (for example, if the content is out of date)—the request is forwarded to the origin (3) (such as an Amazon Simple Storage Service (Amazon S3) bucket) for a new copy of the object.
2. In the case of a cache hit, the origin cannot apply validation logic to the user’s request, because the edge server doesn’t need to contact the origin to fulfil the user’s request.

In the next section, I show you how to inspect the client-request headers using AWS WAF to allow or block requests at the CDN.

Solution implementation—two approaches

This post includes two ways to set up AWS WAF to help prevent hotlinking:

Using a separate subdomain: Static files (such as images or styling components) to be protected are moved to a separate subdomain such as static.example.com so that you only need to validate the Referer header.
Using the same domain: Static files are located under a directory on the same domain. This solution includes how to extend this example to check for an empty Referer header.

The choice of approach will depend on how your site is structured and the level of protection you want to implement. The first approach enables you to set up a Referer header check to make sure that requests for the images only come from an allowlisted sub-domain, while the second approach has an additional check for an empty Referer header. The second approach extends the first approach and allows for some flexibility for users to share direct links to the image while still preventing unaffiliated third-party sites from embedding the image links on their websites.

Terms

The following list includes key terms used in this post:

AWS WAF configurations consist of a web access control list (web ACL), associated with a given CloudFront distribution.
Each web ACL is a collection of one or more rules, and each rule can have one or more match conditions.
Match conditions are made up of one or more filters, which inspect components of the request (such as its headers or URI) to match for certain conditions.
Case-sensitivity: HTTP header names are case-insensitive. Referer and referer point to the same HTTP header. HTTP header values, however, are case-sensitive.

Prerequisites

You must have a CloudFront distribution set up before configuring an AWS WAF web ACL. For information about how to set up a CloudFront distribution with an S3 bucket as an origin, see Configure distributions.

Approach 1: A separate subdomain

In this example, you create an AWS WAF rule set that contains a single rule with a single match condition, which in turn has a single filter. The match condition checks the Referer header and verifies that it contains a given value. If the request matches the condition specified in the rule, the traffic is allowed. Otherwise, the AWS WAF rule blocks the traffic.

For this example, because all the static files are on a separate subdomain (static.example.com) accessed only from the site example.com, you will block hotlinking for any file that don’t have a referer that ends with example.com.

Use the following steps to set this up using the AWS WAF console.

Step 1: Create and name a new web ACL

Sign in to the AWS WAF console.
If you have not created a web ACL before, Choose Create web ACL on the AWS WAF console landing page.
Because you want to associate the web ACL with a CloudFront distribution, select Amazon CloudFront distributions as the Resource type.
1. Enter a Name for the web ACL that you’re creating. For this example, I used the name sample-webacl. The page will automatically populate an associated Amazon CloudWatch metric name. CloudWatch is a monitoring service that allows you to gather and report on metrics of various services. This CloudWatch metric can be used later to report on how your newly created AWS WAF configuration is being used.
2. After you have supplied the name of the web ACL, you can select the available AWS resources to be protected by this web ACL. In this example, you will fill that in later, so leave this field blank for now.
3. By default, AWS WAF can inspect up to 16 KB of the web request body with additional values of 32, 48, and 64 KB for an additional cost. Leave the web request Body size limit at the default value of 16 KB.
4. Choose Next.

Figure 2: Describing the web ACL and associating it to resources

Step 2: Create a string match condition on Referer header

AWS WAF ACLs can use AWS managed rule groups, rule groups from AWS Marketplace providers, or you can write your own rules and rule groups. For this example, you will create your own rules and rule groups.

In the AWS WAF console, choose Add rules, and select Add my own rules and rule groups to create the string match condition.

Figure 3: Add rules and rule groups
This will bring you to the Rule visual editor page. The default Rule type will be set to Rule builder which you can leave unchanged. In the Rule builder section, select Regular rule.

Figure 4: Rule type and Rule builder
The next step is to construct a string match condition to match on the Referer header. Under Name, enter a name for the rule, such as Referer-check. Make sure that If a request is set to doesn’t match the statement (NOT). The string match condition is a negative match which means that if the Referer header field value does not match the value specified in the rule, the request will be blocked. This makes sure that requests for static.example.com which only originate from example.com are allowed. In the Statement section, use the following settings:
1. Inspect: Select Single header.
2. Header field name: Enter referer as the value.
3. Match type: Select Exactly matches string.
4. String to match: Enter example.com as the value.
5. Text transformation: Select Lowercase. This isn’t required for most modern browsers, but is a good practice because HTTP header field values are case sensitive.
Figure 5: Rule name and statement
In the Action section, select Block as the Action. Choose Add Rule.

Figure 6: Rule Action

In the preceding rule statement, you’re configuring AWS WAF to inspect a header with the name Referer and checking if the value of the header matches the static string example.com. If the value of the Referer header is not example.com, then the request is blocked.
The next page is Add rules and rule groups. It shows the following attributes of the web ACL:
1. AWS WAF rules that have been added to the web ACL.
2. Web ACL capacity units (WCUs).
3. Default web ACL action.
4. Token domain list.
5. Because you’re only adding one rule to this web ACL, choose Next.
  
  Figure 7: Rules and rule groups, WCUs, and default web ACL action
On the next page, you will set the rule priority. Because you added only one rule, you will not need to adjust the rule priority. If there is more than one rule, you can select a particular rule and use the Move up or Move down options to organize the rule order. Choose Next.

Figure 8: Set rule priority
The Configure metrics page details can be left at the default values. Choose Next to proceed to the final step.

Figure 9: Configure metrics
The final step is to review the web ACL details. If you need to change one of the settings of the web ACL, you can choose Edit step for the corresponding step. Choose Create web ACL to finalize creating the AWS WAF web ACL.

Figure 10: Review and create web ACL

Step 3: Associate the new rule with the relevant CloudFront distribution

You can now associate AWS resources with the web ACL that you created in the previous steps. In this case, the AWS resource is a CloudFront distribution.

In the AWS WAF console, choose Web ACLs in the navigation pane. Select the web ACL named sample-webacl that you created previously.

Figure 11: Select a Web ACL to configure
Choose Add AWS resources.

Figure 12: Add AWS resources
Eligible AWS resources will be displayed in the pop-up page. Select the CloudFront distribution from the Resources list. Choose Add to associate the ACL sample-webacl with the CloudFront distribution.

Figure 13: Select CloudFront distribution to associate with sample-webacl
The next page is the Web ACLs page, which will show the CloudFront distribution selected in the previous step in the Associated AWS resources section.

Figure 14: Web ACLs and Associated AWS resources

Test the referer check rule

You’re ready to test the web ACL that you created by issuing a cURL command from the command line and confirming that the referer check is matched correctly. When you request files without the allowlisted Referer header, the requests are blocked at the CDN. However, valid requests still are allowed through.

When a third party embeds your content (request blocked at the CDN)

» curl –H "Referer: example.net -I https://static.example.com/favicon.ico
« HTTP/1.1 403 Forbidden

When you embed your content (request allowed through the CDN)

» curl –H "Referer: example.com -I https://static.example.com/favicon.ico
« HTTP/1.1 200 OK

Note: With Approach 1, you must make the request with an allowlisted Referer header. In this example, all paths are filtered.

Approach 2: All content under the same domain, with filtering by path

In the second approach, you allow a blank Referer header and filter by a given URL path. To do this, you will create an AWS WAF web ACL that contains multiple rules with additional match conditions, which in turn are comprised of multiple filters. As with the first approach, the match condition looks at the Referer header; however, you will validate the header in two ways. First, you validate whether the request contains the expected header, and if it does not, you apply the second validation, which checks to see whether it has a URL style Referer header. This enables you to access the assets directly in a browser when the assets aren’t embedded elsewhere in a website but still provides protection against hotlinking.

Accessing an image directly in the browser can be useful in situations where users might want to share the link to the image directly, thus helping to prevent a negative user experience when sharing the image link with other users. This approach makes it an improvement over the first approach where requests for the images must originate from the sub-domain.

You will also validate the path used in the request (in this example /wp-content), which allows AWS WAF to protect individual folders under a single domain name.

Step 1: Decide what to protect

As in the first approach, rather than filter on everything under a domain, you will filter based on the path. In this case, /wp-content. This allows you to protect your uploaded content that sits under /wp-content, but without having to put the content into a separate subdomain.

Step 2: Create and name a new web ACL

You can use the web ACL that you created for Approach 1, or you can repeat Step 1 of Approach 1 to create a new web ACL.

Step 3: Create string match conditions on the referer

For Approach 2, the assumption is that everything exists under a single domain, so instead of using the catch-all example.com, use the more secure https://example.com/ and mark the header as Starting with https://example.com.

Because you’re explicitly filtering on one header, you must watch out for two things:

Switching between www.example.com and example.com in your application.
Switching between https:// and http:// in your application.

If either of these switches occurs, you will see a 403 Forbidden error returned instead of your embedded files. In this example, all content is delivered directly through https://example.com/.

For this example, you will construct two rules, each of which will contain multiple string match conditions. AWS WAF allows for conditional match conditions within a rule so you can create nested logic statements. For example, a rule evaluation is true if all the statements within a rule statement are evaluated to true.

First rule: Validate a Referer header:

For this rule, you will set the following match conditions and AWS WAF actions:

Rule name: Validate-Referer-header

If Referer header value starts with https://example.com

AND

If URI path starts with /wp-content

THEN

ALLOW request

Open the AWS Management Console for AWS WAF and navigate to WAF & Shield.
Choose Web ACLs in the navigation pane and select Global (CloudFront) as the AWS Region.

Figure 15: Web ACLs and AWS Regions
The page will refresh to show the Web ACL sample-webacl that you created in the preceding Step 2. Select sample-webacl.

Figure 16: Web ACLs list
Select the Rules tab.

Figure 17: Web ACL rules
Choose Add rules and select Add my own rules and rule groups. If you’re reusing the web ACL created in Approach 1, delete the Referer-check rule before adding new rules.

Figure 18: Add rules and rule groups
For Rule type, select Rule builder.

Figure 19: Rule type
In the Rule section, use the following settings:
1. Name: Enter Validate-referer-header as the value.
2. Type: Select Regular rule.
3. If a request: Select matches all the statements (AND).
Figure 20: Rule name and match condition
In the Statement 1 section, use the following settings:
1. Inspect: Select Single header.
2. Header field name: Enter referer as the value.
3. Match type: Select Starts with string.
4. String to match: Enter https://example.com as the value.
5. Text transformation: Select Lowercase.
Figure 21: First string match condition
Create the second string match condition (Statement 2). For the URL itself, you want to protect content under /wp-content, so you will create a string match to validate that case using the same steps as for the first string match condition, with two changes:
1. For Inspect, select URI path.
2. For String to match, enter /wp-content as the value.
Figure 22: Second string match condition
Change the Action to Allow and choose Add Rule at the bottom of the page.

Figure 23: Set the Action to Allow
In the Set rule priority page, choose Save.

Figure 24: Save the rule

Second rule: Validate without a Referer header

For the second rule, you will set the following match conditions and rule actions:

Rule name: Validate- with-no-Referer-header

If Referer header contains ://

AND

If URI path starts with /wp-content

THEN

BLOCK request

The second rule is similar to the first rule, but it matches when the Referer header value includes ://. You use this match condition to check whether the Referer header has been set at all. If it has, you block the request.

In the Web ACL page, choose Add rules and select Add my own rules and rule groups to be taken to the Rule type page.

Figure 25: Create the second rule
For Rule type and Rule builder, use the following settings:
1. Rule type: Select Rule builder.
2. Name: Enter Validate-with-no_Referer-header as the value.
3. Type: Select Regular rule.
4. If a request: Select matches all the statements (AND).
Figure 26: Set the rule type and matching
For Statement 1, use the following settings:
1. Inspect: Select Single header.
2. Header field name: Enter Referer as the value.
3. Match type: Select Contains string.
4. String to match: Enter ://
Figure 27: Configure Statement 1
For Statement 2, use the following settings:
1. Inspect: Select URI path.
2. Match type: Select Starts with string.
3. String to match: Enter /wp-content as the value.
Figure 28: Configure Statement 2
For Action, keep the default setting of Block and choose Add Rule.

Figure 29: Add rule
The resulting Set rule priority page will list the rules in the sample-webacl web ACL and will look like the following figure. It shows the name of the rule, the rule priority, the web capacity units (WCUs) and the AWS WAF response. Choose Save.

Figure 30: Rule priority and web ACL units used the web ACL.

The Rules tab will now show both of the rules that you added with their corresponding AWS WAF actions in addition to the default action of Allow for requests that don’t match one of the rules.

Figure 31: Rules tab of sample-webacl web ACL

Step 4: Associate the new rules with the relevant CloudFront distribution

Select the Associated AWS Resources tab and choose Add AWS resources.

Figure 32: Add AWS resources
Select the relevant CloudFront distribution and choose Add.

Figure 33: Select the CloudFront distribution
The web ACLs page will show the CloudFront distribution in the Associated AWS resources tab.

Figure 34: Associated AWS resources

Test the rules

Similar to Approach 1, you have filtering at the CDN, but this time the filtering is based on the path and direct linking is allowed (without a Referer header).

You can use cURL to verify that the new AWS WAF web ACL correctly protects your content. Use the –H argument to send a different Referer header to the CloudFront distribution, which allows you to test as if you are embedding the website content in an unauthorized page.

When a third party embeds your content

» curl –H "Referer: https://example.net/" -I https://example.com/wp-content/uploads/2013/03/shareable-image.jpg
« HTTP/1.1 403 Forbidden

When your content is directly linked (with no Referer)

» curl -I https://example.com/wp-content/uploads/2013/03/shareable-image.jpg
« HTTP/1.1 200 OK

When you embed your content

» curl –H "Referer: https://example.com/" -I https://example.com/wp-content/uploads/2013/03/shareable-image.jpg
« HTTP/1.1 200 OK

Conclusion

AWS WAF is a web application firewall that lets you monitor and control the HTTP(S) requests that are forwarded to your protected web application resources. In this post, you saw how to use the AWS WAF custom rule builder feature to prevent content hotlinking to protect your website’s content hosted in an Amazon S3 bucket.

The two approaches demonstrated in this post provide you with ways to implement a robust referer check solution that helps prevent unauthorized third-party websites from linking back to static assets on your website, thus helping to prevent increased bandwidth costs, bad user experience, and degraded performance because of resource leeching. Following the concept of least privilege, you can further restrict the AWS WAF rules to apply only to certain image file extensions (such as .jpg or .png).

While referer checking helps prevent unaffiliated sites from backlinking to your site’s images and benefitting by using your site’s bandwidth, more sophisticated exploits can carefully craft a request to bypass the referer check. Other web request mechanisms, such as web browser plugins, server-to-server requests that forge referer header values, or privacy-based web browsers may also cause inconsistencies in accurately evaluating the referer header value. Be aware of such inconsistencies and consider using additional private content mechanisms such as signed URLs and token authentication.

Web browsers don’t have a mechanism to validate if a Referer header has been tampered with. Referer checking should be implemented as part of a broader web application security strategy by using AWS WAF application protection rules, Bot Control, Fraud Control, and Distributed Denial of Service (DDOS) protection. Effective web traffic monitoring using AWS WAF logs, Amazon CloudWatch metrics, and web ACL traffic dashboards will help ensure that bad actors aren’t bypassing the AWS WAF rules that you have set up to protect your web traffic.

You can use AWS WAF to build on top of the referer check to implement more advanced content protection solutions such as rate-limiting, bot mitigation, and DDOS mitigations to further secure your website against a wide range of exploits.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this solution or its implementation, start a new thread on the AWS WAF forum.

Alex Smith was the original author of this post in 2016.

Using Amazon Q Developer CLI for custom Java application transformations

2025-04-08 Dinesh Prabakaran

Post Syndicated from Dinesh Prabakaran original https://aws.amazon.com/blogs/devops/using-amazon-q-developer-cli-for-custom-java-application-transformations/

In today’s rapidly evolving software landscape, maintaining and modernizing Java applications is a critical challenge for many organizations. As new Java versions are released and best practices evolve, the need for efficient code transformation becomes increasingly important. Amazon Q Developer transformation for Java using the Command Line Interface (CLI) presents a powerful alternative to integrated development environments (IDEs), offering unique advantages in scenarios requiring batch processing, CI/CD integration, headless environments, and custom automation workflows. By leveraging the CLI, development teams can perform consistent, scalable, and easily reproducible transformations across extensive codebases.

One key difference between CLI and IDE-based transformations lies in the standardization and customization capabilities. With CLI transformations, teams can define and enforce standardized transformation rules across the entire organization, ensuring consistency in code modernization efforts. This standardization is particularly valuable for large teams or distributed development environments. Additionally, the CLI approach allows for deeper customization of transformation rules, enabling teams to tailor the modernization process to their specific needs and coding standards. Whether updating deprecated APIs, migrating to newer Java versions, or enforcing coding standards, Amazon Q Developer’s CLI transformations provide a flexible and powerful solution.

This blog will explore how to use Amazon Q Developer’s CLI capabilities to create custom transformations for upgrading Java applications. We’ll dive into the process of defining transformation rules, executing them across your codebase, and demonstrate how to customize these transformations to meet specific requirements. By the end of this blog, you’ll have a clear understanding of how to leverage Amazon Q Developer’s CLI for Java transformations, enabling you to modernize your applications more efficiently and with greater control. You’ll be equipped to standardize your transformation processes across teams and projects while also customizing them to fit your unique requirements.

Pre-requisites

Before you begin a transformation, see the prerequisites for transformation on the command line with Amazon Q Developer.

Note: The Amazon Q Developer command line tool for transformation (qct cli) is distinct from the Amazon Q Developer CLI – while qct cli is specifically designed for code transformations, the Amazon Q Developer CLI provides features such as autocompletion, Amazon Q chat, inline ZShell completion, etc.

These pre-requisites ensure that you have all the necessary tools and permissions to use Amazon Q Developer’s CLI capabilities for custom transformations on your Java applications.

About the Application

This sample project will be used to demonstrate the Amazon Q Developer CLI code transformation feature in action. It’s a Java 1.8 based microservice application that displays a free list of movies for the month using configuration stored in AWS AppConfig service. Originally open sourced in 2020, it intentionally uses legacy versions of libraries (Spring Boot 2.x, Log4j 2.13.x, Mockito 1.x, Javax and Junit 4) to showcase the upgrade process. The application includes a dependency on another module built in both Java 1.8 and 17, specifically to demonstrate how post transformation steps can be used to modernize your application’s internal dependencies. You can download this sample project to experiment with the CLI upgrade feature in your own environment.

Overview

You will use Amazon Q Developer command line tool for transformation to perform custom transformations on a Java application. This will involve:

Configure Amazon Q command line tool for transformation.
Use pre-transformation template to identify unused imports and variables and remove them before transformation.
After pre-transformation, upgrade the application to Java 17 to leverage the latest features.
Use post transformation template to

- Modify the POM file to point internal dependencies to the latest Java 17 or 21 (1p) version.
- Modify your code to replace deprecated methods from internal dependencies.
- Identify System.out.println statements and replace them with a proper logger framework.

Walkthrough

Setting up the transformation environment

First, ensure you have the Amazon Q command line tool for transformation installed. You can verify this by running:

which qct

Figure 1: Output of 'which qct' command showing the installation path of Amazon Q Developer CLI
Figure 1: which qct

If it’s not installed, follow the installation instructions in the Amazon Q Developer documentation.

Configuring Amazon Q command line tool and authenticate

Configure the Amazon Q command line tool for transformation by running the qct configure command:

qct configure

This command will:

Prompt you to specify the JDK path for Java 8, 11, 17 and 21. You only need to specify the path to the JDK of the Java version you are upgrading.
Two options are available for authentication
- Option 1 authenticates with IAM credentials stored in your AWS CLI profile. Refer Figure 2.
- Figure 2: qct configure – Authenticate with IAM
  - Provide the AWS CLI profile to use for the IAM authentication. You can specify a specific profile name or press enter to use the default profile.
  - Provide a file path that will point to a CSV file which will be used to add tags for your transformation (optional). The CSV must have two columns, with headers titled key and value, where tag key-value pairs are listed.
- Option 2 authenticates with IDC (IAM Identity Center). Refer Figure 3.
- Figure 3: qct configure – Authenticate with IDC
  - Prompt you to provide the Start URL to authenticate to Amazon Q Developer Pro. The Start URL can be obtained from the console in the Q Developer > Settings.
  - Provide the AWS Region
If you’re upgrading your code’s Java version, you have the option to receive your code suggestions from Amazon Q in one commit or multiple commits. Amazon Q will split the upgraded code into multiple commits by default. If you want all your code changes to appear in one commit, enter the letter ‘O’ for one commit when prompted.

For more information on how Amazon Q splits up the code changes, see Reviewing the transformation summary and accepting changes.

Customizing transformations

You can customize transformations by providing custom logic in the form of ast-grep rules that Amazon Q uses to make changes to your code.

To start with customization, create an Orchestrator file where you provide the paths to the custom transformation files. The Orchestrator file is a YAML file containing paths to custom pre-transformation and post-transformation files, which contain ast-grep rules that will run before and after transformation.

Here’s an example:

orchestrator_qct_cli.yaml

name: orchestrator_qct_cli
description: My collection of custom transformations to run before and after a transformation.

pre_qct_actions:
  ast-grep:
    rules:
      - custom-transformation-pre-qct.yaml

post_qct_actions:
  ast-grep:
    rules:
      - custom-transformation-post-qct.yaml

The pre-transformation rules file shown below is used to

Identify unused local variables declarations and remove them
Identify unused import declarations and remove them

This custom transformation cleans up unused local variable declarations and imports, helping in reducing the number of lines that will be considered for transformation as demonstrated in the following example:

custom-transformation-pre-qct.yaml

id: no-unused-vars
language: java
rule:
    kind: local_variable_declaration
    all:
        - has:
              has:
                  kind: identifier
                  pattern: $IDENT
        - not:
              precedes:
                  stopBy: end
                  has:
                      stopBy: end
                      any:
                          - { kind: identifier, pattern: $IDENT }
                          - { has: {kind: identifier, pattern: $IDENT, stopBy: end}}
fix: ''

--- # this is YAML doc separator to have multiple rules in one file

id: no-unused-imports
rule:
    kind: import_declaration
    all:
        - has:
            has:
                kind: identifier
                pattern: $IDENT
        - not:
            precedes:
                stopBy: end
                has:
                    stopBy: end
                    any:
                        - { kind: type_identifier, pattern: $IDENT }
                        - { has: {kind: type_identifier, pattern: $IDENT, stopBy: end}}
fix: ''

The post-transformation rules file serves multiple purposes and helps customers seamlessly upgrade their first-party (1P) dependencies after QCT transforms their application to Java 17 or 21:

When your project uses internal AWS dependencies that have been upgraded to Java 17 or 21, these rules automatically update your POM file to use the latest compatible versions. This eliminates manual dependency version updates and resolves build errors related to private dependencies.
After updating the POM file, the rules automatically modify your code to replace deprecated methods from internal dependencies with their latest supported versions, ensuring compatibility with the upgraded dependencies.
The rules identify System.out.println statements and replace them with a proper logger framework, improving application observability.
This automated approach significantly simplifies the migration process by handling both the application transformation and internal dependency updates in one streamlined operation.

These rules help modernize your codebase and ensure compatibility with updated dependencies.

custom-transformation-post-qct.yaml

id: update-movie-service-util-java17
language: html
rule:
    pattern: |-
        <dependency>
            <groupId>org.amazonaws.samples</groupId>
            <artifactId>movie-service-utils</artifactId>
            <version>$VERSION</version>
        </dependency>
constraints:
    VERSION:
        regex: "^0\\.1\\.0"
fix: |-
    <dependency>
        <groupId>org.amazonaws.samples</groupId>
        <artifactId>movie-service-utils</artifactId>
        <version>0.2.0</version>
    </dependency>

--- # this is YAML doc separator to have multiple rules in one file
id: update-movie-service-util-method
language: java
rule:
    pattern: |-
        MovieUtils.isValidMovieName($MOVIE_NAME)
fix: |-
    MovieUtils.isValidMovie($MOVIE_NAME, movieId)

--- # this is YAML doc separator to have multiple rules in one file
id: sysout-to-logger
language: java
rule:
    pattern: System.out.println($MATCH)
fix: logger.info($MATCH)

Both pre- and post-transformation enhances logging capabilities, increases configurability, improves error handling, and makes the application more production-ready, while the use of transformation rules automates the process, saving time and reducing errors in large codebases.

Executing the transformation

Now, run the transformation using the Amazon Q Developer CLI:

qct transform --source_folder <path-to-folder>
    --custom_transformation_file <path-to-orchestrator-file> 
    --target_version <your-target-java-version>

Here,

–source_folder points to the path of the folder containing the Java application that needs to be transformed from version 8 to either 17 or 21.

–custom_transformation_file specifies the path to the orchestrator file (orchestrator_qct_cli.yaml).

–target_version refers to the target Java version to which the application will be transformed. It can be either JAVA_17 or JAVA_21.

If you have common requirements across all the applications you’re transforming, it’s better to store this file in a shared location and use an absolute path during transformation.

For applications with specific requirements, you can include the orchestrator file in the application’s codebase and use a relative path.

Once you execute the transform command if you choose to authenticate with IDC, it will prompt to authenticate providing a URL using the credentials set up in identity center that have access to Amazon Q Developer Pro:
Figure 4: Screenshot showing Amazon Q Developer CLI transform command prompting for IAM Identity Center authentication with a browser verification code
Figure 4: qct transform authentication with the IAM Identity provider

After logging in through the browser, verify if the code matches with the code in the CLI and approve access to Amazon Q Developer.

Once the request is approved – enter Y in the command line to proceed with transformation:

Before starting the transformation, the agent verifies if you have at least the minimum supported version of Maven for transformation

Figure 5: Screenshot of terminal output showing Amazon Q Developer CLI transform command execution, displaying pre-processing steps and transformation job initiation
Figure 5: qct transform starts with pre-processing followed with transformation job

First ast-grep command is run using the pre-transformation template before transformation and once it’s successful, the Q Developer transformation job begins.

Figure 6: Screenshot of terminal showing successful completion of Amazon Q Developer CLI transformation, displaying the newly created Git branch name containing the transformed code
Figure 6: qct transform completed with the results committed to a new branch

After the transformation is complete the changes are saved to a local branch and the branch name can be obtained from CLI output.

After successful transformation ast-grep post-transformation step is executed and the branch is updated with the custom transformed code.

Reviewing and applying the changes

After the transformation is complete, review the changes in the new branch that’s in the CLI output. Use “git branch” to view the new branch created with the transformed files.

git branch

Figure 7: Screenshot of terminal output from 'git branch' command highlighting the newly created branch containing the transformed code
Figure 7: git branch – shows the new branch containing transformation result

Now compare the transformed branch with the source branch in our case its change-branch.

git diff change-branch

Figure 8: Screenshot of Git diff output showing Spring Boot dependency version update from 2.0.5 to 3.3.8 in the project's POM file
Figure 8: Spring boot upgraded to 3.3.8 from 2.0.5

Figure 8: Screenshot of Git diff output showing Java version configuration change from 8 to 17 in the project's build configuration
Figure 9: Java upgraded from 8 to 17

You can see the application is upgraded to Java 17, Spring Boot upgraded to 3.3.0 form 2.0.5, and the internal dependency movie-service-utils upgraded to 0.2.0 version to support Java 17.

Figure 10: Screenshot of Git diff output showing two code changes: removal of unused variables and replacement of System.out.println statements with logger.info calls
Figure 10: Unused variable removed also System.out.println replaced with Logging framework

Figure 11: Screenshot of Git diff output highlighting the removal of unused Java import statements from the source code
Figure 11: Unused imports removed

Figure 12: Screenshot of Git diff output showing MovieUtils.isValidMovie method signature change with updated parameters to align with new dependency version
Figure 12: MobieUtils.isValidMovie method updated to match the updated internal dependency

During pre-transformation

Unused local variables are identified and removed from the code.
Unused imports are identified and removed

During post-transformation

Using custom transformation, the sysout statements are replaced with logger framework.
The internal dependency MovieUtils.isValidMovieName method is updated with the required parameters that are required for the latest version using custom post transformation template.

Other changes of the Java 8 to 17 transformation are mentioned here.

Pre-transformation, transformation, and post-transformation changes are committed separately to help users compare differences and identify changes made in each step.

If you’re satisfied with the results, you can create a pull request from the branch which contains the transformed code to your corresponding release branch.

Troubleshooting

When working with Amazon Q Developer CLI transformations, you might encounter some common issues. Here’s how to address them:

Unstaged Git Commits
Before running a transformation, make sure to stash or commit any pending changes to your local branch. This ensures a clean working directory for the transformation process.
Clearing the Working Directory
If a transformation fails, clear the workspace located at ~/.aws/qcodetransform/transformation_projects/<your project name> before retrying the transformation. This step is only necessary for failed transformations.

Clean up

To clean up after the transformation:

Remove the user or group access to the Amazon Q Developer Pro application
Unsubscribe from Amazon Q Developer Pro

Call to Action

Ready to modernize your Java applications? Here’s how to get started:

Install Amazon Q Developer command line tool for transformation
Review the prerequisites for the command line transformation with Amazon Q Developer.
Download the sample project to practice transformations in a safe environment

Conclusion

Using Amazon Q Developer’s CLI capabilities for custom transformations provides a powerful and flexible way to upgrade Java applications. This approach allows you to automate the modernization process, saving time and reducing the risk of manual errors.

By leveraging custom rules, you can tailor the transformation to your specific needs, whether it’s updating deprecated methods, migrating to new APIs, or applying best practices across your codebase.

As you continue to work with Amazon Q Developer, explore more advanced transformation scenarios and integrate this process into your development workflow for ongoing modernization efforts.

About the authors

Deploying Zabbix Components with Docker and Docker Compose

2025-04-08 Janis Eidaks

Post Syndicated from Janis Eidaks original https://blog.zabbix.com/deploying-zabbix-components-with-docker-and-docker-compose/30025/

Installing Zabbix from packages can feel overwhelming, due to the availability of different configuration options. The detailed and comprehensive documentation certainly helps to check the purpose of these multiple options, what values can be set in their fields, and if one is required for your planned deployment. There are quite a few official Zabbix blog posts about Zabbix in containers, and this post is aimed at showcasing how additional Zabbix components can be easily set up in a docker environment, along with docker run and docker compose examples.

Table of Contents

For those who would prefer to use Zabbix in a containerized environment such as Docker, or who want to try out Zabbix quickly, this guide is for you (you can also check out the other Zabbix Docker blog posts). You can also mix and match Zabbix components installed from packages or built from source with those running in containers.

Please follow the official guide on how to set up the docker here.

To better understand the Zabbix architecture for those who are trying out Zabbix for the first time, I will give you an overview that should make it much easier to follow and understand Zabbix.

Zabbix consists of 3 main components (the bare minimum to get started):

Zabbix Server – responsible for everything related to data collection, trigger evaluation, event generation, and alerting.
Zabbix Frontend – responsible for the configuration (modifying or changing the configuration of the monitoring targets) and visualization (dashboards, graphs, tables, and widgets).
Database – this is where the Zabbix configuration and monitoring history data are stored.

You can monitor your targets with the bare minimum setup; however, more comprehensive and complete monitoring can be achieved by using the C-based Zabbix-agent or GO-based Zabbix-agent2 in combination with templates, user parameters, and more. To set up the minimum necessary Zabbix components, you can use this example in the guide.

There are also official guides available on the Zabbix documentation page (for both: the docker run and docker compose) or the Docker/Github.

As of this writing , these official Zabbix docker components are available from the docker hub page:

Zabbix Server (with MySQL/PostgreSQL database)
Zabbix Proxy (with MySQL/SQLite3 database)
Zabbix Frontend (Apache/Nginx with MySQL/PostgreSQL DB)
Zabbix Agent (TLS encryption)
Zabbix Agent2 (TLS encryption)
Zabbix Java Gateway
Zabbix SNMP traps
Zabbix Web Service

Tags are used to select which OS container an image will be based on, as well as which Zabbix component version you wish to employ. If you only specify tag value – latest, you will get the latest Zabbix version based on the Alpine Linux. The images based on Linux Alpine are more lightweight than the other distros.

When something does not work as expected or fails, check the container error logs! This will be useful for debugging purposes and will help to narrow down the cause of an issue. Additionally, when debugging you can also specify additional options, such as specific lines of log, timestamp since or until, or following the log file content.

# docker logs --tail 50 container_name_or_id

    --details        Show extra details provided to logs
-f, --follow         Follow log output
    --since string   Show logs since timestamp (e.g. "2013-01-02T13:23:37Z") or relative (e.g. "42m" for 42 minutes)
-n, --tail string    Number of lines to show from the end of the logs (default "all")
-t, --timestamps     Show timestamps
    --until string   Show logs before a timestamp (e.g. "2013-01-02T13:23:37Z") or relative (e.g. "42m" for 42 minutes)

In some rare cases, when there is a container issue (everything else is correct, worked before, etc.), restarting the docker service can sometimes solve the issue.

So, what is different if you have only used Zabbix installed from packages? The examples below illustrate the differences in configuration options based on different Zabbix deployment methods: a) package-based/compiled installation, b) docker run command, and c) docker compose file example. First of all, you will have to specify environment variables in the docker run command or docker compose file. The list of available environment variables for each docker image is available in both docker hub and Github.

A). Package-based config

# vi /etc/zabbix/zabbix_server.conf
...
DBName=zabbix
DBUser=zabbix_usr
DBPassword=zabbix_pwd
...

B).Docker run config

docker run --name zbxsrv -t \
...
-e MYSQL_DATABASE=zabbix\
-e MYSQL_USER=zabbix_usr \
-e MYSQL_PASSWORD=zabbix_pwd\
...

C). Docker compose config

# vi /../...yaml
...
  environment:
   MYSQL_DATABASE=zabbix
   MYSQL_USER=zabbix_usr
   MYSQL_PASSWORD=zabbix_pwd

The environment variables are represented as key-value pairs, e.g., VAR=VAL. The values can optionally be unquoted or double-quoted. If some environment variable value contains special characters, you will need to escape them. To properly escape them, check out the docker documentation page.

You can create custom, user-defined networks to connect multiple containers to the same network. On such networks, containers can resolve each other by name or alias. If needed, you can assign a specific IP address to a container (if the address is already used, you will get an error).

# docker network create --subnet 172.20.0.0/16 --ip-range 172.20.240.0/20 zabbix-net

Docker run

In this section, we have an example of docker run commands for two Zabbix components: Zabbix proxy and Java gateway. When using custom, user-defined networks, you can use container names for communication between containers instead of using IP addresses. Here, instead of defining the IP address for Zabbix Java gateway, the container name is used. You can set a static IP address for your container or let docker do it for you, but confirm if the change of the IP address will not cause issues in case your container gets a different IP address. This can become an issue if you use an IP address in some configuration fields instead of a container name.

A lot of parameters are specified using environment variables with the option -e. Also, 3 different ports are exposed on your host machine. To keep the SQLite3 database file upon container deletion, the container directory containing database file is mounted to host directory (the proxy DB is usually used as a buffer storage before sending data to Zabbix server and usually is not used to store data beyond the moment when the data is sent).

docker run --name zabbix-proxy-active-01 \
-e ZBX_HOSTNAME="Zabbix-proxy-active-01" \
-e ZBX_SERVER_HOST=46.101.140.98 \
-e ZBX_PROXYMODE="0" \
-e ZBX_JAVAGATEWAY_ENABLE=true \
-e ZBX_JAVAGATEWAY=zabbix-java-gateway-proxy \
-e ZBX_JAVAGATEWAYPORT=10052 \
-e ZBX_STARTJAVAPOLLERS=5 \
--network=zabbix-net \
-e ZBX_LISTENPORT=10101  \
-p 10101:10101 \
-p 10050:10050 \
-p 10051:10051 \
-v /var/lib/zabbix/db_data:/var/lib/zabbix/db_data \
--restart unless-stopped \
--init -d zabbix/zabbix-proxy-sqlite3:alpine-7.2.4

docker run --name zabbix-java-gateway-proxy \
--network=zabbix-net \
--restart unless-stopped \
-d zabbix/zabbix-java-gateway:alpine-7.2.4

You can start each of these Zabbix components using the docker run command, however, any change to the container configuration will require you to stop the container, delete it, and execute the docker run command again. You also have another option – you could create a docker compose file and write the necessary configuration in yaml format. When you need to add some changes to the container configuration, run the docker compose down command to remove containers, edit the docker compose file, and run docker compose up command to start them up again with the new configuration:

docker compose -f ./docker_compose_v3_proxy.yaml down
docker compose -f ./docker_compose_v3_proxy.yaml up -d

If you have not mounted volume or directory to container for the data you want to keep, you can copy the data from the container to your host. Otherwise, that data will be gone if you delete the container or use the docker compose down command. So, it is important to set up the persistent storage/volume for the data that needs retaining, so you don’t lose important data from the container when container configuration is changed. You also need to expose the ports for the necessary services for the appropriate components (if they are set up on on separate hosts): zabbix-server, zabbix-proxy, zabbix-agent/zabbix-agent2 (default ports: 10050 for Zabbix agent passive mode, 10051 for Zabbix-agent active mode, some different port for proxy, 10052 for Java gateway).

Here we have the same docker run options written to docker compose file, including the environment variables, mounted directories and exposed ports. You can specify as many services as needed and start them just with docker compose command.

docker_compose_v3_proxy.yaml

services:
  zabbix-proxy-active-01:
    image: "${PROXY_SQLITE3_IMAGE}:${ALPINE_IMAGE_TAG}"
    environment:
      ZBX_HOSTNAME: Zabbix-proxy-active-01
      ZBX_SERVER_HOST: ${ZBX_SERVER_HOST}
      ZBX_PROXYMODE: 0
      ZBX_LISTENPORT: 10101
      ZBX_JAVAGATEWAY_ENABLE: true
      ZBX_JAVAGATEWAY: zabbix-java-gateway-proxy
      ZBX_JAVAGATEWAYPORT: 10052
      ZBX_STARTJAVAPOLLERS: 5
    volumes:
      - /var/lib/zabbix/db_data:/var/lib/zabbix/db_data:rw
    networks:
      - backend
    ports:
      - 10101:10101
      - 10050:10050
      - 10051:10051
    restart: unless-stopped

  zabbix-java-gateway-proxy:
    image: "${JAVA_GW_IMAGE}:${ALPINE_IMAGE_TAG}"
    networks:
      - backend
    restart: unless-stopped

networks:
  backend:
    name: zabbix-net
    external: true

.env

PROXY_SQLITE3_IMAGE=zabbix/zabbix-proxy-sqlite3
JAVA_GW_IMAGE = zabbix/zabbix-java-gateway
ALPINE_IMAGE_TAG=alpine-7.2.4
ZBX_SERVER_HOST=46.101.140.98

You can also use official Zabbix-supplied docker compose files, try them out, and modify them as needed.

You can read more about the official docker compose files here.

Containerized Zabbix components allow us to use test different scenarios within the docker:

Creating HA Zabbix-server nodes
Creating multiple proxies
Creating multiple agents
Adding more Java gateways
Creating multiple frontends
Easily configure Browser monitoring
Configure SNMP traps
Easily make scheduled reports

Deploying multiple redundant Zabbix servers

To enable HA Zabbix server mode, modify both the Zabbix-server container and Zabbix-frontend container configuration environment variables.

For the HA Zabbix server mode, add 2 environment variables:

ZBX_HANODENAME
ZBX_NODEADDRESS

All of the containers are set with the user-defined network, therefore I will use the container name in the ZBX_HANODENAME option instead of the static address, as it will be resolved by docker. If you need to use a different listen port for the trapper, you need to define it using the environment variable ZBX_LISTENPORT. You can omit the port in variable ZBX_HANODENAME, as the ZBX_LISTENPORT (default is 10051) will be applied automatically.

Here is the docker run example for the Zabbix-server HA mode.

docker run --name zabbix-server-mysql-ha1 -t \
-e DB_SERVER_HOST="mysql-server" \
-e MYSQL_DATABASE="zabbix" \
-e MYSQL_USER="zabbix" \
-e MYSQL_PASSWORD="zabbix_pwd" \
-e MYSQL_ROOT_PASSWORD="root_pwd" \
-e ZBX_HANODENAME="zabbix-server-HA1" \
-e ZBX_NODEADDRESS="zabbix-server-mysql-ha1" \
--network=zabbix-net \
-p 10151:10051 \
--restart unless-stopped \
-d zabbix/zabbix-server-mysql:alpine-7.2.4

docker run --name zabbix-server-mysql-ha2 -t \
-e DB_SERVER_HOST="mysql-server" \
-e MYSQL_DATABASE="zabbix" \
-e MYSQL_USER="zabbix" \
-e MYSQL_PASSWORD="zabbix_pwd" \
-e MYSQL_ROOT_PASSWORD="root_pwd" \
-e ZBX_HANODENAME="zabbix-server-HA2" \
-e ZBX_NODEADDRESS="zabbix-server-mysql-ha2" \
--network=zabbix-net \
-p 10251:10051 \
--restart unless-stopped \
-d zabbix/zabbix-server-mysql:alpine-7.2.4

From the frontend container, remove these two environment variables:

ZBX_SERVER_HOST
ZBX_SERVER_PORT

docker run --name zabbix-web-nginx-mysql -t \
-e ZBX_SERVER_HOST="zabbix-server-mysql" \
-e ZBX_SERVER_PORT=10051
-e DB_SERVER_HOST="mysql-server" \
-e MYSQL_DATABASE="zabbix" \
-e MYSQL_USER="zabbix" \
-e MYSQL_PASSWORD="zabbix_pwd" \
-e MYSQL_ROOT_PASSWORD="root_pwd" \
--network=zabbix-net \
-p 80:8080 \
--restart unless-stopped \
-d zabbix/zabbix-web-nginx-mysql:alpine-7.2.4

Once both container configurations are modified, you should be able to see the currently added HA server nodes and their states without issues.

Fig. 1. Containers of HA Zabbix server containers

Fig. 2. Dashboard – system information

You can also execute commands on the container:

# docker exec -it container_name_or_id sh -c "zabbix_server -R ha_status"

Fig. 3. Executing command on container

Containers of HA Zabbix server containers

I’t’s possible to allocate an interactive pseudo-TTY shell, by adding option -ti and specifying shell after the container name or id.

# docker exec -ti container_name_or_id /bin/bash

Fig. 4. Executing command from within container

You can also start multiple proxies at once in docker. This can help to offload preprocessing to the proxy, gather data from the targets behind the firewall, and send collected data back to the Zabbix server, only requiring one port.

Fig. 5.Overall block diagram of Zabbix monitoring opportunities

Deploying multiple Zabbix proxies

First, you must choose the proxy mode and set the environment variable ZBX_PROXYMODE.

For active mode proxy, please define the server host address for a single server or addresses separated by a semicolon in the case of HA Zabbix server configuration (example shown below).

docker run --name zabbix-proxy-active-01 \
-e ZBX_HOSTNAME="Zabbix-proxy-active-01" \
-e ZBX_SERVER_HOST="zabbix-server-mysql-ha1;zabbix-server-mysql-ha2;zabbix-server-mysql-ha3" \
-e ZBX_PROXYMODE="0" \
--network=zabbix-net \
-e ZBX_LISTENPORT=10101  \
-p 10101:10101 \
-v /var/lib/zabbix/db_data:/var/lib/zabbix/db_data \
--restart unless-stopped \
--init -d zabbix/zabbix-proxy-sqlite3:alpine-7.2.4

For passive mode proxy, define the server host address for a single server or addresses separated by a comma in the case of HA Zabbix server configuration (example shown below).

docker run --name zabbix-proxy-passive-01 \
-e ZBX_HOSTNAME="Zabbix-proxy-passive-01" \
-e ZBX_SERVER_HOST="zabbix-server-mysql-ha1,zabbix-server-mysql-ha2,zabbix-server-mysql-ha3" \
-e ZBX_PROXYMODE="1" \
--network=zabbix-net \
-e ZBX_LISTENPORT=10102 \
-p 10102:10102 \
-v /var/lib/zabbix/db_data:/var/lib/zabbix/db_data \
--restart unless-stopped \
--init -d zabbix/zabbix-proxy-sqlite3:alpine-7.2.4

docker_compose_v3_proxies.yaml

services:
  zabbix-proxy-active-01:
    image: "${PROXY_SQLITE3_IMAGE}:${ALPINE_IMAGE_TAG}"
    environment:
      ZBX_HOSTNAME: zabbix-proxy-active-01
      ZBX_SERVER_HOST: zabbix-server-mysql-ha1;zabbix-server-mysql-ha2;zabbix-server-mysql-ha3
      ZBX_PROXYMODE: 0
      ZBX_LISTENPORT: 10101
    volumes:
      - /var/lib/zabbix/db_data:/var/lib/zabbix/db_data:rw
    networks:
      - backend
    ports:
      - 10101:10101
    restart: unless-stopped

  zabbix-proxy-passive-01:
    image: "${PROXY_SQLITE3_IMAGE}:${ALPINE_IMAGE_TAG}"
    environment:
      ZBX_HOSTNAME: zabbix-proxy-passive-01
      ZBX_SERVER_HOST: zabbix-server-mysql-ha1,zabbix-server-mysql-ha2,zabbix-server-mysql-ha3
      ZBX_PROXYMODE: 1
      ZBX_LISTENPORT: 10102
    volumes:
      - /var/lib/zabbix/db_data:/var/lib/zabbix/db_data:rw
    networks:
      - backend
    ports:
      - 10102:10102
    restart: unless-stopped
networks:
  backend:
    name: zabbix-net
    external: true

.env

PROXY_SQLITE3_IMAGE=zabbix/zabbix-proxy-sqlite3
JAVA_GW_IMAGE = zabbix/zabbix-java-gateway
ALPINE_IMAGE_TAG=alpine-7.2.4
ZBX_SERVER_HOST=46.101.140.98

The proxy name in the frontend must be the same as the value set in proxy environment variable ZBX_HOSTNAME! Also, in frontend for active proxies, you don’t need to add the proxy address.

Next, you can set hosts to be monitored by Zabbix-proxies, but make sure to update the agent configuration, so agents accept connections from proxy.

Fig. 6. Hosts monitored by proxy

Fig. 7.List of proxies and hosts monitored by them

Configuring Proxy groups

You can create as many proxy containers as necessary in Docker, and you can also create proxy groups for load balancing (it is based on the number of hosts per proxy).

First, create a proxy group in the frontend:

Set proxy group name
Select failover period
Minimum number of proxies

Fig. 8.Creating a new proxy group

Next, add proxies to the proxy group, and specify the address for active agents and port for the active agents.

Fig. 9. Adding proxy to proxy group

Do not forget to change Zabbix agent configuration for hosts now monitored through the proxy group (add proxy groups IPs/DNS to Server and ServerActive options).

Fig. 10. Creating a new host and monitoring it through proxy group

You can see additional information regarding the proxies in the Frontend section: Administration/ Proxies.

Fig. 11. List of all configured proxies and those belonging to proxy group

Adding more Java gateways

Zabbix server or proxy can communicate with only one Zabbix java gateway, however, you are not limited tin how many Zabbix proxies you create together with Zabbix Java Gateway. You can make an unlimited number of pairs, consisting of Zabbix proxy with Zabbix Java Gateway.

For the containerized Zabbix server, you will need to add these 4 environment variables:

ZBX_JAVAGATEWAY_ENABLE=true
ZBX_JAVAGATEWAY=zabbix-java-gateway-server
ZBX_JAVAGATEWAYPORT=10052
ZBX_STARTJAVAPOLLERS=5

And start the Java gateway for the zabbix-server in docker:

docker run --name zabbix-java-gateway-server -t \
--network=zabbix-net \
--restart unless-stopped \
-d zabbix/zabbix-java-gateway:alpine-7.2.4

Or if you want to add java gateway to the Zabbix proxy, then add these 4 environment variables to Zabbix proxy in docker:

ZBX_JAVAGATEWAY_ENABLE=true
ZBX_JAVAGATEWAY=zabbix-java-gateway-proxy
ZBX_JAVAGATEWAYPORT=10052
ZBX_STARTJAVAPOLLERS=5

And start the java gateway as a container:

docker run --name zabbix-java-gateway-proxy -t \
--network=zabbix-net \
--restart unless-stopped \
-d zabbix/zabbix-java-gateway:alpine-7.2.4

And here we have a host, monitored by zabbix-agent2 through zabbix-proxy-active-02

Fig. 12. Host monitored by proxy with configured Java gateway

Upgrading docker proxies with SQLite3 database

If you have older Zabbix components already running in docker and you have upgraded the server, you will also need to upgrade the proxies.

If you have a container created from the proxy zabbix-proxy-sqlite3 image and want to upgrade it, you will lose the existing data stored in the SQLite3 database. For most users, the database functions as a buffer to temporarily keep the data until it’s sent to Zabbix server and the loss of the proxy database file data is of no consequence.

Once you have updated the image for the container, the proxy will detect the existing old database version on startup. If the directory is mounted to database file, it will delete the database file and create a new one. This will impact those who keep data after sending it to Zabbix server and use the data from the proxy database for other purposes.

Fig. 13. Database upgrade for proxy container with SQLite3 database

Upgrading docker proxies with MySQL database

To upgrade the MySQL database for proxy, log in in the MySQL database, set the log_bin_trust_function_creators flag to 1. Change the proxy image version to a newer one and start the container.

mysql> set global log_bin_trust_function_creators = 1;

If you have not set the flag, you will receive an error of database upgrade.

Fig. 14. Failed database upgrade for proxy with MySQL database

Replace the previous version of the proxy image with the new one, check the log file, and check the docker logs to see when the database schema upgrade has finished. After the upgrade, set the flag back to 0.

mysql> set global log_bin_trust_function_creators = 0;

The upgrade has been successful, and the proxy service has started after that.

Fig. 15. Successful database upgrade for proxy with MySQL database

An official docker image for the proxy with Postgresql database support is not available due to the extensive number of existing images and different versions.

Deploying multiple frontends

You can launch as many frontends as you need if you are experiencing a sudden surge in Zabbix users. Just specify which port to assign for it and you are good to go (don’t forget to also open the port in the firewall).

docker run --name zabbix-web-nginx-mysql1 -t \
-e DB_SERVER_HOST="mysql-server" \
-e MYSQL_DATABASE="zabbix" \
-e MYSQL_USER="zabbix" \
-e MYSQL_PASSWORD="zabbix_pwd" \
-e MYSQL_ROOT_PASSWORD="root_pwd" \
--network=zabbix-net \
-p 80:8080 \
--restart unless-stopped \
-d zabbix/zabbix-web-nginx-mysql:alpine-7.2.4

Fig. 16. One started Zabbix frontend container in docker

docker run --name zabbix-web-nginx-mysql2 -t \
-e DB_SERVER_HOST="mysql-server" \
-e MYSQL_DATABASE="zabbix" \
-e MYSQL_USER="zabbix" \
-e MYSQL_PASSWORD="zabbix_pwd" \
-e MYSQL_ROOT_PASSWORD="root_pwd" \
--network=zabbix-net \
-p 81:8080 \
--restart unless-stopped \
-d zabbix/zabbix-web-nginx-mysql:alpine-7.2.4

Fig. 17. Two started Zabbix frontend containers in docker

docker run --name zabbix-web-nginx-mysql3 -t \
-e DB_SERVER_HOST="mysql-server" \
-e MYSQL_DATABASE="zabbix" \
-e MYSQL_USER="zabbix" \
-e MYSQL_PASSWORD="zabbix_pwd" \
-e MYSQL_ROOT_PASSWORD="root_pwd" \
--network=zabbix-net \
-p 82:8080 \
--restart unless-stopped \
-d zabbix/zabbix-web-nginx-mysql:alpine-7.2.4

Fig. 18. Three started Zabbix frontend containers in docker

Fig. 19. Multiple frontends accessed through different ports

Browser monitoring

Browser monitoring setup has never been easier! Just add two parameters to zabbix-server container config:

ZBX_WEBDRIVERURL=selenium:4444
ZBX_STARTBROWSERPOLLERS=2

And start the web driver in the docker (with a standalone chrome browser):

docker run --name selenium -t\
--network=zabbix-net \
--restart unless-stopped \
-p 4444:4444 \
--shm-size="1g" \
-d selenium/standalone-chrome:latest

Next step: create a new host, add the template, specify which page to monitor with Macro values, and it’s DONE!!!!

Fig. 20. Creating host for monitoring website

Fig. 21. Screenshot of the monitored website

SNMP traps

For the snmptraps to work, the same directory must be shared among the zabbix-server and zabbix-snmptrap container. On the Zabbix-server side, you need to explicitly set snmp environment variable ZBX_ENABLE_SNMP_TRAPS to true and mount directory /var/lib/zabbix/snmptraps.

You also need to add the same volume to the snmptrap container.

And run the snmptraps container (make sure there is no permission issue for the directory)

docker run --name zabbix-snmptraps -t \
-v /var/lib/zabbix/snmptraps:/var/lib/zabbix/snmptraps:rw \
--network=zabbix-net \
-p 162:1162/udp \
--restart unless-stopped \
-d zabbix/zabbix-snmptraps:alpine-7.2-latest

Fig. 22. Received SNMP trap message

Scheduled reports

You can also easily configure scheduled reports by adding 2 additional environment variables to the Zabbix-server. In my case, both of these containers are in the same custom user network, therefore I will use the container name zabbix-web-service in the ZBX_WEBSERVICEURL option.

ZBX_STARTREPORTWRITERS=5
ZBX_WEBSERVICEURL=http://zabbix-web-service:10053/report

Start the Zabbix-web service, specify also these 2 parameters (you can skip those if defaults are used). You can also allow any incoming connections by setting ZBX_ALLOWEDIP=0.0.0.0/0. We discourage this, however.

ZBX_ALLOWEDIP=zabbix-server-mysql
ZBX_LISTENPORT=10053

Before testing scheduled reports, make sure you have enabled and configured the email media type.

Fig. 23. Configured and enabled media type

It is also encouraged to test it and check that you have received the test email.

Fig. 24. Successful media type test response

Fig. 25. Received test response on the selected media type.

Next, configure the user media where the scheduled report will be sent.

Fig. 26.Media type defined for the user

Last, but not least, set the frontend URL in the section Administration/General/Other section. In my case, I set the container name of the frontend and specify the port.

for Apache: http://<server_ip_or_name>/zabbix
for Nginx: http://<server_ip_or_name>

Fig. 27. Configured frontend address for the Frontend URL option

Next, create a scheduled report based on the dashboard of your choice.

Fig. 28.Configuring scheduled report

Check that you have received the test report in your mail.

Fig. 29.Successful scheduled report test.

Fig. 30. Received scheduled report test in the email

Now you know how to set up scheduled reports!

Docker container monitoring

You can also monitor Docker containers with a containerized Zabbix instance*

* Disclaimer: If docker service is not running, Zabbix monitoring will also not function and you will not receive notifications and alerts.

You can also monitor your docker instance with the Zabbix agent 2, however, you will be required to install Zabbix-agent 2 on the host either as a package or build it from the source.

You will also need to give user zabbix access to the docker.sock file. Just add user zabbix to group docker:

# usermod -aG docker zabbix

Otherwise, you will get an error message in items:

Cannot fetch data: Get "http://1.28/info": dial unix /var/run/docker.sock: connect: permission denied.

Go back to the frontend and create a Host for monitoring the docker containers:

Link template: Docker by Zabbix agent 2
Add host to host group
Specify host address or dns name, set the correct connect to option, and specify the agent port (if a default port is used, then set 10050).

Fig. 31. Configuring the host for monitoring the docker container

Now, if some issue happens to other containers, Zabbix will monitor them. But to be notified of an issue, don’t forget to enable and configure the media, user media, media templates, and trigger actions, so that you receive alerts.

Fig. 32.Latest data for the docker host

Thank you for reading – I hope you’ve found this article helpful and informative!

The post Deploying Zabbix Components with Docker and Docker Compose appeared first on Zabbix Blog.

Validate Your Lambda Runtime with CloudFormation Lambda Hooks

2025-04-02 Matteo Luigi Restelli

Post Syndicated from Matteo Luigi Restelli original https://aws.amazon.com/blogs/devops/validate-your-lambda-runtime-with-cloudformation-lambda-hooks/

Introduction

This post demonstrates how to leverage AWS CloudFormation Lambda Hooks to enforce compliance rules at provisioning time, enabling you to evaluate and validate Lambda function configurations against custom policies before deployment. Often these policies impact the way a software should be built, restricting language versions and runtimes. A great example is applying those policies on AWS Lambda, a serverless compute service for running code without having to provision or manage servers. While AWS Lambda already manages the deprecation of runtimes, preventing you from deploying unsupported runtimes, organizations may need to provide and enforce their specific compliance rules not directly linked to the deprecation of a specific language version.

Introducing Lambda Hooks

AWS CloudFormation Lambda Hooks are a powerful feature that allows developers to evaluate CloudFormation and AWS Cloud Control API operations against custom code implemented as Lambda functions. This capability enables proactive inspection of resource configurations before provisioning, enhancing security, compliance, and operational efficiency.

Lambda Hooks provide a mechanism to intercept and evaluate various CloudFormation operations, including resource operations, stack operations, and change set operations (they can also be used with Cloud Control API, but in this post we’re focusing on CloudFormation). By activating a Lambda Hook, CloudFormation creates an entry in your account’s registry as a private Hook, allowing you to configure it for specific AWS accounts and regions. When configuring Lambda Hooks, you can specify one or more Lambda functions to be invoked during the evaluation process. These functions can be in the same AWS account and Region as the Hook, or in another Account you own, provided proper permissions are set up. The evaluation process occurs at specific points in the CloudFormation Stack lifecycle. For instance, during stack creation, update, or deletion, the configured Lambda functions are invoked to assess the proposed changes against your defined compliance rules. Based on the evaluation results, the hook can either block the operation or issue a warning, allowing the operation to proceed.

Lambda Hooks evaluate resources before they are provisioned through CloudFormation, providing a pre-emptive layer of governance. This means that non-compliant resources are caught and prevented from being deployed, rather than requiring retroactive fixes. By leveraging Lambda Hooks, organizations can automate and standardize their compliance checks across all AWS accounts and regions. This centralized approach to policy enforcement ensures consistency and reduces the overhead of managing compliance manually.

Solution Overview

The following sections demonstrate a practical use case for AWS CloudFormation Lambda Hooks, focusing on enforcing compliance rules on AWS Lambda runtimes.

Meet AnyCompany, a forward-thinking enterprise with a robust set of compliance rules governing their software development practices. Among these rules is a strict policy on the use of specific AWS Lambda runtimes.

As they continue to embrace serverless architecture, AnyCompany faces a challenge: how to prevent the deployment of Lambda functions that use non-compliant runtimes. Given their commitment to AWS CloudFormation for deploying Lambda functions, AnyCompany is keen to leverage the power of AWS CloudFormation Lambda Hooks.

We’ll explore the setup process, demonstrate the hook in action, and discuss the broader implications for maintaining compliance in a dynamic cloud environment.

Architecture

The following architecture highlights the implementation of the Lambda Hook. In this implementation, we are using AWS CloudFormation Lambda Hooks to intercept the deployment of Lambda Functions and perform the compliance checks on these resources. The Lambda Hook will interact with an AWS Lambda Function, which will perform the compliance checks. Finally, we’re using AWS Systems Manager Parameter Store to store the Configuration Parameter which contains the list of permitted Lambda Runtimes.

Figure 1: Architecture of the Solution

A Developer (or a CI/CD pipeline) deploys a CloudFormation stack containing Lambda functions.
CloudFormation invokes the respective Lambda Hook, which is configured to intercept operations on AWS Lambda Resources. We are setting this hook to “FAIL” deployment in case checks are not successful.
The Lambda Hook checks if the runtime of the Lambda is admitted or violates Company’s compliance. To do this, it checks if the runtime is present on a pre-configured list of admitted runtimes saved as Parameter in AWS Systems Manager Parameter Store. Keep in mind that we’re using SSM Parameter Store to store the configuration for this specific example, but other alternatives may be viable as well (Amazon DynamoDB, AWS Secrets Manager, or AppConfig lambda-function-settings-check Preventive Rule)
The Lambda Hook, after checking runtime compliance, replies:
- With a failure, if the Lambda runtime is not compliant
- With a success, if the Lambda runtime is compliant
Depending on the response of the Lambda Hook, the deployment may or not take place.

Repository Structure

You can find all the code for this solution at this link. Here’s the repository structure:

.
├── README.md
├── deploy.sh
├── cleanup.sh
├── hook-lambda
│ ├── index.ts
│ ├── package.json
│ ├── services
│ │ └── parameter-store.ts
│ └── tsconfig.json
├── sample
│ ├── deploy_sample.sh
│ ├── cleanup_sample.sh
│ └── lambda_template.yml
└── template.yml

hook-lambda: directory containing all the code related to the CloudFormation Lambda Hook (Validation Lambda Function, and the CloudFormation template for the Solution)
sample: directory containing the code of the sample used to test the CloudFormation Lambda Hook
deploy.sh: utility script to deploy the Solution via AWS CLI
cleanup.sh: utility script to clean up the AWS CloudFormation Hook infrastructure via the AWS CLI
template.yml: AWS CloudFormation Template containing all the AWS Resources involved in the Solution

Prerequisites

You must have the following prerequisites for this solution:

An AWS account or sign up to create and activate one.
The following software installed on your development machine:
Install the AWS Command Line Interface (AWS CLI) and configure it to point to your AWS account.
Install Node.js and use a package manager such as npm.
Appropriate AWS credentials for interacting with resources in your AWS account.

Walkthrough

Creating the AWS Lambda Validation Function – Lambda Code

The CloudFormation Lambda Hook interacts with a specific Lambda (referred to as Validation Lambda throughout the rest of this post), which gets invoked during CloudFormation CREATE and UPDATE STACK operations involving Lambda Functions. The goal is to check if these Lambda functions have runtimes that comply with AnyCompany’s rules.

Below is the detailed description of the steps that the Validation Lambda function handler follows (the code is written in Typescript).

First, the Validation Lambda retrieves an environment variable containing the SSM Parameter Store parameter name which contains the compliant runtimes list. Additionally, safety checks ensure that only Lambda Resources are considered and that their Runtime property is defined.

Note that both safety checks could be skipped, since the Hook should already be configured to interact only with Lambda Resources and the Lambda’s Runtime property is always required. However, they remain in place to demonstrate how to retrieve this information from the Lambda Hook event in your handler.

const parameterName = process.env.PERMITTED_RUNTIMES_PARAM;
if (!parameterName) {
	throw new Error('Permitted Runtimes Parameter is not set');
}

const resourceProperties = event.requestData.targetModel.resourceProperties;
// Check if this is a Lambda function resource
if (event.requestData.targetType !== 'AWS::Lambda::Function') {
console.log("Resource is not a Lambda function, skipping");
	return {
		hookStatus: 'SUCCESS',
		message: 'Not a Lambda function resource, skipping validation',
		clientRequestToken: event.clientRequestToken
	}
}

// Check runtime version compliance
const runtime = resourceProperties.Runtime;
if (!runtime) {
	console.log("Runtime not defined, failing");
	return {
		hookStatus: 'FAILURE',
		errorCode: 'NonCompliant',
		message: 'Runtime is required for Lambda functions',
		clientRequestToken: event.clientRequestToken
	}
}

Then the Validation Lambda retrieves the value of the Configuration Parameter from SSM Parameter Store through a utility class called ParameterStoreService. For this post, consider that the value inside that Configuration Parameter is a list of strings, where each string contains one of the possible Lambda runtime values that you can find here (e.g. nodejs22.x,nodejs20.x,python3.11,python3.10,java17,java11,dotnet6). After retrieving the value, the Validation Lambda checks if the runtime of the Lambda Resource complies with the configured admitted runtimes. If the runtime is not compliant, you’ll receive a properly formatted response with FAILURE as hookStatus, otherwise the response will contain a SUCCESS hookStatus.

// Retrieve configuration from Parameter Store
const compliantRuntimes = await parameterStoreService.getParameterFromStore(parameterName);

// Check if Lambda runtime is permitted or not
if (!compliantRuntimes.includes(runtime)) {
console.log("Runtime " + runtime + " not compliant ");
	return {
		hookStatus: 'FAILURE',
		errorCode: 'NonCompliant',
		message: `Runtime ${runtime} is not compliant. Please use one of: ${compliantRuntimes.join(', ')}`,
		clientRequestToken: event.clientRequestToken
	}
}

return {
	hookStatus: 'SUCCESS',
	message: 'Runtime version compliance check passed',
	clientRequestToken: event.clientRequestToken
}

For more information about the possible response values of CloudFormation Lambda Hooks Lambda, have a look at this link.

Creating the validation Lambda – Lambda CloudFormation definition

The Validation Lambda function will be deployed via CloudFormation, in the same Stack with the CloudFormation Lambda Hook definition and the AWS Systems Manager Parameter Store Parameter. Here’s the fragment of the CloudFormation Template containing its definition:

# Lambda Function
ValidationFunction:
	Type: AWS::Lambda::Function
	Properties:
		Handler: index.handler
		Role: !GetAtt LambdaExecutionRole.Arn
		Code:
			S3Bucket: !Ref DeploymentBucket
			S3Key: hook-lambda.zip
		Runtime: nodejs22.x
		Timeout: 60
		MemorySize: 128
		Environment:
			Variables:
				PERMITTED_RUNTIMES_PARAM: !Ref ParameterStoreParamName

You’ll need to associate an IAM Role with proper permissions to access the AWS Systems Manager Parameter Store Parameter:

# Lambda Function Role
LambdaExecutionRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service: lambda.amazonaws.com
            Action: sts:AssumeRole
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
		
# IAM Policy to access Parameter Store
ParameterStoreAccessPolicy:
    Type: AWS::IAM::RolePolicy
    Properties:
      RoleName: !Ref LambdaExecutionRole
      PolicyName: ParameterStoreAccess
      PolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Action:
              - ssm:GetParameter
            Resource: !Sub arn:aws:ssm:${AWS::Region}:${AWS::AccountId}:parameter${ParameterStoreParamName}

Creating the CloudFormation Lambda Hook

At this point, you only need to author a proper CloudFormation Lambda Hook. The Hook requires:

To be activated during the CREATE and UPDATE CloudFormation operations,
To consider only AWS::Lambda::Function CloudFormation resources
To act during Pre Provisioning of CloudFormation templates
To target Stack and Resource Operations
Target the already defined Lambda Validation function

Here’s the definition in the CloudFormation template:

# Lambda Hook
ValidationHook:
    Type: AWS::CloudFormation::LambdaHook
    Properties:
      Alias: Private::Lambda::LambdaResourcesComplianceValidationHook
      LambdaFunction: !GetAtt ValidationFunction.Arn
      ExecutionRole: !GetAtt HookExecutionRole.Arn
      FailureMode: FAIL
      HookStatus: ENABLED
      TargetFilters:
        Actions:
          - CREATE
          - UPDATE
        InvocationPoints:
          - PRE_PROVISION
        TargetNames:
          - AWS::Lambda::Function
      TargetOperations:
        - RESOURCE
        - STACK

Please note that the above template contains a reference to an IAM Role because the Hook requires proper permissions to call the target (Lambda Function). Here’s the IAM Role definition:

# Hook Execution Role
HookExecutionRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service: hooks.cloudformation.amazonaws.com
            Action: sts:AssumeRole

# IAM Policy for Lambda Invocation
LambdaInvokePolicy:
    Type: AWS::IAM::RolePolicy
    Properties:
      RoleName: !Ref HookExecutionRole
      PolicyName: LambdaInvokePolicy
      PolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Action:
              - lambda:InvokeFunction
            Resource: !GetAtt ValidationFunction.Arn

Configuring the compliant runtimes – Using Systems Manager Parameter Store

AWS Systems Manager Parameter Store is a secure, hierarchical storage service for configuration data management and secrets management, allowing users to store and retrieve data such as configurations, database strings etc. as parameter values.

In this specific example, we’ll leverage Parameter Store to store our permitted Lambda runtimes configuration. This configuration value is a StringList parameter, containing a comma-separated list of permitted runtimes. Here’s the fragment of the CloudFormation template that defines the Parameter:

# Parameter Store Parameter
ConfigParameter:
    Type: AWS::SSM::Parameter
    Properties:
      Name: !Ref ParameterStoreParamName
      Type: StringList
      Value: !Ref ParameterStoreDefaultValue
      Description: "Configuration for Lambda Hook"

Please note the usage of CloudFormation parameters for the ‘Name’ and ‘Value’ properties, allowing for dynamic input when deploying the CloudFormation template.

Deploying the Solution

To deploy the solution you can leverage the script deploy.sh in the root folder of the repository. This script will perform the following actions:

Compile and build the Validation Lambda Function
Create an Amazon S3 Bucket to store the CloudFormation Template
Upload the CloudFormation template and Lambda code to the S3 Bucket
Deploy the CloudFormation template

Testing the Lambda Hook

To test the CloudFormation Lambda Hook, deploy a simple testing CloudFormation template containing a Hello World Lambda function. First, test the Lambda configured with a permitted Lambda runtime, then modify the template to configure the Lambda with a non-compliant runtime.

Here’s the initial definition of the testing CloudFormation Template:

# Lambda Function
HelloWorldFunction:
    Type: AWS::Lambda::Function
    Properties:
      FunctionName: hello-world-function
      Runtime: nodejs22.x
      Handler: index.handler
      Role: !GetAtt LambdaExecutionRole.Arn
      Code:
        ZipFile: |
          exports.handler = async (event, context) => {
              console.log('Hello World!');
              const response = {
                  statusCode: 200,
                  body: JSON.stringify('Hello World!')
              };
              return response;
          };
      Timeout: 30
      MemorySize: 128

Please note that the Runtime value is nodejs22.x, which is currently in the list of permitted runtimes. The expectation is that the deployment of this function will succeed.

Deploy this template via the AWS CLI:

aws cloudformation deploy \
--template-file ./lambda_template.yml \
--capabilities CAPABILITY_IAM \
--stack-name lambda-sample

Check the CloudFormation Console:

Figure 2: CloudFormation Console showing successful Stack deployment

As expected, the deployment was successful. You can also see that the CloudFormation Lambda Hook has been invoked by taking a look at the CloudWatch Logs:

Figure 3: Validation Lambda Function Logs with successful validation

Now modify the original sample Template in order to set a Lambda Runtime which is not inside the list of permitted runtimes:

# Lambda Function
HelloWorldFunction:
    Type: AWS::Lambda::Function
    Properties:
      FunctionName: hello-world-function
      Runtime: nodejs18.x
      Handler: index.handler
      Role: !GetAtt LambdaExecutionRole.Arn
      Code:
        ZipFile: |
          exports.handler = async (event, context) => {
              console.log('Hello World!');
              const response = {
                  statusCode: 200,
                  body: JSON.stringify('Hello World!')
              };
              return response;
          };
      Timeout: 30
      MemorySize: 128

Deploy this template via AWS CLI with the same command used before and check the CloudFormation Console:

Figure 4: CloudFormation Console showing failed Stack deployment due to Hook intervention

As expected, the deployment was not successful. The CloudFormation Lambda Hook has been invoked, and since the Lambda Runtime was not present in the permitted runtimes list, the deployment failed.

You can also see that the hook failed In the CloudWatch Logs:

Figure 5: Validation Lambda Function Logs with validation error

Cleaning up

To clean up the resources related to the sample, you can run the script cleanup_sample.sh inside the sample folder. This script will delete the sample’s CloudFormation Template through the AWS CLI.

To cleanup the resources related to the solution described above and based on AWS CloudFormation Lambda Hook, you can leverage the script cleanup.sh in the root folder of the repository. This script will perform the following actions:

Delete the CloudFormation Stack
Empty the S3 Bucket used for the deployment of the Stack
Delete the S3 Bucket

Conclusion

In this post, you explored the implementation of CloudFormation Hooks to enforce runtime compliance in Lambda functions across your AWS infrastructure. By leveraging the Lambda hook’s capabilities, you learned how to create a preventative control that validates Lambda runtime configurations before deployment.

By activating the Lambda hook and implementing a custom Lambda function validator, you established an automated mechanism to ensure that only compliant runtimes are used within your organization’s Lambda functions during CloudFormation stack creation and updates. The solution’s integration with common development tools like AWS CLI, AWS SAM, CI/CD pipelines, and AWS CDK makes it straightforward to implement these controls within existing workflows, eliminating the need for manual runtime checks or post-deployment remediation.

The validation approach demonstrated in this post extends beyond Lambda runtimes and can be adapted to different AWS Resources supported by CloudFormation, allowing you to enforce policies on different infrastructure components offered by AWS.

About the author

Installing and Configuring Zabbix Server and Agent on Windows

2025-04-01 Zabbix LatAm

Post Syndicated from Zabbix LatAm original https://blog.zabbix.com/installing-and-configuring-zabbix-server-and-agent-on-windows/29945/

Monitoring a Windows server helps verify and keep track of reboots, disk space, memory, CPU, communication loss, and high bandwidth consumption within the server – in fact, anything unusual that may require attention. In this post, we’ll see how to install and configure the Zabbix server and Zabbix agent on Windows, highlighting the key points that will keep your system running smoothly.

Check Zabbix server version

First, check which version of Zabbix server you’re using. This can be verified from the frontend in Reports > System information. In this example, we use version 7.0.9.

Before you begin, head over to your Windows server and verify the name and type of architecture.

This is critical to selecting the right agent during deployment.

Download the Zabbix agent

From the official Zabbix website, download the corresponding agent, taking into account the operating system (in this case, Windows), the hardware architecture (64 bits), the version of Zabbix server (ensuring compatibility with the version used), encryption (using OpenSSL as an encryption method), and the installation format (selecting the MSI file).

Select the current version of the release and download.

Install the Zabbix agent

Start installing the Zabbix Agent on the Windows server.

Accept the terms and conditions.

Check the components to be installed. You’ll need at least 8.70 MB of disk space.
The default installation path is C:\Program Files\Zabbix Agent\.

By default, the installer detects the name of the server. Enter the IP of your Zabbix Server. You can also use pre-shared keys.

Start the installation and wait for it to finish.

Configure host in Zabbix Server

To set up a host on Zabbix Server, go to the Zabbix frontend and go to Data collection > Hosts.

Then, click Create host (located in the top right) and configure the following details:

• The hostname (DESKTOP-D75R1IG)
• An identifying display name (such as ‘Windows Server’)
• The template (select ‘Windows by Zabbix Agent’)
• The group (assigns the server to an appropriate group)
• The interface (choose the agent monitoring option and enter the IP of the server)

Monitoring and visualization

Once the host is configured, you will start receiving data from the server in Zabbix, including:

• Overall performance: CPU, memory, and disk status.

• Windows services and detailed host information.

• Bandwidth consumption

Conclusion

Zabbix provides an ideal template for a productive environments on Windows, making it a key tool for global monitoring of your servers.

In addition, you can extend and adapt the monitoring according to your needs, such as monitoring logs, ports, or specific events, while also checking for login failures or other critical issues in your systems.

The post Installing and Configuring Zabbix Server and Agent on Windows appeared first on Zabbix Blog.

Monitoring Pure Storage FlashArray with Zabbix

2025-03-25 Aleksandr Iantsen

Post Syndicated from Aleksandr Iantsen original https://blog.zabbix.com/monitoring-pure-storage-flasharray-with-zabbix/29752/

Monitoring data storage systems is the key to keeping modern IT systems running smoothly. With the rapid growth of data and the need for instant access, using high-performance solutions like Pure Storage FlashArray is not just an advantage – it’s a necessity. However, even the most advanced systems require careful oversight regarding their performance and health. Good monitoring helps find problems early and makes it possible to use resources more efficiently. In this article, we will explore how to set up monitoring for the Pure Storage FlashArray storage system with Zabbix using our new templates.

Pure Storage FlashArray offers two API versions: REST API 1.X and REST API 2.X. To ensure compatibility and comprehensive coverage for the maximum number of devices, two templates have been developed for these API versions. This allows users to effectively monitor their Pure Storage FlashArray storage systems regardless of which API version they are utilizing, making sure that they can take full advantage of the monitoring capabilities and performance metrics provided by each version. By accommodating both API versions, organizations can achieve a more flexible and comprehensive monitoring setup tailored to meet their specific infrastructure needs.

Table of Contents

Preparing Pure Storage FlashArray for monitoring with Zabbix

In all of these examples, the Purity for FlashArray (Purity//FA) graphical user interface (GUI) will be used, so keep in mind that some of the UI elements or navigation menus can potentially change in the future.

User creation

First of all, you need to set up a user in GUI that Zabbix will use to access the REST API and gather data. To do so, navigate to 'Settings' -> 'Users and Policies' -> 'Users' from the left-side menu. On this page, pay attention to the ‘Users’ block. In the upper right corner of this block, you will see three dots. Click on them to open a context menu. In this menu, select the 'Create User...' option. Here, create a new user by filling in the fields.

API Key creation

Unlike Pure Storage FlashArray v2 by HTTP, Pure Storage FlashArray v1 by HTTP supports authentication using a username and password instead of a token. This feature is left for backward compatibility with older versions of devices and firmware. However, it is strongly recommended to use token authentication if there are no technical limitations.

If you do plan to use username and password authentication in the Pure Storage FlashArray v1 by HTTP template, you can skip this step and move on to the next one.

Once you have created the user, the next step is to generate an API token. To do this, find the newly created user in the 'Users' block on the 'Settings' -> 'Users and Policies' page. On the right side of the user’s entry, locate the three dots and click on them to open the menu. From this menu, select 'Create API Token...'. Follow the prompts to generate the API token, which Zabbix will use to authenticate requests. The 'Expires In' field can be left empty.

After clicking the Create button, the GUI will show you details about the API key. Save this information somewhere safe for now, as we will need to use this data later in Zabbix. After saving, you can close this pop-up.

Preparing Zabbix

Create a host

Open your Zabbix web interface, then navigate to the ‘Configuration' -> 'Hosts‘ page and create a new host. In this step, you need to specify a host name of your choice, so choose one of the Pure Storage FlashArray v1 by HTTP or Pure Storage FlashArray v2 by HTTP templates and assign the host to a group. The choice of template depends on the version of the Pure Storage FlashArray RESTful API that is supported by your devices.

Before clicking the Add button, you need to configure macros. Open the Macros tab and choose both Inherited and host macros. You’ll find a lot of macros there, but only a few of them need to be changed to start using the template. Let’s take a look at these macros.

Macro list in the Pure Storage FlashArray v1 by HTTP template:

Macro	Default value	Description
`{$PURE.FLASHARRAY.API.URL}`	–	Web interface URL.
`{$PURE.FLASHARRAY.API.TOKEN}`	–	API token.
`{$PURE.FLASHARRAY.API.USERNAME}`	–	Web interface username.
`{$PURE.FLASHARRAY.API.PASSWORD}`	–	Web interface password.
`{$PURE.FLASHARRAY.API.VERSION}`	1.19	API version.

For the Pure Storage FlashArray v1 by HTTP template, it is mandatory to specify the {$PURE.FLASHARRAY.API.URL} macro, as well as either the {$PURE.FLASHARRAY.API.TOKEN} or {$PURE.FLASHARRAY.API.USERNAME} and {$PURE.FLASHARRAY.API.PASSWORD}. It is highly recommended to use a token for authentication.

Macro list in the Pure Storage FlashArray v2 by HTTP template:

Macro	Default value	Description
`{$PURE.FLASHARRAY.API.URL}`	–	Web interface URL.
`{$PURE.FLASHARRAY.API.TOKEN}`	–	API token.
`{$PURE.FLASHARRAY.API.VERSION}`	2.36	API version.

For the Pure Storage FlashArray v2 by HTTP template, it is mandatory to specify just the {$PURE.FLASHARRAY.API.URL} and {$PURE.FLASHARRAY.API.TOKEN} macros to start using the template.

You can change the value for the {$PURE.FLASHARRAY.API.VERSION} macro if your device does not support this version of the API.

After specifying at least the mandatory macro values, your Macros tab should look something like this:

After clicking the Add button, this host will be added to Zabbix.

Data collection

After following the above steps, you should notice the newly created triggers and items after a short time if the macro values are correct.

In case there are any problems with the template’s data collection, you will find errors in the last history data of items with a name ending with item errors. Also, the corresponding triggers should be fired if there are any problems with the collection of any data.

After that, you should see newly discovered items in the Items view (for example).

On top of that, each host will have its own dashboard created automatically that will provide you with a good overview of resource utilization.

Use macros for low-level discovery filtering

In official Zabbix templates, you might find macros that end with MATCHES and NOT_MATCHES. These are used for low-level discovery rules (LLDs), to help you filter resources that should or should not be discovered. These values use regular expressions. Therefore, you can use wildcard symbols for pattern matching.

Usage of these macros can be found in the Filters tab, under discovery rules.

The typical default value for MATCHES is .* and for NOT_MATCHES – CHANGE_IF_NEEDED. This means that any kind of value will be discovered if it is not equal to CHANGE_IF_NEEDED. For example, in Network interface discovery, filters are used to check the interface name:

Macro {$PURE.FLASHARRAY.NETIF.LLD.FILTER.NAME.MATCHES} has a value of .*;
Macro {$PURE.FLASHARRAY.NETIF.LLD.FILTER.NAME.NOT_MATCHES} has a value of CHANGE_IF_NEEDED.

You can set the value of macro {$PURE.FLASHARRAY.NETIF.LLD.FILTER.NAME.NOT_MATCHES} to filevip, which will cause an interface named filevip to not be discovered.

Now that you have an idea how these filters work, you can adjust them based on your requirements.

HTTP proxy usage

If needed, you can specify an HTTP proxy for the template to use by changing the value of the{$PURE.FLASHARRAY.HTTP_PROXY} user macro. Every request will use this proxy.

Afterword

To wrap things up, setting up monitoring for Pure Storage FlashArray devices in Zabbix is an important step that guarantees the smooth operation of your infrastructure. I hope that our new templates will help you manage and monitor your devices more effectively.

This short article has been created to provide you with the necessary knowledge and tools to set up a monitoring system that meets your specific needs. By enabling efficient monitoring, you will be better equipped to respond to changes in system performance and maintain optimal operation. I believe this material will be valuable in helping you achieve these goals!

The post Monitoring Pure Storage FlashArray with Zabbix appeared first on Zabbix Blog.

The First Steps Toward Monitoring with Zabbix and SNMP

2025-03-11 Zabbix LatAm

Post Syndicated from Zabbix LatAm original https://blog.zabbix.com/the-first-steps-toward-monitoring-with-zabbix-and-snmp/29784/

In this article, we’ll explore how to use Zabbix to monitor a MikroTik device via SNMP, using specific templates that allow you to visualize the status of interfaces and their performance. Read on to understand how to use network monitoring to ensure the correct operation and performance of devices in an infrastructure employing the SNMP protocol.

Table of Contents

Verifying SNMP communication

Before you begin, make sure that SNMP communication is configured correctly on your MikroTik device. Also, set up an appropriate SNMP community for your equipment.

Create a host in Zabbix

Once SNMP is configured, go to Data Collection > Hosts > Create Host.

Here you will need to enter the basic details of the device, such as the name, IP, and the group it belongs to. If you are working with multiple MikroTik devices, organize the hosts into groups according to their characteristics.

Apply a template

Zabbix offers a wide variety of default templates that fit different device models. By selecting the appropriate template for your MikroTik device, you will be able to view all its resources efficiently.

Configure SNMP macros

In the Macros section, specify the SNMP community you previously configured on your MikroTik.

Then, click “Update” to save the changes. This configuration will allow Zabbix to access the device data.

If you are monitoring multiple devices using the same SNMP community, it is best to configure a global macro in the path Administration > Macros.

This will allow you to efficiently manage a network of devices without having to configure them individually.

Visualization and monitoring

After completing the above steps, you will be able to start viewing device information directly in Zabbix, including:

Overall device performance:

Connected interfaces:

Items

Zabbix can capture a new interface automatically at defined intervals. This makes it easy to monitor a new interface without the need to include it manually, thanks to the Network Interfaces Discovery functionality.

To analyze the status, we can go to Data Collection > Hosts, find our MikroTik device, and select Items.

In this section, we can observe interface 2 of our client, which appears as a dependent item. This means that there is a master item that collects data through MIBs, which are network information databases. These items in the description section provide much more detailed and technical information about their functionality.

Configuration of specific items for interfaces

If we want to create a specific item, we must access Data Collection > Hosts > Create Item.

We must also assign a name that identifies SNMP Agent, specify the key that identifies the parameter to be monitored, analyze the corresponding MIB to capture the OID, and define the metrics capture interval according to our monitoring needs.

To validate the OID using snmpwalk, it returns the information of the MIB IF-MIB::ifOperStatus.2, which represents the interface status.

Configuration of custom triggers

To configure a trigger to alert us about the status of the interface, we go to Data Collection > Hosts > Triggers > Create Trigger.

Then, we assign a descriptive name to the trigger (either manually or by using macros), define the event that will trigger the alert, set the appropriate severity, and create a logical expression that determines the status.

State 2 → The interface is down.

Status 1 → The interface is operational.

To correctly interpret SNMP values in Zabbix, we go to Data Collection > Templates > MikroTik RB4011iGS+RM by SNMP > Value Mapping.

From here, we can observe the values returned by SNMP and configure our triggers based on them.

Finally, we can test our configuration in Monitoring > Problems, where we can see the triggers running.

Advantages of using SNMP with Zabbix

Using Zabbix as a monitoring tool not only facilitates network management, but also allows you to monitor third-party applications that use the SNMP protocol.

Its flexibility, together with the wide range of templates and configurations, make it the best choice for optimizing resources and ensuring stable performance in your infrastructure.

The post The First Steps Toward Monitoring with Zabbix and SNMP appeared first on Zabbix Blog.

Building a Monitoring Dashboard: Which Metrics to Track?

2025-03-04 Michael Kammer

Post Syndicated from Michael Kammer original https://blog.zabbix.com/building-a-monitoring-dashboard-which-metrics-to-track/29777/

A well-designed monitoring dashboard is the key to helping users process, interact with, and analyze data. Done right, it allows key decision-makers to track metrics and gain insights in an organized, easy-to-read format, while giving technical teams complete visibility into IT performance at a single glance. Done wrong, it creates information overload, with too much of everything – too many graphs, colors, widgets, and other sources of information, making it at best deceptive and at worst completely useless.

Obviously, there’s no dashboard big enough to display every possible metric for every possible stakeholder, which is why the key to making a well-organized, informative dashboard that doesn’t confuse the viewer is knowing which metrics to track. By sticking to the absolute “must haves,” you’ll make sure that users can find mission-critical information first. But how should you choose which metrics to track? We’ve put our hard-won dashboard expertise to work and identified four key metric groups that no dashboard should be considered complete without.

Global metrics

System uptime and availability. Availability is one of the most important metrics you can use to determine your network’s performance, because it’s a metric that everyone can see the effects of immediately. For a business, it’s critical when it comes to making sure that the services provided to users are consistently available.

Overall resource utilization (CPU, memory, disk storage, etc.). Think of tracking resource utilization like keeping tabs on your phone’s battery life. You need to track CPU, memory, disk storage, and network usage to keep everything running smoothly. Keeping an eye on those metrics will help you fix small issues before they turn into gigantic problems.

Top critical issues or alerts. Speaking of problems, they can and will happen – and when they do, you’ll naturally want to know about them as soon as possible. An alert can be as simple as a notification of a system update, or it can draw attention to an unusual spike in errors. It could also call attention to a major emergency that demands immediate attention. Either way, no effective dashboard is complete without them.

SLA compliance status. If you’re running a business, monitoring SLA compliance status lets you see service availability and performance, which in turn guarantee customer satisfaction. It allows for quick detection of issues, making proactive management and resolution possible before customers feel any impact.

Infrastructure metrics

Server performance (CPU, RAM, disk I/O). Tracking the response time, central processing unit (CPU) utilization, memory consumption, and network bandwidth of a server helps guarantee a functional user experience. It involves keeping an eye on CPU and RAM utilization, disk I/O (input and output operations involving a physical disk), plus a variety of other sub-metrics.

Application health. Monitoring application health involves collecting, analyzing, and interpreting data about an application’s performance, availability, and behavior. It’s mission-critical because it can help you detect and troubleshoot problems, optimize resource utilization, and provide the application’s users with the quality experience they expect.

Storage usage and trends. Keeping track of storage usage on your dashboard gives you a real-time view of storage metrics as well as predictive analytics (useful for capacity planning) and proactive issue detection, across on-premise and cloud storage environments. Like so many other monitored metrics, its purpose is to maintain optimal storage performance while preventing potential issues before they impact any business operations.

Database performance metrics. Basically, database monitoring is how you measure what you want to improve. It’s what you do before you start performance tuning. Keeping track of your database on your dashboard makes this possible by collecting performance metrics, so that you’re always aware of whether your database can fully support your applications and respond quickly to queries.

Network metrics

Bandwidth utilization and traffic patterns. Bandwidth refers to the maximum data transmission rate on a network at a particular time. Having this metric on your dashboard will let you easily track the amount of bandwidth your network is using and make you immediately aware if you run over the bandwidth threshold.

Latency and packet loss. Latency, or network delay, is a network performance metric that measures the amount of time it takes to transfer data from one destination to another. Consistent delays or unusual spikes in delay time usually mean that you have a major network performance issue. Tracking latency and packet loss on your dashboard will let you know if data transfers are taking too long, while also helping you make sure that any lost data packets get to their destinations.

Interface status and error rates. A network interface can be either networking hardware or a software interface. Monitoring them on your dashboard lets you see each and every network device, and tracking their performance is important when it comes time to identify the root causes of poor performance and network bottlenecks.

Firewall and VPN tunnel status. Monitoring the status of Firewalls and VPN tunnels is important because (among other things) it keeps you aware of whether your VPN tunnel interface is up and available for passing traffic, and whether the destination IP address being monitored is reachable. At the same time, you’ll also have access to real-time information about how your firewall is working, which will keep you aware of any security holes or incorrect settings before they become major problems.

Security metrics

Unauthorized access attempts. Unauthorized access is a big risk to businesses, jeopardizing sensitive data and disrupting operations. You can track attempts by unauthorized users to gain access to any website, server, device, or app by monitoring user activity on your dashboard. This data can also be labeled and sorted so that you can easily interpret it at a glance.

Endpoint security status (AV, patching). Endpoints are basically any devices that connect to networks, including laptops, mobile phones, and IoT devices. The more of them you have, the greater your chances of data loss and cyber threat entry. Monitoring the critical junctures of endpoints on your dashboard will help you identify and prevent threats while making sure that you have quick response measures in place to protect your data and systems.

Compliance and audit logs. Compliance and audit logs are there to make sure errors are noticed and fixed, keep you compliant with regulatory requirements, improve business security, and detect fraud. Monitor them on your dashboard, and you’ll have real-time visibility into your compliance posture as well as immediate alerts when a potential violation is detected.

Active security alerts or anomalies. Continuously keeping an eye on your systems and network lets you detect threats (anything from malware to abnormal activities and unauthorized access) before they escalate and cause real damage. In turn, this helps you maintain user trust, avoid downtime, and comply with data security regulations.

These metrics should give any dashboard a solid foundation that can be easily customized to meet specific business or operational goals.

The Zabbix Advantage

One of Zabbix’s most important features has always been our easily customizable dashboards, which allow users to see and analyze even the most complex monitoring data at a single glance. When it’s time to keep tabs on the essential metrics we identified above, Zabbix dashboards allow anyone (or any infrastructure team) to efficiently monitor network performance, manage resource usage, and guarantee device/application availability.

Zabbix’s graphing and visualization features make it easy to see historical trends and make comparisons. You can choose whatever visualization format is best for a particular set of data, including line graphs, bar charts, pie charts, gauges, and more. Not only that, Zabbix dashboard widgets can communicate with each other, serve as data sources for other widgets, and dynamically update the information they display based on the data source.

To learn more about the flexibility of Zabbix dashboards and see how they can help you track just about any metric imaginable, contact us.

The post Building a Monitoring Dashboard: Which Metrics to Track? appeared first on Zabbix Blog.

How to restrict Amazon S3 bucket access to a specific IAM role

2025-02-14 Chris Craig

Post Syndicated from Chris Craig original https://aws.amazon.com/blogs/security/how-to-restrict-amazon-s3-bucket-access-to-a-specific-iam-role/

February 14, 2025: This post was updated with the recommendation to restrict S3 bucket access to an IAM role by using the aws:PrincipalArn condition key instead of the aws:userid condition key.

April 2, 2021: In the section “Granting cross-account bucket access to a specific IAM role,” we updated the second policy to fix an error.

July 11, 2016: This post was first published.

Customers often ask how to limit access to an Amazon Simple Storage Service (Amazon S3) bucket to only a specific AWS Identity and Access Management (IAM) user or role. A popular approach has been to use the Principal element to list the users or roles who need access to the bucket. However, the Principal element needs the exact values of the user ARN, role ARN, or assumed-role ARN. It does not support using a wildcard (*) to include all role sessions, nor does it allow you to use policy variables.

In this blog post, we show how to restrict S3 bucket access to a specific IAM role or user within an account by using the Conditions element. Even if another user in the same account has an Admin policy or a policy with s3:*, they will be denied access if they are not explicitly listed in the Conditions element. You can use this approach, for example, to limit access to a bucket with sensitive content or additional security requirements.

Solution overview

The solution in this post uses a bucket policy to restrict access to an S3 bucket, even if an entity has access to the full API of S3 through an attached identity-based policy. The following diagram illustrates how this works for accessing an S3 bucket within the same account as your IAM user or IAM role. We recommend that you use IAM roles, and only use IAM users for use cases that aren’t supported by federated users.

Figure 1: Diagram illustrating how to access an S3 bucket within the same account as your IAM user or IAM role

The workflow in Figure 1 is as follows:

The IAM user’s policy and the IAM role’s identity-based policy grant access to “s3:*”.
The S3 bucket policy associated with Bucket B restricts access to only the IAM role. This means that only the IAM role is able to access its content.
Both the IAM user and the IAM role can access other S3 buckets (for example, Bucket A) in the account. The IAM role is able to access both buckets, but the user can access only the S3 buckets without the bucket policy attached to them. Even though both the role and the user have full “s3:*” permissions, the bucket policy negates access to the bucket for anyone that has not assumed the role.

The main difference in the cross-account approach is that every bucket must have a bucket policy attached to allow access to the IAM role from the other account. The following diagram illustrates how this works in a cross-account deployment scenario.

Figure 2: Diagram illustrating how to access an S3 bucket in a different account than your IAM role

The workflow in Figure 2 is as follows:

The IAM role’s identity-based policy and the IAM users’ policy in the bucket account both grant access to “s3:*”
Bucket policy B denies access to all IAM users and roles except the role specified, and the policy defines what the role is allowed to do with the bucket.
Bucket policy A allows access to the IAM role from the other account.
The IAM user and IAM role can both access Bucket A because the IAM user is in the same account and there is an explicit Allow in bucket policy A for the role. The role can access both buckets because the Deny in bucket policy B is only for principals other than the IAM role.

Using the `aws:PrincipalArn` condition

You can use different types of condition keys to compare details about the principal making the request with the principal properties that you specify in the policy. We recommend that you use the aws:PrincipalArn key. The aws:PrincipalArn key compares the Amazon Resource Name (ARN) of the principal that made the request with the ARN that you specify in the policy.

You could also use the aws:userid policy variable to uniquely identify a user or role in their explicit Deny statements. There is added complexity with using aws:userid to find the value because you have to perform an API call using valid credentials. When working with IAM roles this activity has additional complexity because you are required to get the AssumedRoleUser information, which will not only include the unique role ID, but also the role-session-name that was provided while assuming the role. For example, the aws:userid for an AssumedRoleUser will be as follows:

aws:userid – AROADBQP57FF2AEXAMPLE:role-session-name

It becomes inconvenient to manage and track these IDs when you have a large list of users and roles to be included in the policy.

To mitigate these challenges, we recommend that you use the aws:PrincipalArn condition key. For IAM roles, the request context returns the ARN of the role, not the ARN of the user that assumed the role. AWS recommends that you specify the ARN for resources in policies instead of unique IDs and that you perform IAM policy audits on a periodic basis. Let’s look at how to use the condition key in an IAM policy.

Granting same-account bucket access to a specific role

When accessing a bucket from within the same account, in most cases it is not necessary to use a bucket policy because the policy defines access that is already granted by the user’s direct IAM policy. S3 bucket policies are usually used for cross-account access, but you can also use them to restrict access through an explicit Deny. The Deny would be applied to all principals whether they were in the same account as the bucket or within a different account.

In this case, you use the IAM user or role ARN with the aws:PrincipalArn condition key in a StringNotEquals or StringNotLike condition with a wildcard string. In addition, you use the aws:PrincipalARN key to compare the ARN of the principal that made the request with the ARN that you specify in the policy. Using a conditional logic element allows for the use of a wildcard string to allow for any role session name to be accepted.

Once you have the ARN of the role to which you want to allow access, you need to block the access of other users from within the same account as the bucket. An example policy to block access to the bucket and its objects for users that are not using the IAM role credentials would look like the following.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": [
        "arn:aws:s3:::amzn-s3-demo-bucket",
        "arn:aws:s3:::amzn-s3-demo-bucket/*"
      ],
      "Condition": {
        "StringNotEquals": {
          "aws:PrincipalArn": [
            "arn:aws:iam::111122223333:role/<ROLE-NAME>"
          ]
        }
      }
    }
  ]
}

Use this same policy for IAM users as shown below.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": [
        "arn:aws:s3:::amzn-s3-demo-bucket",
        "arn:aws:s3:::amzn-s3-demo-bucket/*"
      ],
      "Condition": {
        "StringNotEquals": {
          "aws:PrincipalARN": [
            "arn:aws:iam::111122223333:role/<ROLE-NAME>”,
            “arn:aws:iam::111122223333:user/<USER-NAME>"
          ]
        }
      }
    }
  ]
}

Granting cross-account bucket access to a specific IAM role

When granting cross-account bucket access to an IAM user or role, you must define what the IAM user or role is allowed to do with the granted access. Learn more about the permissions needed to allow an IAM entity to access a bucket via the CLI/API and the console in Writing IAM Policies: How to Grant Access to an Amazon S3 Bucket. Using the information found in this blog post, an example bucket policy would look like the following.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::111122223333:role/<ROLE-NAME>"
            },
            "Action": "s3:ListBucket",
            "Resource": "arn:aws:s3:::amzn-s3-demo-bucket"
        },
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::111122223333:role/<ROLE-NAME>"
            },
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:DeleteObject"
            ],
            "Resource": "arn:aws:s3:::amzn-s3-demo-bucket/*"
        },
        {
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::amzn-s3-demo-bucket",
                "arn:aws:s3:::amzn-s3-demo-bucket/*"
            ],
            "Condition": {
                "StringNotEquals": {
                    "aws:PrincipalARN": [
                        "arn:aws:iam::111122223333:role/<ROLE-NAME>"
                    ]
                }
            }
        }
    ]
}

To grant access to an IAM user in another account, you need to add the ARN for the IAM user to the aws:PrincipalArn condition as outlined in the previous section of this blog post. In addition to the aws:PrincipalArn condition, you would also need to add the IAM user’s full ARN to the Principal element of these policies. An example policy is shown below.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": [
                {
                    "AWS": [
                        "arn:aws:iam::444455556666:role/<ROLE-NAME>”,
                        “arn:aws:iam::444455556666:user/<USER-NAME>"
                    ]
                }
            ],
            "Action": "s3:ListBucket",
            "Resource": "arn:aws:s3:::amzn-s3-demo-bucket"
        },
        {
            "Effect": "Allow",
            "Principal": [
                {
                    "AWS": [
                        "arn:aws:iam::444455556666:role/<ROLE-NAME>”,
                        “arn:aws:iam::444455556666:user/<USER-NAME>"
                    ]
                }
            ],
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:DeleteObject"
            ],
            "Resource": "arn:aws:s3:::amzn-s3-demo-bucket/*"
        },
        {
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::amzn-s3-demo-bucket",
                "arn:aws:s3:::amzn-s3-demo-bucket/*"
            ],
            "Condition": {
                "StringNotEquals": {
                    "aws:PrincipalARN": [
                        "arn:aws:iam::444455556666:role/<ROLE-NAME>”,
                        “arn:aws:iam::444455556666:user/<USER-NAME>"
                    ]
                }
            }
        }
    ]
}

In addition to including role permissions in the bucket policy, you need to define these permissions in the IAM user’s or role’s user policy. The permissions are added to a customer managed policy and attached to the role or user in the IAM console, with the following example policy document.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "s3:ListAllMyBuckets",
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": "s3:ListBucket",
      "Resource": "arn:aws:s3:::amzn-s3-demo-bucket"
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject"
      ],
      "Resource": "arn:aws:s3:::amzn-s3-demo-bucket/*"
    }
  ]
}

By following the guidance in this post, you restrict S3 bucket access to a specific IAM role or user in same-account and cross-account scenarios, even if the user has an Admin policy or a policy with “s3:*”. There are many applications of this logic in which requirements will vary across use cases. We recommend to employ the principle of least privilege wherever possible, and to grant only the minimum permissions that are required to perform necessary tasks.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Identity and Access Management re:Post or contact AWS Support.

Creating a Personal Assistant in Zabbix with Artificial Intelligence

2025-02-05 Cesar Caceres

Post Syndicated from Cesar Caceres original https://blog.zabbix.com/creating-a-personal-assistant-in-zabbix-with-artificial-intelligence/29596/

Zabbix is dedicated to monitoring IT infrastructures based on predetermined thresholds, such as servers, networks, and applications. Incorporating artificial intelligence (AI) into Zabbix as a complement allows a user to mitigate alerts based on these predetermined thresholds, offering possible causes and solutions to problems. This can help a user resolve incidents more efficiently.

In this article, we will explain how to integrate Zabbix and Google’s AI tool Gemini by using the API provided as well as a custom widget alternative.

First steps towards integration

You can find the repository in GitHub based on the Google Gemini model. You’ll need to create an account in Google AI Studio to obtain the required API.

Script configuration in Zabbix

From Zabbix version 7.0, access:

“Alerts” > “Scripts” > “Create Script.”

For this functionality, we designated the name as “Possible cause and solution.” Next, we can configure the parameters with the trigger event and the API generated in AI Studio. We then copy and get the script from the repository mentioned in the «Script» field, as in the following image:

Application in the problem panel

After configuration, we access the alerts panel and select a specific alert. We click on “AI Assistant” and access the functionality that was previously named as “Possible cause and solution.”

The following images present an example of an agent installed on a notebook.

Possible cause:

Possible solution:

The AI will be able to provide a precise solution for each problem presented, allowing us to progressively optimize the predetermined thresholds.

Creating accurate personalized dashboards for the user is essential. With this in mind, we propose the creation of an AI-based widget called “What are you working on?” (¿Qué harías tu? in Spanish), which analyzes the current state of the problem presented in Zabbix.

This concept integrates all the functionalities present in the widget (including Summary, Perspectives, Diagnosis, Comparison, and Forecast), since the used prompt can indicate whether it is necessary to make adjustments to the strategic plan or predict future trends based on the panel data built.

To exemplify how the “What are you working?” widget works, let’s consider the analysis of disk usage on our Zabbix Server.

The creation of personalized widgets from the official Zabbix page.

Once we have knowledge for the project, on the backend of our Zabbix Server we locate the route:

/usr/share/zabbix/widgets/

Then, we create a carpet called “insights” and copy the following repository. It is necessary to place the Gemini API in the file «assets/js/class.widget.php.js» in the field “YOUR_API_KEY.”

On the frontend, we go to “Administration” > “General” > “Modules.”

In the upper right corner, we click on “Scan Directory.” We have our widget to use:

After performing the scan, it is necessary to enable the widget, as it is disabled by default.

The importance of using AI in Zabbix

Let’s imagine a scenario with 100 monitored servers. Performance thresholds, Windows services, or other specific services can generate up to 50 weekly alerts. With the help of AI, it’s possible to reduce this number to a bare minimum, thanks to the weekly collection of possible causes and solutions.

This ground-level approach allows users to solve problems faster, but also improves overall health by minimizing necessary adjustments to the Zabbix server.

Implementing AI locally

Using a dedicated server with open source AI models like HuggingFace, it’s possible to implement the AI locally and create a database collecting the possible causes and solutions of the events.

The AI will learn from repetitive events, offering more accurate answers in the future. The analysis of possible trends can be based on the generated alerts. In this way, we can optimize our alerts and put artificial intelligence to work understanding and solving our problems.

Conclusion

The model we use is project-oriented. We are constantly evolving artificial intelligence, and we must use the model we know best. language is distinct due to the orientation of the prompts used for the answers and the learning we can provide, either by making requests to specific artificial intelligence platforms or by using it locally.

The post Creating a Personal Assistant in Zabbix with Artificial Intelligence appeared first on Zabbix Blog.

Monitoring Sensor Data with Zabbix and Modbus Protocol

2025-01-23 Nyein Chan Zaw

Post Syndicated from Nyein Chan Zaw original https://blog.zabbix.com/monitoring-sensor-data-with-zabbix-and-modbus-protocol/29471/

This week’s blog entry comes to us from Nyein Chan Zaw, who is based in Bangkok, Thailand and works as an Infrastructure Specialist for Green Will Solution. Read on to see how he uses his integrating a Modbus protocol with Zabbix to monitor data from temperature, humidity, and smoke sensors — and display their metrics on a Zabbix dashboard.

Step 1: Collecting Sensor Data via Modbus Protocol

This snapshot shows how all three sensors are synchronized with the Modbus protocol, confirming that the communication is operational.

In the initial setup, the temperature, humidity, and smoke sensors transmit their data to the Modbus protocol. This data synchronization can be visualized using Modbus polling software, where the values from each sensor are displayed in real-time.

Step 2: Configuring Modbus Files on Zabbix Agent

This snapshot demonstrates the configuration of three MB files corresponding to the three sensors.

To enable Zabbix to communicate with Modbus, the Modbus configuration (MB) files must be set up in the Zabbix Agent configuration file on the Zabbix server. Each sensor requires an individual MB configuration entry, specifying the Modbus parameters such as function code, register address, and data type.

Step 3: Creating a Host for Modbus Protocol in Zabbix

Next, a Zabbix host must be created to represent the Modbus protocol device.

This snapshot highlights the host creation process with the associated IP address and configuration details.

During this process, assign the Modbus protocol’s IP address as the host’s interface. Configure the interface to communicate with the Zabbix server using the Zabbix agent.

Step 4: Configuring Items for Each Sensor

Each sensor requires an item in Zabbix to capture its data.

This snapshot shows how items are configured for each sensor.

For every item, specify the Name for identification (e.g., Temperature Sensor). Define the Key, which includes the Modbus protocol function and register settings, to ensure accurate data retrieval.

Step 5: Viewing and Utilizing Sensor Data in Zabbix

This snapshot displays the Zabbix dashboard, showcasing data from all three sensors.

Once the host and items are configured, Zabbix starts collecting data from the Modbus protocol. This data is displayed in the Zabbix interface, where metrics for temperature, humidity, and smoke are updated in real-time. Additionally, a custom dashboard can be created to visualize all three sensors’ data at a glance, providing actionable insights for monitoring and decision-making.

Conclusion

Integrating Modbus with Zabbix streamlines the monitoring of sensor data, making it easy to collect, visualize, and act upon critical metrics. This process demonstrates Zabbix’s flexibility and scalability in managing industrial protocols and data sources, ensuring robust monitoring for diverse applications.

If you’re looking to implement similar solutions or need help integrating Modbus with Zabbix, feel free to reach out in the comments below!

The post Monitoring Sensor Data with Zabbix and Modbus Protocol appeared first on Zabbix Blog.

File Integrity Monitoring with Zabbix

2024-12-12 Paulo R. Deolindo Jr.

Post Syndicated from Paulo R. Deolindo Jr. original https://blog.zabbix.com/file-integrity-monitoring-with-zabbix/29460/

We have often seen Zabbix used as a simple tool for monitoring network assets as well as Information and Communication Technology (ICT) infrastructure. While this concept is not incorrect, it is equally important to understand that with the advancement of Zabbix versions, more and more functionalities have been made available for other types of monitoring, enabling advanced data analysis and stunning visualizations through new and modern widgets in the frontend layer.

In this short blog post, we will explore some of the existing yet under-discussed features of Zabbix that contribute to the maturity of the cybersecurity discipline within organizations — a topic that is becoming increasingly critical in the corporate environment.

Table of Contents

FIM – File Integrity Monitoring

FIM is a very common concept among information security tools, specifically in tools like SIEM/XDR (Security Information Event Management/Extended Detection and Response). The name is quite suggestive of its usability, but while some tools highlight this feature as one of their main functionalities, it is also available for those who use Zabbix – just not explicitly labeled under this name.
Here, we will approach FIM as a concept rather than just a functionality. This is because we aim to achieve a result, not merely have a menu with a name to claim compliance while using our tool. In fact, the outcome needs to be more important than mere “marketing.”

What should we expect from FIM?

Imagine that your servers have certain directories and/or files so critical that you cannot afford to neglect monitoring them for changes, insertions, or deletions. Additionally, these files may have owners and properties that must not be altered – otherwise, the systems that depend on them might lose the ability to read or execute their functions. This, at a minimum, is what we expect from FIM as a functionality.
To illustrate this a bit further, consider a database service like MariaDB:

# ls -lR /etc/mysql/
/etc/mysql/:
total 24
drwxr-xr-x 2 root root 4096 Jun 25 18:40 conf.d
-rwxr-xr-x 1 root root 1740 Nov 30 2023 debian-start
-rw------- 1 root root 544 Jun 25 18:43 debian.cnf
-rw-r--r-- 1 root root 1126 Nov 30 2023 mariadb.cnf
drwxr-xr-x 2 root root 4096 Sep 30 16:36 mariadb.conf.d
lrwxrwxrwx 1 root root 24 Oct 20 2020 my.cnf -> /etc/alternatives/my.cnf
-rw-r--r-- 1 root root 839 Oct 20 2020 my.cnf.fallback

/etc/mysql/conf.d:
total 8
-rw-r--r-- 1 root root 8 Oct 20 2020 mysql.cnf
-rw-r--r-- 1 root root 55 Oct 20 2020 mysqldump.cnf

/etc/mysql/mariadb.conf.d:
total 40
-rw-r--r-- 1 root root 575 Nov 30 2023 50-client.cnf
-rw-r--r-- 1 root root 231 Nov 30 2023 50-mysql-clients.cnf
-rw-r--r-- 1 root root 927 Nov 30 2023 50-mysqld_safe.cnf
-rw-r--r-- 1 root root 3795 Sep 30 16:36 50-server.cnf
-rw-r--r-- 1 root root 570 Nov 30 2023 60-galera.cnf
-rw-r--r-- 1 root root 76 Nov 8 2023 provider_bzip2.cnf
-rw-r--r-- 1 root root 72 Nov 8 2023 provider_lz4.cnf
-rw-r--r-- 1 root root 74 Nov 8 2023 provider_lzma.cnf
-rw-r--r-- 1 root root 72 Nov 8 2023 provider_lzo.cnf
-rw-r--r-- 1 root root 78 Nov 8 2023 provider_snappy.cnf

All the files, directories, and subdirectories listed above have already been configured, and the system (whatever it may be) is functioning perfectly. However, if someone suddenly decides to alter a configuration in the file /etc/mysql/mariadb.conf.d/50-server.cnf, this could be disastrous for the service. Regardless, the important thing to do is to monitor this scope and notify the relevant stakeholders so that an appropriate analysis can be conducted.

Zabbix can help with that. Let’s see how.

Zabbix and File Integrity Monitoring functions

Consider that the Zabbix agent is installed on the server to be monitored:

vfs.dir.count[/etc/mysql]

With this key, we can count the objects present within the /etc/mysql directory. Subsequently, we can create a trigger to be activated if there is any change related to the initial collection count, such as someone deleting or adding a file or directory in this location.

vfs.dir.size[/etc/mysql]

With this key, we can determine the total size in bytes used by the directories and configuration files. In the future, we can create a trigger that activates when this size changes, indicating the deletion or addition of a file.

vfs.file.exists[/etc/mysql/mariadb.conf.d/50-server.cnf]

Among several important files, we may have a greater interest in some configuration files, and we can validate their existence by creating a trigger that activates when such a file ceases to exist. This will clearly indicate that something important has disappeared.

In this case, the value “1” represents “OK” for the existence of the file.

vfs.file.cksum[/etc/mysql/mariadb.conf.d/50-server.cnf,sha256]

In addition to verifying the existence of the configuration file we consider important, we need to be informed if anything in it changes. This key handles that by generating a hash in a variety of possible formats, allowing a trigger to be activated in case of a hash change, which would reflect a file modification (unfortunately, we won’t know what exactly was altered).

vfs.file.regmatch[/etc/mysql/mariadb.conf.d/50-server.cnf,^max_connections\s+=\s+(\d+)]

We might have a specific parameter of interest – for example, the maximum number of connections allowed to the database. Monitoring this is important because if the configuration is set to the default value, it means that no “tuning” has been applied to the database. Alternatively, it could mean that someone simply deleted or commented out this line, causing it to be ignored by the system. Therefore, verifying whether the parameter exists and is properly configured is crucial.

In this case, the value “1” indicates that the regular expression was successfully found, meaning that the configuration or parameter we need to exist is indeed present.

vfs.file.regexp[/etc/mysql/mariadb.conf.d/50-server.cnf,^max_connections\s+=\s+(\d+),,,,\1]

Beyond verifying the existence and integrity of the file, it is also possible to determine what was changed within it. However, we would need to specify the configuration of interest using a regular expression. For example, considering that the maximum number of connections allowed by the database system is “x,” we can be alerted by a trigger if it changes to “y,” “z,” or any other value different from “x.” This setup allows us to monitor the parameter of interest with precision. This logic can be applied to any other parameter you consider important. Of course, there is another way to automate this process, but we will not cover that automation here.

In this case, the parameter defining the maximum number of connections is not only present, but we also know the exact number of connections. This way, we will have a history of the applied parameterization in case it is changed at any point.

vfs.file.owner[/etc/mysql/mariadb.conf.d/50-server.cnf]

vfs.file.owner[/etc/mysql/mariadb.conf.d/50-server.cnf,group]

The two keys above allow us to determine the owner of a file and (in the case of a Linux system) the owning group. We can also choose to monitor the user’s name or their UID in the system. Naturally, a trigger can be activated to alert us in case of an ownership change, indicating that someone might be “taking over” an important file in the system.

vfs.file.permissions[/etc/mysql/mariadb.conf.d/50-server.cnf]

The key above allows us to determine a file’s permissions—read, write, read and write, execution, or a special permission bit. Naturally, a trigger can be activated to alert us if there is any permission change in the file.

vfs.file.attr[/etc/mysql/mariadb.conf.d/50-server.cnf]

The key above does not exist by default. It was created with a UserParameter, which is a customization for verifying a command that, in this case, checks the attributes of a specific file. Consider the following command executed directly in your system’s terminal:

# lsattr /etc/mysql/mariadb.conf.d/50-server.cnf
--------------e------- /etc/mysql/mariadb.conf.d/50-server.cnf

What interests us are the attributes:

--------------e-------

If someone who invades the system modifies the attribute of a file (for example) using this command…

# chattr +A /etc/mysql/mariadb.conf.d/50-server.cnf
# lsattr /etc/mysql/mariadb.conf.d/50-server.cnf
-------A------e------- /etc/mysql/mariadb.conf.d/50-server.cnf

…it could mean that someone does not want the system to log when this file was accessed (refer to the chattr command manual). Additionally, any other attribute can be added or removed, which poses a risk to the system because these attributes can alter how files are accessed, stored on disk, and later read. Therefore, we can create a UserParameter as follows:

# cd /etc/zabbix/zabbix_agent2.d/
# echo "UserParameter=vfs.file.attr[*],lsattr \$1 | cut -d\" \" -f1" > attr.conf
# zabbix_agent2 -R userparameter_reload

Finally, we can test the reading of attributes directly from the terminal:

# zabbix_agent2 -t vfs.file.attr[/etc/mysql/mariadb.conf.d/50-server.cnf]
vfs.file.attr[/etc/mysql/mariadb.conf.d/50-server.cnf][s|-------A------e-------]

You can also try this now through the frontend.

When creating the item, don’t forget to create the trigger that should be activated in case there is a change in the attribute of a file, whatever it may be.

Paying attention to file access and modification times

To delve a bit deeper into the concept of FIM, we should ask ourselves if we are monitoring file access and modifications concerning their timestamps. In a way, if we have implemented everything proposed above, the answer is yes.

That said, there is an easier way to keep track of all the things we’ve discussed. It involves using this key:

vfs.dir.get[/etc/mysql]

When creating an item with this key, we will recursively obtain all its objects, such as subdirectories and files. The output format will be a JSON, which allows us to create LLD (Low-level Discovery) rules to automate FIM. Below is a small snippet of the monitoring output:

{
"basename": "mariadb.cnf",
"pathname": "/etc/mysql/mariadb.cnf",
"dirname": "/etc/mysql",
"type": "file",
"user": "root",
"group": "root",
"permissions": "0644",
"uid": 0,
"gid": 0,
"size": 1126,
"time": {
"access": "2024-11-30T23:01:01-0300",
"modify": "2023-11-30T01:42:37-0300",
"change": "2024-06-25T18:41:01-0300"
},
"timestamp": {
"access": 1733018461,
"modify": 1701319357,
"change": 1719351661
}
...

Considering that the output includes all objects from the main directory, this would be the most sensible approach to configure our FIM. However, it is necessary to create the LLD and prototypes. We will not cover this in detail in this article, but this is the path I recommend you follow.

Below is a “blueprint” for an LLD to create automated File Integrity Monitoring:

The “Master item”:

The “Dependent rule”:

The LLD Macro:

The item prototypes:

Below are the components of a trigger prototype (I created just one to symbolize a type of alert for file modification):

Name: Object: {#BASENAME} just changed

Event name: Object: {#BASENAME} just changed. Last hash: {ITEM.VALUE} The previous one: {?last(/MySQLDB/vfs.file.cksum["{#PATHNAME}",sha256],#2)} Object: {#BASENAME} just changed. Last hash: {ITEM.VALUE} The previous one: {?last(/MySQLDB/vfs.file.cksum["{#PATHNAME}",sha256],#2)}

Severity: Warning

Expression: last(/MySQLDB/vfs.file.cksum["{#PATHNAME}",sha256],#1)<>last(/MySQLDB/vfs.file.cksum["{#PATHNAME}",sha256],#2)

And then, some results:

Conclusion

The implementation of a robust File Integrity Monitoring system helps to ensure the security of IT infrastructure. Detecting unauthorized changes in critical files helps prevent attacks, identify security breaches, and ensure the integrity and availability of systems. With Zabbix, we have an effective solution to implement FIM, enabling process automation and the real-time visualization of changes. This monitoring not only reinforces protection against intrusions but also facilitates auditing and compliance with regulatory standards.

The main benefits of integrating File Integrity Monitoring with Zabbix include:

1. Early detection of changes in critical files, enabling quick responses.
2. Enhanced compliance with security regulations and internal policies.
3. Protection against malware and ransomware by identifying changes in essential files.
4. Ease of auditing with automated reports and modification histories.
5. Greater visibility and control over the integrity of data and systems in real time.
6. Operational efficiency through the automation of alerts and reports.
7. Improved proactive security, helping prevent attacks before they become critical.

By using Zabbix, organizations can strengthen their security posture and optimize risk management, ensuring that any unauthorized changes are detected and promptly corrected.

The post File Integrity Monitoring with Zabbix appeared first on Zabbix Blog.

An Introduction to Browser Monitoring

2024-12-03 Alexander Petrov-Gavrilov

Post Syndicated from Alexander Petrov-Gavrilov original https://blog.zabbix.com/an-introduction-to-browser-monitoring/29245/

Website and web application monitoring can vary from simple use cases to complex multi-step scenarios. To fully cover the scope of modern website monitoring requirements, Zabbix has introduced Browser item, a new item type that brings with it multiple accompanying improvements for simulating browser behavior and collecting website metrics.

Table of Contents

What is browser monitoring?

Browser monitoring allows users to monitor complex websites and web applications using an actual browser. It involves the constant tracking and analysis of the performance, reliability, and functionality of a website or web application from a real user perspective. This process ensures that key pages, features, and user navigation work as expected. By monitoring critical pages and flows specific to different businesses, companies can ensure optimal user experience, resolve potential or ongoing issues, and proactively address any potential problems.

Browser monitoring can be split into two main approaches:

Browser real user monitoring – Monitors how your web page or web application is performing, using real user data to analyze overall performance and user experience.
Browser synthetic monitoring – Analyzes application availability and performance, using scheduled testing to analyze website availability and emulate real user experience.

Since Zabbix is not a real person (yet) but is fully capable of emulating real user behavior on a website very precisely, we will focus on browser synthetic monitoring.

What business goals can we achieve with browser monitoring?

There are a multitude of goals that can be achieved, depending on what business we are running or expect to monitor, but some examples include:

Improving user experience

Browser monitoring helps ensure that users have a fast, smooth, and reliable experience on a website or web application. A positive user experience leads to higher user satisfaction and a greater likelihood of repeated visits or purchases.

Ensuring cross-browser and cross-device compatibility

Users access websites from a host of browsers and devices. Browser monitoring helps to detect compatibility issues that could affect certain users (e.g., JavaScript errors on specific browsers or layout shifts on mobile). By monitoring these scenarios, we can deliver a consistent experience across platforms, which is essential as multi-device usage continues to grow.

E-commerce checkout monitoring

Retailers can ensure a smooth checkout process by monitoring page load times, form interactions, and payment processing to confirm that users can easily complete purchases.

Form performance

Browser monitoring makes it easy to detect any issues preventing form completion, such as slow response times or broken validation. It also ensures a smooth, error-free experience to improve lead capture and gain more conversions.

Subscription renewal page monitoring

Subscription-based businesses rely on customers regularly renewing or upgrading their plans. Monitoring the subscription renewal page for load speed, usability, and any payment processing issues is essential, as issues on this page can directly the amount of renewals and lead to customer loss.

Supporting portal uptime

Many businesses provide a customer support portal where users can submit requests or use a knowledge database. Downtime or slow response times can lead to frustrated customers and an increased number of complaints.

How to set up browser monitoring

There are a lot of goals we can reach, but the question remains – how can we reach them with Zabbix? The answer is that we can use the already mentioned and newly introduced browser item.

Browser items gather information by running custom JavaScript code and fetching data via HTTP or HTTPS protocols. These items can mimic browser activities like clicking buttons, typing text, navigating across webpages, and performing other user interactions within websites or web applications.

Along with the script, users can specify optional parameters (name-value pairs) and set a timeout limit for the actions. But before we can actually use the item, we will need to configure Zabbix server or Zabbix proxy with a WebDriver, so that Zabbix can actually control browser trough scripts.

What is a WebDriver? A WebDriver controls a browser directly, mimicking user interactions through a local machine or on a remote server, enabling full browser automation. The term WebDriver includes both the language-specific bindings and the individual browser control implementations, often simply called WebDriver. WebDriver is designed to offer a straightforward and streamlined programming interface trough an object-oriented API which efficiently manages and drives browser actions.

In this guide, for instance, we’ll use a WebDriver with Chrome within a Docker container and make a script that includes actions like button clicks and text entry.

WebDriver installation

One of the simplest ways to install a WebDriver is to use containers. To install a chrome WebDriver on a local or remote machine, you can use Docker or any other preferred container engines:

# podman run --name webdriver -d \
-p 4444:4444 \
-p 7900:7900 \
--shm-size="2g" \
--restart=always -d docker.io/selenium/standalone-chrome:latest

Port 4444 will be the port on which the WebDriver will be listening and port 7900 will be used by NoVNC, which allows us to observe browser behavior in case a browser with a GUI is used.

Zabbix server/proxy configuration

After WebDriver is installed, we need to set up the communication between Zabbix and the driver. This can be done by editing the Zabbix server/proxy configuration file and updating the following parameters:

### Option: WebDriverURL
#       WebDriver interface HTTP[S] URL. For example http://localhost:4444 used with 
#       Selenium WebDriver standalone server.
#
# WebDriverURL=
WebDriverURL=http://localhost:4444

### Option: StartBrowserPollers
#       Number of pre-forked instances of browser item pollers.
#
# Range: 0-1000
# StartBrowserPollers=1
StartBrowserPollers=5

With the configuration parameters in place, we will now configure our Browser item to collect and monitor the list of upcoming Zabbix trainings from the training schedule page.

Creating a host

First, we need to navigate to the “Data collection” > “Hosts” section and create a host that represents our web page. This is more than anything – a logical representation. This means we don’t need any specific interfaces or additional configuration. The host in our example will look like this:

Creating a browser item

Since the data collection is done by items, we need to navigate to the “Items” section on the “Zabbix training schedule” host and create an item with the type “Browser.” It should look something like this:

Now comes the most important part – creating the script to monitor the schedule. Click on the “Script” field.

First, we will need to define what browser we will use, and any extra options we might want to specify, like screen resolution or whether the browser should run in headless mode or not. This can be done using the Browser object. The Browser object manages WebDriver sessions and initializes a session upon creation, then terminates it upon destruction. A single script can support up to four Browser objects.

var browser, result;
var  opts = Browser.chromeOptions();
opts.capabilities.alwaysMatch['goog:chromeOptions'].args = []
browser = new Browser(opts);
browser.setScreenSize(Number(1980), Number(1020));

In this snippet, we defined that we will use the Chrome browser with a GUI. As you can see, the screen size is set to the pretty common 1980x1020p.

Now we will need to define what the browser will be doing. This can be done by using such Browser object methods as navigate – to point to the correct URL of the web page or application and (for example) findElement/findElements to return some element of the web page.

findElement/findElements methods allow us to define strategies to locate an element and selectors to provide what to look for. Strategies and selectors can be of multiple kinds:
strategy – (string, CSS selector/link text/partial link text/tag name/Xpath)
selector – (string) Element selector using the specified location strategy

Let’s take a look at the next snippet:

try {
    browser.navigate("https://www.zabbix.com/");
    browser.collectPerfEntries("open page");

    el = browser.findElement("xpath", "//span[text()='Training']");
    if (el === null) {
     throw Error("cannot find training");
    }
    el.click();

    el = browser.findElement("link text", "Schedule");
    if (el === null) {
        throw Error("cannot find application form");
    }
    el.click();

In this snippet,

I am using a browser to navigate to the Zabbix page.
I collect a range of performance entries related to opening the page (download speed, response time, etc.).
I look for an element with the text “Training” using the XPath strategy, and the selector “Training.”
I click on it, which is a method to interact with elements.
In the next part, I use the strategy “link text” to find a link with the text selector “Schedule.”
I click on it

A visual description would look like this:

Browser interaction with the zabbix.com website

Now, let’s do some more clicking to filter out all other trainings and leave only trainings in Korean and Dutch:

    el = browser.findElement("link text", "English");
    if (el === null) {
        throw Error("cannot find application form");
    }
    el.click();

    el = browser.findElement("xpath", "//span[text()='English']");
    if (el === null) {
        throw Error("cannot find application form");
    }
    el.click();

    el = browser.findElement("xpath", "//span[text()='Korean']");
    if (el === null) {
        throw Error("cannot find application form");
    }
    el.click();

    el = browser.findElement("xpath", "//span[text()='Dutch']");
    if (el === null) {
        throw Error("cannot find password input field");
    }
    el.click();

    Zabbix.sleep(2000);

English is selected by default, so the script “unclicks” it. Then it selects Korean and Dutch and uses the sleep function to have some extra time for the page to load and make a screenshot of the currently opened page:

List of trainings with language filters applied on it

Now let’s get the list of dates so we can monitor which trainings we have left in 2024:

el = browser.findElements("xpath", "//*[contains(text(), ' 20')]");
var dates = [];
for (var n = 0; n < el.length; n++) { 
    dates.push(el[n].getText('2024')); 
}

// Remove entries that do not contain "2024"
dates = dates.filter(function(date) {
    return date.includes('2024');
});

dates = uniq(dates);

In this case we do a bit of a jump, and now search for all elements that contain text 20 (to include all years), but filter them out by year 2024 specifically, which later can be easily replaced with 2025. The end result contains all the upcoming training dates:

Items containing the upcoming training dates

The full host export with the script snippet can be found by following this link.

An additional example

But what if I want to fill in a form? Maybe to make a purchase, create an order, or just test a contact form? Good news – that’s an even simpler operation! Let’s take a look at this snippet:

// enter name
var el = browser.findElement("xpath", "//label[text()='First Name']/following::input");
if (el === null) {throw Error("cannot find name input field");}
el.sendKeys("Aleksandrs");

// enter last name
var el = browser.findElement("xpath", "//label[text()='Last name']/following::input");
if (el === null) {throw Error("cannot find name input field");}
el.sendKeys("Petrovs-Gavrilovs");

// enter cert number
var el = browser.findElement("xpath", "//label[text()='Certificate number']/following::input");
if (el === null) {throw Error("cannot find name input field");}
el.sendKeys("CT-2404-003");

// select version
var el = browser.findElement("css selector", "form#certificate_validation>fieldset>div:nth-of-type(5)>select");
if (el === null) {throw Error("cannot find name input field");}
el.sendKeys("7.0");

// check certificate
var el = browser.findElement("xpath", "//button[text()='Check Certificate']");
if (el === null) {throw Error("cannot find name input field");}
el.click();

This way, I can validate that my certificate is still valid!

As you can see, there are multiple ways to make a browser emulate user behavior and allow us to validate whether our pages and businesses are performing the way we expect them to! You can find even more examples in Zabbix documentation and Zabbix Certified Training, which I welcome you to attend!

The post An Introduction to Browser Monitoring appeared first on Zabbix Blog.

Proactively validate your AWS CloudFormation templates with AWS Lambda

2024-11-20 Kirankumar Chandrashekar

Post Syndicated from Kirankumar Chandrashekar original https://aws.amazon.com/blogs/devops/proactively-validate-your-aws-cloudformation-templates-with-aws-lambda/

AWS CloudFormation is a service that allows you to define, manage, and provision your AWS cloud infrastructure using code. To enhance this process and ensure your infrastructure meets your organization’s standards, AWS offers CloudFormation Hooks. These Hooks are extension points that allow you to invoke custom logic at specific points during CloudFormation stack operations, enabling you to perform validations, make modifications, or trigger additional processes. Among these, the Lambda hook is a powerful option provided by AWS. This managed hook allows you to use Lambda functions to validate your CloudFormation templates before deployment. By using a Lambda hook, you can invoke custom logic to check infrastructure configurations on create or update or delete CloudFormation resources or stacks or change sets, as well as create or update operations for AWS Cloud Control API (CCAPI) resources. This enables you to enforce defined policies for your infrastructure-as-code (IaC), preventing the deployment of non-compliant resources or emitting warnings for potential issues. In this blog post, you will explore how to use a Lambda hook to validate your CloudFormation templates before deployment, ensuring your infrastructure is compliant and secure from the start.

Introducing Lambda Hook

The Lambda hook is an AWS-provided managed hook with the type AWS::Hooks::LambdaHook. It simplifies the integration of custom logic into CloudFormation stacks. This powerful feature allows you to focus on building and testing your custom logic as a Lambda function, without the complexity of creating a hook from scratch.

By using the Lambda hook, you can activate a pre-built hook and deploy your custom logic into a Lambda function using familiar tools like AWS CLI or AWS Serverless Application Model (SAM) or AWS Cloud Development Kit (CDK). This approach reduces the number of components you need to manage in your workflow, allowing for more streamlined operations. The Lambda hook also offers flexible evaluation capabilities, enabling you to respond to specific template properties or configurations as needed.

One of the key advantages of the Lambda hook is the enhanced control it provides. You can benefit from features such as VPC integration, local logging, and granular resource management, all while leveraging the power of AWS Lambda functions. To get started with the Lambda hook, you’ll need to activate it in your AWS account. This activation process eliminates the need for authoring, testing, packaging, and deploying a custom hook using the AWS CloudFormation Command Line Interface (CFN-CLI), significantly simplifying your workflow.

Example Use Case: S3 Bucket Versioning Validation

This blog post demonstrates using the Lambda hook to validate S3 Bucket versioning before deployment. While focused on S3 buckets, this approach can be applied to other resource types, properties, stack, and change set operations.

By leveraging the Lambda hook, you’ll streamline custom logic integration into your CloudFormation stacks. The process involves:

Activating Lambda hook of type AWS::Hooks::LambdaHook in your account
Writing a Lambda function for validation
Providing the Lambda ARN as input to the hook

This example showcases how to enhance your infrastructure-as-code practices, ensuring compliant and secure deployments from the start.

Architecture

This section shows you how the Lambda hook and Lambda function work together to enhance your CloudFormation deployments.

Lambda hook and Lambda function

First, you need to create a Lambda function with the business logic to respond to the hook. Then, you need to create an IAM execution role with the necessary permissions to invoke the Lambda Function. Once you have the Lambda function and the IAM execution role, you can activate the AWS provided Lambda hook. Follow the steps in the documentation to activate a Lambda hook from the AWS console. Alternatively, you can activate it using the AWS Command Line Interface (AWS CLI) by using the activate-type and set-type-configuration commands. Lastly, you can also use AWS::CloudFormation::LambdaHook as a CloudFormation resource to activate and configure Lambda hook from a CloudFormation template. You can share this resource across your other accounts and regions using AWS CloudFormation StackSets by following this blog.

Lambda hook in action

The following diagram and explanation illustrate the step-by-step workflow of how Lambda hook integrates with your CloudFormation operations, providing a visual representation of the process from template creation to resource deployment or modification.

Lambda Hooks Architecture and its working in action

Diagram 1: Lambda hooks in action

The architecture diagram illustrates the step-by-step flow of how the Lambda hook is used during a CloudFormation stack operation.

Author a template: Author a CloudFormation template, including the necessary resources to configure.
Create the stack: The CloudFormation stack creation process has started, but the process of creating the defined resources in the template has not yet begun.
Request is received by CloudFormation service: When a resource creation, update, or deletion is requested, the CloudFormation service receives the request.
Invoke the Hook: The CloudFormation service then invokes the Lambda hook.
The hook invokes your the Lambda Function: The Lambda hook, in turn, triggers the execution of the Lambda function that was defined in the hook activation.
The Lambda function processes the request and responds back to the Hook: The Lambda function processes the request, performing validation, or additional tasks as required. The Lambda function then responds back to the Lambda hook.
The stack workflow progresses further in either continuing the resource creation/update/deletion with/without a warning or fails: Based on the Lambda function’s output, the Lambda hook either allows the stack operation to proceed with the resource operation (for example, creation of the resource), or deny the resource operation causing a rollback of the stack.

This workflow demonstrates how Lambda hook seamlessly integrates into the CloudFormation stack deployment process, allowing you to implement custom validations, enforce policies, and extend the capabilities of your infrastructure-as-code deployments through the power of Lambda functions. By leveraging the Lambda hook and the custom Lambda function, customers can extend the capabilities of their CloudFormation deployments, enabling advanced use cases such as resource validation, or additional task execution.

Sample Deployment

This section shows you how to enable the Lambda hook, which is of type AWS::Hooks::LambdaHook, and add the business logic in the Lambda function to validate the versioning configuration of an S3 bucket. The sample solution shown in this blog post demonstrates the hook triggering for the resource type AWS::S3::Bucket, and if you want to trigger this for every resource type, then you can use the Resource filter within Hook filters configuration that can take wildcard "AWS::*::*" as a value or multiple targets of resource types for example "AWS::S3::Bucket", "AWS::DynamoDB::Table", and you’ll also want to make sure that the Lambda Function has the logic to handle the additional resource type. You can also add additional Hook targets , for example to validate your STACK or CHANGE_SET.

In the example used in this blog post, you will configure the hook and activate on create and update operations operations. For more information about TargetFilters, see Hook configuration schema and for more information about Lambda hook see here. With these modifications, you need to consider two important points: First, you will need to handle the business logic to deal with different resource types in your Lambda function code. Second, additional pricing may apply based on your resource usage, for more details see the Lambda pricing page.

Creating the Lambda Function

You can create a Lambda function in several ways – on the AWS Console, using CloudFormation, using AWS CLI, or by directly invoking the API via SDK. In this section, we will cover creating a Lambda function with a few clicks on the AWS console. See Using Lambda with infrastructure as code (IaC) for deploying Lambda Function using SAM CLI, CDK or CloudFormation.

Create the Lambda function on the AWS console by following create a Lambda function with the console instructions and use the following sample Python code.

"""Example Lambda function called by AWS::Hooks::LambdaHook."""


import logging


HOOK_INVOCATION_POINTS = [
    "CREATE_PRE_PROVISION",
    "UPDATE_PRE_PROVISION",
    "DELETE_PRE_PROVISION",
]

TARGET_NAMES = [
    "AWS::S3::Bucket",
]

LOGGER = logging.getLogger()

LOGGER.setLevel("INFO")


def lambda_handler(event, context):
  """Define the entry point of the function."""
  try:
    request = event
    
    LOGGER.info(f"Request: {request}")

    invocation_point = request["actionInvocationPoint"]
    LOGGER.info(f"Invocation point: {invocation_point}")

    target_name = request["requestData"]["targetName"]
    LOGGER.info(f"Target name: {target_name}")
    
    clientRequestToken = request["clientRequestToken"]

    if (
      invocation_point not in HOOK_INVOCATION_POINTS
      or target_name not in TARGET_NAMES
    ):
      message = (
        f"Skipping {target_name} evaluation for {invocation_point}."
      )
      LOGGER.info(message)
      payload = {
        "clientRequestToken": clientRequestToken,
        "hookStatus": "SUCCESS",
        "errorCode": None,
        "message": message,
        "callbackContext": None,
        "callbackDelaySeconds": 0,
      }
      LOGGER.debug(payload)
      return payload

    target_model = request["requestData"]["targetModel"]

    resource_properties = (
      target_model.get("resourceProperties")
      if target_model and target_model.get("resourceProperties")
      else None
    )
    LOGGER.debug(f"Resource properties: {resource_properties}")

    versioning_configuration = (
      resource_properties.get("VersioningConfiguration")
      if resource_properties
      and resource_properties.get("VersioningConfiguration")
      else None
    )
    versioning_configuration_status = (
      versioning_configuration.get("Status")
      if versioning_configuration
      and versioning_configuration.get("Status")
      else None
    )
    if (
      not resource_properties
      or not versioning_configuration
      or not versioning_configuration_status
      or not versioning_configuration_status == "Enabled"
    ):
      message = "Versioning not set or not enabled for the S3 bucket."
      LOGGER.error(message)
      payload = {
        "clientRequestToken": clientRequestToken,
        "hookStatus": "FAILED",
        "errorCode": "NonCompliant",
        "message": message,
      }
    else:
      message = "Versioning is enabled for the S3 bucket."
      LOGGER.info(message)
      payload = {
        "clientRequestToken": clientRequestToken,
        "hookStatus": "SUCCESS",
        "errorCode": None,
        "message": message,
      }

    LOGGER.debug(payload)
    return payload
  except Exception as exception:
    message = str(exception)
    payload = {
      "clientRequestToken":  event["clientRequestToken"],
      "hookStatus": "FAILED",
      "errorCode": "InternalFailure",
      "message": message,
      "callbackContext": None,
      "callbackDelaySeconds": 0,
    }
    LOGGER.error(message)
    return payload

Example event sent to Lambda by the hook

{
  "clientRequestToken": "XXXXXXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
  "awsAccountId": "111111111111",
  "stackId": "arn:aws:cloudformation:<AWS-Region>:111111111111:stack/example-stack/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "changeSetId": "None",
  "hookTypeName": "AWS::Hooks::LambdaHook",
  "hookTypeVersion": "00000001",
  "hookModel":
    {
      "LambdaFunction": "example-hook-function-name",
    },
  "actionInvocationPoint": "CREATE_PRE_PROVISION",
  "requestData":
    {
      "targetName": "AWS::S3::Bucket",
      "targetType": "AWS::S3::Bucket",
      "targetLogicalId": "Bucket",
      "callerCredentials": "None",
      "providerCredentials": "None",
      "providerLogGroupName": "None",
      "hookEncryptionKeyArn": "None",
      "hookEncryptionKeyRole": "None",
      "targetModel":
        {
          "resourceProperties":
            {
              "PublicAccessBlockConfiguration":
                { "RestrictPublicBuckets": true, "IgnorePublicAcls": true },
              "BucketName": "XXXXXXXXXXXXXXXXXXXXXXXXXXX",
              "VersioningConfiguration": { "Status": "Enabled" },
            },
          "previousResourceProperties": "None",
        },
    },
  "requestContext": { "invocation": 1, "callbackContext": "None" },
}

Explanation of the Lambda Function code

The Lambda Function code is designed to process the event received from the Lambda hook and validate the versioning configuration of the target S3 bucket resource. Here’s a detailed explanation of the code:

The function first extracts the relevant information from the event, including the invocation point and the target resource type.
It then checks if the current invocation point is in the configured HOOK_INVOCATION_POINTS list and if the target resource type is AWS::S3::Bucket. If not, the function returns a success response, skipping the validation for this particular invocation.

Note: this code that skips the validation is put here as a fallback logic in the event the user has not chosen to use TargetFilters. As this is a wildcard hook, without TargetFilters the hook will always be invoked for any AWS resource type described in the template, and since the hook targets preCreate, preUpdate, and preDelete by default, the hook will be invoked for these invocation points by default. To narrow the scope and reduce costs by avoiding to invoke the hook for all AWS resource type targets and invocation points, use TargetFilters.

Next, the function retrieves the resource properties from the event, specifically looking for the VersioningConfiguration property and its Status.
If the VersioningConfiguration property is not present or its Status is not set to Enabled, the function returns a failure response, indicating that the versioning is not enabled for the S3 bucket.
If the versioning is enabled, the function returns a success response.
The function also includes a fallback mechanism to return a failure response in case of any other exceptions. By evaluating this sample code, you can validate the versioning configuration of the S3 bucket during the CloudFormation stack creation and update processes, with your infrastructure-as-code policies.

Enabling Lambda Hook in your AWS Account/Region

Navigate to the AWS CloudFormation service on the AWS Console, then choose “Create Hook” → “with Lambda” from the main Hooks page:

Lambda Hooks Creation

Diagram 2: Create a Hook with Lambda console page

You will see the page explaining how the Lambda function work as a hook.

Lambda hooks provide lambda

Diagram 3: Provide a Lambda function to Hook Console page

Provide the Hooks details: the name, the Lambda function it should take, the type, and the mode. You can also create your execution role directly from the console by choosing “New execution role”.

Diagram 3: Provide a Lambda function to Hook Console page

You can review the Lambda hook and activate it from the next page.

Lambda Hooks details and configuration

Diagram 4: Review Lambda hook Console page

Test a sample

In this section, you will test the hook and the Lambda Function that you activated for a S3 bucket resource.

Create an S3 Bucket without versioning

AWSTemplateFormatVersion: "2010-09-09"

Description: This CloudFormation template provisions an S3 Bucket without versioning enabled

Resources:
  S3Bucket:
    DeletionPolicy: Delete
    Type: AWS::S3::Bucket
    Properties:
      BucketName: !Sub test-bucket-versioning-1-${AWS::Region}-${AWS::AccountId}

You will see the hook invoking Lambda function and the Lambda Function responding back with a failure message since the Versioning is not enabled.

When you create or update a stack with the template above, the Lambda hook will be invoked, and the Lambda Function will respond with a failure message since bucket versioning is not enabled. The Lambda Function code will extract the resourceProperties from the event, check the VersioningConfiguration property, and find that the Status is not set to Enabled. As a result, if you use the example template above where you describe the S3 bucket without versioning enabled, the Lambda Function will send a failure response back to the hook, causing the CloudFormation stack operation to fail as shown in the following screenshot.

Lambda Hooks Failure stack

Diagram 5: Lambda Hook failure Stack

Create an S3 Bucket with versioning enabled You can try creating an S3 Bucket with versioning enabled to see how Hooks assessment succeeded.

AWSTemplateFormatVersion: "2010-09-09"

Description: This CloudFormation template provisions an S3 Bucket with Versioning enabled

Resources:
  S3Bucket:
    DeletionPolicy: Delete
    Type: AWS::S3::Bucket
    Properties:
      BucketName: !Sub test-bucket-versioning-2-${AWS::Region}-${AWS::AccountId}
      VersioningConfiguration:
        Status: Enabled

In this case, you will see the hook invoking the Lambda function and getting a success message since the Versioning is enabled

When you create a stack with this CloudFormation template, the Lambda hook will be invoked, and the Lambda Function will respond with a success message since the versioning is enabled. The Lambda Function code will extract the resourceProperties from the event, check the VersioningConfiguration property, and find that the Status is set to Enabled. As a result, the Lambda Function will send a success response back to the hook, allowing the CloudFormation stack operation to proceed as shown in the following screenshot.

Diagram 6: Lambda Hook success Stack

By testing these two scenarios, you can verify that the Lambda hook and the associated Lambda Function are working as expected, enforcing the S3 bucket versioning policy during CloudFormation stack operations.

Clean up

To clean up, refer to the following documentation to delete CloudFormation Stacks: Deleting a stack on the AWS CloudFormation console or Deleting a stack using AWS CLI. Refer to documentation to deactivate the hook.

Conclusion

In this blog post, you explored the capabilities of CloudFormation Hooks and how they can be leveraged to extend the functionality of your infrastructure-as-code deployments. Specifically, you learned about the Lambda hook, a pre-built hook that simplifies the process of integrating custom logic into your CloudFormation stacks.

By activating the Lambda hook, and deploying a custom Lambda Function, you were able to validate the versioning configuration of an S3 bucket during the CloudFormation stack creation and update processes. This approach allows you to enforce infrastructure-as-code policies and ensure compliance at the point of deployment, rather than relying on post-deployment checks or indirect governance mechanisms. The ability to leverage familiar tools and workflows, such as the AWS CLI, AWS SAM, CI/CD pipelines, or the AWS CDK, makes it easier to incorporate custom logic into your CloudFormation deployments. This reduces the overhead and complexity associated with traditional hook orchestration and packaging, empowering you to streamline your infrastructure-as-code practices.

As you continue to build and deploy your cloud infrastructure, consider exploring the various CloudFormation Hooks available, for example, see aws-cloudformation/aws-cloudformation-samples and aws-cloudformation/community-registry-extensions GitHub repositories. The approach demonstrated in this blog post can be applied to other resource types supported by CloudFormation, allowing you to validate and enforce policies for a wide range of infrastructure components, from EC2 instances and VPCs to databases and application services.

About the Author

Kirankumar Chandrashekar is a Sr. Solutions Architect for Strategic Accounts at AWS. He focuses on leading customers in architecting DevOps, modernization using serverless, containers and container orchestration technologies like Docker, ECS, EKS to name a few. Kirankumar is passionate about DevOps, Infrastructure as Code, modernization and solving complex customer issues. He enjoys music, as well as cooking and traveling.

Stella Hie is a Sr. Product Manager Technical for AWS Infrastructure as Code. She focuses on proactive control and governance space, working on delivering the best experience for customers to use AWS solutions safely. Outside of work, she enjoys hiking, playing piano, and watching live shows.

Monitoring VMware vSphere with Zabbix

2024-11-20 Mateusz Romaniuk

Post Syndicated from Mateusz Romaniuk original https://blog.zabbix.com/monitoring-vmware-vsphere-with-zabbix/29193/

Zabbix is an open-source monitoring tool designed to oversee multiple IT infrastructure components, including networks, servers, virtual machines, and cloud services. It operates using both agent-based and agentless monitoring methods. Agents can be installed on monitored devices to collect performance data and report back to a centralized Zabbix server.

Zabbix provides comprehensive integration capabilities for monitoring VMware environments, including ESXi hypervisors, vCenter servers, and virtual machines (VMs). This integration allows administrators to effectively track performance metrics and resource usage across their VMware infrastructure.

In this post, I will show you how to set up Zabbix monitoring with a VMware vSphere infrastructure.

Table of Contents

Requirements:

Zabbix server
Access to the VMware vCenter Server

Step one: Create a Zabbix service user in the vCenter

First things first, let’s create a service user on the vCenter that will be used by the Zabbix server to collect data. To make life easier, in my lab setup the user [email protected] will have full Administrator privileges. Read-only permissions should be enough, however.

1. In the vSphere Client, choose Menu -> Administration -> Users and Groups. From the Users tab, select Domain vsphere.local, and click the ADD button to add a new user.

2. Type a username and password. Click ADD to create a new user.

3. Change the tab to Groups and select the Administrators group.

4. Find a new user zabbix, click on it and save. The user is added to the Administrators group.

5. From the Host and Clusters view, choose vCenter name and go to the Permissions tab. Click the Add button.

6. Choose a proper domain (vsphere.local), find the user zabbix, set the role to Administrator, and check Propagate to children. Click OK to give those permissions.

Step two: Make changes on the Zabbix server

Next, we need to edit zabbix_server.conf. In this file we need to enable the vmware collector process. It’s necessary to start VMware monitoring. FYI, I have installed Zabbix server in version 7.0.4.

1. Edit a configuration file zabbix_server.conf

vim /etc/zabbix/zabbix_server.conf

2. Find the StartVMwareCollectors parameter, delete “#” before it and change the value from 0 to at least 2. Save the file and exit.

Except for StartVMwareCollectors which is mandatory, it’s possible to enable and modify additional VMware parameters. You can find more details about them HERE.
VMwareCacheSize
VMwareFrequency
VMwarePerfFrequency
VMwareTimeout

3. Restart the zabbix-server service.

systemctl restart zabbix-server

Step three: Configure the VMware template on Zabbix

1.Log in to the Zabbix server via GUI – http://zabbix_server/zabbix. Go to the Hosts section under the Monitoring tab.

2. Create a new “Host.” Click Create Host in the right upper corner.

3. In the Host tab provide the following details:

Host name – type the name of the system that we want to monitor – here it is VMware Infrastructure.
Templates – type/find template name “VMware”, more info about VMware template you can find HERE.
Host groups – find/type “VMware(new)” host group.

At this point, go to the Macros tab.

4. In the Macros tab you need to provide 3 values/macros. These macros describes data that is needed to connect Zabbix to the VMware vCenter:

{$VMWARE.URL} – VMware service (vCenter or ESXi hypervisor) SDK URL (https://servername/sdk) that we want to connect.
{$VMWARE.USERNAME} – VMware service username created in the 1 section.
{$VMWARE.PASSWORD} – VMware service user password created in the 1 section.

Click the Add button.

5. A new Host was created and data collection is in progress.

6. Depending on the size of the infrastructure, data collection takes different amounts of time. Once configured, Zabbix will automatically discover VMs and begin collecting performance data. You can find an overview of the latest data in the Dashboard screen.

7. More specific and detailed data can be found in Latest data under the Monitoring tab.

In Host groups or Hosts choose the name of the item you are looking for (you can also click the “Select” button). Select the name of the ESXi host, the virtual machine, the vCenter name, the datastore, or all VMware information.

Zabbix can collect multiple metrics from VMware using its built-in templates. These metrics include:

– CPU usage
– Memory consumption
– Disk I/O statistics
– Network traffic
– Datastore capacity

In conclusion

Integrating Zabbix with VMware provides a robust solution for monitoring virtualized environments and enhancing visibility into system performance and resource utilization, while enabling timely alerts and responses to operational issues.

The post Monitoring VMware vSphere with Zabbix appeared first on Zabbix Blog.

Installing and checking Prometheus Podman exporter

Creating a template and template items

Creating a Discovery rule in template

Creating a template Discovery rule: item prototypes

Creating a template Discovery rule: trigger prototype

Summary

Key Takeaways

Getting Started with Amazon Q Developer

Understanding the Impact of Good Prompts

What Makes an Effective Prompt?

Common Patterns to Avoid

Short or Vague Requests:

Overly Broad Questions:

Proven Techniques for Better Results

Quick Reference Prompt Template

Best Practices for Daily Use

Take Action Now

Additional Resources

What’s Next?

Broadcast and listen capabilities

Navigator widgets

Dashboard-level host broadcast

Selecting non-existing items

Advanced graph widget use cases

Data sets

Trigger and problem display

Aggregation

Time shift

Missing data

Defining widget value thresholds

Widgets with threshold support

Dashboards for MSPs

Dashboard visibility

Host permissions

Restricting access to widgets

Dashboard ownership

What is PagerDuty?

How to integrate PagerDuty with Zabbix

Best practices for integrating Zabbix and PagerDuty

Conclusion

Introduction:

Prerequisites

Planning

Conclusion

Resources

Solution overview

Solution implementation—two approaches

Terms

Prerequisites

Approach 1: A separate subdomain

Step 1: Create and name a new web ACL

Step 2: Create a string match condition on Referer header

Step 3: Associate the new rule with the relevant CloudFront distribution

Test the referer check rule

Approach 2: All content under the same domain, with filtering by path

Step 1: Decide what to protect

Step 2: Create and name a new web ACL

Step 3: Create string match conditions on the referer

Step 4: Associate the new rules with the relevant CloudFront distribution

Test the rules

Conclusion

Pre-requisites

About the Application

Overview

Walkthrough

Setting up the transformation environment

Configuring Amazon Q command line tool and authenticate

Customizing transformations

Executing the transformation

Reviewing and applying the changes

Troubleshooting

Clean up

Call to Action

Conclusion

About the authors

Deploying multiple redundant Zabbix servers

Deploying multiple Zabbix proxies

Configuring Proxy groups

Adding more Java gateways

Upgrading docker proxies with SQLite3 database