Zabbix 2.2 Monitoring System Overview
Zabbix Tutorial Slides - https://monitor.rice.edu/tutorial/
Zabbix Webinar Videos - https://monitor.rice.edu/webinars/
Introduction
Zabbix is distributed network monitoring software that functions by polling and trapping data. Using this data Zabbix can create specific reports and statistics, which can be accessed from a web based front end. Zabbix data is dependent on the backend database values for a specific item. Zabbix data gathering works at intervals and are not on-demand.
What’s new?
- No longer support IE6
- Support systems with multiple interfaces
- New item keys and deprecated item keys.
- New macros and deprecated macros.
- New agent required to use new item keys and macros.
Fore more information see: https://www.zabbix.com/documentation/2.2/manual
Architecture Overview
Zabbix Server - Central process and repository where all configuration, statistical, and operational data are stored. Performs polling and trapping of data, calculates triggers, send notifications to users.
Zabbix Agent - Deployed on monitoring servers/devices to actively monitor local resources and applications and report the gathered data to the Zabbix server. Zabbix Agent can perform passive and active checks.
Passive Check is data being requested by the Server to the Agent.
Active Check is data being sent from the Agent to the Server.
Host groups
Host groups are containers of hosts and templates.
Host groups are what permissions are based on for user groups.
Recommended to use Host group names that includes the group of the owner.
Examples:
Group Name: Middleware Development & Integration - MDI
Host Groups:
MDI - Cyrus Mail Servers
MDI - Netscaler Servers
Hosts and Templates
Hosts and Templates are almost the same on the backend.
They both hold Application names, Items, Triggers, Graphs, Screens, and Discovery Rules.
Templates
Templates contain Application names, Items, Triggers, Graphs, Screens, and Discovery Rules that are used to apply to one or more hosts.
This allows systems with similar functions to be linked.
Changing a template will propagate to all linked hosts.
Unlinking a host from a template will just unlink the items, the items will still exist on the host unless you unlink and clear.
It is recommended that Templates should be stored in the appropriate host group of the owner.
Templates should not be shared between different hosts and different owners.
You can nest templates to separate services, applications, etc.
Recommended to use template names that include the group name of the owner.
Hosts
Object to be monitored (servers, switches, routers, etc)
Connect on IP / DNS
Can be grouped in 'Host groups'
Can be linked to (multiple) 'Templates'
Can be accessed through 'Interfaces'
Monitoring Status can be changed from Monitored to Not Monitored
Recommend a fully qualified domain as host name.
Recommend to only use visible name if required do to macro ease of use.
Items (Retrieve values)
Items gather data from a host.
Identified by a 'Key'
Can contain numerical or textual values.
Item is an individual metric for gathering a specific data.
Item types are a variety of checks offered by Zabbix
Item keys are functions used to gather data from the specific item type for the configured item.
Item keys must be unique for each host and/or template.
Item keys differ based on item type like Zabbix Agent / SNMP / IPMI / SSH / Telnet / Simple Checks / Scripts / Calculation
Contain timer settings like update interval and history retention period.
Can and should be grouped in 'Applications'
Sample Zabbix Item Types:
- Zabbix Agents (active or passive)
- SNMP (v1, v2, v3)
- IPMI (limited)
- Simple Checks (icmp, port checks)
- Web Scenarios(cURL based [text])
- External Scripts (Server Side)
- Zabbix Trapper (zabbix_sender)
Triggers (Trigger by Item value)
Use logic to interpret item values
Can check against things like min, max, avg or last value
Textual recognition based on string or regex
Also a lot of other things like dates and times can be used for comparison
Can become quite complex, but are very powerful
Outcome is always numerical
Recommended to use a trigger description that describes the problem.
Trigger Events (Created by Trigger activity)
Trigger state change and duration
Are logged for future reference or audit
Can be acknowledged
Actions (Executed to do work after an Event occurs)
Work on the basis of conditions
Send out (repeated) notifications
Run scripts on hosts
Data Visualization
Things to Remember
Host name is the unique host name used by the agents and the server. (The configuration must match exactly.)
A fully qualified domain name for the host name. (e.g. monitor.rice.edu)
Visible name is optional and not recommended for usage unless necessary. Used only in macros e.g. {HOST.NAME}, lists, maps, etc.
Templates should be in the appropriate host group(s) of the owner. The naming of templates should include the name of the server type, function, application, or service. Also recommend the template should show the owner name.
When using interfaces I strongly recommend using IP address over DNS name. (Unless it's a website or something that changed dynamically)
Do not set any interval under 60 seconds. Intervals should be configured properly. If the interval can be set longer, I recommend doing so. This is a shared environment, be polite.
There are little to no use cases to keep the history of items over 1 day. So please change the history date to one day.
Recommended to use host group names that are easy to identify the group who owns it. Prepend user group or something similar.
Passive Check is data being requested by the Server to the Agent.
Active Check is data being sent from the Agent to the Server.
Unlinking a host from a template will just unlink the items, the items will still exist on the host unless you unlink and clear.
Make sure your actions only affect the host group you are working with and the user/user group you are targeting.