You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
yzl 93958d0fb0
zabbix6.0
1 year ago
..
README.md zabbix6.0 1 year ago
template_db_influxdb_http.yaml zabbix6.0 1 year ago

README.md

InfluxDB by HTTP

Overview

This template is designed for the effortless deployment of InfluxDB monitoring by Zabbix via HTTP and doesn't require any external scripts.

Requirements

Zabbix version: 7.0 and higher.

Tested versions

This template has been tested on:

  • InfluxDB 2.0

Configuration

Zabbix should be configured according to the instructions in the Templates out of the box section.

Setup

This template works with self-hosted InfluxDB instances. Internal service metrics are collected from InfluxDB /metrics endpoint. For organization discovery template need to use Authorization via API token. See docs: https://docs.influxdata.com/influxdb/v2.0/security/tokens/

Don't forget to change the macros {$INFLUXDB.URL}, {$INFLUXDB.API.TOKEN}. Also, see the Macros section for a list of macros used to set trigger values. NOTE. Some metrics may not be collected depending on your InfluxDB instance version and configuration.

Macros used

Name Description Default
{$INFLUXDB.URL}

InfluxDB instance URL

http://localhost:8086
{$INFLUXDB.API.TOKEN}

InfluxDB API Authorization Token

{$INFLUXDB.ORG_NAME.MATCHES}

Filter of discoverable organizations

.*
{$INFLUXDB.ORG_NAME.NOT_MATCHES}

Filter to exclude discovered organizations

CHANGE_IF_NEEDED
{$INFLUXDB.TASK.RUN.FAIL.MAX.WARN}

Maximum number of tasks runs failures for trigger expression.

2
{$INFLUXDB.REQ.FAIL.MAX.WARN}

Maximum number of query requests failures for trigger expression.

2

Items

Name Description Type Key and additional info
InfluxDB: Get instance metrics HTTP agent influx.get_metrics

Preprocessing

  • Check for not supported value

    Custom on fail: Discard value

  • Prometheus to JSON
InfluxDB: Instance status

Get the health of an instance.

HTTP agent influx.healthcheck

Preprocessing

  • Check for not supported value

    Custom on fail: Set value to: {"status":"fail"}]}

  • JavaScript: return JSON.parse(value).status == 'pass' ? 1: 0

  • Discard unchanged with heartbeat: 30m

InfluxDB: Boltdb reads, rate

Total number of boltdb reads per second.

Dependent item influxdb.boltdb_reads.rate

Preprocessing

  • JSON Path: $[?(@.name=="boltdb_reads_total")].value.first()

    Custom on fail: Discard value

  • Change per second
InfluxDB: Boltdb writes, rate

Total number of boltdb writes per second.

Dependent item influxdb.boltdb_writes.rate

Preprocessing

  • JSON Path: $[?(@.name=="boltdb_writes_total")].value.first()

    Custom on fail: Discard value

  • Change per second
InfluxDB: Buckets, total

Number of total buckets on the server.

Dependent item influxdb.buckets.total

Preprocessing

  • JSON Path: $[?(@.name=="influxdb_buckets_total")].value.first()

    Custom on fail: Discard value

  • Discard unchanged with heartbeat: 30m

InfluxDB: Dashboards, total

Number of total dashboards on the server.

Dependent item influxdb.dashboards.total

Preprocessing

  • JSON Path: $[?(@.name=="influxdb_dashboards_total")].value.first()

    Custom on fail: Discard value

  • Discard unchanged with heartbeat: 30m

InfluxDB: Organizations, total

Number of total organizations on the server.

Dependent item influxdb.organizations.total

Preprocessing

  • JSON Path: $[?(@.name=="influxdb_organizations_total")].value.first()

    Custom on fail: Discard value

  • Discard unchanged with heartbeat: 30m

InfluxDB: Scrapers, total

Number of total scrapers on the server.

Dependent item influxdb.scrapers.total

Preprocessing

  • JSON Path: $[?(@.name=="influxdb_scrapers_total")].value.first()

    Custom on fail: Discard value

  • Discard unchanged with heartbeat: 30m

InfluxDB: Telegraf plugins, total

Number of individual telegraf plugins configured.

Dependent item influxdb.telegraf_plugins.total

Preprocessing

  • JSON Path: $[?(@.name=="influxdb_telegraf_plugins_count")].value.sum()

    Custom on fail: Discard value

  • Discard unchanged with heartbeat: 30m

InfluxDB: Telegrafs, total

Number of total telegraf configurations on the server.

Dependent item influxdb.telegrafs.total

Preprocessing

  • JSON Path: $[?(@.name=="influxdb_telegrafs_total")].value.first()

    Custom on fail: Discard value

  • Discard unchanged with heartbeat: 30m

InfluxDB: Tokens, total

Number of total tokens on the server.

Dependent item influxdb.tokens.total

Preprocessing

  • JSON Path: $[?(@.name=="influxdb_tokens_total")].value.first()

    Custom on fail: Discard value

  • Discard unchanged with heartbeat: 30m

InfluxDB: Users, total

Number of total users on the server.

Dependent item influxdb.users.total

Preprocessing

  • JSON Path: $[?(@.name=="influxdb_users_total")].value.first()

    Custom on fail: Discard value

  • Discard unchanged with heartbeat: 30m

InfluxDB: Version

Version of the InfluxDB instance.

Dependent item influxdb.version

Preprocessing

  • JSON Path: $[?(@.name=="influxdb_info")].labels.version.first()

  • Discard unchanged with heartbeat: 3h

InfluxDB: Uptime

InfluxDB process uptime in seconds.

Dependent item influxdb.uptime

Preprocessing

  • JSON Path: $[?(@.name=="influxdb_uptime_seconds")].value.first()

InfluxDB: Workers currently running

Total number of workers currently running tasks.

Dependent item influxdb.task_executor_runs_active.total

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    Custom on fail: Discard value

InfluxDB: Workers busy, pct

Percent of total available workers that are currently busy.

Dependent item influxdb.task_executor_workers_busy.pct

Preprocessing

  • JSON Path: $[?(@.name=="task_executor_workers_busy")].value.first()

    Custom on fail: Discard value

InfluxDB: Task runs failed, rate

Total number of failure runs across all tasks.

Dependent item influxdb.task_executor_complete.failed.rate

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    Custom on fail: Discard value

  • Change per second
InfluxDB: Task runs successful, rate

Total number of runs successful completed across all tasks.

Dependent item influxdb.task_executor_complete.successful.rate

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    Custom on fail: Discard value

  • Change per second

Triggers

Name Description Expression Severity Dependencies and additional info
InfluxDB: Health check was failed

The InfluxDB instance is not available or unhealthy.

last(/InfluxDB by HTTP/influx.healthcheck)=0 High
InfluxDB: Version has changed

InfluxDB version has changed. Acknowledge to close the problem manually.

last(/InfluxDB by HTTP/influxdb.version,#1)<>last(/InfluxDB by HTTP/influxdb.version,#2) and length(last(/InfluxDB by HTTP/influxdb.version))>0 Info Manual close: Yes
InfluxDB: has been restarted

Uptime is less than 10 minutes.

last(/InfluxDB by HTTP/influxdb.uptime)<10m Info Manual close: Yes
InfluxDB: Too many tasks failure runs

"Number of failure runs completed across all tasks is too high."

min(/InfluxDB by HTTP/influxdb.task_executor_complete.failed.rate,5m)>{$INFLUXDB.TASK.RUN.FAIL.MAX.WARN} Warning

LLD rule Organizations discovery

Name Description Type Key and additional info
Organizations discovery

Discovery of organizations metrics.

HTTP agent influxdb.orgs.discovery

Preprocessing

  • JavaScript: The text is too long. Please see the template.

  • Discard unchanged with heartbeat: 1h

Item prototypes for Organizations discovery

Name Description Type Key and additional info
InfluxDB: [{#ORG_NAME}] Query requests bytes, success

Count of bytes received with status 200 per second.

Dependent item influxdb.org.query_request_bytes.success.rate["{#ORG_NAME}"]

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    Custom on fail: Discard value

  • Change per second
InfluxDB: [{#ORG_NAME}] Query requests bytes, failed

Count of bytes received with status not 200 per second.

Dependent item influxdb.org.query_request_bytes.failed.rate["{#ORG_NAME}"]

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    Custom on fail: Discard value

  • Change per second
InfluxDB: [{#ORG_NAME}] Query requests, failed

Total number of query requests with status not 200 per second.

Dependent item influxdb.org.query_request.failed.rate["{#ORG_NAME}"]

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    Custom on fail: Discard value

  • Change per second
InfluxDB: [{#ORG_NAME}] Query requests, success

Total number of query requests with status 200 per second.

Dependent item influxdb.org.query_request.success.rate["{#ORG_NAME}"]

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    Custom on fail: Discard value

  • Change per second
InfluxDB: [{#ORG_NAME}] Query response bytes, success

Count of bytes returned with status 200 per second.

Dependent item influxdb.org.http_query_response_bytes.success.rate["{#ORG_NAME}"]

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    Custom on fail: Discard value

  • Change per second
InfluxDB: [{#ORG_NAME}] Query response bytes, failed

Count of bytes returned with status not 200 per second.

Dependent item influxdb.org.http_query_response_bytes.failed.rate["{#ORG_NAME}"]

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    Custom on fail: Discard value

  • Change per second

Trigger prototypes for Organizations discovery

Name Description Expression Severity Dependencies and additional info
InfluxDB: [{#ORG_NAME}]: Too many requests failures

Too many query requests failed.

min(/InfluxDB by HTTP/influxdb.org.query_request.failed.rate["{#ORG_NAME}"],5m)>{$INFLUXDB.REQ.FAIL.MAX.WARN} Warning

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums