You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
yzl 93958d0fb0
zabbix6.0
1 year ago
..
README.md zabbix6.0 1 year ago
template_db_clickhouse_http.yaml zabbix6.0 1 year ago

README.md

ClickHouse by HTTP

Overview

This template is designed for the effortless deployment of ClickHouse monitoring by Zabbix via HTTP and doesn't require any external scripts.

Requirements

Zabbix version: 7.0 and higher.

Tested versions

This template has been tested on:

  • ClickHouse 20.3+, 21.3+, 22.12+

Configuration

Zabbix should be configured according to the instructions in the Templates out of the box section.

Setup

Create a user to monitor the service:

create file /etc/clickhouse-server/users.d/zabbix.xml
<yandex>
    <users>
      <zabbix>
        <password>zabbix_pass</password>
        <networks incl="networks" />
        <profile>web</profile>
        <quota>default</quota>
        <allow_databases>
          <database>test</database>
        </allow_databases>
      </zabbix>
    </users>
  </yandex>

Login and password are also set in macros:

  • {$CLICKHOUSE.USER}
  • {$CLICKHOUSE.PASSWORD} If you don't need authentication - remove headers from HTTP-Agent type items

Macros used

Name Description Default
{$CLICKHOUSE.USER} zabbix
{$CLICKHOUSE.PASSWORD} zabbix_pass
{$CLICKHOUSE.NETWORK.ERRORS.MAX.WARN}

Maximum number of network errors for trigger expression

5
{$CLICKHOUSE.PORT}

The port of ClickHouse HTTP endpoint

8123
{$CLICKHOUSE.SCHEME}

Request scheme which may be http or https

http
{$CLICKHOUSE.LLD.FILTER.DB.MATCHES}

Filter of discoverable databases

.*
{$CLICKHOUSE.LLD.FILTER.DB.NOT_MATCHES}

Filter to exclude discovered databases

CHANGE_IF_NEEDED
{$CLICKHOUSE.LLD.FILTER.DICT.MATCHES}

Filter of discoverable dictionaries

.*
{$CLICKHOUSE.LLD.FILTER.DICT.NOT_MATCHES}

Filter to exclude discovered dictionaries

CHANGE_IF_NEEDED
{$CLICKHOUSE.LLD.FILTER.TABLE.MATCHES}

Filter of discoverable tables

.*
{$CLICKHOUSE.LLD.FILTER.TABLE.NOT_MATCHES}

Filter to exclude discovered tables

CHANGE_IF_NEEDED
{$CLICKHOUSE.QUERY_TIME.MAX.WARN}

Maximum ClickHouse query time in seconds for trigger expression

600
{$CLICKHOUSE.QUEUE.SIZE.MAX.WARN}

Maximum size of the queue for operations waiting to be performed for trigger expression.

20
{$CLICKHOUSE.LOG_POSITION.DIFF.MAX.WARN}

Maximum diff between log_pointer and log_max_index.

30
{$CLICKHOUSE.REPLICA.MAX.WARN}

Replication lag across all tables for trigger expression.

600
{$CLICKHOUSE.DELAYED.FILES.DISTRIBUTED.COUNT.MAX.WARN}

Maximum size of distributed files queue to insert for trigger expression.

600
{$CLICKHOUSE.PARTS.PER.PARTITION.WARN}

Maximum number of parts per partition for trigger expression.

300
{$CLICKHOUSE.DELAYED.INSERTS.MAX.WARN}

Maximum number of delayed inserts for trigger expression.

0

Items

Name Description Type Key and additional info
ClickHouse: Get system.events

Get information about the number of events that have occurred in the system.

HTTP agent clickhouse.system.events

Preprocessing

  • JSON Path: $.data

ClickHouse: Get system.metrics

Get metrics which can be calculated instantly, or have a current value format JSONEachRow

HTTP agent clickhouse.system.metrics

Preprocessing

  • JSON Path: $.data

ClickHouse: Get system.asynchronous_metrics

Get metrics that are calculated periodically in the background

HTTP agent clickhouse.system.asynchronous_metrics

Preprocessing

  • JSON Path: $.data

ClickHouse: Get system.settings

Get information about settings that are currently in use.

HTTP agent clickhouse.system.settings

Preprocessing

  • JSON Path: $.data

  • Discard unchanged with heartbeat: 1h

ClickHouse: Longest currently running query time

Get longest running query.

HTTP agent clickhouse.process.elapsed
ClickHouse: Check port availability Simple check net.tcp.service[{$CLICKHOUSE.SCHEME},"{HOST.CONN}","{$CLICKHOUSE.PORT}"]

Preprocessing

  • Discard unchanged with heartbeat: 10m

ClickHouse: Ping HTTP agent clickhouse.ping

Preprocessing

  • Regular expression: Ok\. 1

    Custom on fail: Set value to: 0

  • Discard unchanged with heartbeat: 10m

ClickHouse: Version

Version of the server

HTTP agent clickhouse.version

Preprocessing

  • Discard unchanged with heartbeat: 1d

ClickHouse: Revision

Revision of the server.

Dependent item clickhouse.revision

Preprocessing

  • JSON Path: $[?(@.metric == "Revision")].value.first()

ClickHouse: Uptime

Number of seconds since ClickHouse server start

Dependent item clickhouse.uptime

Preprocessing

  • JSON Path: $[?(@.metric == "Uptime")].value.first()

ClickHouse: New queries per second

Number of queries to be interpreted and potentially executed. Does not include queries that failed to parse or were rejected due to AST size limits, quota limits or limits on the number of simultaneously running queries. May include internal queries initiated by ClickHouse itself. Does not count subqueries.

Dependent item clickhouse.query.rate

Preprocessing

  • JSON Path: $[?(@.data.event == "Query")].value.first()

    Custom on fail: Set value to: 0

  • Change per second
ClickHouse: New SELECT queries per second

Number of SELECT queries to be interpreted and potentially executed. Does not include queries that failed to parse or were rejected due to AST size limits, quota limits or limits on the number of simultaneously running queries. May include internal queries initiated by ClickHouse itself. Does not count subqueries.

Dependent item clickhouse.select_query.rate

Preprocessing

  • JSON Path: $[?(@.event == "SelectQuery")].value.first()

    Custom on fail: Set value to: 0

  • Change per second
ClickHouse: New INSERT queries per second

Number of INSERT queries to be interpreted and potentially executed. Does not include queries that failed to parse or were rejected due to AST size limits, quota limits or limits on the number of simultaneously running queries. May include internal queries initiated by ClickHouse itself. Does not count subqueries.

Dependent item clickhouse.insert_query.rate

Preprocessing

  • JSON Path: $[?(@.event == "InsertQuery")].value.first()

    Custom on fail: Set value to: 0

  • Change per second
ClickHouse: Delayed insert queries

Number of INSERT queries that are throttled due to high number of active data parts for partition in a MergeTree table.

Dependent item clickhouse.insert.delay

Preprocessing

  • JSON Path: $[?(@.metric == "DelayedInserts")].value.first()

ClickHouse: Current running queries

Number of executing queries

Dependent item clickhouse.query.current

Preprocessing

  • JSON Path: $[?(@.metric == "Query")].value.first()

ClickHouse: Current running merges

Number of executing background merges

Dependent item clickhouse.merge.current

Preprocessing

  • JSON Path: $[?(@.metric == "Merge")].value.first()

ClickHouse: Inserted bytes per second

The number of uncompressed bytes inserted in all tables.

Dependent item clickhouse.inserted_bytes.rate

Preprocessing

  • JSON Path: $[?(@.event == "InsertedBytes")].value.first()

    Custom on fail: Set value to: 0

  • Change per second
ClickHouse: Read bytes per second

Number of bytes (the number of bytes before decompression) read from compressed sources (files, network).

Dependent item clickhouse.read_bytes.rate

Preprocessing

  • JSON Path: $[?(@.event == "ReadCompressedBytes")].value.first()

    Custom on fail: Set value to: 0

  • Change per second
ClickHouse: Inserted rows per second

The number of rows inserted in all tables.

Dependent item clickhouse.inserted_rows.rate

Preprocessing

  • JSON Path: $[?(@.event == "InsertedRows")].value.first()

    Custom on fail: Set value to: 0

  • Change per second
ClickHouse: Merged rows per second

Rows read for background merges.

Dependent item clickhouse.merge_rows.rate

Preprocessing

  • JSON Path: $[?(@.event == "MergedRows")].value.first()

    Custom on fail: Set value to: 0

  • Change per second
ClickHouse: Uncompressed bytes merged per second

Uncompressed bytes that were read for background merges

Dependent item clickhouse.merge_bytes.rate

Preprocessing

  • JSON Path: $[?(@.event == "MergedUncompressedBytes")].value.first()

    Custom on fail: Set value to: 0

  • Change per second
ClickHouse: Max count of parts per partition across all tables

Clickhouse MergeTree table engine split each INSERT query to partitions (PARTITION BY expression) and add one or more PARTS per INSERT inside each partition, after that background merge process run.

Dependent item clickhouse.max.part.count.for.partition

Preprocessing

  • JSON Path: $[?(@.metric == "MaxPartCountForPartition")].value.first()

ClickHouse: Current TCP connections

Number of connections to TCP server (clients with native interface).

Dependent item clickhouse.connections.tcp

Preprocessing

  • JSON Path: $[?(@.metric == "TCPConnection")].value.first()

ClickHouse: Current HTTP connections

Number of connections to HTTP server.

Dependent item clickhouse.connections.http

Preprocessing

  • JSON Path: $[?(@.metric == "HTTPConnection")].value.first()

ClickHouse: Current distribute connections

Number of connections to remote servers sending data that was INSERTed into Distributed tables.

Dependent item clickhouse.connections.distribute

Preprocessing

  • JSON Path: $[?(@.metric == "DistributedSend")].value.first()

ClickHouse: Current MySQL connections

Number of connections to MySQL server.

Dependent item clickhouse.connections.mysql

Preprocessing

  • JSON Path: $[?(@.metric == "MySQLConnection")].value.first()

    Custom on fail: Set value to: 0

ClickHouse: Current Interserver connections

Number of connections from other replicas to fetch parts.

Dependent item clickhouse.connections.interserver

Preprocessing

  • JSON Path: $[?(@.metric == "InterserverConnection")].value.first()

ClickHouse: Network errors per second

Network errors (timeouts and connection failures) during query execution, background pool tasks and DNS cache update.

Dependent item clickhouse.network.error.rate

Preprocessing

  • JSON Path: $[?(@.event == "NetworkErrors")].value.first()

    Custom on fail: Set value to: 0

  • Change per second
ClickHouse: ZooKeeper sessions

Number of sessions (connections) to ZooKeeper. Should be no more than one.

Dependent item clickhouse.zookeeper.session

Preprocessing

  • JSON Path: $[?(@.metric == "ZooKeeperSession")].value.first()

ClickHouse: ZooKeeper watches

Number of watches (e.g., event subscriptions) in ZooKeeper.

Dependent item clickhouse.zookeeper.watch

Preprocessing

  • JSON Path: $[?(@.metric == "ZooKeeperWatch")].value.first()

ClickHouse: ZooKeeper requests

Number of requests to ZooKeeper in progress.

Dependent item clickhouse.zookeeper.request

Preprocessing

  • JSON Path: $[?(@.metric == "ZooKeeperRequest")].value.first()

ClickHouse: ZooKeeper wait time

Time spent in waiting for ZooKeeper operations.

Dependent item clickhouse.zookeeper.wait.time

Preprocessing

  • JSON Path: $[?(@.event == "ZooKeeperWaitMicroseconds")].value.first()

    Custom on fail: Set value to: 0

  • Custom multiplier: 1.0E-6

  • Change per second
ClickHouse: ZooKeeper exceptions per second

Count of ZooKeeper exceptions that does not belong to user/hardware exceptions.

Dependent item clickhouse.zookeeper.exceptions.rate

Preprocessing

  • JSON Path: $[?(@.event == "ZooKeeperOtherExceptions")].value.first()

    Custom on fail: Set value to: 0

  • Change per second
ClickHouse: ZooKeeper hardware exceptions per second

Count of ZooKeeper exceptions caused by session moved/expired, connection loss, marshalling error, operation timed out and invalid zhandle state.

Dependent item clickhouse.zookeeper.hw_exceptions.rate

Preprocessing

  • JSON Path: $[?(@.event == "ZooKeeperHardwareExceptions")].value.first()

    Custom on fail: Set value to: 0

  • Change per second
ClickHouse: ZooKeeper user exceptions per second

Count of ZooKeeper exceptions caused by no znodes, bad version, node exists, node empty and no children for ephemeral.

Dependent item clickhouse.zookeeper.user_exceptions.rate

Preprocessing

  • JSON Path: $[?(@.event == "ZooKeeperUserExceptions")].value.first()

    Custom on fail: Set value to: 0

  • Change per second
ClickHouse: Read syscalls in fly

Number of read (read, pread, io_getevents, etc.) syscalls in fly

Dependent item clickhouse.read

Preprocessing

  • JSON Path: $[?(@.metric == "Read")].value.first()

ClickHouse: Write syscalls in fly

Number of write (write, pwrite, io_getevents, etc.) syscalls in fly

Dependent item clickhouse.write

Preprocessing

  • JSON Path: $[?(@.metric == "Write")].value.first()

ClickHouse: Allocated bytes

Total number of bytes allocated by the application.

Dependent item clickhouse.jemalloc.allocated

Preprocessing

  • JSON Path: $[?(@.metric == "jemalloc.allocated")].value.first()

ClickHouse: Resident memory

Maximum number of bytes in physically resident data pages mapped by the allocator,

comprising all pages dedicated to allocator metadata, pages backing active allocations,

and unused dirty pages.

Dependent item clickhouse.jemalloc.resident

Preprocessing

  • JSON Path: $[?(@.metric == "jemalloc.resident")].value.first()

ClickHouse: Mapped memory

Total number of bytes in active extents mapped by the allocator.

Dependent item clickhouse.jemalloc.mapped

Preprocessing

  • JSON Path: $[?(@.metric == "jemalloc.mapped")].value.first()

ClickHouse: Memory used for queries

Total amount of memory (bytes) allocated in currently executing queries.

Dependent item clickhouse.memory.tracking

Preprocessing

  • JSON Path: $[?(@.metric == "MemoryTracking")].value.first()

ClickHouse: Memory used for background merges

Total amount of memory (bytes) allocated in background processing pool (that is dedicated for background merges, mutations and fetches).

Note that this value may include a drift when the memory was allocated in a context of background processing pool and freed in other context or vice-versa. This happens naturally due to caches for tables indexes and doesn't indicate memory leaks.

Dependent item clickhouse.memory.tracking.background

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    Custom on fail: Set value to: 0

ClickHouse: Memory used for background moves

Total amount of memory (bytes) allocated in background processing pool (that is dedicated for background moves). Note that this value may include a drift when the memory was allocated in a context of background processing pool and freed in other context or vice-versa.

This happens naturally due to caches for tables indexes and doesn't indicate memory leaks.

Dependent item clickhouse.memory.tracking.background.moves

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    Custom on fail: Set value to: 0

ClickHouse: Memory used for background schedule pool

Total amount of memory (bytes) allocated in background schedule pool (that is dedicated for bookkeeping tasks of Replicated tables).

Dependent item clickhouse.memory.tracking.schedule.pool

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    Custom on fail: Set value to: 0

ClickHouse: Memory used for merges

Total amount of memory (bytes) allocated for background merges. Included in MemoryTrackingInBackgroundProcessingPool. Note that this value may include a drift when the memory was allocated in a context of background processing pool and freed in other context or vice-versa.

This happens naturally due to caches for tables indexes and doesn't indicate memory leaks.

Dependent item clickhouse.memory.tracking.merges

Preprocessing

  • JSON Path: $[?(@.metric == "MemoryTrackingForMerges")].value.first()

    Custom on fail: Set value to: 0

ClickHouse: Current distributed files to insert

Number of pending files to process for asynchronous insertion into Distributed tables. Number of files for every shard is summed.

Dependent item clickhouse.distributed.files

Preprocessing

  • JSON Path: $[?(@.metric == "DistributedFilesToInsert")].value.first()

ClickHouse: Distributed connection fail with retry per second

Connection retries in replicated DB connection pool

Dependent item clickhouse.distributed.files.retry.rate

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    Custom on fail: Set value to: 0

  • Change per second
ClickHouse: Distributed connection fail with retry per second

Connection failures after all retries in replicated DB connection pool

Dependent item clickhouse.distributed.files.fail.rate

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    Custom on fail: Set value to: 0

  • Change per second
ClickHouse: Replication lag across all tables

Maximum replica queue delay relative to current time

Dependent item clickhouse.replicas.max.absolute.delay

Preprocessing

  • JSON Path: $[?(@.metric == "ReplicasMaxAbsoluteDelay")].value.first()

ClickHouse: Total replication tasks in queue

Number of replication tasks in queue

Dependent item clickhouse.replicas.sum.queue.size

Preprocessing

  • JSON Path: $[?(@.metric == "ReplicasSumQueueSize")].value.first()

ClickHouse: Total number read-only Replicas

Number of Replicated tables that are currently in readonly state due to re-initialization after ZooKeeper session loss or due to startup without ZooKeeper configured.

Dependent item clickhouse.replicas.readonly.total

Preprocessing

  • JSON Path: $[?(@.metric == "ReadonlyReplica")].value.first()

ClickHouse: Get replicas info

Get information about replicas.

HTTP agent clickhouse.replicas

Preprocessing

  • JSON Path: $.data

ClickHouse: Get databases info

Get information about databases.

HTTP agent clickhouse.databases

Preprocessing

  • JSON Path: $.data

ClickHouse: Get tables info

Get information about tables.

HTTP agent clickhouse.tables

Preprocessing

  • JSON Path: $.data

ClickHouse: Get dictionaries info

Get information about dictionaries.

HTTP agent clickhouse.dictionaries

Preprocessing

  • JSON Path: $.data

Triggers

Name Description Expression Severity Dependencies and additional info
ClickHouse: Configuration has been changed

ClickHouse configuration has been changed. Acknowledge to close the problem manually.

last(/ClickHouse by HTTP/clickhouse.system.settings,#1)<>last(/ClickHouse by HTTP/clickhouse.system.settings,#2) and length(last(/ClickHouse by HTTP/clickhouse.system.settings))>0 Info Manual close: Yes
ClickHouse: There are queries running is long last(/ClickHouse by HTTP/clickhouse.process.elapsed)>{$CLICKHOUSE.QUERY_TIME.MAX.WARN} Average Manual close: Yes
ClickHouse: Port {$CLICKHOUSE.PORT} is unavailable last(/ClickHouse by HTTP/net.tcp.service[{$CLICKHOUSE.SCHEME},"{HOST.CONN}","{$CLICKHOUSE.PORT}"])=0 Average Manual close: Yes
ClickHouse: Service is down last(/ClickHouse by HTTP/clickhouse.ping)=0 or last(/ClickHouse by HTTP/net.tcp.service[{$CLICKHOUSE.SCHEME},"{HOST.CONN}","{$CLICKHOUSE.PORT}"]) = 0 Average Manual close: Yes
Depends on:
  • ClickHouse: Port {$CLICKHOUSE.PORT} is unavailable
ClickHouse: Version has changed

The ClickHouse version has changed. Acknowledge to close the problem manually.

last(/ClickHouse by HTTP/clickhouse.version,#1)<>last(/ClickHouse by HTTP/clickhouse.version,#2) and length(last(/ClickHouse by HTTP/clickhouse.version))>0 Info Manual close: Yes
ClickHouse: Host has been restarted

The host uptime is less than 10 minutes.

last(/ClickHouse by HTTP/clickhouse.uptime)<10m Info Manual close: Yes
ClickHouse: Failed to fetch info data

Zabbix has not received any data for items for the last 30 minutes.

nodata(/ClickHouse by HTTP/clickhouse.uptime,30m)=1 Warning Manual close: Yes
Depends on:
  • ClickHouse: Service is down
ClickHouse: Too many throttled insert queries

Clickhouse have INSERT queries that are throttled due to high number of active data parts for partition in a MergeTree, please decrease INSERT frequency

min(/ClickHouse by HTTP/clickhouse.insert.delay,5m)>{$CLICKHOUSE.DELAYED.INSERTS.MAX.WARN} Warning Manual close: Yes
ClickHouse: Too many MergeTree parts

Descease INSERT queries frequency.
Clickhouse MergeTree table engine split each INSERT query to partitions (PARTITION BY expression)
and add one or more PARTS per INSERT inside each partition,
after that background merge process run, and when you have too much unmerged parts inside partition,
SELECT queries performance can significate degrade, so clickhouse try delay insert, or abort it.

min(/ClickHouse by HTTP/clickhouse.max.part.count.for.partition,5m)>{$CLICKHOUSE.PARTS.PER.PARTITION.WARN} * 0.9 Warning Manual close: Yes
ClickHouse: Too many network errors

Number of errors (timeouts and connection failures) during query execution, background pool tasks and DNS cache update is too high.

min(/ClickHouse by HTTP/clickhouse.network.error.rate,5m)>{$CLICKHOUSE.NETWORK.ERRORS.MAX.WARN} Warning
ClickHouse: Too many ZooKeeper sessions opened

Number of sessions (connections) to ZooKeeper.
Should be no more than one, because using more than one connection to ZooKeeper may lead to bugs due to lack of linearizability (stale reads) that ZooKeeper consistency model allows.

min(/ClickHouse by HTTP/clickhouse.zookeeper.session,5m)>1 Warning
ClickHouse: Too many distributed files to insert

Clickhouse servers and <remote_servers> in config.xml (https://clickhouse.tech/docs/en/operations/table_engines/distributed/)

min(/ClickHouse by HTTP/clickhouse.distributed.files,5m)>{$CLICKHOUSE.DELAYED.FILES.DISTRIBUTED.COUNT.MAX.WARN} Warning Manual close: Yes
ClickHouse: Replication lag is too high

When replica have too much lag, it can be skipped from Distributed SELECT Queries without errors
and you will have wrong query results.

min(/ClickHouse by HTTP/clickhouse.replicas.max.absolute.delay,5m)>{$CLICKHOUSE.REPLICA.MAX.WARN} Warning Manual close: Yes

LLD rule Tables

Name Description Type Key and additional info
Tables

Info about tables

Dependent item clickhouse.tables.discovery

Item prototypes for Tables

Name Description Type Key and additional info
ClickHouse: {#DB}.{#TABLE}: Get table info

The item gets information about {#TABLE} table of {#DB} database.

Dependent item clickhouse.table.info_raw["{#DB}.{#TABLE}"]

Preprocessing

  • JSON Path: $[?(@.database == "{#DB}" && @.table == "{#TABLE}")].first()

    Custom on fail: Discard value

ClickHouse: {#DB}.{#TABLE}: Bytes

Table size in bytes. Database: {#DB}, table: {#TABLE}

Dependent item clickhouse.table.bytes["{#DB}.{#TABLE}"]

Preprocessing

  • JSON Path: $.bytes

ClickHouse: {#DB}.{#TABLE}: Parts

Number of parts of the table. Database: {#DB}, table: {#TABLE}

Dependent item clickhouse.table.parts["{#DB}.{#TABLE}"]

Preprocessing

  • JSON Path: $.parts

ClickHouse: {#DB}.{#TABLE}: Rows

Number of rows in the table. Database: {#DB}, table: {#TABLE}

Dependent item clickhouse.table.rows["{#DB}.{#TABLE}"]

Preprocessing

  • JSON Path: $.rows

LLD rule Replicas

Name Description Type Key and additional info
Replicas

Info about replicas

Dependent item clickhouse.replicas.discovery

Item prototypes for Replicas

Name Description Type Key and additional info
ClickHouse: {#DB}.{#TABLE}: Get replicas info

The item gets information about replicas of {#TABLE} table of {#DB} database.

Dependent item clickhouse.replica.info_raw["{#DB}.{#TABLE}"]

Preprocessing

  • JSON Path: $[?(@.database == "{#DB}" && @.table == "{#TABLE}")].first()

    Custom on fail: Discard value

ClickHouse: {#DB}.{#TABLE}: Replica readonly

Whether the replica is in read-only mode.

This mode is turned on if the config doesn't have sections with ZooKeeper, if an unknown error occurred when re-initializing sessions in ZooKeeper, and during session re-initialization in ZooKeeper.

Dependent item clickhouse.replica.is_readonly["{#DB}.{#TABLE}"]

Preprocessing

  • JSON Path: $.is_readonly

ClickHouse: {#DB}.{#TABLE}: Replica session expired

True if the ZooKeeper session expired

Dependent item clickhouse.replica.is_session_expired["{#DB}.{#TABLE}"]

Preprocessing

  • JSON Path: $.is_session_expired

ClickHouse: {#DB}.{#TABLE}: Replica future parts

Number of data parts that will appear as the result of INSERTs or merges that haven't been done yet.

Dependent item clickhouse.replica.future_parts["{#DB}.{#TABLE}"]

Preprocessing

  • JSON Path: $.future_parts

ClickHouse: {#DB}.{#TABLE}: Replica parts to check

Number of data parts in the queue for verification. A part is put in the verification queue if there is suspicion that it might be damaged.

Dependent item clickhouse.replica.parts_to_check["{#DB}.{#TABLE}"]

Preprocessing

  • JSON Path: $.parts_to_check

ClickHouse: {#DB}.{#TABLE}: Replica queue size

Size of the queue for operations waiting to be performed.

Dependent item clickhouse.replica.queue_size["{#DB}.{#TABLE}"]

Preprocessing

  • JSON Path: $.queue_size

ClickHouse: {#DB}.{#TABLE}: Replica queue inserts size

Number of inserts of blocks of data that need to be made.

Dependent item clickhouse.replica.inserts_in_queue["{#DB}.{#TABLE}"]

Preprocessing

  • JSON Path: $.inserts_in_queue

ClickHouse: {#DB}.{#TABLE}: Replica queue merges size

Number of merges waiting to be made.

Dependent item clickhouse.replica.merges_in_queue["{#DB}.{#TABLE}"]

Preprocessing

  • JSON Path: $.merges_in_queue

ClickHouse: {#DB}.{#TABLE}: Replica log max index

Maximum entry number in the log of general activity. (Have a non-zero value only where there is an active session with ZooKeeper).

Dependent item clickhouse.replica.log_max_index["{#DB}.{#TABLE}"]

Preprocessing

  • JSON Path: $.log_max_index

ClickHouse: {#DB}.{#TABLE}: Replica log pointer

Maximum entry number in the log of general activity that the replica copied to its execution queue, plus one. (Have a non-zero value only where there is an active session with ZooKeeper).

Dependent item clickhouse.replica.log_pointer["{#DB}.{#TABLE}"]

Preprocessing

  • JSON Path: $.log_pointer

ClickHouse: {#DB}.{#TABLE}: Total replicas

Total number of known replicas of this table. (Have a non-zero value only where there is an active session with ZooKeeper).

Dependent item clickhouse.replica.total_replicas["{#DB}.{#TABLE}"]

Preprocessing

  • JSON Path: $.total_replicas

ClickHouse: {#DB}.{#TABLE}: Active replicas

Number of replicas of this table that have a session in ZooKeeper (i.e., the number of functioning replicas). (Have a non-zero value only where there is an active session with ZooKeeper).

Dependent item clickhouse.replica.active_replicas["{#DB}.{#TABLE}"]

Preprocessing

  • JSON Path: $.active_replicas

ClickHouse: {#DB}.{#TABLE}: Replica lag

Difference between log_max_index and log_pointer

Dependent item clickhouse.replica.lag["{#DB}.{#TABLE}"]

Preprocessing

  • JSON Path: $.replica_lag

Trigger prototypes for Replicas

Name Description Expression Severity Dependencies and additional info
ClickHouse: {#DB}.{#TABLE} Replica is readonly

This mode is turned on if the config doesn't have sections with ZooKeeper, if an unknown error occurred when re-initializing sessions in ZooKeeper, and during session re-initialization in ZooKeeper.

min(/ClickHouse by HTTP/clickhouse.replica.is_readonly["{#DB}.{#TABLE}"],5m)=1 Warning
ClickHouse: {#DB}.{#TABLE} Replica session is expired

This mode is turned on if the config doesn't have sections with ZooKeeper, if an unknown error occurred when re-initializing sessions in ZooKeeper, and during session re-initialization in ZooKeeper.

min(/ClickHouse by HTTP/clickhouse.replica.is_session_expired["{#DB}.{#TABLE}"],5m)=1 Warning
ClickHouse: {#DB}.{#TABLE}: Too many operations in queue min(/ClickHouse by HTTP/clickhouse.replica.queue_size["{#DB}.{#TABLE}"],5m)>{$CLICKHOUSE.QUEUE.SIZE.MAX.WARN:"{#TABLE}"} Warning
ClickHouse: {#DB}.{#TABLE}: Number of active replicas less than number of total replicas max(/ClickHouse by HTTP/clickhouse.replica.active_replicas["{#DB}.{#TABLE}"],5m) < last(/ClickHouse by HTTP/clickhouse.replica.total_replicas["{#DB}.{#TABLE}"]) Warning
ClickHouse: {#DB}.{#TABLE}: Difference between log_max_index and log_pointer is too high min(/ClickHouse by HTTP/clickhouse.replica.lag["{#DB}.{#TABLE}"],5m) > {$CLICKHOUSE.LOG_POSITION.DIFF.MAX.WARN} Warning

LLD rule Dictionaries

Name Description Type Key and additional info
Dictionaries

Info about dictionaries

Dependent item clickhouse.dictionaries.discovery

Item prototypes for Dictionaries

Name Description Type Key and additional info
ClickHouse: Dictionary {#NAME}: Get dictionary info

The item gets information about {#NAME} dictionary.

Dependent item clickhouse.dictionary.info_raw["{#NAME}"]

Preprocessing

  • JSON Path: $[?(@.name == "{#NAME}")].first()

    Custom on fail: Discard value

ClickHouse: Dictionary {#NAME}: Bytes allocated

The amount of RAM the dictionary uses.

Dependent item clickhouse.dictionary.bytes_allocated["{#NAME}"]

Preprocessing

  • JSON Path: $.bytes_allocated

ClickHouse: Dictionary {#NAME}: Element count

Number of items stored in the dictionary.

Dependent item clickhouse.dictionary.element_count["{#NAME}"]

Preprocessing

  • JSON Path: $.element_count

ClickHouse: Dictionary {#NAME}: Load factor

The percentage filled in the dictionary (for a hashed dictionary, the percentage filled in the hash table).

Dependent item clickhouse.dictionary.load_factor["{#NAME}"]

Preprocessing

  • JSON Path: $.bytes_allocated

  • Custom multiplier: 100

LLD rule Databases

Name Description Type Key and additional info
Databases

Info about databases

Dependent item clickhouse.db.discovery

Item prototypes for Databases

Name Description Type Key and additional info
ClickHouse: {#DB}: Get DB info

The item gets information about {#DB} database.

Dependent item clickhouse.db.info_raw["{#DB}"]

Preprocessing

  • JSON Path: $[?(@.database == "{#DB}")].first()

    Custom on fail: Discard value

ClickHouse: {#DB}: Bytes

Database size in bytes.

Dependent item clickhouse.db.bytes["{#DB}"]

Preprocessing

  • JSON Path: $.bytes

ClickHouse: {#DB}: Tables

Number of tables in {#DB} database.

Dependent item clickhouse.db.tables["{#DB}"]

Preprocessing

  • JSON Path: $.tables

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums