# HashiCorp Vault by HTTP ## Overview The template to monitor HashiCorp Vault by Zabbix that work without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection. Template `Vault by HTTP` — collects metrics by HTTP agent from `/sys/metrics` API endpoint. See https://www.vaultproject.io/api-docs/system/metrics. ## Requirements Zabbix version: 7.0 and higher. ## Tested versions This template has been tested on: - Vault 1.6 ## Configuration > Zabbix should be configured according to the instructions in the [Templates out of the box](https://www.zabbix.com/documentation/7.0/manual/config/templates_out_of_the_box) section. ## Setup > See [Zabbix template operation](https://www.zabbix.com/documentation/7.0/manual/config/templates_out_of_the_box/http) for basic instructions. Configure Vault API. See [Vault Configuration](https://www.vaultproject.io/docs/configuration). Create a Vault service token and set it to the macro `{$VAULT.TOKEN}`. ### Macros used |Name|Description|Default| |----|-----------|-------| |{$VAULT.API.PORT}|
Vault port.
|`8200`| |{$VAULT.API.SCHEME}|Vault API scheme.
|`http`| |{$VAULT.HOST}|Vault host name.
|`Maximum percentage of used file descriptors for trigger expression.
|`90`| |{$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN}|Maximum number of Vault leadership setup failed.
|`5`| |{$VAULT.LEADERSHIP.LOSSES.MAX.WARN}|Maximum number of Vault leadership losses.
|`5`| |{$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN}|Maximum number of Vault leadership step downs.
|`5`| |{$VAULT.LLD.FILTER.STORAGE.MATCHES}|Filter of discoverable storage backends.
|`.+`| |{$VAULT.TOKEN}|Vault auth token.
|`Vault accessors separated by spaces for monitoring token expiration time.
|| |{$VAULT.TOKEN.TTL.MIN.CRIT}|Token TTL critical threshold.
|`3d`| |{$VAULT.TOKEN.TTL.MIN.WARN}|Token TTL warning threshold.
|`7d`| ### Items |Name|Description|Type|Key and additional info| |----|-----------|----|-----------------------| |Vault: Get health||HTTP agent|vault.get_health**Preprocessing**
Check for not supported value
⛔️Custom on fail: Set value to: `{"healthcheck": 0}`
**Preprocessing**
Check for not supported value
⛔️Custom on fail: Discard value
**Preprocessing**
Check for not supported value
⛔️Custom on fail: Discard value
**Preprocessing**
Check for error in JSON: `$.errors`
⛔️Custom on fail: Discard value
Get information about tokens via their accessors. Accessors are defined in the macro "{$VAULT.TOKEN.ACCESSORS}".
|Script|vault.get_tokens| |Vault: Check WAL discovery||Dependent item|vault.check_wal_discovery**Preprocessing**
Prometheus to JSON: `{__name__=~"^vault_wal_(?:.+)$"}`
⛔️Custom on fail: Discard value
JavaScript: `The text is too long. Please see the template.`
Discard unchanged with heartbeat: `15m`
**Preprocessing**
Prometheus to JSON: `{__name__=~"^replication_(?:.+)$"}`
⛔️Custom on fail: Discard value
JavaScript: `The text is too long. Please see the template.`
Discard unchanged with heartbeat: `15m`
**Preprocessing**
Prometheus to JSON: `{__name__=~"^vault_(?:.+)_(?:get|put|list|delete)_count$"}`
⛔️Custom on fail: Discard value
JavaScript: `The text is too long. Please see the template.`
Discard unchanged with heartbeat: `15m`
**Preprocessing**
Prometheus to JSON: `{__name__=~"^vault_rollback_attempt_(?:.+?)_count$"}`
⛔️Custom on fail: Discard value
JavaScript: `The text is too long. Please see the template.`
Discard unchanged with heartbeat: `15m`
Initialization status.
|Dependent item|vault.health.initialized**Preprocessing**
JSON Path: `$.initialized`
⛔️Custom on fail: Discard value
Discard unchanged with heartbeat: `1h`
Seal status.
|Dependent item|vault.health.sealed**Preprocessing**
JSON Path: `$.sealed`
⛔️Custom on fail: Discard value
Discard unchanged with heartbeat: `1h`
Standby status.
|Dependent item|vault.health.standby**Preprocessing**
JSON Path: `$.standby`
⛔️Custom on fail: Discard value
Discard unchanged with heartbeat: `1h`
Performance standby status.
|Dependent item|vault.health.performance_standby**Preprocessing**
JSON Path: `$.performance_standby`
⛔️Custom on fail: Discard value
Discard unchanged with heartbeat: `1h`
Performance replication mode
https://www.vaultproject.io/docs/enterprise/replication
|Dependent item|vault.health.replication_performance_mode**Preprocessing**
JSON Path: `$.replication_performance_mode`
⛔️Custom on fail: Discard value
Discard unchanged with heartbeat: `1h`
Disaster recovery replication mode
https://www.vaultproject.io/docs/enterprise/replication
|Dependent item|vault.health.replication_dr_mode**Preprocessing**
JSON Path: `$.replication_dr_mode`
⛔️Custom on fail: Discard value
Discard unchanged with heartbeat: `1h`
Server version.
|Dependent item|vault.health.version**Preprocessing**
JSON Path: `$.version`
⛔️Custom on fail: Discard value
Discard unchanged with heartbeat: `1h`
Vault healthcheck.
|Dependent item|vault.health.check**Preprocessing**
JSON Path: `$.healthcheck`
⛔️Custom on fail: Set value to: `1`
Discard unchanged with heartbeat: `1h`
HA enabled status.
|Dependent item|vault.leader.ha_enabled**Preprocessing**
JSON Path: `$.ha_enabled`
Discard unchanged with heartbeat: `1h`
Leader status.
|Dependent item|vault.leader.is_self**Preprocessing**
JSON Path: `$.is_self`
Discard unchanged with heartbeat: `1h`
Get metrics error.
|Dependent item|vault.get_metrics.error**Preprocessing**
JSON Path: `$.errors[0]`
⛔️Custom on fail: Set value to: ``
Discard unchanged with heartbeat: `1h`
Total user and system CPU time spent in seconds.
|Dependent item|vault.metrics.process.cpu.seconds.total**Preprocessing**
Prometheus pattern: `VALUE(process_cpu_seconds_total)`
⛔️Custom on fail: Discard value
Maximum number of open file descriptors.
|Dependent item|vault.metrics.process.max.fds**Preprocessing**
Prometheus pattern: `VALUE(process_max_fds)`
⛔️Custom on fail: Discard value
Discard unchanged with heartbeat: `1h`
Number of open file descriptors.
|Dependent item|vault.metrics.process.open.fds**Preprocessing**
Prometheus pattern: `VALUE(process_open_fds)`
⛔️Custom on fail: Discard value
Resident memory size in bytes.
|Dependent item|vault.metrics.process.resident_memory.bytes**Preprocessing**
Prometheus pattern: `VALUE(process_resident_memory_bytes)`
⛔️Custom on fail: Discard value
Server uptime.
|Dependent item|vault.metrics.process.uptime**Preprocessing**
Prometheus pattern: `VALUE(process_start_time_seconds)`
⛔️Custom on fail: Discard value
JavaScript: `The text is too long. Please see the template.`
Virtual memory size in bytes.
|Dependent item|vault.metrics.process.virtual_memory.bytes**Preprocessing**
Prometheus pattern: `VALUE(process_virtual_memory_bytes)`
⛔️Custom on fail: Discard value
Maximum amount of virtual memory available in bytes.
|Dependent item|vault.metrics.process.virtual_memory.max.bytes**Preprocessing**
Prometheus pattern: `VALUE(process_virtual_memory_max_bytes)`
⛔️Custom on fail: Discard value
Discard unchanged with heartbeat: `1h`
Number of all audit log requests across all audit log devices.
|Dependent item|vault.metrics.audit.log.request.rate**Preprocessing**
Prometheus pattern: `VALUE(vault_audit_log_request_count)`
⛔️Custom on fail: Discard value
Number of audit log request failures.
|Dependent item|vault.metrics.audit.log.request.failure.rate**Preprocessing**
Prometheus pattern: `VALUE(vault_audit_log_request_failure)`
⛔️Custom on fail: Discard value
Number of audit log responses across all audit log devices.
|Dependent item|vault.metrics.audit.log.response.rate**Preprocessing**
Prometheus pattern: `VALUE(vault_audit_log_response_count)`
⛔️Custom on fail: Discard value
Number of audit log response failures.
|Dependent item|vault.metrics.audit.log.response.failure.rate**Preprocessing**
Prometheus pattern: `VALUE(vault_audit_log_response_failure)`
⛔️Custom on fail: Discard value
Number of DELETE operations at the barrier.
|Dependent item|vault.metrics.barrier.delete.rate**Preprocessing**
Prometheus pattern: `VALUE(vault_barrier_delete_count)`
⛔️Custom on fail: Discard value
Number of GET operations at the barrier.
|Dependent item|vault.metrics.vault.barrier.get.rate**Preprocessing**
Prometheus pattern: `VALUE(vault_barrier_get_count)`
⛔️Custom on fail: Discard value
Number of LIST operations at the barrier.
|Dependent item|vault.metrics.barrier.list.rate**Preprocessing**
Prometheus pattern: `VALUE(vault_barrier_list_count)`
⛔️Custom on fail: Discard value
Number of PUT operations at the barrier.
|Dependent item|vault.metrics.barrier.put.rate**Preprocessing**
Prometheus pattern: `VALUE(vault_barrier_put_count)`
⛔️Custom on fail: Discard value
Number of times a value was retrieved from the LRU cache.
|Dependent item|vault.metrics.cache.hit.rate**Preprocessing**
Prometheus pattern: `VALUE(vault_cache_hit)`
⛔️Custom on fail: Discard value
Number of times a value was not in the LRU cache. The results in a read from the configured storage.
|Dependent item|vault.metrics.cache.miss.rate**Preprocessing**
Prometheus pattern: `VALUE(vault_cache_miss)`
⛔️Custom on fail: Discard value
Number of times a value was written to the LRU cache.
|Dependent item|vault.metrics.cache.write.rate**Preprocessing**
Prometheus pattern: `VALUE(vault_cache_write)`
⛔️Custom on fail: Discard value
Number of token checks handled by Vault core.
|Dependent item|vault.metrics.core.check.token.rate**Preprocessing**
Prometheus pattern: `VALUE(vault_core_check_token_count)`
⛔️Custom on fail: Discard value
Number of ACL and corresponding token entry fetches handled by Vault core.
|Dependent item|vault.metrics.core.fetch.acl_and_token**Preprocessing**
Prometheus pattern: `VALUE(vault_core_fetch_acl_and_token_count)`
⛔️Custom on fail: Discard value
Number of requests handled by Vault core.
|Dependent item|vault.metrics.core.handle.request**Preprocessing**
Prometheus pattern: `VALUE(vault_core_handle_request_count)`
⛔️Custom on fail: Discard value
Cluster leadership setup failures which have occurred in a highly available Vault cluster.
|Dependent item|vault.metrics.core.leadership.setup_failed**Preprocessing**
Prometheus to JSON: `vault_core_leadership_setup_failed`
JSON Path: `The text is too long. Please see the template.`
⛔️Custom on fail: Set value to: `0`
Cluster leadership losses which have occurred in a highly available Vault cluster.
|Dependent item|vault.metrics.core.leadership_lost**Preprocessing**
Prometheus to JSON: `vault_core_leadership_lost_count`
JSON Path: `$[?(@.name=="vault_core_leadership_lost_count")].value.sum()`
⛔️Custom on fail: Set value to: `0`
Duration of time taken by post-unseal operations handled by Vault core.
|Dependent item|vault.metrics.core.post_unseal**Preprocessing**
Prometheus pattern: `VALUE(vault_core_post_unseal_count)`
⛔️Custom on fail: Discard value
Duration of time taken by pre-seal operations.
|Dependent item|vault.metrics.core.pre_seal**Preprocessing**
Prometheus pattern: `VALUE(vault_core_pre_seal_count)`
⛔️Custom on fail: Discard value
Duration of time taken by requested seal operations.
|Dependent item|vault.metrics.core.seal_with_request**Preprocessing**
Prometheus pattern: `VALUE(vault_core_seal_with_request_count)`
⛔️Custom on fail: Discard value
Duration of time taken by seal operations.
|Dependent item|vault.metrics.core.seal**Preprocessing**
Prometheus pattern: `VALUE(vault_core_seal_count)`
⛔️Custom on fail: Discard value
Duration of time taken by internal seal operations.
|Dependent item|vault.metrics.core.seal_internal**Preprocessing**
Prometheus pattern: `VALUE(vault_core_seal_internal_count)`
⛔️Custom on fail: Discard value
Cluster leadership step down.
|Dependent item|vault.metrics.core.step_down**Preprocessing**
Prometheus to JSON: `vault_core_step_down_count`
JSON Path: `$[?(@.name=="vault_core_step_down_count")].value.sum()`
⛔️Custom on fail: Set value to: `0`
Duration of time taken by unseal operations.
|Dependent item|vault.metrics.core.unseal**Preprocessing**
Prometheus pattern: `VALUE(vault_core_unseal_count)`
⛔️Custom on fail: Discard value
Time taken to fetch lease times.
|Dependent item|vault.metrics.expire.fetch.lease.times**Preprocessing**
Prometheus pattern: `VALUE(vault_expire_fetch_lease_times_count)`
⛔️Custom on fail: Discard value
Time taken to fetch lease times by token.
|Dependent item|vault.metrics.expire.fetch.lease.times.by_token**Preprocessing**
Prometheus pattern: `VALUE(vault_expire_fetch_lease_times_by_token_count)`
⛔️Custom on fail: Discard value
Number of all leases which are eligible for eventual expiry.
|Dependent item|vault.metrics.expire.num_leases**Preprocessing**
Prometheus pattern: `VALUE(vault_expire_num_leases)`
⛔️Custom on fail: Discard value
Time taken to revoke a token.
|Dependent item|vault.metrics.expire.revoke**Preprocessing**
Prometheus pattern: `VALUE(vault_expire_revoke_count)`
⛔️Custom on fail: Discard value
Time taken to forcibly revoke a token.
|Dependent item|vault.metrics.expire.revoke.force**Preprocessing**
Prometheus pattern: `VALUE(vault_expire_revoke_force_count)`
⛔️Custom on fail: Discard value
Tokens revoke on a prefix.
|Dependent item|vault.metrics.expire.revoke.prefix**Preprocessing**
Prometheus pattern: `VALUE(vault_expire_revoke_prefix_count)`
⛔️Custom on fail: Discard value
Time taken to revoke all secrets issued with a given token.
|Dependent item|vault.metrics.expire.revoke.by_token**Preprocessing**
Prometheus pattern: `VALUE(vault_expire_revoke_by_token_count)`
⛔️Custom on fail: Discard value
Time taken to renew a lease.
|Dependent item|vault.metrics.expire.renew**Preprocessing**
Prometheus pattern: `VALUE(vault_expire_renew_count)`
⛔️Custom on fail: Discard value
Time taken to renew a token which does not need to invoke a logical backend.
|Dependent item|vault.metrics.expire.renew_token**Preprocessing**
Prometheus pattern: `VALUE(vault_expire_renew_token_count)`
⛔️Custom on fail: Discard value
Time taken for register operations.
|Dependent item|vault.metrics.expire.register**Preprocessing**
Prometheus pattern: `VALUE(vault_expire_register_count)`
⛔️Custom on fail: Discard value
Time taken for register authentication operations which create lease entries without lease ID.
|Dependent item|vault.metrics.expire.register.auth**Preprocessing**
Prometheus pattern: `VALUE(vault_expire_register_auth_count)`
⛔️Custom on fail: Discard value
Number of operations to get a policy.
|Dependent item|vault.metrics.policy.get_policy.rate**Preprocessing**
Prometheus pattern: `VALUE(vault_policy_get_policy_count)`
⛔️Custom on fail: Discard value
Number of operations to list policies.
|Dependent item|vault.metrics.policy.list_policies.rate**Preprocessing**
Prometheus pattern: `VALUE(vault_policy_list_policies_count)`
⛔️Custom on fail: Discard value
Number of operations to delete a policy.
|Dependent item|vault.metrics.policy.delete_policy.rate**Preprocessing**
Prometheus pattern: `VALUE(vault_policy_delete_policy_count)`
⛔️Custom on fail: Discard value
Number of operations to set a policy.
|Dependent item|vault.metrics.policy.set_policy.rate**Preprocessing**
Prometheus pattern: `VALUE(vault_policy_set_policy_count)`
⛔️Custom on fail: Discard value
The time taken to create a token.
|Dependent item|vault.metrics.token.create**Preprocessing**
Prometheus pattern: `VALUE(vault_token_create_count)`
⛔️Custom on fail: Discard value
The time taken to create a token accessor.
|Dependent item|vault.metrics.token.createAccessor**Preprocessing**
Prometheus pattern: `VALUE(vault_token_createAccessor_count)`
⛔️Custom on fail: Discard value
Number of token look up.
|Dependent item|vault.metrics.token.lookup.rate**Preprocessing**
Prometheus pattern: `VALUE(vault_token_lookup_count)`
⛔️Custom on fail: Discard value
The time taken to look up a token.
|Dependent item|vault.metrics.token.revoke**Preprocessing**
Prometheus pattern: `VALUE(vault_token_revoke_count)`
⛔️Custom on fail: Discard value
Time taken to revoke a token tree.
|Dependent item|vault.metrics.token.revoke.tree**Preprocessing**
Prometheus pattern: `VALUE(vault_token_revoke_tree_count)`
⛔️Custom on fail: Discard value
Time taken to store an updated token entry without writing to the secondary index.
|Dependent item|vault.metrics.token.store**Preprocessing**
Prometheus pattern: `VALUE(vault_token_store_count)`
⛔️Custom on fail: Discard value
Number of bytes allocated by the Vault process. This could burst from time to time, but should return to a steady state value.
|Dependent item|vault.metrics.runtime.alloc.bytes**Preprocessing**
Prometheus pattern: `VALUE(vault_runtime_alloc_bytes)`
⛔️Custom on fail: Discard value
Number of freed objects.
|Dependent item|vault.metrics.runtime.free.count**Preprocessing**
Prometheus pattern: `VALUE(vault_runtime_free_count)`
⛔️Custom on fail: Discard value
Number of objects on the heap. This is a good general memory pressure indicator worth establishing a baseline and thresholds for alerting.
|Dependent item|vault.metrics.runtime.heap.objects**Preprocessing**
Prometheus pattern: `VALUE(vault_runtime_heap_objects)`
⛔️Custom on fail: Discard value
Cumulative count of allocated heap objects.
|Dependent item|vault.metrics.runtime.malloc.count**Preprocessing**
Prometheus pattern: `VALUE(vault_runtime_malloc_count)`
⛔️Custom on fail: Discard value
Number of goroutines. This serves as a general system load indicator worth establishing a baseline and thresholds for alerting.
|Dependent item|vault.metrics.runtime.num_goroutines**Preprocessing**
Prometheus pattern: `VALUE(vault_runtime_num_goroutines)`
⛔️Custom on fail: Discard value
Number of bytes allocated to Vault. This includes what is being used by Vault's heap and what has been reclaimed but not given back to the operating system.
|Dependent item|vault.metrics.runtime.sys.bytes**Preprocessing**
Prometheus pattern: `VALUE(vault_runtime_sys_bytes)`
⛔️Custom on fail: Discard value
The total garbage collector pause time since Vault was last started.
|Dependent item|vault.metrics.total.gc.pause**Preprocessing**
Prometheus pattern: `VALUE(vault_runtime_total_gc_pause_ns)`
⛔️Custom on fail: Discard value
Custom multiplier: `1e-09`
Total number of garbage collection runs since Vault was last started.
|Dependent item|vault.metrics.runtime.total.gc.runs**Preprocessing**
Prometheus pattern: `VALUE(vault_runtime_total_gc_runs)`
⛔️Custom on fail: Discard value
Total number of service tokens available for use; counts all un-expired and un-revoked tokens in Vault's token store. This measurement is performed every 10 minutes.
|Dependent item|vault.metrics.token**Preprocessing**
Prometheus to JSON: `vault_token_count`
JSON Path: `$[?(@.name=="vault_token_count")].value.sum()`
⛔️Custom on fail: Set value to: `0`
Total number of service tokens that were created by a auth method.
|Dependent item|vault.metrics.token.by_auth**Preprocessing**
Prometheus to JSON: `vault_token_count_by_auth`
JSON Path: `$[?(@.name=="vault_token_count_by_auth")].value.sum()`
⛔️Custom on fail: Set value to: `0`
Total number of service tokens that have a policy attached.
|Dependent item|vault.metrics.token.by_policy**Preprocessing**
Prometheus to JSON: `vault_token_count_by_policy`
JSON Path: `$[?(@.name=="vault_token_count_by_policy")].value.sum()`
⛔️Custom on fail: Set value to: `0`
Number of service tokens, grouped by the TTL range they were assigned at creation.
|Dependent item|vault.metrics.token.by_ttl**Preprocessing**
Prometheus to JSON: `vault_token_count_by_ttl`
JSON Path: `$[?(@.name=="vault_token_count_by_ttl")].value.sum()`
⛔️Custom on fail: Set value to: `0`
Number of service or batch tokens created.
|Dependent item|vault.metrics.token.creation.rate**Preprocessing**
Prometheus to JSON: `vault_token_creation`
JSON Path: `$[?(@.name=="vault_token_creation")].value.sum()`
⛔️Custom on fail: Set value to: `0`
Number of entries in each key-value secret engine.
|Dependent item|vault.metrics.secret.kv.count**Preprocessing**
Prometheus to JSON: `vault_secret_kv_count`
JSON Path: `$[?(@.name=="vault_secret_kv_count")].value.sum()`
⛔️Custom on fail: Set value to: `0`
Counts the number of leases created by secret engines.
|Dependent item|vault.metrics.secret.lease.creation.rate**Preprocessing**
Prometheus to JSON: `vault_secret_lease_creation`
JSON Path: `$[?(@.name=="vault_secret_lease_creation")].value.sum()`
⛔️Custom on fail: Set value to: `0`
https://www.vaultproject.io/docs/concepts/seal
|`last(/HashiCorp Vault by HTTP/vault.health.sealed)=1`|Average|| |Vault: Version has changed|Vault version has changed. Acknowledge to close the problem manually.
|`last(/HashiCorp Vault by HTTP/vault.health.version,#1)<>last(/HashiCorp Vault by HTTP/vault.health.version,#2) and length(last(/HashiCorp Vault by HTTP/vault.health.version))>0`|Info|**Manual close**: Yes| |Vault: Vault server is not responding||`last(/HashiCorp Vault by HTTP/vault.health.check)=0`|High|| |Vault: Failed to get metrics||`length(last(/HashiCorp Vault by HTTP/vault.get_metrics.error))>0`|Warning|**Depends on**:Uptime is less than 10 minutes.
|`last(/HashiCorp Vault by HTTP/vault.metrics.process.uptime)<10m`|Info|**Manual close**: Yes| |Vault: High frequency of leadership setup failures|There have been more than {$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN} Vault leadership setup failures in the past 1h.
|`(max(/HashiCorp Vault by HTTP/vault.metrics.core.leadership.setup_failed,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.leadership.setup_failed,1h))>{$VAULT.LEADERSHIP.SETUP.FAILED.MAX.WARN}`|Average|| |Vault: High frequency of leadership losses|There have been more than {$VAULT.LEADERSHIP.LOSSES.MAX.WARN} Vault leadership losses in the past 1h.
|`(max(/HashiCorp Vault by HTTP/vault.metrics.core.leadership_lost,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.leadership_lost,1h))>{$VAULT.LEADERSHIP.LOSSES.MAX.WARN}`|Average|| |Vault: High frequency of leadership step downs|There have been more than {$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN} Vault leadership step downs in the past 1h.
|`(max(/HashiCorp Vault by HTTP/vault.metrics.core.step_down,1h)-min(/HashiCorp Vault by HTTP/vault.metrics.core.step_down,1h))>{$VAULT.LEADERSHIP.STEPDOWNS.MAX.WARN}`|Average|| ### LLD rule Storage metrics discovery |Name|Description|Type|Key and additional info| |----|-----------|----|-----------------------| |Storage metrics discovery|Storage backend metrics discovery.
|Dependent item|vault.storage.discovery| ### Item prototypes for Storage metrics discovery |Name|Description|Type|Key and additional info| |----|-----------|----|-----------------------| |Vault: Storage [{#STORAGE}] {#OPERATION} ops, rate|Number of a {#OPERATION} operation against the {#STORAGE} storage backend.
|Dependent item|vault.metrics.storage.rate[{#STORAGE}, {#OPERATION}]**Preprocessing**
Prometheus pattern: `VALUE({#PATTERN_C})`
⛔️Custom on fail: Discard value
Mountpoint metrics discovery.
|Dependent item|vault.mountpoint.discovery| ### Item prototypes for Mountpoint metrics discovery |Name|Description|Type|Key and additional info| |----|-----------|----|-----------------------| |Vault: Rollback attempt [{#MOUNTPOINT}] ops, rate|Number of operations to perform a rollback operation on the given mount point.
|Dependent item|vault.metrics.rollback.attempt.rate[{#MOUNTPOINT}]**Preprocessing**
Prometheus pattern: `VALUE({#PATTERN_C})`
⛔️Custom on fail: Discard value
Number of operations to dispatch a rollback operation to a backend, and for that backend to process it. Rollback operations are automatically scheduled to clean up partial errors.
|Dependent item|vault.metrics.route.rollback.rate[{#MOUNTPOINT}]**Preprocessing**
Prometheus pattern: `VALUE({#PATTERN_C})`
⛔️Custom on fail: Discard value
Discovery for WAL metrics.
|Dependent item|vault.wal.discovery| ### Item prototypes for WAL metrics discovery |Name|Description|Type|Key and additional info| |----|-----------|----|-----------------------| |Vault: Delete WALs, count{#SINGLETON}|Time taken to delete a Write Ahead Log (WAL).
|Dependent item|vault.metrics.wal.deletewals[{#SINGLETON}]**Preprocessing**
Prometheus pattern: `VALUE(vault_wal_deletewals_count)`
⛔️Custom on fail: Discard value
Number of Write Ahead Logs (WAL) deleted during each garbage collection run.
|Dependent item|vault.metrics.wal.gc.deleted[{#SINGLETON}]**Preprocessing**
Prometheus pattern: `VALUE(vault_wal_gc_deleted)`
⛔️Custom on fail: Discard value
Total Number of Write Ahead Logs (WAL) on disk.
|Dependent item|vault.metrics.wal.gc.total[{#SINGLETON}]**Preprocessing**
Prometheus pattern: `VALUE(vault_wal_gc_total)`
⛔️Custom on fail: Discard value
Time taken to load a Write Ahead Log (WAL).
|Dependent item|vault.metrics.wal.loadWAL[{#SINGLETON}]**Preprocessing**
Prometheus pattern: `VALUE(vault_wal_loadWAL_count)`
⛔️Custom on fail: Discard value
Time taken to persist a Write Ahead Log (WAL).
|Dependent item|vault.metrics.wal.persistwals[{#SINGLETON}]**Preprocessing**
Prometheus pattern: `VALUE(vault_wal_persistwals_count)`
⛔️Custom on fail: Discard value
Time taken to flush a ready Write Ahead Log (WAL) to storage.
|Dependent item|vault.metrics.wal.flushready[{#SINGLETON}]**Preprocessing**
Prometheus pattern: `VALUE(vault_wal_flushready_count)`
⛔️Custom on fail: Discard value
Discovery for replication metrics.
|Dependent item|vault.replication.discovery| ### Item prototypes for Replication metrics discovery |Name|Description|Type|Key and additional info| |----|-----------|----|-----------------------| |Vault: Stream WAL missing guard, count{#SINGLETON}|Number of incidences where the starting Merkle Tree index used to begin streaming WAL entries is not matched/found.
|Dependent item|vault.metrics.logshipper.streamWALs.missing_guard[{#SINGLETON}]**Preprocessing**
Prometheus pattern: `VALUE(logshipper_streamWALs_missing_guard)`
⛔️Custom on fail: Discard value
Number of incidences where the starting Merkle Tree index used to begin streaming WAL entries is matched/found.
|Dependent item|vault.metrics.logshipper.streamWALs.guard_found[{#SINGLETON}]**Preprocessing**
Prometheus pattern: `VALUE(logshipper_streamWALs_guard_found)`
⛔️Custom on fail: Discard value
The last committed index in the Merkle Tree.
|Dependent item|vault.metrics.replication.merkle.commit_index[{#SINGLETON}]**Preprocessing**
Prometheus pattern: `VALUE(replication_merkle_commit_index)`
⛔️Custom on fail: Discard value
The index of the last WAL.
|Dependent item|vault.metrics.replication.wal.last_wal[{#SINGLETON}]**Preprocessing**
Prometheus pattern: `VALUE(replication_wal_last_wal)`
⛔️Custom on fail: Discard value
The index of the last DR WAL.
|Dependent item|vault.metrics.replication.wal.last_dr_wal[{#SINGLETON}]**Preprocessing**
Prometheus pattern: `VALUE(replication_wal_last_dr_wal)`
⛔️Custom on fail: Discard value
The index of the last Performance WAL.
|Dependent item|vault.metrics.replication.wal.last_performance_wal[{#SINGLETON}]**Preprocessing**
Prometheus pattern: `VALUE(replication_wal_last_performance_wal)`
⛔️Custom on fail: Discard value
The index of the last remote WAL.
|Dependent item|vault.metrics.replication.fsm.last_remote_wal[{#SINGLETON}]**Preprocessing**
Prometheus pattern: `VALUE(replication_fsm_last_remote_wal)`
⛔️Custom on fail: Discard value
Tokens metrics discovery.
|Dependent item|vault.tokens.discovery| ### Item prototypes for Token metrics discovery |Name|Description|Type|Key and additional info| |----|-----------|----|-----------------------| |Vault: Token [{#TOKEN_NAME}] error|Token lookup error text.
|Dependent item|vault.token_via_accessor.error["{#ACCESSOR}"]**Preprocessing**
JSON Path: `$.[?(@.accessor == "{#ACCESSOR}")].error.first()`
Discard unchanged with heartbeat: `1h`
The Token has TTL.
|Dependent item|vault.token_via_accessor.has_ttl["{#ACCESSOR}"]**Preprocessing**
JSON Path: `$.[?(@.accessor == "{#ACCESSOR}")].has_ttl.first()`
Discard unchanged with heartbeat: `1h`
The TTL period of the token.
|Dependent item|vault.token_via_accessor.ttl["{#ACCESSOR}"]**Preprocessing**
JSON Path: `$.[?(@.accessor == "{#ACCESSOR}")].ttl.first()`