What takes disk space

Post Syndicated from Aigars Kadiķis original https://blog.zabbix.com/what-takes-disk-space/13349/

In today’s class let’s talk about where the disk space goes. Which items and hosts objects consume the disk space the most.

The post will cover things like:
Biggest tables in a database
Biggest data coming to the instance right now
Biggest data inside one partition of the DB table
Print hosts and items which consumes the most disk space

Biggest tables

In general, the leading tables are:

history
history_uint
history_str
history_text
history_log
events

‘history_uint’ will store integers. ‘history’ will store decimal numbers.
‘history_str’, ‘history_text’, ‘history_log’ stores textual data.
In the table ‘events’ goes problem events, internal events, agent auto-registration events, discovery events.

Have a look yourself in a database which tables take the most space. On MySQL:

SELECT table_name,
       table_rows,
       data_length,
       index_length,
       round(((data_length + index_length) / 1024 / 1024 / 1024),2) "Size in GB"
FROM information_schema.tables
WHERE table_schema = "zabbix"
ORDER BY round(((data_length + index_length) / 1024 / 1024 / 1024),2) DESC
LIMIT 8;

On PostgreSQL:

SELECT *, pg_size_pretty(total_bytes) AS total , pg_size_pretty(index_bytes) AS index ,
       pg_size_pretty(toast_bytes) AS toast , pg_size_pretty(table_bytes) AS table
FROM (SELECT *, total_bytes-index_bytes-coalesce(toast_bytes, 0) AS table_bytes
   FROM (SELECT c.oid,
             nspname AS table_schema,
             relname AS table_name ,
             c.reltuples AS row_estimate ,
             pg_total_relation_size(c.oid) AS total_bytes ,
             pg_indexes_size(c.oid) AS index_bytes ,
             pg_total_relation_size(reltoastrelid) AS toast_bytes
      FROM pg_class c
      LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
      WHERE relkind = 'r' ) a) a;

Detect big data coming to instance right now

Analyze ‘history_log’ table for the last 30 minutes:

SELECT hosts.host,items.itemid,items.key_,
COUNT(history_log.itemid)  AS 'count', AVG(LENGTH(history_log.value)) AS 'avg size',
(COUNT(history_log.itemid) * AVG(LENGTH(history_log.value))) AS 'Count x AVG'
FROM history_log 
JOIN items ON (items.itemid=history_log.itemid)
JOIN hosts ON (hosts.hostid=items.hostid)
WHERE clock > UNIX_TIMESTAMP(NOW() - INTERVAL 30 MINUTE)
GROUP BY hosts.host,history_log.itemid
ORDER BY 6 DESC
LIMIT 1\G

With PostgreSQL:

SELECT hosts.host,history_log.itemid,items.key_,
COUNT(history_log.itemid) AS "count", AVG(LENGTH(history_log.value))::NUMERIC(10,2) AS "avg size",
(COUNT(history_log.itemid) * AVG(LENGTH(history_log.value)))::NUMERIC(10,2) AS "Count x AVG"
FROM history_log 
JOIN items ON (items.itemid=history_log.itemid)
JOIN hosts ON (hosts.hostid=items.hostid)
WHERE clock > EXTRACT(epoch FROM NOW()-INTERVAL '30 MINUTE')
GROUP BY hosts.host,history_log.itemid,items.key_
ORDER BY 6 DESC
LIMIT 5
\gx

Re-run the same query but replace ‘history_log’ (in all places) with ‘history_text’ or ‘history_str’.

Which hosts consume the most space

This is a very heavy query. We will go back one day and analyze 6 minutes of that data:

SELECT ho.hostid, ho.name, count(*) AS records, 
(count(*)* (SELECT AVG_ROW_LENGTH FROM information_schema.tables 
WHERE TABLE_NAME = 'history_text' and TABLE_SCHEMA = 'zabbix')/1024/1024) AS 'Total size average (Mb)', 
sum(length(history_text.value))/1024/1024 + sum(length(history_text.clock))/1024/1024 + sum(length(history_text.ns))/1024/1024 + sum(length(history_text.itemid))/1024/1024 AS 'history_text Column Size (Mb)'
FROM history_text
LEFT OUTER JOIN items i on history_text.itemid = i.itemid 
LEFT OUTER JOIN hosts ho on i.hostid = ho.hostid 
WHERE ho.status IN (0,1)
AND clock > UNIX_TIMESTAMP(now() - INTERVAL 1 DAY - INTERVAL 6 MINUTE)
AND clock < UNIX_TIMESTAMP(now() - INTERVAL 1 DAY)
GROUP BY ho.hostid
ORDER BY 4 DESC
LIMIT 5\G

If “6-minute query” works in a relatively good time frame, try “INTERVAL 60 MINUTE”.
If “INTERVAL 60 MINUTE” works good, try “INTERVAL 600 MINUTE”.

Analyze in partition level (MySQL)

On MySQL, if database table partitioning is enabled we can list the biggest partitions on a filesystem:

ls -lh history_log#*

It will print:

-rw-r-----. 1 mysql mysql  44M Jan 24 20:23 history_log#p#p2021_02w.ibd
-rw-r-----. 1 mysql mysql  24M Jan 24 21:20 history_log#p#p2021_03w.ibd
-rw-r-----. 1 mysql mysql 128K Jan 11 00:59 history_log#p#p2021_04w.ibd

From previous output, we can take partition name ‘p2021_02w’ and use it in a query:

SELECT ho.hostid, ho.name, count(*) AS records, 
(count(*)* (SELECT AVG_ROW_LENGTH FROM information_schema.tables 
WHERE TABLE_NAME = 'history_log' and TABLE_SCHEMA = 'zabbix')/1024/1024) AS 'Total size average (Mb)', 
sum(length(history_log.value))/1024/1024 + 
sum(length(history_log.clock))/1024/1024 +
sum(length(history_log.ns))/1024/1024 + 
sum(length(history_log.itemid))/1024/1024 AS 'history_log Column Size (Mb)'
FROM history_log PARTITION (p2021_02w)
LEFT OUTER JOIN items i on history_log.itemid = i.itemid 
LEFT OUTER JOIN hosts ho on i.hostid = ho.hostid 
WHERE ho.status IN (0,1)
GROUP BY ho.hostid
ORDER BY 4 DESC
LIMIT 10;

You can reproduce a similar scenario while listing:

ls -lh history_text#*
ls -lh history_str#*

Free up disk space (MySQL)

Deleting a host in GUI will not free up data space on MySQL. It will create empty rows in table where the new data can be inserted. If you want to really free up disk space, we can rebuild partition. At first list all possible partition names:

SHOW CREATE TABLE history\G

To rebuild partition:

ALTER TABLE history REBUILD PARTITION p202101160000;

Free up disk space (PostgreSQL)

On PostgreSQL, there is a process which is responsible for vacuuming the table. To ensure a vacuum has been done lately, kindly run:

SELECT schemaname, relname, n_live_tup, n_dead_tup, last_autovacuum
FROM pg_stat_all_tables
WHERE n_dead_tup > 0
ORDER BY n_dead_tup DESC;

In output, we look at ‘n_dead_tup’ it means a dead tuple.
If the last auto vacuum has not occurred in last 10 days, it’s bad. We have to install a different definition. We can increase vacuum priority by having:

vacuum_cost_page_miss = 10
vacuum_cost_page_dirty = 20
autovacuum_vacuum_threshold = 50
autovacuum_vacuum_scale_factor = 0.01
autovacuum_vacuum_cost_delay = 20ms
autovacuum_vacuum_cost_limit = 3000
autovacuum_max_workers = 6

Alright. That is it for today.