Today i want show a problem discovered on one of our Zabbix Server and the Housekeeper process.
Housekeeper (Zabbix documentation)
The Housekeeper is a periodical process, executed by Zabbix server. The process removes outdated information and information deleted by user.
Most of us know the two parameters inside zabbix_server.conf to limit the process behavior:
HousekeepingFrequency
The Housekeeper is a periodical process, executed by Zabbix server. The process removes outdated information and information deleted by user.
MaxHousekeeperDelete
No more than ‘MaxHousekeeperDelete‘ rows will be deleted per one task in one housekeeping cycle.
Actually for most of the times no one care so much about that but today we fall on a big Zabbix Server slow down and this problemi s coming from Housekeeper.
Today we understood much better the logic behind the Housekeeper process, i will try to explain below.
Few days ago we have removed 3 items prototypes from a template that it was linked to 60 hosts and for every hosts that prototype items is near to 300 real items.
So how many orphaned items we have? 3 (proto items) *60 (hosts)*100(expanded proto items)=18000 !
But the huge strange behaviour is here, MaxHousekeeperDelete. If we set to MaxHousekeeperDelete=500 zabbix will try to remove 500 history value per orphaned items.
So what happend?
In the beginning the Housekeeper process it will try to remove 18000×500=9.000.000 of history value!!!!
For example if we look on Zabbix server Log:
————–
housekeeper [deleted 68 hist/trends, 4522000 items, 0 events, 0 sessions, 0 alarms, 0 audit items in 2649.273207 sec, idle 1 hour(s)]
————–
The “4522000 items” is the deleted orphaned items value for a single Housekeeper process.
To discuss about this strange logic we have opened an official trouble ticket in Zabbix