为telegraf添加zpool输出支持

telegraf到目前为止都还不支持zpool健康的监控,简直说不过去,于是另辟蹊径利用inputs.exec + python脚本的方式来定期采集,然后写入influxdb,进而在grafana上做展示!

#!/usr/bin/env python3
import json
from subprocess import check_output
 
columns = ["NAME", "SIZE", "ALLOC", "FREE", "EXPANDSZ", "FRAG", "CAP", "DEDUP", "HEALTH", "ALTROOT"]
health = {'ONLINE':0, 'DEGRADED':11, 'OFFLINE':21, 'UNAVAIL':22, 'FAULTED':23, 'REMOVED':24}
 
stdout = check_output(["/sbin/zpool", "list", "-Hp"],encoding='UTF-8').split('\n')
parsed_stdout = list(map(lambda x: dict(zip(columns,x.split('\t'))), stdout))[:-1]
 
for pool in parsed_stdout:
    for item in pool:
        if item in ["SIZE", "ALLOC", "FREE", "FRAG", "CAP"]:
            pool[item] = int(pool[item])
        if item in ["DEDUP"]:
            pool[item] = float(pool[item])
        if item == "HEALTH":
            pool[item] = health[pool[item]]
 
print(json.dumps(parsed_stdout))

在telegraf.conf里添加:

[[inputs.exec]]
  interval = "1h"
  commands = ["python3 /etc/telegraf/telegraf.d/zpool.py"]
  timeout = "10s"
  data_format = "json"
  name_suffix = "_zpool"

测试配置:

telegraf -test -input-filter exec
2019-05-09T05:47:40Z I! Starting Telegraf 1.10.3
2019-05-09T05:47:40Z I! Using config file: /etc/telegraf/telegraf.conf
> exec_zpool,host=r730xd.xbits.net ALLOC=16203183869952,CAP=67,DEDUP=1,FRAG=6,FREE=7711194034176,HEALTH=0,SIZE=23914377904128 1557380861000000000

重启telegraf生效:

systemctl restart telegraf

去influxdb查询是否有数据过来了:

> select * from exec_zpool
name: exec_zpool
time                ALLOC          CAP DEDUP FRAG FREE          HEALTH SIZE           host
----                -----          --- ----- ---- ----          ------ ----           ----
1557380520000000000 16203202363392 67  1     6    7711175540736 0      23914377904128 r730xd.xbits.net
1557380580000000000 16203273867264 67  1     6    7711104036864 0      23914377904128 r730xd.xbits.net
  • monitoring/telegraf_zpool监控.txt
  • 最后更改: 2019/08/13 10:47
  • 由 mrco