View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000329 | My infrastructure | General | public | 2025-10-05 12:58 | 2025-10-09 12:02 |
Reporter | dvl | Assigned To | dvl | ||
Priority | normal | Severity | minor | Reproducibility | have not tried |
Status | assigned | Resolution | open | ||
Summary | 0000329: Add Nagios check for copy jobs | ||||
Description | Let's make sure jobs are copied to bacula-sd-03 from bacula-sd-04 Here's a possible starting point: bacula=# select jobid, job, endtime, priorjobid, realstarttime, realstarttime from job where priorjobid != 0 order by endtime desc limit 600; jobid | job | endtime | priorjobid | realstarttime | realstarttime --------+---------------------------------------------------------------------+---------------------+------------+---------------------+--------------------- 379831 | BackupCatalog.2025-10-04_21.57.17_16 | 2025-09-07 07:42:45 | 378962 | 2025-10-04 22:17:44 | 2025-10-04 22:17:44 379819 | zuul_jail_snapshots.2025-10-04_21.57.17_04 | 2025-09-07 06:52:10 | 378961 | 2025-10-04 22:02:09 | 2025-10-04 22:02:09 379803 | r730-03_jail_snapshots.2025-10-04_21.57.15_48 | 2025-09-07 03:38:33 | 378950 | 2025-10-04 21:59:37 | 2025-10-04 21:59:37 379834 | r730-01_snapshots.2025-10-04_21.57.17_19 | 2025-09-07 03:37:10 | 378947 | 2025-10-04 22:25:25 | 2025-10-04 22:25:25 379783 | unifi.2025-10-04_21.57.15_28 | 2025-09-07 03:30:07 | 378948 | 2025-10-04 21:57:49 | 2025-10-04 21:57:49 379825 | r730-01_jail_snapshots.2025-10-04_21.57.17_10 | 2025-09-07 03:29:13 | 378946 | 2025-10-04 22:06:55 | 2025-10-04 22:06:55 379787 | tallboy_Papers_Jail.2025-10-04_21.57.15_32 | 2025-09-07 03:27:26 | 378956 | 2025-10-04 21:57:58 | 2025-10-04 21:57:58 379811 | x8dtu_jail_snapshots.2025-10-04_21.57.16_56 | 2025-09-07 03:27:20 | 378959 | 2025-10-04 22:00:39 | 2025-10-04 22:00:39 379821 | dbclone_databases.2025-10-04_21.57.17_06 | 2025-09-07 03:26:51 | 378938 | 2025-10-04 22:02:22 | 2025-10-04 22:02:22 379801 | tallboy_jail_snapshots.2025-10-04_21.57.15_46 | 2025-09-07 03:22:27 | 378955 | 2025-10-04 21:59:23 | 2025-10-04 21:59:23 379815 | r720-02_jail_snapshots.2025-10-04_21.57.17_00 | 2025-09-07 03:21:13 | 378944 | 2025-10-04 22:01:21 | 2025-10-04 22:01:21 379769 | zuul_basic.2025-10-04_21.57.15_14 | 2025-09-07 03:13:50 | 378960 | 2025-10-04 21:57:35 | 2025-10-04 21:57:35 379779 | tallboy_home.2025-10-04_21.57.15_24 | 2025-09-07 03:13:01 | 378954 | 2025-10-04 21:57:42 | 2025-10-04 21:57:42 379753 | x8dtu_basic.2025-10-04_21.57.14_58 | 2025-09-07 03:09:24 | 378958 | 2025-10-04 21:57:25 | 2025-10-04 21:57:25 379795 | mydev_home_dir.2025-10-04_21.57.15_40 | 2025-09-07 03:09:24 | 378942 | 2025-10-04 21:58:39 | 2025-10-04 21:58:39 379726 | tallboy_Papers_Jail_PostgreSQL_Configuration.2025-10-04_21.52.31_29 | 2025-09-07 03:08:20 | 378957 | 2025-10-04 21:52:32 | 2025-10-04 21:52:32 379741 | fileserver_basic.2025-10-04_21.57.14_46 | 2025-09-07 03:07:51 | 378939 | 2025-10-04 21:57:19 | 2025-10-04 21:57:19 379799 | svn_everything.2025-10-04_21.57.15_44 | 2025-09-07 03:07:32 | 378952 | 2025-10-04 21:59:22 | 2025-10-04 21:59:22 379749 | tallboy_basic.2025-10-04_21.57.14_54 | 2025-09-07 03:07:03 | 378953 | 2025-10-04 21:57:23 | 2025-10-04 21:57:23 379775 | r720-02_basic.2025-10-04_21.57.15_20 | 2025-09-07 03:05:52 | 378943 | 2025-10-04 21:57:41 | 2025-10-04 21:57:41 379735 | mydev_basic.2025-10-04_21.57.14_40 | 2025-09-07 03:05:35 | 378941 | 2025-10-04 21:57:17 | 2025-10-04 21:57:17 379743 | r730-03_basic.2025-10-04_21.57.14_48 | 2025-09-07 03:05:33 | 378949 | 2025-10-04 21:57:20 | 2025-10-04 21:57:20 379763 | r730-01_basic.2025-10-04_21.57.15_08 | 2025-09-07 03:05:21 | 378945 | 2025-10-04 21:57:32 | 2025-10-04 21:57:32 379765 | gw01_basic.2025-10-04_21.57.15_10 | 2025-09-07 03:05:13 | 378940 | 2025-10-04 21:57:32 | 2025-10-04 21:57:32 379791 | repo-svn-snapshots.2025-10-04_21.57.15_36 | 2025-09-07 03:05:07 | 378935 | 2025-10-04 21:58:14 | 2025-10-04 21:58:14 379739 | svn_basic.2025-10-04_21.57.14_44 | 2025-09-07 03:05:05 | 378951 | 2025-10-04 21:57:18 | 2025-10-04 21:57:18 379757 | repo-git-snapshots.2025-10-04_21.57.15_02 | 2025-09-07 03:04:20 | 378936 | 2025-10-04 21:57:28 | 2025-10-04 21:57:28 379729 | ansible.2025-10-04_21.57.13_34 | 2025-09-07 03:04:05 | 378937 | 2025-10-04 21:57:15 | 2025-10-04 21:57:15 379829 | BackupCatalog.2025-10-04_21.57.17_14 | 2025-08-03 07:35:22 | 377980 | 2025-10-04 22:12:43 | 2025-10-04 22:12:43 379817 | zuul_jail_snapshots.2025-10-04_21.57.17_02 | 2025-08-03 06:45:14 | 377979 | 2025-10-04 22:01:41 | 2025-10-04 22:01:41 379805 | r730-03_jail_snapshots.2025-10-04_21.57.16_50 | 2025-08-03 03:37:45 | 377968 | 2025-10-04 21:59:53 | 2025-10-04 21:59:53 379833 | r730-01_snapshots.2025-10-04_21.57.17_18 | 2025-08-03 03:36:08 | 377965 | 2025-10-04 22:20:33 | 2025-10-04 22:20:33 379781 | unifi.2025-10-04_21.57.15_26 | 2025-08-03 03:30:38 | 377966 | 2025-10-04 21:57:46 | 2025-10-04 21:57:46 379827 | r730-01_jail_snapshots.2025-10-04_21.57.17_12 | 2025-08-03 03:29:54 | 377964 | 2025-10-04 22:09:18 | 2025-10-04 22:09:18 379809 | x8dtu_jail_snapshots.2025-10-04_21.57.16_54 | 2025-08-03 03:27:21 | 377977 | 2025-10-04 22:00:23 | 2025-10-04 22:00:23 379785 | tallboy_Papers_Jail.2025-10-04_21.57.15_30 | 2025-08-03 03:26:26 | 377974 | 2025-10-04 21:57:50 | 2025-10-04 21:57:50 379823 | dbclone_databases.2025-10-04_21.57.17_08 | 2025-08-03 03:25:05 | 377956 | 2025-10-04 22:06:09 | 2025-10-04 22:06:09 379797 | tallboy_jail_snapshots.2025-10-04_21.57.15_42 | 2025-08-03 03:20:31 | 377973 | 2025-10-04 21:58:55 | 2025-10-04 21:58:55 379813 | r720-02_jail_snapshots.2025-10-04_21.57.16_58 | 2025-08-03 03:19:22 | 377962 | 2025-10-04 22:01:10 | 2025-10-04 22:01:10 379771 | zuul_basic.2025-10-04_21.57.15_16 | 2025-08-03 03:14:31 | 377978 | 2025-10-04 21:57:36 | 2025-10-04 21:57:36 379777 | tallboy_home.2025-10-04_21.57.15_22 | 2025-08-03 03:12:37 | 377972 | 2025-10-04 21:57:41 | 2025-10-04 21:57:41 379755 | x8dtu_basic.2025-10-04_21.57.15_00 | 2025-08-03 03:10:24 | 377976 | 2025-10-04 21:57:27 | 2025-10-04 21:57:27 379793 | mydev_home_dir.2025-10-04_21.57.15_38 | 2025-08-03 03:09:09 | 377960 | 2025-10-04 21:58:21 | 2025-10-04 21:58:21 379724 | tallboy_Papers_Jail_PostgreSQL_Configuration.2025-10-04_21.44.39_26 | 2025-08-03 03:08:47 | 377975 | 2025-10-04 21:44:42 | 2025-10-04 21:44:42 379807 | svn_everything.2025-10-04_21.57.16_52 | 2025-08-03 03:08:05 | 377970 | 2025-10-04 22:00:10 | 2025-10-04 22:00:10 379761 | fileserver_basic.2025-10-04_21.57.15_06 | 2025-08-03 03:07:25 | 377957 | 2025-10-04 21:57:30 | 2025-10-04 21:57:30 379751 | tallboy_basic.2025-10-04_21.57.14_56 | 2025-08-03 03:06:46 | 377971 | 2025-10-04 21:57:24 | 2025-10-04 21:57:24 379789 | repo-svn-snapshots.2025-10-04_21.57.15_34 | 2025-08-03 03:05:49 | 377953 | 2025-10-04 21:58:01 | 2025-10-04 21:58:01 379773 | r720-02_basic.2025-10-04_21.57.15_18 | 2025-08-03 03:05:43 | 377961 | 2025-10-04 21:57:36 | 2025-10-04 21:57:36 379747 | r730-01_basic.2025-10-04_21.57.14_52 | 2025-08-03 03:05:35 | 377963 | 2025-10-04 21:57:21 | 2025-10-04 21:57:21 379745 | r730-03_basic.2025-10-04_21.57.14_50 | 2025-08-03 03:05:21 | 377967 | 2025-10-04 21:57:21 | 2025-10-04 21:57:21 379767 | gw01_basic.2025-10-04_21.57.15_12 | 2025-08-03 03:05:18 | 377958 | 2025-10-04 21:57:33 | 2025-10-04 21:57:33 379733 | mydev_basic.2025-10-04_21.57.14_38 | 2025-08-03 03:05:07 | 377959 | 2025-10-04 21:57:15 | 2025-10-04 21:57:15 379737 | svn_basic.2025-10-04_21.57.14_42 | 2025-08-03 03:05:04 | 377969 | 2025-10-04 21:57:18 | 2025-10-04 21:57:18 379759 | repo-git-snapshots.2025-10-04_21.57.15_04 | 2025-08-03 03:04:33 | 377954 | 2025-10-04 21:57:28 | 2025-10-04 21:57:28 379731 | ansible.2025-10-04_21.57.14_36 | 2025-08-03 03:04:15 | 377955 | 2025-10-04 21:57:15 | 2025-10-04 21:57:15 359441 | r730-03_basic_testing.2023-09-15_21.27.12_47 | 2023-09-15 12:57:19 | 359391 | | 359417 | r730-03_basic_testing.2023-09-15_19.35.23_01 | 2023-09-15 12:57:19 | 359391 | | 359399 | r730-03_basic_testing.2023-09-15_13.11.50_24 | 2023-09-15 12:57:19 | 359391 | | 359397 | r730-03_basic_testing.2023-09-15_13.09.52_22 | 2023-09-15 12:57:19 | 359391 | | 359527 | r730-03_basic_testing.2023-09-18_20.56.35_38 | 2023-09-15 12:57:19 | 359391 | | 359431 | r730-03_basic_testing.2023-09-15_20.43.51_29 | 2023-09-15 12:57:19 | 359391 | | 359393 | r730-03_basic_testing.2023-09-15_13.02.16_17 | 2023-09-15 12:57:19 | 359391 | | 359404 | r730-03_basic_testing.2023-09-15_13.48.37_37 | 2023-09-15 12:57:19 | 359391 | | 359427 | r730-03_basic_testing.2023-09-15_20.19.25_20 | 2023-09-15 12:57:19 | 359391 | | --More--(byte 10350) | ||||
Steps To Reproduce | ideas: type = C level = F jobstatus = T endtime within the past 70 days poolid = 35 (FullFile-03) jobbytes > 0 for each job lastreadstorageid ? writestorageid ? count of this should always be > number of jobs expected. It will be double the expected value for a few days each month. That's OK | ||||
Additional Information | What about also checking that the newest volume in /jails/bacula-sd-03/usr/local/bacula/volumes/FullFile-03 is never more than 37 days old? | ||||
Tags | No tags attached. | ||||
|
Stuff added to Ansible:File svn-commit.tmp saved Sending host_vars/webserver.int.unixathome.org Adding roles/nrpe/files/nagios-custom/check_bacula_copy_jobs_bacula_sd_03 Adding roles/nrpe/templates/nrpe-sets/check_bacula_jobs.j2 Transmitting file data ...done Committing transaction... Committed revision 2947. |
|
re:[19:33 webserver dvl /usr/local/etc/nrpe.d] % cat check_bacula_jobs.cfg command[check_bacula_copy_jobs_bacula_sd_03]=/usr/local/libexec/nagios-custom/check_bacula_copy_jobs_bacula_sd_03 [19:33 webserver dvl /usr/local/etc/nrpe.d] % cat /usr/local/libexec/nagios-custom/check_bacula_copy_jobs_bacula_sd_03 #!/bin/sh SQL="select count(*) from job where priorjobid != 0 and type = 'C' and level = 'F' and jobstatus = 'T' and poolid = (select poolid from pool where name = 'FullFile-03') and jobbytes > 0 and endtime < (CURRENT_DATE - interval '40 days') and priorjob != ''" DBHOST="pg03.int.unixathome.org" DBNAME="bacula" USER="nagios" COUNT=$(psql --no-align --host $DBHOST --dbname=$DBNAME --quiet --tuples-only --username=$USER <<EOF $SQL EOF ) if [ ${COUNT} -lt 28 ]; then echo CRITICAL, there are $COUNT copy jobs fi echo there are $COUNT copy jobs exit 0 |
|
That date check needs a > not a < Fixed. |
|
Seems right. I'll know more after the new set of full backups on the first Sunday of November. |
Date Modified | Username | Field | Change |
---|---|---|---|
2025-10-05 12:58 | dvl | New Issue | |
2025-10-05 12:58 | dvl | Status | new => assigned |
2025-10-05 12:58 | dvl | Assigned To | => dvl |
2025-10-05 19:33 | dvl | Note Added: 0000440 | |
2025-10-05 19:33 | dvl | Note Added: 0000441 | |
2025-10-07 18:13 | dvl | Note Added: 0000442 | |
2025-10-09 12:02 | dvl | Note Added: 0000443 |