[01:05:28] 10Operations, 10DNS, 10Release-Engineering-Team, 10Traffic, and 2 others: Move Foundation Wiki to new URL when new Wikimedia Foundation website launches - https://phabricator.wikimedia.org/T188776#4021427 (10Krenair) The instructions for handling such a change, if a developer decides to accept this request... [01:09:34] 10Operations, 10DNS, 10Release-Engineering-Team, 10Traffic, and 2 others: Move Foundation Wiki to new URL when new Wikimedia Foundation website launches - https://phabricator.wikimedia.org/T188776#4021440 (10Krenair) @bawolff, is that site approved to sit under wikimedia.org, seeing as it has wgRawHtml? [01:24:29] 10Operations, 10DNS, 10Release-Engineering-Team, 10Traffic, and 2 others: Move Foundation Wiki to new URL when new Wikimedia Foundation website launches - https://phabricator.wikimedia.org/T188776#4021466 (10Varnent) The new scope for that site will result in far fewer pages (essentially just documentation... [01:25:40] PROBLEM - IPv6 ping to ulsfo on ripe-atlas-ulsfo IPv6 is CRITICAL: CRITICAL - failed 21 probes of 293 (alerts on 19) - https://atlas.ripe.net/measurements/1791309/#!map [01:30:35] 10Operations, 10DNS, 10Release-Engineering-Team, 10Traffic, and 2 others: Move Foundation Wiki to new URL when new Wikimedia Foundation website launches - https://phabricator.wikimedia.org/T188776#4021476 (10Krenair) Okay but the existing site presumably has to continue to live somewhere, and the special c... [01:30:40] RECOVERY - IPv6 ping to ulsfo on ripe-atlas-ulsfo IPv6 is OK: OK - failed 8 probes of 293 (alerts on 19) - https://atlas.ripe.net/measurements/1791309/#!map [01:43:15] 10Operations, 10DNS, 10Release-Engineering-Team, 10Traffic, and 2 others: Move Foundation Wiki to new URL when new Wikimedia Foundation website launches - https://phabricator.wikimedia.org/T188776#4021506 (10Krenair) (Some of) the stuff I wondered across with a quick grep that doesn't look to be covered by... [03:27:11] PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 856.45 seconds [03:32:17] 10Operations, 10DNS, 10Release-Engineering-Team, 10Traffic, and 2 others: Move Foundation Wiki to new URL when new Wikimedia Foundation website launches - https://phabricator.wikimedia.org/T188776#4021611 (10Bawolff) >>! In T188776#4021440, @Krenair wrote: > @bawolff, is that site approved to sit under wik... [03:33:31] PROBLEM - puppet last run on mw2234 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/share/GeoIP/GeoIPCity.dat.gz] [03:45:36] 10Operations, 10DNS, 10Release-Engineering-Team, 10Traffic, and 2 others: Move Foundation Wiki to new URL when new Wikimedia Foundation website launches - https://phabricator.wikimedia.org/T188776#4021634 (10Varnent) > That said, I would certainly prefer it wasnt a raw html wiki. At first glance it sounds... [04:03:31] RECOVERY - puppet last run on mw2234 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [04:03:49] (03PS1) 10Gergő Tisza: Enable loginOnly mode for local auth provider on group 0 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/416331 (https://phabricator.wikimedia.org/T57420) [04:12:30] RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 169.60 seconds [04:17:50] PROBLEM - Disk space on elastic1027 is CRITICAL: DISK CRITICAL - free space: /srv 61737 MB (12% inode=99%) [04:34:51] RECOVERY - Disk space on elastic1027 is OK: DISK OK [05:46:59] 10Operations, 10Analytics, 10Traffic: Update documentation for "https" field in X-Analytics - https://phabricator.wikimedia.org/T188807#4021737 (10Tbayer) After some [[https://www.troyhunt.com/understanding-http-strict-transport/|further]] [[https://www.seroundtable.com/googlebot-hsts-redirects-301-307-21405... [05:49:01] 10Operations, 10Analytics, 10Traffic: Update documentation for "https" field in X-Analytics - https://phabricator.wikimedia.org/T188807#4021738 (10Tbayer) [06:19:08] (03PS1) 10Krinkle: scap prep: Scap-ify the creation of beta's StartProfiler.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/416334 (https://phabricator.wikimedia.org/T180766) [06:19:18] no_justification: Untested^ but review welcome :) [06:28:56] Krinkle: well 1028 on a Saturday... 🍻 [06:33:08] no_justification: sure sure, not now, some other time :) [06:33:09] o/ [06:40:22] 10Operations, 10Analytics, 10Traffic: Update documentation for "https" field in X-Analytics - https://phabricator.wikimedia.org/T188807#4020348 (10Nuria) Please see: https://github.com/wikimedia/puppet/blob/production/modules/varnish/templates/analytics.inc.vcl.erb#L224 https://github.com/wikimedia/puppet/b... [08:34:50] PROBLEM - IPv6 ping to eqiad on ripe-atlas-eqiad IPv6 is CRITICAL: CRITICAL - failed 31 probes of 293 (alerts on 19) - https://atlas.ripe.net/measurements/1790947/#!map [08:39:50] RECOVERY - IPv6 ping to eqiad on ripe-atlas-eqiad IPv6 is OK: OK - failed 12 probes of 293 (alerts on 19) - https://atlas.ripe.net/measurements/1790947/#!map [08:55:33] 10Operations, 10Analytics, 10Traffic: Update documentation for "https" field in X-Analytics - https://phabricator.wikimedia.org/T188807#4021816 (10Tbayer) Update: I checked again how HTTP requests are logged, this time with curl (as a client without HSTS preloading) instead of Chrome: `$ curl -v 'http://de... [09:35:28] (03PS1) 10Jayprakash12345: Enable translate extension in bdwikimedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/416338 (https://phabricator.wikimedia.org/T188853) [10:02:40] (03PS2) 10Jayprakash12345: Enable translate extension in bdwikimedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/416338 (https://phabricator.wikimedia.org/T188853) [10:18:20] (03CR) 10Marostegui: [C: 031] Revert "labsdb: Depool labsdb1010 in preparation for its recovery" [puppet] - 10https://gerrit.wikimedia.org/r/415923 (owner: 10Jcrespo) [10:26:44] (03Abandoned) 10Sau226: Disable main page deletion on enwiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414509 (https://phabricator.wikimedia.org/T184959) (owner: 10Sau226) [12:21:25] (03CR) 10MarcoAurelio: [C: 04-1] Enable translate extension in bdwikimedia (032 comments) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/416338 (https://phabricator.wikimedia.org/T188853) (owner: 10Jayprakash12345) [12:21:52] (03CR) 10MarcoAurelio: [C: 04-1] "To deployer: please run before "mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=bdwikimedia translate" otherwise" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/416338 (https://phabricator.wikimedia.org/T188853) (owner: 10Jayprakash12345) [12:32:56] (03Draft1) 10MarcoAurelio: mediawiki: add cronjob to purge expired temporary userrights [puppet] - 10https://gerrit.wikimedia.org/r/416344 (https://phabricator.wikimedia.org/T176754) [12:32:57] (03PS2) 10MarcoAurelio: mediawiki: add cronjob to purge expired temporary userrights [puppet] - 10https://gerrit.wikimedia.org/r/416344 (https://phabricator.wikimedia.org/T176754) [12:33:33] (03CR) 10jerkins-bot: [V: 04-1] mediawiki: add cronjob to purge expired temporary userrights [puppet] - 10https://gerrit.wikimedia.org/r/416344 (https://phabricator.wikimedia.org/T176754) (owner: 10MarcoAurelio) [12:33:43] (03PS3) 10MarcoAurelio: [DO NOT MERGE] mediawiki: add cronjob to purge expired temporary userrights [puppet] - 10https://gerrit.wikimedia.org/r/416344 (https://phabricator.wikimedia.org/T176754) [12:34:14] (03CR) 10jerkins-bot: [V: 04-1] [DO NOT MERGE] mediawiki: add cronjob to purge expired temporary userrights [puppet] - 10https://gerrit.wikimedia.org/r/416344 (https://phabricator.wikimedia.org/T176754) (owner: 10MarcoAurelio) [12:35:02] (03CR) 10MarcoAurelio: "Apparently puppet don't like tabs." [puppet] - 10https://gerrit.wikimedia.org/r/416344 (https://phabricator.wikimedia.org/T176754) (owner: 10MarcoAurelio) [12:35:14] (03PS3) 10Jayprakash12345: Enable translate extension in bdwikimedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/416338 (https://phabricator.wikimedia.org/T188853) [12:36:40] (03CR) 10EddieGP: [C: 04-1] "There's already https://gerrit.wikimedia.org/r/c/382631/" [puppet] - 10https://gerrit.wikimedia.org/r/416344 (https://phabricator.wikimedia.org/T176754) (owner: 10MarcoAurelio) [12:37:32] (03Abandoned) 10MarcoAurelio: [DO NOT MERGE] mediawiki: add cronjob to purge expired temporary userrights [puppet] - 10https://gerrit.wikimedia.org/r/416344 (https://phabricator.wikimedia.org/T176754) (owner: 10MarcoAurelio) [12:40:54] (03CR) 10MarcoAurelio: "Needs rebasing but LGTM. Left some comments though." (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/382631 (https://phabricator.wikimedia.org/T176754) (owner: 10EddieGP) [12:42:18] (03CR) 10MarcoAurelio: [C: 031] "Thanks." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/416224 (https://phabricator.wikimedia.org/T188633) (owner: 10Jayprakash12345) [12:44:32] (03CR) 10MarcoAurelio: [C: 031] Enable translate extension in bdwikimedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/416338 (https://phabricator.wikimedia.org/T188853) (owner: 10Jayprakash12345) [12:46:10] PROBLEM - puppet last run on stat1004 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 4 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[cdh::hadoop::directory /user/spark] [12:49:31] PROBLEM - Disk space on stat1004 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [12:50:20] PROBLEM - MD RAID on stat1004 is CRITICAL: NRPE: Call to popen() failed [12:50:21] ACKNOWLEDGEMENT - MD RAID on stat1004 is CRITICAL: NRPE: Call to popen() failed nagiosadmin RAID handler auto-ack: https://phabricator.wikimedia.org/T188861 [12:50:26] 10Operations, 10ops-eqiad: Degraded RAID on stat1004 - https://phabricator.wikimedia.org/T188861#4022007 (10ops-monitoring-bot) [12:50:30] PROBLEM - configured eth on stat1004 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [12:51:11] PROBLEM - DPKG on stat1004 is CRITICAL: NRPE: Unable to read output [12:52:20] RECOVERY - MD RAID on stat1004 is OK: OK: Active: 8, Working: 8, Failed: 0, Spare: 0 [12:53:30] PROBLEM - Disk space on stat1004 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [12:54:50] PROBLEM - dhclient process on stat1004 is CRITICAL: NRPE: Unable to read output [12:54:50] PROBLEM - Check systemd state on stat1004 is CRITICAL: NRPE: Call to popen() failed [12:55:20] PROBLEM - MD RAID on stat1004 is CRITICAL: NRPE: Unable to read output [12:55:21] ACKNOWLEDGEMENT - MD RAID on stat1004 is CRITICAL: NRPE: Unable to read output nagiosadmin RAID handler auto-ack: https://phabricator.wikimedia.org/T188863 [12:55:27] 10Operations, 10ops-eqiad: Degraded RAID on stat1004 - https://phabricator.wikimedia.org/T188863#4022030 (10ops-monitoring-bot) [12:56:50] PROBLEM - Check systemd state on stat1004 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [12:58:30] PROBLEM - SSH on stat1004 is CRITICAL: Server answer [12:58:40] PROBLEM - Check the NTP synchronisation status of timesyncd on stat1004 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [13:00:30] RECOVERY - SSH on stat1004 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u4 (protocol 2.0) [13:01:10] PROBLEM - puppet last run on stat1004 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [13:01:55] (03Draft1) 10MarcoAurelio: Disable abusefilter from collecting private data on Beta [mediawiki-config] - 10https://gerrit.wikimedia.org/r/416346 (https://phabricator.wikimedia.org/T188862) [13:02:01] (03PS2) 10MarcoAurelio: Disable abusefilter from collecting private data on Beta [mediawiki-config] - 10https://gerrit.wikimedia.org/r/416346 (https://phabricator.wikimedia.org/T188862) [13:03:30] PROBLEM - SSH on stat1004 is CRITICAL: Server answer [13:08:17] Amir1: you around? [13:17:21] PROBLEM - IPMI Sensor Status on stat1004 is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. [13:50:39] (03CR) 10EddieGP: Add cron job for expired userrights maintenance script (032 comments) [puppet] - 10https://gerrit.wikimedia.org/r/382631 (https://phabricator.wikimedia.org/T176754) (owner: 10EddieGP) [15:01:58] (03PS3) 10Jcrespo: mariadb-backups: Change backup format to YYYY-MM-dd--HH-mm-SS [puppet] - 10https://gerrit.wikimedia.org/r/415608 (https://phabricator.wikimedia.org/T184696) [15:02:00] (03PS1) 10Jcrespo: mariadb-backups: Allow backup consolidation and recovery [puppet] - 10https://gerrit.wikimedia.org/r/416353 (https://phabricator.wikimedia.org/T184696) [15:02:53] (03CR) 10jerkins-bot: [V: 04-1] mariadb-backups: Allow backup consolidation and recovery [puppet] - 10https://gerrit.wikimedia.org/r/416353 (https://phabricator.wikimedia.org/T184696) (owner: 10Jcrespo) [15:45:45] 10Operations, 10Analytics, 10Traffic: Update documentation for "https" field in X-Analytics - https://phabricator.wikimedia.org/T188807#4022159 (10BBlack) Note that successful non-HTTPS requests evading our standard HTTPS redirect code are still possible under some circumstances. The circumstances are: 1)... [15:58:22] mmmm I am trying to loging as root via mgmt to stat1004, it freezes.. [15:59:08] !log powercycle stat1004 - available via mgmt, root login freezes while trying [15:59:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:00:21] PROBLEM - Host stat1004 is DOWN: PING CRITICAL - Packet loss = 100% [16:00:30] RECOVERY - dhclient process on stat1004 is OK: PROCS OK: 0 processes with command name dhclient [16:00:31] RECOVERY - Check systemd state on stat1004 is OK: OK - running: The system is fully operational [16:00:40] RECOVERY - Host stat1004 is UP: PING OK - Packet loss = 0%, RTA = 0.23 ms [16:01:00] RECOVERY - DPKG on stat1004 is OK: All packages OK [16:01:11] RECOVERY - MD RAID on stat1004 is OK: OK: Active: 8, Working: 8, Failed: 0, Spare: 0 [16:01:11] RECOVERY - SSH on stat1004 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u4 (protocol 2.0) [16:01:20] RECOVERY - configured eth on stat1004 is OK: OK - interfaces up [16:01:21] RECOVERY - Disk space on stat1004 is OK: DISK OK [16:02:27] 10Operations, 10ops-eqiad: Degraded RAID on stat1004 - https://phabricator.wikimedia.org/T188863#4022030 (10elukey) Powercycled, now seems good: ``` elukey@stat1004:~$ cat /proc/mdstat Personalities : [raid10] md1 : active raid10 sda3[0] sdd3[3] sdc3[2] sdb3[1] 7716112384 blocks super 1.2 512K chunks 2... [16:03:04] everything seems good now, I'll keep it monitored [16:06:11] RECOVERY - puppet last run on stat1004 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [16:17:21] RECOVERY - IPMI Sensor Status on stat1004 is OK: Sensor Type(s) Temperature, Power_Supply Status: OK [16:28:40] RECOVERY - Check the NTP synchronisation status of timesyncd on stat1004 is OK: OK: synced at Sun 2018-03-04 16:28:37 UTC. [17:14:45] (03Abandoned) 10Legoktm: docker: Add deb-src for apt.wm.o in jessie and stretch images [puppet] - 10https://gerrit.wikimedia.org/r/387984 (https://phabricator.wikimedia.org/T179354) (owner: 10Legoktm) [18:05:12] !log T12345 Ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=wikidatawiki --logwiki=metawiki 'Erik Fastman' 'Glorious Engine' [18:05:22] oops [18:05:26] !log T188721 Ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=wikidatawiki --logwiki=metawiki 'Erik Fastman' 'Glorious Engine' [18:05:29] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:05:29] T12345: Create "annotation" namespace on Hebrew Wikisource - https://phabricator.wikimedia.org/T12345 [18:05:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:05:44] T188721: Global rename of Erik_Fastman to Glorious_Engine stuck "in progress" since 28th February on wikidatawiki - https://phabricator.wikimedia.org/T188721 [18:17:22] 10Operations, 10ops-eqiad: Degraded RAID on stat1004 - https://phabricator.wikimedia.org/T188863#4022030 (10Addshore) Duplicate of T188861 ? has slightly different error message [20:16:38] !log T188721 ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=wikidatawiki --ignorestatus --logwiki=metawiki 'Erik Fastman' 'Glorious Engine' [20:16:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:16:55] T188721: Global rename of Erik_Fastman to Glorious_Engine stuck "in progress" since 28th February on wikidatawiki - https://phabricator.wikimedia.org/T188721 [20:21:11] PROBLEM - Check systemd state on gerrit2001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [20:21:30] PROBLEM - gerrit process on gerrit2001 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java .*-jar /var/lib/gerrit2/review_site/bin/gerrit.war daemon -d /var/lib/gerrit2/review_site [20:24:16] hmm, is that doing that because it notifys every so often? [20:31:20] RECOVERY - Check systemd state on gerrit2001 is OK: OK - running: The system is fully operational [20:31:40] RECOVERY - gerrit process on gerrit2001 is OK: PROCS OK: 1 process with regex args ^/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java .*-jar /var/lib/gerrit2/review_site/bin/gerrit.war daemon -d /var/lib/gerrit2/review_site [20:33:33] huh, it carn't connect to the db, so not sure how it recovered. [20:40:21] PROBLEM - Check systemd state on gerrit2001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [20:40:41] PROBLEM - gerrit process on gerrit2001 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java .*-jar /var/lib/gerrit2/review_site/bin/gerrit.war daemon -d /var/lib/gerrit2/review_site [20:46:53] that will retry at the next puppet run. [21:01:31] RECOVERY - Check systemd state on gerrit2001 is OK: OK - running: The system is fully operational [21:01:51] RECOVERY - gerrit process on gerrit2001 is OK: PROCS OK: 1 process with regex args ^/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java .*-jar /var/lib/gerrit2/review_site/bin/gerrit.war daemon -d /var/lib/gerrit2/review_site [21:10:00] PROBLEM - gerrit process on gerrit2001 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java .*-jar /var/lib/gerrit2/review_site/bin/gerrit.war daemon -d /var/lib/gerrit2/review_site [21:10:40] PROBLEM - Check systemd state on gerrit2001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [21:12:56] no_justification mutante ^^ [21:16:37] (03CR) 10Framawiki: [C: 031] Disable Flow extension on Commons [mediawiki-config] - 10https://gerrit.wikimedia.org/r/408073 (https://phabricator.wikimedia.org/T186463) (owner: 10Zoranzoki21) [21:31:50] RECOVERY - Check systemd state on gerrit2001 is OK: OK - running: The system is fully operational [21:32:10] RECOVERY - gerrit process on gerrit2001 is OK: PROCS OK: 1 process with regex args ^/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java .*-jar /var/lib/gerrit2/review_site/bin/gerrit.war daemon -d /var/lib/gerrit2/review_site [21:39:51] PROBLEM - Check systemd state on gerrit2001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [21:40:10] PROBLEM - gerrit process on gerrit2001 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java .*-jar /var/lib/gerrit2/review_site/bin/gerrit.war daemon -d /var/lib/gerrit2/review_site [22:01:20] RECOVERY - gerrit process on gerrit2001 is OK: PROCS OK: 1 process with regex args ^/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java .*-jar /var/lib/gerrit2/review_site/bin/gerrit.war daemon -d /var/lib/gerrit2/review_site [22:02:00] RECOVERY - Check systemd state on gerrit2001 is OK: OK - running: The system is fully operational [22:10:10] PROBLEM - Check systemd state on gerrit2001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [22:10:30] PROBLEM - gerrit process on gerrit2001 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java .*-jar /var/lib/gerrit2/review_site/bin/gerrit.war daemon -d /var/lib/gerrit2/review_site [22:24:20] (03Abandoned) 10Huji: Expand the access to 2FA on fawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392224 (https://phabricator.wikimedia.org/T180648) (owner: 10Huji) [22:31:11] RECOVERY - Check systemd state on gerrit2001 is OK: OK - running: The system is fully operational [22:31:31] RECOVERY - gerrit process on gerrit2001 is OK: PROCS OK: 1 process with regex args ^/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java .*-jar /var/lib/gerrit2/review_site/bin/gerrit.war daemon -d /var/lib/gerrit2/review_site [22:40:20] PROBLEM - Check systemd state on gerrit2001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [22:40:40] PROBLEM - gerrit process on gerrit2001 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java .*-jar /var/lib/gerrit2/review_site/bin/gerrit.war daemon -d /var/lib/gerrit2/review_site [23:01:30] RECOVERY - Check systemd state on gerrit2001 is OK: OK - running: The system is fully operational [23:01:50] RECOVERY - gerrit process on gerrit2001 is OK: PROCS OK: 1 process with regex args ^/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java .*-jar /var/lib/gerrit2/review_site/bin/gerrit.war daemon -d /var/lib/gerrit2/review_site [23:10:30] PROBLEM - Check systemd state on gerrit2001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [23:10:51] PROBLEM - gerrit process on gerrit2001 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java .*-jar /var/lib/gerrit2/review_site/bin/gerrit.war daemon -d /var/lib/gerrit2/review_site [23:31:01] RECOVERY - gerrit process on gerrit2001 is OK: PROCS OK: 1 process with regex args ^/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java .*-jar /var/lib/gerrit2/review_site/bin/gerrit.war daemon -d /var/lib/gerrit2/review_site [23:31:41] RECOVERY - Check systemd state on gerrit2001 is OK: OK - running: The system is fully operational [23:39:41] PROBLEM - Check systemd state on gerrit2001 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [23:40:10] PROBLEM - gerrit process on gerrit2001 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java .*-jar /var/lib/gerrit2/review_site/bin/gerrit.war daemon -d /var/lib/gerrit2/review_site