[01:24:27] 10Operations, 10Traffic, 10Patch-For-Review, 10Performance-Team (Radar): Better handling for one-hit-wonder objects - https://phabricator.wikimedia.org/T144187 (10Krinkle) [01:55:25] PROBLEM - HHVM jobrunner on mw1335 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 473 bytes in 0.001 second response time [01:56:35] RECOVERY - HHVM jobrunner on mw1335 is OK: HTTP OK: HTTP/1.1 200 OK - 206 bytes in 0.002 second response time [02:23:51] PROBLEM - Check systemd state on labstore1004 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [02:23:59] PROBLEM - Ensure mysql credential creation for tools users is running on labstore1004 is CRITICAL: CRITICAL - Expecting active but unit maintain-dbusers is failed [02:27:17] RECOVERY - Check systemd state on labstore1004 is OK: OK - running: The system is fully operational [02:27:23] RECOVERY - Ensure mysql credential creation for tools users is running on labstore1004 is OK: OK - maintain-dbusers is active [02:36:29] (03PS1) 10Legoktm: extdist: Switch to Python 3 [puppet] - 10https://gerrit.wikimedia.org/r/475579 (https://phabricator.wikimedia.org/T210312) [02:37:25] (03PS2) 10Legoktm: extdist: Switch to Python 3 [puppet] - 10https://gerrit.wikimedia.org/r/475579 (https://phabricator.wikimedia.org/T210312) [02:50:35] (03CR) 10Krinkle: [C: 031] RunSingleJob: Check that JobExecutor has been loaded [mediawiki-config] - 10https://gerrit.wikimedia.org/r/474885 (https://phabricator.wikimedia.org/T208922) (owner: 10Mobrovac) [03:40:15] PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 931.18 seconds [03:48:19] RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 212.96 seconds [07:25:59] (03PS1) 10Giuseppe Lavagetto: mediawiki::web::prod_sites: add handler to www.wikimedia.org [puppet] - 10https://gerrit.wikimedia.org/r/475581 [07:26:04] <_joe_> elukey: ^^ [07:26:30] <_joe_> elukey: I'm sending it to php7, bold move [07:28:16] (03PS2) 10Giuseppe Lavagetto: mediawiki::web::prod_sites: add handler to www.wikimedia.org [puppet] - 10https://gerrit.wikimedia.org/r/475581 [07:29:17] (03PS3) 10Giuseppe Lavagetto: mediawiki::web::prod_sites: add handler to www.wikimedia.org [puppet] - 10https://gerrit.wikimedia.org/r/475581 [07:30:02] :D [07:30:35] (03CR) 10Elukey: [C: 031] mediawiki::web::prod_sites: add handler to www.wikimedia.org [puppet] - 10https://gerrit.wikimedia.org/r/475581 (owner: 10Giuseppe Lavagetto) [07:30:43] (03CR) 10Giuseppe Lavagetto: "https://puppet-compiler.wmflabs.org/compiler1002/13694/ shows this is a functional noop" [puppet] - 10https://gerrit.wikimedia.org/r/475499 (owner: 10Giuseppe Lavagetto) [07:30:57] (03CR) 10Giuseppe Lavagetto: [C: 032] mediawiki::web::prod_sites: add handler to www.wikimedia.org [puppet] - 10https://gerrit.wikimedia.org/r/475581 (owner: 10Giuseppe Lavagetto) [09:17:59] (03CR) 10Volans: "Minor nitpicks inline" (033 comments) [puppet] - 10https://gerrit.wikimedia.org/r/475579 (https://phabricator.wikimedia.org/T210312) (owner: 10Legoktm) [09:41:11] (03CR) 10Volans: "A couple of small fixes needed, looks good otherwise." (035 comments) [software/spicerack] - 10https://gerrit.wikimedia.org/r/468558 (https://phabricator.wikimedia.org/T207918) (owner: 10Mathew.onipe) [09:58:04] (03CR) 10Volans: [C: 04-1] "Argument parsing still needs some fixes, see inline." (033 comments) [cookbooks] - 10https://gerrit.wikimedia.org/r/467964 (https://phabricator.wikimedia.org/T207919) (owner: 10Mathew.onipe) [12:09:33] (03CR) 10Seb35: extdist: Switch to Python 3 (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/475579 (https://phabricator.wikimedia.org/T210312) (owner: 10Legoktm) [13:31:41] PROBLEM - Ensure mysql credential creation for tools users is running on labstore1004 is CRITICAL: CRITICAL - Expecting active but unit maintain-dbusers is failed [13:32:15] PROBLEM - Check systemd state on labstore1004 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [13:57:01] RECOVERY - Ensure mysql credential creation for tools users is running on labstore1004 is OK: OK - maintain-dbusers is active [13:57:33] RECOVERY - Check systemd state on labstore1004 is OK: OK - running: The system is fully operational [14:32:10] (03CR) 10Volans: extdist: Switch to Python 3 (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/475579 (https://phabricator.wikimedia.org/T210312) (owner: 10Legoktm) [14:58:09] PROBLEM - Check systemd state on labstore1004 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [14:58:49] PROBLEM - Ensure mysql credential creation for tools users is running on labstore1004 is CRITICAL: CRITICAL - Expecting active but unit maintain-dbusers is failed [15:27:23] RECOVERY - Ensure mysql credential creation for tools users is running on labstore1004 is OK: OK - maintain-dbusers is active [15:27:49] RECOVERY - Check systemd state on labstore1004 is OK: OK - running: The system is fully operational [17:53:17] PROBLEM - Ensure mysql credential creation for tools users is running on labstore1004 is CRITICAL: CRITICAL - Expecting active but unit maintain-dbusers is failed [17:53:41] PROBLEM - Check systemd state on labstore1004 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [17:56:45] RECOVERY - Ensure mysql credential creation for tools users is running on labstore1004 is OK: OK - maintain-dbusers is active [18:00:15] PROBLEM - Ensure mysql credential creation for tools users is running on labstore1004 is CRITICAL: CRITICAL - Expecting active but unit maintain-dbusers is failed [18:26:53] RECOVERY - Check systemd state on labstore1004 is OK: OK - running: The system is fully operational [18:27:39] RECOVERY - Ensure mysql credential creation for tools users is running on labstore1004 is OK: OK - maintain-dbusers is active [18:59:47] PROBLEM - Ensure mysql credential creation for tools users is running on labstore1004 is CRITICAL: CRITICAL - Expecting active but unit maintain-dbusers is failed [19:00:07] PROBLEM - Check systemd state on labstore1004 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [19:27:33] RECOVERY - Check systemd state on labstore1004 is OK: OK - running: The system is fully operational [19:28:21] RECOVERY - Ensure mysql credential creation for tools users is running on labstore1004 is OK: OK - maintain-dbusers is active [19:53:19] PROBLEM - HHVM jobrunner on mw1299 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 473 bytes in 0.001 second response time [19:54:27] RECOVERY - HHVM jobrunner on mw1299 is OK: HTTP OK: HTTP/1.1 200 OK - 206 bytes in 0.006 second response time [20:04:01] PROBLEM - Ensure mysql credential creation for tools users is running on labstore1004 is CRITICAL: CRITICAL - Expecting active but unit maintain-dbusers is failed [20:04:19] PROBLEM - Check systemd state on labstore1004 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [20:27:17] RECOVERY - Check systemd state on labstore1004 is OK: OK - running: The system is fully operational [20:28:07] RECOVERY - Ensure mysql credential creation for tools users is running on labstore1004 is OK: OK - maintain-dbusers is active [21:22:41] PROBLEM - puppet last run on mwdebug1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [21:53:31] RECOVERY - puppet last run on mwdebug1001 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [22:32:01] PROBLEM - Ensure mysql credential creation for tools users is running on labstore1004 is CRITICAL: CRITICAL - Expecting active but unit maintain-dbusers is failed [22:32:15] PROBLEM - Check systemd state on labstore1004 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [22:33:11] PROBLEM - nova instance creation test on cloudcontrol1003 is CRITICAL: PROCS CRITICAL: 0 processes with command name python, args nova-fullstack [22:34:05] PROBLEM - Check systemd state on cloudcontrol1003 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [22:37:35] RECOVERY - Check systemd state on cloudcontrol1003 is OK: OK - running: The system is fully operational [22:37:51] RECOVERY - nova instance creation test on cloudcontrol1003 is OK: PROCS OK: 1 process with command name python, args nova-fullstack [22:57:25] RECOVERY - Ensure mysql credential creation for tools users is running on labstore1004 is OK: OK - maintain-dbusers is active [22:57:37] RECOVERY - Check systemd state on labstore1004 is OK: OK - running: The system is fully operational [23:09:02] (03PS1) 10Bstorm: nova: Reduce compute workers for eqiad main [puppet] - 10https://gerrit.wikimedia.org/r/475603 (https://phabricator.wikimedia.org/T202889) [23:13:16] (03CR) 10Arturo Borrero Gonzalez: [C: 032] "Compiler is happy https://integration.wikimedia.org/ci/view/operations/job/operations-puppet-catalog-compiler/13698/console" [puppet] - 10https://gerrit.wikimedia.org/r/475603 (https://phabricator.wikimedia.org/T202889) (owner: 10Bstorm) [23:39:49] PROBLEM - Ensure mysql credential creation for tools users is running on labstore1004 is CRITICAL: CRITICAL - Expecting active but unit maintain-dbusers is failed [23:40:01] PROBLEM - Check systemd state on labstore1004 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [23:57:03] (03PS1) 10Arturo Borrero Gonzalez: neutron: reduce number of api_wokers [puppet] - 10https://gerrit.wikimedia.org/r/475607 (https://phabricator.wikimedia.org/T202889) [23:58:13] RECOVERY - Ensure mysql credential creation for tools users is running on labstore1004 is OK: OK - maintain-dbusers is active [23:58:23] RECOVERY - Check systemd state on labstore1004 is OK: OK - running: The system is fully operational [23:58:38] (03CR) 10Arturo Borrero Gonzalez: "https://integration.wikimedia.org/ci/view/operations/job/operations-puppet-catalog-compiler/13699/console" [puppet] - 10https://gerrit.wikimedia.org/r/475607 (https://phabricator.wikimedia.org/T202889) (owner: 10Arturo Borrero Gonzalez) [23:58:53] (03CR) 10Arturo Borrero Gonzalez: [C: 032] neutron: reduce number of api_wokers [puppet] - 10https://gerrit.wikimedia.org/r/475607 (https://phabricator.wikimedia.org/T202889) (owner: 10Arturo Borrero Gonzalez)