[00:00:47] PROBLEM - puppet last run on labsdb1007 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [00:27:47] RECOVERY - puppet last run on labsdb1007 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [01:10:17] RECOVERY - Esams HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [01:16:47] PROBLEM - puppet last run on lvs1004 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [01:40:41] PROBLEM - puppet last run on mw1214 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [01:45:41] RECOVERY - puppet last run on lvs1004 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [01:51:51] PROBLEM - puppet last run on analytics1044 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [02:09:41] RECOVERY - puppet last run on mw1214 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [02:17:01] !log l10nupdate@tin scap sync-l10n completed (1.29.0-wmf.3) (duration: 06m 03s) [02:17:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:20:51] RECOVERY - puppet last run on analytics1044 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [02:21:19] !log l10nupdate@tin ResourceLoader cache refresh completed at Sun Nov 20 02:21:19 UTC 2016 (duration 4m 18s) [02:21:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:24:30] PROBLEM - puppet last run on maerlant is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [04:53:30] RECOVERY - puppet last run on maerlant is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [04:59:30] PROBLEM - puppet last run on hydrogen is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [05:27:30] RECOVERY - puppet last run on hydrogen is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [05:40:38] (03PS1) 10Dereckson: Configure Babel for fr.wikibooks and fr.wikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/322493 (https://phabricator.wikimedia.org/T146213) [06:01:15] 06Operations, 10Domains, 10Phabricator, 10Traffic: short URL for phabricator - https://phabricator.wikimedia.org/T151094#2808719 (10mmodell) >>! In T151094#2807840, @Dzahn wrote: > ^ this is a working URL that gets you to https://meta.wikimedia.org/wiki/Special:UrlShortener why it does not link here is an... [06:11:33] 06Operations, 10Domains, 10Phabricator, 10Traffic: short URL for phabricator - https://phabricator.wikimedia.org/T151094#2807477 (10jeremyb) >>! In T151094#2808719, @mmodell wrote: > https:://w.wiki does not work for me. Firefox decides to instead take me to a google search for w.wiki... odd can you give... [06:33:00] PROBLEM - tools homepage -admin tool- on tools.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Not Available - 531 bytes in 0.021 second response time [06:41:20] PROBLEM - puppet last run on elastic1032 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Package[git] [06:44:30] PROBLEM - puppet last run on labtestweb2001 is CRITICAL: CRITICAL: Puppet has 2 failures. Last run 2 minutes ago with 2 failures. Failed resources (up to 3 shown): Package[ack-grep],Package[acct] [06:53:00] PROBLEM - puppet last run on snapshot1005 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Package[vim] [07:08:20] RECOVERY - puppet last run on elastic1032 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [07:12:30] RECOVERY - puppet last run on labtestweb2001 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [07:14:00] RECOVERY - tools homepage -admin tool- on tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 3670 bytes in 0.039 second response time [07:21:00] RECOVERY - puppet last run on snapshot1005 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [07:58:09] (03PS2) 10ArielGlenn: start moving adds/changes methods out to incr_dumps module [dumps] - 10https://gerrit.wikimedia.org/r/322491 (https://phabricator.wikimedia.org/T133547) [08:28:30] PROBLEM - puppet last run on cp3048 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [08:49:30] PROBLEM - puppet last run on db1016 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [08:56:30] RECOVERY - puppet last run on cp3048 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [09:15:20] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [09:17:30] RECOVERY - puppet last run on db1016 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [09:17:40] PROBLEM - Text HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 10.00% of data above the critical threshold [1000.0] [09:25:40] RECOVERY - Text HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] [09:55:20] PROBLEM - puppet last run on ms-be1019 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:17:27] PROBLEM - Redis status tcp_6479 on rdb2006 is CRITICAL: CRITICAL ERROR - Redis Library - can not ping 10.192.48.44 on port 6479 [10:17:57] PROBLEM - puppet last run on mw1253 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:18:27] RECOVERY - Redis status tcp_6479 on rdb2006 is OK: OK: REDIS 2.8.17 on 10.192.48.44:6479 has 1 databases (db0) with 3473687 keys, up 20 days 1 hours - replication_delay is 0 [10:23:17] RECOVERY - puppet last run on ms-be1019 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [10:30:43] PROBLEM - mobileapps endpoints health on scb1003 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [10:32:43] RECOVERY - mobileapps endpoints health on scb1003 is OK: All endpoints are healthy [10:46:03] RECOVERY - puppet last run on mw1253 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [11:22:39] why does https://www.wikipedia.org/ still show, that dewiki has only 1993000+ articles? We reached 2 mio yesterday [11:34:12] Sagan: has anyone updated the portal and deployed it? [11:34:39] afaik they are static and not automatically calculated [11:34:52] -.- [11:36:01] Niharika: you about? [11:36:23] PROBLEM - dhclient process on thumbor1002 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [11:37:13] RECOVERY - dhclient process on thumbor1002 is OK: PROCS OK: 0 processes with command name dhclient [11:59:16] Steinsplitter: Kinda. What's up? [12:04:33] PROBLEM - puppet last run on wtp1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [12:32:33] RECOVERY - puppet last run on wtp1003 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [12:48:38] 06Operations: Prometheus cronspam - https://phabricator.wikimedia.org/T151149#2808933 (10elukey) [12:48:52] 06Operations: Prometheus cronspam - https://phabricator.wikimedia.org/T151149#2808947 (10elukey) [12:57:35] PROBLEM - puppet last run on cp3048 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [13:27:35] RECOVERY - puppet last run on cp3048 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [15:23:02] 06Operations, 10Domains, 10Phabricator, 10Traffic: short URL for phabricator - https://phabricator.wikimedia.org/T151094#2809033 (10Dzahn) Oops, yea, that was my bad. https://w.wiki [15:32:53] 06Operations: eqiad: 1 hardware access request for labs on real hardware (mwoffliner) - https://phabricator.wikimedia.org/T117095#2809034 (10Kelson) @chasemp @Andrew You are awesome! It looks like you just have fixed a many years old HW bootleneck problem! Thank you very much! [15:52:13] 06Operations, 10hardware-requests: Analytics AQS cluster expansion - https://phabricator.wikimedia.org/T149920#2809066 (10elukey) [15:57:27] PROBLEM - puppet last run on cp3008 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [16:02:29] (03CR) 10Elukey: "This solution is not completely correct since requesting the /index.php page triggers a HTTP 200, rather than a 404, but it is needed to a" [puppet] - 10https://gerrit.wikimedia.org/r/322268 (https://phabricator.wikimedia.org/T137176) (owner: 10Elukey) [16:26:27] RECOVERY - puppet last run on cp3008 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [16:43:27] PROBLEM - Redis status tcp_6479 on rdb2006 is CRITICAL: CRITICAL: replication_delay is 614 600 - REDIS 2.8.17 on 10.192.48.44:6479 has 1 databases (db0) with 3509244 keys, up 20 days 8 hours - replication_delay is 614 [17:09:27] RECOVERY - Redis status tcp_6479 on rdb2006 is OK: OK: REDIS 2.8.17 on 10.192.48.44:6479 has 1 databases (db0) with 3494263 keys, up 20 days 8 hours - replication_delay is 0 [17:33:27] PROBLEM - puppet last run on analytics1042 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:46:47] PROBLEM - puppet last run on labvirt1010 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:47:43] whats going on with tools-labs webservice copyvios isnt up and its always up [18:02:27] RECOVERY - puppet last run on analytics1042 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [18:16:47] RECOVERY - puppet last run on labvirt1010 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [18:50:28] PROBLEM - Redis status tcp_6479 on rdb2006 is CRITICAL: CRITICAL ERROR - Redis Library - can not ping 10.192.48.44 on port 6479 [18:51:27] RECOVERY - Redis status tcp_6479 on rdb2006 is OK: OK: REDIS 2.8.17 on 10.192.48.44:6479 has 1 databases (db0) with 3500654 keys, up 20 days 10 hours - replication_delay is 0 [18:56:48] PROBLEM - puppet last run on mw1220 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [19:16:37] (03PS4) 10Zppix: Force a 404 on each HTTP request landing to a non configured domain [puppet] - 10https://gerrit.wikimedia.org/r/322268 (https://phabricator.wikimedia.org/T137176) (owner: 10Elukey) [19:23:47] RECOVERY - puppet last run on mw1220 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [19:38:17] PROBLEM - puppet last run on mw1268 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [20:06:25] RECOVERY - puppet last run on mw1268 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [20:12:05] (03PS1) 10ArielGlenn: MiscDir becomes MiscDumpDir. naming is hard, etc. [dumps] - 10https://gerrit.wikimedia.org/r/322509 [20:12:33] (03CR) 10jenkins-bot: [V: 04-1] MiscDir becomes MiscDumpDir. naming is hard, etc. [dumps] - 10https://gerrit.wikimedia.org/r/322509 (owner: 10ArielGlenn) [20:15:57] (03PS2) 10ArielGlenn: MiscDir becomes MiscDumpDir. naming is hard, etc. [dumps] - 10https://gerrit.wikimedia.org/r/322509 [20:17:45] (03PS1) 10ArielGlenn: move more incremental-related methods out to incr_dumps module [dumps] - 10https://gerrit.wikimedia.org/r/322510 (https://phabricator.wikimedia.org/T133547) [20:18:13] (03CR) 10jenkins-bot: [V: 04-1] move more incremental-related methods out to incr_dumps module [dumps] - 10https://gerrit.wikimedia.org/r/322510 (https://phabricator.wikimedia.org/T133547) (owner: 10ArielGlenn) [20:18:15] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [20:20:57] (03PS2) 10ArielGlenn: move more incremental-related methods out to incr_dumps module [dumps] - 10https://gerrit.wikimedia.org/r/322510 (https://phabricator.wikimedia.org/T133547) [20:22:06] (03PS1) 10ArielGlenn: move methods that dump things into the IncrDump class in incr_dump [dumps] - 10https://gerrit.wikimedia.org/r/322511 (https://phabricator.wikimedia.org/T133547) [20:22:33] (03CR) 10jenkins-bot: [V: 04-1] move methods that dump things into the IncrDump class in incr_dump [dumps] - 10https://gerrit.wikimedia.org/r/322511 (https://phabricator.wikimedia.org/T133547) (owner: 10ArielGlenn) [20:25:04] (03PS2) 10ArielGlenn: move methods that dump things into the IncrDump class in incr_dump [dumps] - 10https://gerrit.wikimedia.org/r/322511 (https://phabricator.wikimedia.org/T133547) [20:26:16] (03PS1) 10ArielGlenn: add run method to the IncrDump class to be used by the generate wrapper [dumps] - 10https://gerrit.wikimedia.org/r/322512 (https://phabricator.wikimedia.org/T133547) [20:26:51] I'm trying to do these now because I'm in crappy enough shape that I may not get to them tomorrow [20:27:00] and rather have them in there just incase [20:27:31] jouncebot next [20:27:32] In 17 hour(s) and 32 minute(s): European Mid-day SWAT(Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20161121T1400) [20:28:09] (03PS1) 10ArielGlenn: a bit of pylint: order of imports, var initialization type whines [dumps] - 10https://gerrit.wikimedia.org/r/322513 [20:29:24] (03PS1) 10ArielGlenn: move options specific to adds/changes into args dict [dumps] - 10https://gerrit.wikimedia.org/r/322514 (https://phabricator.wikimedia.org/T133547) [20:30:56] (03PS2) 10ArielGlenn: move options specific to adds/changes into args dict [dumps] - 10https://gerrit.wikimedia.org/r/322514 (https://phabricator.wikimedia.org/T133547) [20:32:00] (03PS1) 10ArielGlenn: Change last few config options from 'incr' to 'misc' [dumps] - 10https://gerrit.wikimedia.org/r/322515 (https://phabricator.wikimedia.org/T133547) [20:33:04] done, sorry for the spam [20:33:17] i doint think that is spam. [20:34:56] paladox its only spam if grrrit-wm crashes [20:35:00] @ apergos [20:35:06] LOL [20:35:22] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [20:35:30] that or labs explodes again [20:45:22] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [20:47:22] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [21:10:58] 06Operations, 10MediaWiki-General-or-Unknown, 10Traffic: Failure to save recent changes - https://phabricator.wikimedia.org/T150503#2809467 (10Marshallsumter) I've made several efforts to well above about 6kB and all have worked!!! Another awesome job by the phabricator volunteers!!! [21:18:22] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [21:20:22] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [21:23:22] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [21:35:22] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [21:58:32] PROBLEM - puppet last run on einsteinium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [22:04:19] 07Puppet, 10Beta-Cluster-Infrastructure, 07Tracking: Deployment-prep hosts with puppet errors (tracking) - https://phabricator.wikimedia.org/T132259#2809567 (10Krenair) [22:06:37] 07Puppet, 10Beta-Cluster-Infrastructure, 05Goal: Remove all ::beta roles in puppet - https://phabricator.wikimedia.org/T86644#973295 (10Krenair) I've been doing some of this recently: https://gerrit.wikimedia.org/r/#/c/322403/ - role::beta::uploadservice https://gerrit.wikimedia.org/r/#/c/322404/ - role::bet... [22:27:32] RECOVERY - puppet last run on einsteinium is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [22:42:02] !log testing [22:42:21] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [22:42:23] !Log testting [22:44:15] ^^ [22:44:19] Reedy [22:44:57] What do you want me to do about it? [23:02:42] PROBLEM - puppet last run on cp1060 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [23:06:06] (03PS1) 10Alex Monk: beta: Move beta-specific VHosts into their own apache config file [puppet] - 10https://gerrit.wikimedia.org/r/322601 (https://phabricator.wikimedia.org/T1256) [23:06:09] (03PS1) 10Alex Monk: Move production apache config files to templates [puppet] - 10https://gerrit.wikimedia.org/r/322602 (https://phabricator.wikimedia.org/T1256) [23:06:11] (03PS1) 10Alex Monk: Use production apache config on beta [puppet] - 10https://gerrit.wikimedia.org/r/322603 (https://phabricator.wikimedia.org/T1256) [23:06:13] (03PS1) 10Alex Monk: Get rid of old beta_sites class now just containing a load of ensure => absent [puppet] - 10https://gerrit.wikimedia.org/r/322604 (https://phabricator.wikimedia.org/T1256) [23:07:42] (03CR) 10jenkins-bot: [V: 04-1] Move production apache config files to templates [puppet] - 10https://gerrit.wikimedia.org/r/322602 (https://phabricator.wikimedia.org/T1256) (owner: 10Alex Monk) [23:07:48] (03CR) 10jenkins-bot: [V: 04-1] Use production apache config on beta [puppet] - 10https://gerrit.wikimedia.org/r/322603 (https://phabricator.wikimedia.org/T1256) (owner: 10Alex Monk) [23:08:14] (03CR) 10jenkins-bot: [V: 04-1] Get rid of old beta_sites class now just containing a load of ensure => absent [puppet] - 10https://gerrit.wikimedia.org/r/322604 (https://phabricator.wikimedia.org/T1256) (owner: 10Alex Monk) [23:10:41] (03PS2) 10Alex Monk: Move production apache config files to templates [puppet] - 10https://gerrit.wikimedia.org/r/322602 (https://phabricator.wikimedia.org/T1256) [23:10:43] (03PS2) 10Alex Monk: Use production apache config on beta [puppet] - 10https://gerrit.wikimedia.org/r/322603 (https://phabricator.wikimedia.org/T1256) [23:10:45] (03PS2) 10Alex Monk: Get rid of old beta_sites class now just containing a load of ensure => absent [puppet] - 10https://gerrit.wikimedia.org/r/322604 (https://phabricator.wikimedia.org/T1256) [23:15:22] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] [23:16:42] (03PS3) 10Alex Monk: Move some production apache config files to templates [puppet] - 10https://gerrit.wikimedia.org/r/322602 (https://phabricator.wikimedia.org/T1256) [23:16:44] (03PS3) 10Alex Monk: Use production apache config on beta [puppet] - 10https://gerrit.wikimedia.org/r/322603 (https://phabricator.wikimedia.org/T1256) [23:16:46] (03PS3) 10Alex Monk: Get rid of old beta_sites class now just containing a load of ensure => absent [puppet] - 10https://gerrit.wikimedia.org/r/322604 (https://phabricator.wikimedia.org/T1256) [23:27:42] (03PS4) 10Alex Monk: Move some production apache config files to templates [puppet] - 10https://gerrit.wikimedia.org/r/322602 (https://phabricator.wikimedia.org/T1256) [23:27:44] (03PS4) 10Alex Monk: Use production apache config on beta [puppet] - 10https://gerrit.wikimedia.org/r/322603 (https://phabricator.wikimedia.org/T1256) [23:27:46] (03PS4) 10Alex Monk: Get rid of old beta_sites class now just containing a load of ensure => absent [puppet] - 10https://gerrit.wikimedia.org/r/322604 (https://phabricator.wikimedia.org/T1256) [23:31:36] RECOVERY - puppet last run on cp1060 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [23:34:16] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [1000.0] [23:39:16] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [1000.0] [23:46:46] PROBLEM - puppet last run on mw1241 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [23:53:16] RECOVERY - Esams HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0]