[00:06:08] !log lists: creating new list wikimedia-nys (Noongar language) (T159499) [00:06:13] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:06:13] T159499: Create a Noongar (nys) Wikimedia community mailing list - https://phabricator.wikimedia.org/T159499 [00:11:08] 06Operations, 10Wikimedia-Mailing-lists: Reach out to Google about @yahoo.com emails not reaching gmail inboxes (when sent to mailing lists) - https://phabricator.wikimedia.org/T146841#2672637 (10Dzahn) @Paladox you recently said that yahoo unblocked us and you are receiving mail again? [00:13:27] PROBLEM - puppet last run on db1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [00:22:25] 06Operations, 10Wikimedia-Mailing-lists: Reach out to Google about @yahoo.com emails not reaching gmail inboxes (when sent to mailing lists) - https://phabricator.wikimedia.org/T146841#3111584 (10Paladox) Yep but this task is about emails not reaching gmail users from a yahoo email address if you use the maili... [00:22:28] RECOVERY - puppet last run on wdqs1002 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [00:30:44] 06Operations, 10Analytics, 10DBA: Prep to decommission old dbstore hosts (db1046, db1047) - https://phabricator.wikimedia.org/T156844#3111604 (10whym) Thanks for the ping. To be honest, I don't remember precisely what tables I had there (it was a long time ago), but I believe mines are safe to delete at this... [00:41:27] RECOVERY - puppet last run on db1001 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [01:46:17] 06Operations, 06Performance-Team, 10Wikidata, 10Wikimedia-Site-requests, 07Performance: Increase $wgExpensiveParserFunctionLimit on nowiki - https://phabricator.wikimedia.org/T160685#3111684 (10Krinkle) >>! In T160685#3109425, @jeblad wrote: > Btw, the code at Wikidata can be found at [[ https://www.wiki... [02:30:37] !log l10nupdate@tin scap sync-l10n completed (1.29.0-wmf.16) (duration: 11m 29s) [02:30:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:36:00] !log l10nupdate@tin ResourceLoader cache refresh completed at Sat Mar 18 02:36:00 UTC 2017 (duration 5m 23s) [02:36:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:49:30] (03PS1) 10Catrope: Add rcenhancedfilters to BetaFeatures whitelist [mediawiki-config] - 10https://gerrit.wikimedia.org/r/343435 [02:49:32] (03PS1) 10Catrope: Enable RCFilters beta feature on test wikis and mw.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/343436 [02:49:34] (03PS1) 10Catrope: Enable RCFilters beta feature on plwiki and ptwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/343437 [02:49:36] (03PS1) 10Catrope: Enable RCFilters beta feature on fawiki, nlwiki, ruwiki, trwiki, cswiki and Wikidata [mediawiki-config] - 10https://gerrit.wikimedia.org/r/343438 [02:49:38] (03PS1) 10Catrope: Enable RCFilters beta feature on enwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/343439 [02:50:15] (03CR) 10Catrope: [C: 04-2] "Not until March 28th" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/343437 (owner: 10Catrope) [02:50:26] (03CR) 10Catrope: [C: 04-2] "Not until it's ready" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/343438 (owner: 10Catrope) [02:54:12] (03CR) 10Catrope: [C: 04-2] "Not until it's ready" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/343439 (owner: 10Catrope) [02:54:12] (03CR) 10Jforrester: [C: 031] "Good to go; this is Beta Features product owner sign-off." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/343435 (owner: 10Catrope) [03:05:37] PROBLEM - puppet last run on ms-be1023 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [03:34:37] RECOVERY - puppet last run on ms-be1023 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [04:41:37] PROBLEM - puppet last run on es1013 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [05:10:37] RECOVERY - puppet last run on es1013 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [06:30:09] 06Operations, 06Office-IT, 15User-Urbanecm: Request for email address seniori@wikimedia.org - https://phabricator.wikimedia.org/T160400#3111899 (10Urbanecm) >>! In T160400#3111002, @eross wrote: > Thank you @Dzahn ! > > @Urbanecm I was able to create the account; I sent a couple of test emails including my... [06:36:37] PROBLEM - puppet last run on labstore1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [07:04:37] RECOVERY - puppet last run on labstore1003 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [07:12:47] PROBLEM - puppet last run on wtp1024 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [07:40:47] RECOVERY - puppet last run on wtp1024 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [08:19:37] PROBLEM - puppet last run on mw1206 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [08:47:37] RECOVERY - puppet last run on mw1206 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [09:23:37] 06Operations, 06Performance-Team, 06Reading-Infrastructure-Team, 06Reading-Web-Backlog, and 4 others: Deploy TemplateStyles to WMF production - https://phabricator.wikimedia.org/T133410#2231032 (10Nirmos) T138420 needs to be solved before this happens. TemplateStyles is built on the assumption that there a... [09:51:57] PROBLEM - puppet last run on labvirt1007 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:08:17] PROBLEM - puppet last run on cp3038 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:16:57] PROBLEM - puppet last run on copper is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:20:57] RECOVERY - puppet last run on labvirt1007 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [10:36:17] RECOVERY - puppet last run on cp3038 is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures [10:43:57] RECOVERY - puppet last run on copper is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [10:51:47] PROBLEM - puppet last run on mw1274 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [11:00:37] PROBLEM - puppet last run on db1039 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [11:19:47] RECOVERY - puppet last run on mw1274 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [11:28:37] RECOVERY - puppet last run on db1039 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [12:02:18] 06Operations, 06Performance-Team, 10Wikidata, 10Wikimedia-Site-requests, 07Performance: Increase $wgExpensiveParserFunctionLimit on nowiki - https://phabricator.wikimedia.org/T160685#3107696 (10Lydia_Pintscher) Yes before it is increased please see if other methods like getting the latest version of the... [12:14:47] PROBLEM - puppet last run on mc1028 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [12:16:44] (03CR) 10Giuseppe Lavagetto: Add stages to manage maintenance (033 comments) [switchdc] - 10https://gerrit.wikimedia.org/r/342806 (https://phabricator.wikimedia.org/T160178) (owner: 10Giuseppe Lavagetto) [12:18:47] (03CR) 10Giuseppe Lavagetto: Add stages to manage maintenance (032 comments) [switchdc] - 10https://gerrit.wikimedia.org/r/342806 (https://phabricator.wikimedia.org/T160178) (owner: 10Giuseppe Lavagetto) [12:21:37] PROBLEM - Hadoop NodeManager on analytics1028 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager [12:22:35] expired downtime --^ [12:22:36] fixing [12:42:47] RECOVERY - puppet last run on mc1028 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [13:00:47] PROBLEM - puppet last run on francium is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [13:13:47] PROBLEM - puppet last run on mc1007 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [13:23:43] 06Operations, 06Performance-Team, 06Reading-Infrastructure-Team, 06Reading-Web-Backlog, and 4 others: Deploy TemplateStyles to WMF production - https://phabricator.wikimedia.org/T133410#3112106 (10Nirmos) I have guarded svwiki against this now: https://sv.wikipedia.org/wiki/MediaWiki:Gadget-TemplateStylesG... [13:28:47] RECOVERY - puppet last run on francium is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [13:41:47] RECOVERY - puppet last run on mc1007 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [14:39:17] (03PS10) 10Giuseppe Lavagetto: Add stages to manage maintenance [switchdc] - 10https://gerrit.wikimedia.org/r/342806 (https://phabricator.wikimedia.org/T160178) [14:39:19] (03PS3) 10Giuseppe Lavagetto: Switch old tasks to use remote.Remote [switchdc] - 10https://gerrit.wikimedia.org/r/343319 [14:57:04] (03PS11) 10BBlack: [POC] DNS zones to puppet repo [puppet] - 10https://gerrit.wikimedia.org/r/342887 [15:08:47] PROBLEM - puppet last run on mc1015 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [15:36:48] RECOVERY - puppet last run on mc1015 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [15:37:17] PROBLEM - puppet last run on cp4008 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [16:06:17] RECOVERY - puppet last run on cp4008 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [16:10:42] (03PS12) 10BBlack: [POC] DNS zones to puppet repo [puppet] - 10https://gerrit.wikimedia.org/r/342887 [16:18:28] (03PS1) 10Andrew Bogott: nfs-exportd: Properly collect IPs for volume exports. [puppet] - 10https://gerrit.wikimedia.org/r/343447 (https://phabricator.wikimedia.org/T160818) [16:18:30] (03PS1) 10Andrew Bogott: nfs-exportd: flake8 fixes [puppet] - 10https://gerrit.wikimedia.org/r/343448 [17:02:48] PROBLEM - puppet last run on mw1252 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:03:28] (03PS13) 10BBlack: [POC] DNS zones to puppet repo [puppet] - 10https://gerrit.wikimedia.org/r/342887 [17:06:16] (03PS14) 10BBlack: [POC] DNS zones to puppet repo [puppet] - 10https://gerrit.wikimedia.org/r/342887 [17:10:47] PROBLEM - puppet last run on dbproxy1004 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [17:30:48] RECOVERY - puppet last run on mw1252 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [17:31:12] (03PS15) 10BBlack: [POC] DNS zones to puppet repo [puppet] - 10https://gerrit.wikimedia.org/r/342887 [17:49:48] PROBLEM - puppet last run on mw1279 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [18:03:47] PROBLEM - puppet last run on kafka1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [18:05:47] PROBLEM - Host db1094 is DOWN: PING CRITICAL - Packet loss = 100% [18:09:47] RECOVERY - puppet last run on dbproxy1004 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [18:16:14] cannot connect to db1094 [18:17:47] RECOVERY - puppet last run on mw1279 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [18:19:06] seems freezed on serial console [18:20:34] !log powercycling db1094 [18:20:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:22:07] 06Operations, 10ops-eqiad, 10DBA: db1094 crash - https://phabricator.wikimedia.org/T160832#3112255 (10jcrespo) [18:23:57] RECOVERY - Host db1094 is UP: PING OK - Packet loss = 0%, RTA = 0.24 ms [18:29:17] (03PS1) 10Jcrespo: mariadb: Depool db1094 after crash [mediawiki-config] - 10https://gerrit.wikimedia.org/r/343451 (https://phabricator.wikimedia.org/T160832) [18:31:47] RECOVERY - puppet last run on kafka1003 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [18:34:13] (03PS2) 10Jcrespo: mariadb: Depool db1094 after crash [mediawiki-config] - 10https://gerrit.wikimedia.org/r/343451 (https://phabricator.wikimedia.org/T160832) [18:36:00] (03CR) 10Jcrespo: [V: 032 C: 032] mariadb: Depool db1094 after crash [mediawiki-config] - 10https://gerrit.wikimedia.org/r/343451 (https://phabricator.wikimedia.org/T160832) (owner: 10Jcrespo) [18:37:19] (03CR) 10jenkins-bot: mariadb: Depool db1094 after crash [mediawiki-config] - 10https://gerrit.wikimedia.org/r/343451 (https://phabricator.wikimedia.org/T160832) (owner: 10Jcrespo) [18:38:30] !log jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1094 after crash (duration: 01m 02s) [18:38:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:15:55] 06Operations, 10ops-eqiad, 10DBA, 13Patch-For-Review: db1094 crash - https://phabricator.wikimedia.org/T160832#3112255 (10Marostegui) Did it page?? I never got any page or was it frozen but mysql was still up but unresponsive? [19:23:05] (03PS2) 10Andrew Bogott: nfs-exportd: Properly collect IPs for volume exports. [puppet] - 10https://gerrit.wikimedia.org/r/343447 (https://phabricator.wikimedia.org/T160818) [19:23:07] (03PS2) 10Andrew Bogott: nfs-exportd: flake8 fixes [puppet] - 10https://gerrit.wikimedia.org/r/343448 [19:37:46] (03PS1) 10Andrew Bogott: nfs-exportd: Handle missing projects [puppet] - 10https://gerrit.wikimedia.org/r/343453 (https://phabricator.wikimedia.org/T160818) [19:41:55] 06Operations, 10ops-eqiad, 10DBA, 13Patch-For-Review: db1094 crash - https://phabricator.wikimedia.org/T160832#3112339 (10jcrespo) It didn't page. db1094 web interface is impossible to access- while I can access with no problem to db1093 and db1095. :-/ [19:42:26] (03PS1) 10Rush: nfs-exportd: remove sugarcrm project from all exports [puppet] - 10https://gerrit.wikimedia.org/r/343454 [19:43:52] !log test on labstore1004 nfs-exportd candidate /root/nfs-exportd-candidate.py --observer-pass xxxxxx --interval 0 --config-path /etc/nfs-mounts.yaml --exports-d-path /root/fake_export/ --debug [19:43:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:44:09] (03CR) 10Andrew Bogott: [C: 031] "I'd like to keep sugarcrm in as a test case for 343453 and merge after we're sure it's handled properly." [puppet] - 10https://gerrit.wikimedia.org/r/343454 (owner: 10Rush) [19:45:09] (03CR) 10Rush: [C: 032] nfs-exportd: remove sugarcrm project from all exports [puppet] - 10https://gerrit.wikimedia.org/r/343454 (owner: 10Rush) [19:50:37] 06Operations, 10ops-eqiad, 10DBA: db1094 crash - https://phabricator.wikimedia.org/T160832#3112340 (10jcrespo) [19:50:47] PROBLEM - puppet last run on mw1238 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [19:56:36] 06Operations, 10ops-eqiad, 10DBA: db1094 crash - https://phabricator.wikimedia.org/T160832#3112346 (10Marostegui) Apart from contacting HP we might as well upgrade the firmware if possible anyways: http://h20564.www2.hpe.com/hpsc/doc/public/display?docId=emr_na-c05187655&sp4ts.oid=7252820 [20:06:31] (03CR) 10Rush: [C: 031] nfs-exportd: Properly collect IPs for volume exports. [puppet] - 10https://gerrit.wikimedia.org/r/343447 (https://phabricator.wikimedia.org/T160818) (owner: 10Andrew Bogott) [20:06:35] (03PS3) 10Rush: nfs-exportd: Properly collect IPs for volume exports. [puppet] - 10https://gerrit.wikimedia.org/r/343447 (https://phabricator.wikimedia.org/T160818) (owner: 10Andrew Bogott) [20:08:50] (03CR) 10Andrew Bogott: [C: 032] nfs-exportd: Properly collect IPs for volume exports. [puppet] - 10https://gerrit.wikimedia.org/r/343447 (https://phabricator.wikimedia.org/T160818) (owner: 10Andrew Bogott) [20:08:59] (03CR) 10Andrew Bogott: [C: 032] nfs-exportd: flake8 fixes [puppet] - 10https://gerrit.wikimedia.org/r/343448 (owner: 10Andrew Bogott) [20:09:07] (03PS3) 10Andrew Bogott: nfs-exportd: flake8 fixes [puppet] - 10https://gerrit.wikimedia.org/r/343448 [20:09:44] (03CR) 10Andrew Bogott: [C: 032] nfs-exportd: Handle missing projects [puppet] - 10https://gerrit.wikimedia.org/r/343453 (https://phabricator.wikimedia.org/T160818) (owner: 10Andrew Bogott) [20:10:01] (03PS2) 10Andrew Bogott: nfs-exportd: Handle missing projects [puppet] - 10https://gerrit.wikimedia.org/r/343453 (https://phabricator.wikimedia.org/T160818) [20:16:43] !log labstore1005 service nfs-exportd restart [20:17:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:18:47] RECOVERY - puppet last run on mw1238 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [20:19:57] PROBLEM - puppet last run on es1016 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [20:20:40] 06Operations, 06Labs: Add monitoring for nfs-exportd on active labstore specifically - https://phabricator.wikimedia.org/T160838#3112370 (10chasemp) [20:23:20] (03PS1) 10Andrew Bogott: nfs-exportd: Refresh service if script or .yaml changes. [puppet] - 10https://gerrit.wikimedia.org/r/343459 [20:48:57] RECOVERY - puppet last run on es1016 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [21:22:57] PROBLEM - puppet last run on analytics1036 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [21:23:58] PROBLEM - puppet last run on mc1016 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [21:50:57] RECOVERY - puppet last run on analytics1036 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [21:51:57] RECOVERY - puppet last run on mc1016 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [22:16:57] PROBLEM - puppet last run on ms-fe1001 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [22:40:27] PROBLEM - puppet last run on cp3043 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [22:43:57] RECOVERY - puppet last run on ms-fe1001 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [23:08:27] RECOVERY - puppet last run on cp3043 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [23:15:04] bblack, have you ever measured netmapper's performance? How many lookups can it do per second? [23:18:07] PROBLEM - puppet last run on db1034 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [23:47:07] RECOVERY - puppet last run on db1034 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures