[00:00:07] twentyafterfour: Respected human, time to deploy Phabricator update (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20160811T0000). Please do the needful. [00:01:52] OK, with the excitement over, I'm off. [00:02:05] 06Operations, 10ops-codfw: mw2086 was down - https://phabricator.wikimedia.org/T142661#2542466 (10Dzahn) once/if fixed https://gerrit.wikimedia.org/r/#/c/304139/ should be reverted please [00:02:30] RECOVERY - nodepoold running on labnodepool1001 is OK: PROCS OK: 1 process with UID = 113 (nodepool), regex args ^/usr/bin/python /usr/bin/nodepoold -d [00:02:56] James_F|Away: take care [00:04:00] 06Operations, 10ops-codfw: mw2086 was down - https://phabricator.wikimedia.org/T142661#2542482 (10Papaul) @Dzahn I was working on this with Moritz to troubleshoot IPMI issue he was having so he had to take the system offline. He planned on re imaging the node tomorrow. [00:04:06] 06Operations, 10ops-codfw: mw2086 was down - https://phabricator.wikimedia.org/T142661#2542483 (10Dzahn) [00:04:50] PROBLEM - Unmerged changes on repository mediawiki_config on mira is CRITICAL: There is one unmerged change in mediawiki_config (dir /srv/mediawiki-staging/, ref HEAD..readonly/master). [00:05:15] what is that ^? [00:05:19] or, who is that ^^ [00:05:46] 06Operations, 10MediaWiki-Database, 06Performance-Team: periodic spike of MW exceptions "DB connection was already closed or the connection dropped." - https://phabricator.wikimedia.org/T142079#2542509 (10aaron) p:05Triage>03Normal [00:06:13] Did " Enabling thank-you-edit on beta for testing." get deployed/ [00:06:44] looks like it [00:07:25] greg-g: it's missing the reinstate commit [00:07:56] scap sync-wikiversions doesn't sync git stuff around? [00:08:48] BTW, beers all around, guys. :) [00:08:51] I think it has been said that, no it doesn't do the co-master sync properly [00:09:59] Is there a bug filed? :) [00:10:12] Reedy: :/ it doesn't and I see why. I'll JFDI and make a fix [00:10:14] * Reedy sync-file's wikiversions.json [00:10:20] <3 [00:10:40] RECOVERY - Unmerged changes on repository mediawiki_config on mira is OK: No changes to merge. [00:10:42] !log reedy@tin Synchronized wikiversions.json: noop for mira (duration: 00m 49s) [00:10:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [00:11:51] Reedy: James_F|Away who's on writing the report? [00:12:08] I can do it, but it'll have to wait for morning [00:12:13] It's after 1am already [00:12:30] Don't mind fixing up the mess(es) I've caused. Screw writing reports at this time of night ;) [00:12:41] 1am here too :) [00:15:50] That looks like a spam bot [00:15:59] What does? [00:16:00] since it was blocked in #mediawiki [00:16:24] * Platonides sets ban on *!*@95.141.36.119 [00:16:25] * Platonides has kicked _jem_ from #mediawiki (_jem_) [00:18:25] lol, kline [00:19:20] PROBLEM - mediawiki-installation DSH group on mw2086 is CRITICAL: Host mw2086 is not in mediawiki-installation dsh group [00:19:30] ^^ that cant be good [00:19:33] nice [00:19:45] well no, that is what was requested [00:19:55] take it out of groups because it was down [00:20:13] yup, so just ack [00:21:00] paladox: it's the trol [00:21:10] oh [00:21:14] yep [00:21:17] he was impersonating jem [00:21:24] oh [00:21:42] ACKNOWLEDGEMENT - mediawiki-installation DSH group on mw2086 is CRITICAL: Host mw2086 is not in mediawiki-installation dsh group daniel_zahn https://phabricator.wikimedia.org/T142661 [00:22:13] actually, my kickban was done automatically by a script :) [00:22:39] :) [00:23:09] * paladox is going to watch tv 01:22am bst here [00:23:26] Platonides: what was the ban for anyway :P [00:23:32] flooding [00:23:39] ah [00:24:04] Zppix: all those are proxies/compromised hosts [00:24:13] all those? [00:24:33] he enters in the channels and starts flooding / shouting insults [00:24:38] how do you detect that they are proxies? [00:24:56] run them against a database of know proxies i assume [00:25:09] just like how en wiki has procsee bot or whatever [00:25:14] I don't know if they are proxies actually [00:25:25] but he connects from all over the world [00:25:39] probably vpns then Platonides [00:25:40] thus... that doesn't seem his home dsl :) [00:25:59] he could have fibre [00:26:03] or fttc [00:26:10] or dial-up [00:26:15] lol dial-up [00:26:17] lol [00:26:28] that would be the slowess thing [00:26:37] he seems to be really quick. [00:26:51] The noise would suck too :P [00:27:00] yeh [00:27:10] Let me just blast some alien sounding noises into ur brain [00:27:17] lol [00:27:33] Brb bathroom [00:31:45] LOL [00:33:40] http://knowyourmeme.com/memes/good-luck-im-behind-7-proxies [00:35:04] (03CR) 10Dzahn: [C: 032] Stop icinga git remote update [puppet] - 10https://gerrit.wikimedia.org/r/303955 (https://phabricator.wikimedia.org/T127093) (owner: 10Thcipriani) [00:37:37] lol mutante [00:39:37] !log aaron@tin Synchronized php-1.28.0-wmf.13/maintenance/purgeChangedFiles.php: 4fe4f803541c4b21c5a118eab426d78d6c0c607b (duration: 00m 50s) [00:39:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [00:40:37] 06Operations, 10Ops-Access-Requests: Access for platonides to chanops - https://phabricator.wikimedia.org/T142668#2542616 (10Platonides) [00:41:38] 06Operations, 10Wikimedia-Mailing-lists: cleanup mailman archives - introduce apache rewrites - https://phabricator.wikimedia.org/T109609#2542631 (10Dzahn) I did this for wikidata-l -> wikidata with https://gerrit.wikimedia.org/r/#/c/304055/ It was separately requested in T136798 [00:41:51] !log aaron@tin Synchronized php-1.28.0-wmf.14/maintenance/purgeChangedFiles.php: 91baa668219d8a77b1e300c257e969b04b4b1e3d (duration: 00m 47s) [00:41:55] greg-g: who knew sourdough grilled cheese and tomato would be so good [00:41:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [00:42:00] * AaronSchulz is an addict [00:46:59] AaronSchulz: https://themelt.com/menu -> The Grilled Cheese -> Egg-In-A-Hole -> additions Tomato [00:47:41] * AaronSchulz is just doing it at home [00:47:53] I mean, it's not super hard ;) [00:47:56] bd808: Is a self._before_cluster_sync() call not needed before sync_wikiversions? [00:48:18] man, this purge script is slow...I wonder if it's swift [00:48:47] probably the latency of codfw [00:49:20] LocalFile api makes it hard to batch smartly [00:55:30] (03CR) 10Krinkle: [C: 031] rcstream: remove internal TLS listener [puppet] - 10https://gerrit.wikimedia.org/r/304023 (https://phabricator.wikimedia.org/T134871) (owner: 10BBlack) [00:56:29] How do u change ur quit msg anyway? [00:57:00] Zppix: you can do /quit message [00:57:16] some clients allow configuring a default [00:58:45] welp that worked... [01:00:16] isn't that a bit long? [01:00:51] (03PS1) 10Dzahn: wikimedia.org: repeat hostname on each line for multi records [dns] - 10https://gerrit.wikimedia.org/r/304155 [01:00:56] does it actually save tho [01:01:07] Platonides: i cant help i do some many things with WMF :P [01:01:11] MVD, ENWPSV, ENWPRC, PCR (tm) [01:01:55] Platonides: If i had a biz card for what i did on WMF it would probably be the length of a normal piece of paper lol [01:03:00] whats the difference between postmerge and post on Zuul [01:03:08] need badges for that stuff on phab :) [01:05:30] (03CR) 10Dzahn: [C: 031] rcstream: remove internal TLS listener [puppet] - 10https://gerrit.wikimedia.org/r/304023 (https://phabricator.wikimedia.org/T134871) (owner: 10BBlack) [01:06:47] (03CR) 10Dzahn: "ok, fair. then i'm abandoning. (all i did manually was "git pull origin" fwiw)" [puppet] - 10https://gerrit.wikimedia.org/r/303719 (owner: 10Dzahn) [01:07:01] (03Abandoned) 10Dzahn: dbtree: ensure puppet is deploying changes [puppet] - 10https://gerrit.wikimedia.org/r/303719 (owner: 10Dzahn) [01:15:23] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1001 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [50.0] [01:16:20] ^^ sounds anything but good [01:17:20] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1001 is OK: OK: Less than 1.00% above the threshold [25.0] [01:20:41] looks like commons purging pass is done...the thumbnail I was watching looks normal now [01:21:14] AaronSchulz: i think jenkins is prob on fire atm [01:22:00] 06Operations, 10Ops-Access-Requests: Access for platonides to chanops - https://phabricator.wikimedia.org/T142668#2542616 (10Paladox) +1 [01:22:30] (03PS2) 10Alex Monk: [WIP] dnsrecursor: Rewrite code setting up lua hooks [puppet] - 10https://gerrit.wikimedia.org/r/304146 (https://phabricator.wikimedia.org/T139438) [02:01:13] (03PS1) 10Dzahn: planet: more maintenance, http->https, rm broken urls etc [puppet] - 10https://gerrit.wikimedia.org/r/304159 [02:02:13] (03PS2) 10Dzahn: planet: more maintenance, http->https, rm broken urls etc [puppet] - 10https://gerrit.wikimedia.org/r/304159 (https://phabricator.wikimedia.org/T141480) [02:06:28] (03CR) 10Dzahn: "@Niharika, do you have a new blog that replaces the one on roon.io? it seems that whole site redirects to ghost.org now" [puppet] - 10https://gerrit.wikimedia.org/r/304159 (https://phabricator.wikimedia.org/T141480) (owner: 10Dzahn) [02:08:44] mutante: did you use a tool to detect equivalence between http and https versions? [02:09:59] n [02:11:56] (03CR) 10Dereckson: [C: 031] "Removal seems indeed to match dead blogs, https transition looks good to me." [puppet] - 10https://gerrit.wikimedia.org/r/304159 (https://phabricator.wikimedia.org/T141480) (owner: 10Dzahn) [02:12:08] (03PS1) 10Yuvipanda: prometheus: Use 4sp indent rather than 2sp [puppet] - 10https://gerrit.wikimedia.org/r/304160 [02:12:10] (03PS1) 10Yuvipanda: tools: Add ssh checks to prometheus [puppet] - 10https://gerrit.wikimedia.org/r/304161 [02:13:11] (03CR) 10Niharika29: "HI @Dzahn, yeah ghost.org took over soon.io and shut it down. I don't have another blog, you can remove that record." [puppet] - 10https://gerrit.wikimedia.org/r/304159 (https://phabricator.wikimedia.org/T141480) (owner: 10Dzahn) [02:18:07] (03PS2) 10Yuvipanda: prometheus: Use 4sp indent rather than 2sp [puppet] - 10https://gerrit.wikimedia.org/r/304160 [02:18:40] PROBLEM - puppet last run on elastic2017 is CRITICAL: CRITICAL: Puppet has 1 failures [02:19:28] (03PS2) 10Yuvipanda: tools: Add ssh checks to prometheus [puppet] - 10https://gerrit.wikimedia.org/r/304161 [02:27:38] !log mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.13) (duration: 11m 12s) [02:27:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [02:32:13] (03CR) 10jenkins-bot: [V: 04-1] [WIP] dnsrecursor: Rewrite code setting up lua hooks [puppet] - 10https://gerrit.wikimedia.org/r/304146 (https://phabricator.wikimedia.org/T139438) (owner: 10Alex Monk) [02:38:52] (03PS3) 10Alex Monk: [WIP] dnsrecursor: Rewrite code setting up lua hooks [puppet] - 10https://gerrit.wikimedia.org/r/304146 (https://phabricator.wikimedia.org/T139438) [02:41:53] (03CR) 10Yuvipanda: [C: 032 V: 032] prometheus: Use 4sp indent rather than 2sp [puppet] - 10https://gerrit.wikimedia.org/r/304160 (owner: 10Yuvipanda) [02:43:26] (03PS3) 10Yuvipanda: tools: Add ssh checks to prometheus [puppet] - 10https://gerrit.wikimedia.org/r/304161 [02:43:28] (03PS1) 10Yuvipanda: prometheus: Restore node exporter prefix requirements [puppet] - 10https://gerrit.wikimedia.org/r/304164 [02:45:52] RECOVERY - puppet last run on elastic2017 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [02:49:33] !log mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.14) (duration: 10m 24s) [02:49:39] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [02:56:14] (03PS2) 10Yuvipanda: prometheus: Restore node exporter prefix requirements [puppet] - 10https://gerrit.wikimedia.org/r/304164 [02:56:16] (03PS4) 10Yuvipanda: tools: Add ssh checks to prometheus [puppet] - 10https://gerrit.wikimedia.org/r/304161 [02:56:28] !log l10nupdate@tin ResourceLoader cache refresh completed at Thu Aug 11 02:56:27 UTC 2016 (duration 6m 54s) [02:56:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [03:14:22] (03CR) 10Yuvipanda: [C: 032] tools: Add ssh checks to prometheus [puppet] - 10https://gerrit.wikimedia.org/r/304161 (owner: 10Yuvipanda) [03:14:28] (03CR) 10Yuvipanda: [C: 032] prometheus: Restore node exporter prefix requirements [puppet] - 10https://gerrit.wikimedia.org/r/304164 (owner: 10Yuvipanda) [03:32:13] (03PS1) 10Yuvipanda: prometheus: Don't require jessie for node exporter [puppet] - 10https://gerrit.wikimedia.org/r/304167 [03:32:35] (03CR) 10Yuvipanda: [C: 032 V: 032] prometheus: Don't require jessie for node exporter [puppet] - 10https://gerrit.wikimedia.org/r/304167 (owner: 10Yuvipanda) [03:34:10] PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 896.03 seconds [03:39:50] RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 245.54 seconds [03:59:00] rat [03:59:31] ? [04:10:30] PROBLEM - Make sure enwiki dumps are not empty on checker.tools.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 SERVICE UNAVAILABLE - string OK not found on http://checker.tools.wmflabs.org:80/dumps - 288 bytes in 0.011 second response time [05:02:47] AaronSchulz: as long as it's not from The Melt, it probably is :) [05:07:07] hehee [05:10:45] That wasn't creepy at all... [05:10:52] The laugh anyway... [05:15:20] mutante: I said that before I saw your follow-up :) [05:16:34] Dereckson: the tool i used was "bash, curl, grep, sed .. bla " .. please see https://phabricator.wikimedia.org/P3817 [05:16:40] greg-g: hehee, even better:) [05:18:24] bbl [05:23:14] (03PS1) 10Yuvipanda: labs: Allow ssh access from per-project prometheus hosts [puppet] - 10https://gerrit.wikimedia.org/r/304169 [05:29:39] (03PS3) 10Dzahn: planet: more maintenance, http->https, rm broken urls etc [puppet] - 10https://gerrit.wikimedia.org/r/304159 (https://phabricator.wikimedia.org/T141480) [05:31:21] (03CR) 10Dzahn: [C: 032] planet: more maintenance, http->https, rm broken urls etc [puppet] - 10https://gerrit.wikimedia.org/r/304159 (https://phabricator.wikimedia.org/T141480) (owner: 10Dzahn) [05:31:41] (03PS4) 10Dzahn: planet: more maintenance, http->https, rm broken urls etc [puppet] - 10https://gerrit.wikimedia.org/r/304159 (https://phabricator.wikimedia.org/T141480) [05:34:03] (03PS3) 10Dzahn: Stop icinga git remote update [puppet] - 10https://gerrit.wikimedia.org/r/303955 (https://phabricator.wikimedia.org/T127093) (owner: 10Thcipriani) [05:34:47] (03CR) 10Yuvipanda: "Did the repo get created?" [puppet] - 10https://gerrit.wikimedia.org/r/302119 (https://phabricator.wikimedia.org/T141636) (owner: 10Addshore) [05:36:17] (03CR) 10Dzahn: "thanks, merged and submitted now. jenkins is fast again. yay" [puppet] - 10https://gerrit.wikimedia.org/r/303955 (https://phabricator.wikimedia.org/T127093) (owner: 10Thcipriani) [05:37:40] 06Operations, 10Deployment-Systems: error on tin:/srv/mediawiki-staging: insufficient permission for adding an object to repository database .git/objects - https://phabricator.wikimedia.org/T127093#2542840 (10Dzahn) this should mean that the "root owned files on staging"-issue is over with now... :) [05:40:05] 06Operations, 10Deployment-Systems: error on tin:/srv/mediawiki-staging: insufficient permission for adding an object to repository database .git/objects - https://phabricator.wikimedia.org/T127093#2542845 (10greg) p:05Triage>03Normal [05:40:17] 06Operations, 10Deployment-Systems: error on tin:/srv/mediawiki-staging: insufficient permission for adding an object to repository database .git/objects - https://phabricator.wikimedia.org/T127093#2032127 (10greg) a:03thcipriani [05:44:13] 06Operations, 10Deployment-Systems: error on tin:/srv/mediawiki-staging: insufficient permission for adding an object to repository database .git/objects - https://phabricator.wikimedia.org/T127093#2542861 (10Dzahn) ``` [tin:/srv/mediawiki-staging] $ find . -uid 0 ./.git/refs/remotes/readonly/master ./.git/obj... [06:01:21] PROBLEM - Improperly owned -0:0- files in /srv/mediawiki-staging on tin is CRITICAL: Improperly owned (0:0) files in /srv/mediawiki-staging [06:01:54] eh? [06:02:01] but daniel just fixed you! [06:02:51] PROBLEM - Improperly owned -0:0- files in /srv/mediawiki-staging on mira is CRITICAL: Improperly owned (0:0) files in /srv/mediawiki-staging [06:06:24] gjg@tin:/srv/mediawiki-staging$ find . -uid 0 [06:06:25] gjg@tin:/srv/mediawiki-staging$ [06:06:45] gjg@mira:/srv/mediawiki-staging$ find . -uid 0 [06:06:46] gjg@mira:/srv/mediawiki-staging$ [06:07:57] 06Operations, 10Deployment-Systems: error on tin:/srv/mediawiki-staging: insufficient permission for adding an object to repository database .git/objects - https://phabricator.wikimedia.org/T127093#2542920 (10greg) Just to spite you, I think: ``` 06:01 < icinga-wm> PROBLEM - Improperly owned -0:0- files in /... [06:09:52] hilarious [06:38:20] (03PS1) 10Dzahn: wmnet: host names on each line, fix indentation, misc cleanup [dns] - 10https://gerrit.wikimedia.org/r/304171 [06:48:13] (03PS2) 10Muehlenhoff: etcd: Use DOMAIN_NETWORKS [puppet] - 10https://gerrit.wikimedia.org/r/299116 [06:51:43] (03PS2) 10Yuvipanda: labs: Allow ssh access from per-project prometheus hosts [puppet] - 10https://gerrit.wikimedia.org/r/304169 [06:52:22] (03CR) 10Yuvipanda: [C: 032 V: 032] "Puppet compiler says Noop" [puppet] - 10https://gerrit.wikimedia.org/r/304169 (owner: 10Yuvipanda) [06:54:04] (03CR) 10Jcrespo: "We do not pull, we rebase; the issue may have been with how things had been merged (from a different tree)." [puppet] - 10https://gerrit.wikimedia.org/r/303719 (owner: 10Dzahn) [07:13:37] (03PS1) 10ArielGlenn: updated pylint means new pylint cleanup [dumps] - 10https://gerrit.wikimedia.org/r/304179 [07:25:28] (03CR) 10Muehlenhoff: "Thanks, looks good. A patch on the kernel side is expected to land in 4.4.18." [puppet] - 10https://gerrit.wikimedia.org/r/304050 (owner: 10BBlack) [07:38:38] !log increaseing traffic weight to 30 for mw1261 (current: 5) [07:38:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [07:39:03] (03PS1) 10Muehlenhoff: Update to 4.4.17 [debs/linux44] - 10https://gerrit.wikimedia.org/r/304183 [07:59:08] (03CR) 10Muehlenhoff: [C: 032] Update to 4.4.17 [debs/linux44] - 10https://gerrit.wikimedia.org/r/304183 (owner: 10Muehlenhoff) [08:10:47] 06Operations: potassium - 'puppetmaster.test.eqiad.wmnet' did not match server certificate - https://phabricator.wikimedia.org/T141839#2543084 (10akosiaris) 05Open>03Resolved a:03akosiaris Seems like precise machines did not really like the test vhost we had setup on palladium for the upgrade to 3.8. Anyw... [08:11:10] RECOVERY - puppet last run on potassium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [08:13:08] yuvipanda: the repo has been created and the code is in :) [08:18:02] !log restarting kafka on the kafka analytics cluster for jvm upgrades [08:18:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [08:23:49] (03PS1) 10Alexandros Kosiaris: ferm: Remove ALL_NETWORK_SUBNETS [puppet] - 10https://gerrit.wikimedia.org/r/304185 [08:28:40] 07Puppet, 10Continuous-Integration-Config, 07Jenkins: There is no sane way to get arcanist's conduit tokens onto nodepool CI slaves - https://phabricator.wikimedia.org/T140417#2543103 (10mmodell) [08:30:22] (03CR) 10Muehlenhoff: [C: 031] ferm: Remove ALL_NETWORK_SUBNETS [puppet] - 10https://gerrit.wikimedia.org/r/304185 (owner: 10Alexandros Kosiaris) [08:31:52] (03PS4) 10Jcrespo: Remove labsdb::manager [puppet] - 10https://gerrit.wikimedia.org/r/302427 [08:40:00] 06Operations, 10ops-codfw: mw2086 was down - https://phabricator.wikimedia.org/T142661#2543131 (10Peachey88) [08:50:21] !log uploaded apache2 2.4.10-10+deb8u6+wmf2 to reprepro [08:50:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [08:51:59] (03CR) 10ArielGlenn: [C: 032] updated pylint means new pylint cleanup [dumps] - 10https://gerrit.wikimedia.org/r/304179 (owner: 10ArielGlenn) [08:54:06] (03PS1) 10ArielGlenn: remove wikiqueries script, replaced by onallwikis which does more [dumps] - 10https://gerrit.wikimedia.org/r/304187 [08:56:00] PROBLEM - statsv process on hafnium is CRITICAL: PROCS CRITICAL: 0 processes with command name python, args statsv [08:57:00] (03CR) 10Alexandros Kosiaris: [C: 032] ferm: Remove ALL_NETWORK_SUBNETS [puppet] - 10https://gerrit.wikimedia.org/r/304185 (owner: 10Alexandros Kosiaris) [08:57:30] (03CR) 10Alexandros Kosiaris: "pcc happy at https://puppet-compiler.wmflabs.org/3664/carbon.wikimedia.org/" [puppet] - 10https://gerrit.wikimedia.org/r/304185 (owner: 10Alexandros Kosiaris) [08:57:59] 06Operations, 06Commons, 10Wikimedia-SVG-rendering, 07User-notice: SVG files larger than 10 MB cannot be thumbnailed - https://phabricator.wikimedia.org/T111815#1616960 (10Johan) Since this has been tagged for inclusion in Tech News: When will this fix go into production? Do you know? [08:58:54] (03PS2) 10ArielGlenn: remove wikiqueries script, replaced by onallwikis which does more [dumps] - 10https://gerrit.wikimedia.org/r/304187 [09:01:00] (03PS5) 10Jcrespo: Remove labsdb::manager [puppet] - 10https://gerrit.wikimedia.org/r/302427 [09:01:11] !log upgrading httpd to 2.4.10-10+deb8u6+wmf2 on mw126[234] [09:01:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [09:01:16] 06Operations, 06Commons, 10Wikimedia-SVG-rendering, 07User-notice: SVG files larger than 10 MB cannot be thumbnailed - https://phabricator.wikimedia.org/T111815#2543171 (10MoritzMuehlenhoff) @Johan This still needs more tests before it can be enabled in production, I'll update this Phab task when that has... [09:01:52] (03CR) 10Gehel: [C: 04-1] "Thanks for the cleanup! I'd like to check with Erik that all needed functionalities have been correctly migrated to relforge before actual" [puppet] - 10https://gerrit.wikimedia.org/r/304112 (https://phabricator.wikimedia.org/T142581) (owner: 10Dzahn) [09:02:00] RECOVERY - statsv process on hafnium is OK: PROCS OK: 13 processes with command name python, args statsv [09:02:54] (03CR) 10ArielGlenn: [C: 032] remove wikiqueries script, replaced by onallwikis which does more [dumps] - 10https://gerrit.wikimedia.org/r/304187 (owner: 10ArielGlenn) [09:03:37] 06Operations, 06Commons, 10Wikimedia-SVG-rendering, 07User-notice: SVG files larger than 10 MB cannot be thumbnailed - https://phabricator.wikimedia.org/T111815#2543185 (10Johan) OK, thanks. (: [09:04:07] (03PS6) 10Giuseppe Lavagetto: puppetmaster: add role for puppetdb [puppet] - 10https://gerrit.wikimedia.org/r/303801 (https://phabricator.wikimedia.org/T142363) [09:04:53] (03CR) 10Jcrespo: [C: 032] Remove labsdb::manager [puppet] - 10https://gerrit.wikimedia.org/r/302427 (owner: 10Jcrespo) [09:05:59] (03CR) 10Muehlenhoff: [C: 032] etcd: Use DOMAIN_NETWORKS [puppet] - 10https://gerrit.wikimedia.org/r/299116 (owner: 10Muehlenhoff) [09:06:03] (03PS3) 10Muehlenhoff: etcd: Use DOMAIN_NETWORKS [puppet] - 10https://gerrit.wikimedia.org/r/299116 [09:06:08] (03CR) 10Muehlenhoff: [V: 032] etcd: Use DOMAIN_NETWORKS [puppet] - 10https://gerrit.wikimedia.org/r/299116 (owner: 10Muehlenhoff) [09:06:20] I did a puppet merge and got an error [09:06:42] codfw: Removing node mw2086.codfw.wmnet from cluster imagescaler/apache2 [09:06:59] (03PS1) 10Ema: Install package varnish-modules on v4 hosts [puppet] - 10https://gerrit.wikimedia.org/r/304189 (https://phabricator.wikimedia.org/T122881) [09:07:12] moritzm, could that be related to your change? [09:07:35] which change, the etcd one I just merged? [09:07:44] yes [09:07:45] don't think so [09:08:08] Running conftool-sync on /etc/conftool/data [09:08:41] Daniel merged a patch to drop it from dsh yesterday [09:08:46] ERROR:conftool:delete_node Backend error while deleting node: Backend error: The request requires user authentication : Insufficient credentials [09:08:53] <_joe_> sudo -i puppet-merge [09:09:21] oh, requires root uid? [09:09:26] <_joe_> yes [09:09:26] (03PS1) 10Alexandros Kosiaris: Move labtest hiera in regex.yaml [puppet] - 10https://gerrit.wikimedia.org/r/304190 [09:09:55] now things are probably broken [09:10:16] trigger was not executed [09:12:23] but why would that happen in the first place if it is an older patch? [09:12:51] PROBLEM - puppet last run on tin is CRITICAL: CRITICAL: Puppet has 1 failures [09:13:29] not sure, akosiaris, did you also see that error when you merged the ALL_NETWORK_SUBNETS patch? [09:13:30] PROBLEM - statsv process on hafnium is CRITICAL: PROCS CRITICAL: 0 processes with command name python, args statsv [09:15:34] this is probably due to my kafka restarts [09:15:48] checking [09:16:08] (03PS2) 10Muehlenhoff: openldap_labs: Limit to production networks and labs networks [puppet] - 10https://gerrit.wikimedia.org/r/303181 [09:17:27] Aug 11 08:21:29 hafnium python[2175]: 2016-08-11 08:21:29,736 Unable to connect to broker kafka1012.eqiad.wmnet:9092. Continuing. [09:17:30] Aug 11 08:21:29 hafnium python[2175]: 2016-08-11 08:21:29,736 [Errno 111] Connection refused [09:17:57] !log restarting statsv on hafnium (blocked due to analytics kafka brokers restarts) [09:18:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [09:18:07] so right now mw2086 is pooled [09:18:22] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1001 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [50.0] [09:19:11] RECOVERY - statsv process on hafnium is OK: PROCS OK: 13 processes with command name python, args statsv [09:19:26] (03CR) 10DCausse: "I think we have config in wmf-config concerning nobelium." [puppet] - 10https://gerrit.wikimedia.org/r/304112 (https://phabricator.wikimedia.org/T142581) (owner: 10Dzahn) [09:20:14] it is not like a huge issue, but I want to know how to follow up- moritzm you say that mutante may know an appropiate state for that server? [09:22:09] jynus: https://phabricator.wikimedia.org/T142661 [09:22:13] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1001 is OK: OK: Less than 1.00% above the threshold [25.0] [09:22:17] jynus: the server is perfectly fine, Papaul did some hardware check which required the server to be powered down, it was only down for a that diagnosis window, but it should be kept depooled since these systems are reimaged to jessie [09:22:43] see SAL from 2016-08-09 12:26 moritzm: depooling image scalers mw2086-mw2089 for reimaging with jessie [09:22:54] mortiz, I am worried about the pooling state [09:23:08] because my merge failed to update that [09:23:17] check me check on palladium [09:23:20] if it only failed to do a noop [09:23:29] and the server should be pooled [09:23:30] that is ok [09:23:48] (03PS7) 10Giuseppe Lavagetto: puppetmaster: add role for puppetdb [puppet] - 10https://gerrit.wikimedia.org/r/303801 (https://phabricator.wikimedia.org/T142363) [09:24:22] from the ticket I infer it should be not [09:24:48] but maybe just another merge will fix it automatically [09:25:01] moritzm: that's from conftool. I got .etcdrc -> /root/.etcdrc in my homedir [09:26:43] jynus: it was depooled before, so I'm not sure what happened for T142661, but it's still depooled, so all well. it will be reimaged shortly anyway. Papaul had to run a hardware check to check the IPMI error which prevented it from reimaging in the first place [09:26:43] T142661: mw2086 was down - https://phabricator.wikimedia.org/T142661 [09:27:11] it is, I swear I saw it pooled on palladium [09:27:13] ? [09:27:51] it's depooled right now: [09:27:53] jmm@palladium:~$ sudo -i confctl --find --action get mw2086.codfw.wmnet [09:27:55] {"mw2086.codfw.wmnet": {"pooled": "no", "weight": 10}, "tags": "dc=codfw,cluster=imagescaler,service=apache2"} [09:28:14] you are completely right [09:28:34] sorry for the confusion [09:28:47] ok, I'll just close T142661 once it's reimaged and re-added to dsh [09:28:47] T142661: mw2086 was down - https://phabricator.wikimedia.org/T142661 [09:28:58] did I say how difficult is to read json lines :-) [09:29:52] I am sorry again, but I prefer to make noise and be wrong than be right about a problem and not comment it aloud [09:32:08] (03PS1) 10Giuseppe Lavagetto: adding puppetdb credentials stub [labs/private] - 10https://gerrit.wikimedia.org/r/304191 [09:32:36] (03CR) 10Giuseppe Lavagetto: [C: 032 V: 032] adding puppetdb credentials stub [labs/private] - 10https://gerrit.wikimedia.org/r/304191 (owner: 10Giuseppe Lavagetto) [09:34:01] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [50.0] [09:34:06] jynus: sure, thanks for pointing it out [09:34:27] !log removing leftovers of unmaintained skrillex tool from mira and tin [09:34:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [09:35:50] (03PS8) 10Giuseppe Lavagetto: puppetmaster: add role for puppetdb [puppet] - 10https://gerrit.wikimedia.org/r/303801 (https://phabricator.wikimedia.org/T142363) [09:36:00] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1001 is OK: OK: Less than 1.00% above the threshold [25.0] [09:36:12] moritzm, no issues with the ticket itself [09:36:42] it is in no way in a hurry [09:38:11] RECOVERY - puppet last run on tin is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [09:43:40] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1001 is CRITICAL: CRITICAL: 80.00% of data above the critical threshold [50.0] [09:45:31] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1001 is OK: OK: Less than 1.00% above the threshold [25.0] [09:48:08] (03PS1) 10Muehlenhoff: role::dataset::common: Limit to production networks [puppet] - 10https://gerrit.wikimedia.org/r/304192 [09:50:22] jobs are failing in groups of 150K failures [09:53:12] I do not see DB issues, like lag or aborted connections, but failures is from db load balancer saying "connection was unexpectedly closed" [09:53:26] (03CR) 10ArielGlenn: [C: 031] role::dataset::common: Limit to production networks [puppet] - 10https://gerrit.wikimedia.org/r/304192 (owner: 10Muehlenhoff) [09:58:11] it is coming from a single job execution server, so it must be a single job or a problem with the server [09:59:11] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1001 is CRITICAL: CRITICAL: 80.00% of data above the critical threshold [50.0] [09:59:18] Indeed, it is url:/rpc/RunJobs.php?wiki=commonswiki&type=ChangeNotification&maxtime=60&maxmem=300M [10:01:08] (03PS2) 10Muehlenhoff: role::dataset::common: Limit to production networks [puppet] - 10https://gerrit.wikimedia.org/r/304192 [10:01:12] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1001 is OK: OK: Less than 1.00% above the threshold [25.0] [10:03:25] it will happen again in 10 minutes? [10:07:04] (03PS4) 10DCausse: mwgrep: fails gracefully when an invalid regex is provided [puppet] - 10https://gerrit.wikimedia.org/r/302892 (https://phabricator.wikimedia.org/T141996) [10:07:26] (03CR) 10DCausse: mwgrep: fails gracefully when an invalid regex is provided (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/302892 (https://phabricator.wikimedia.org/T141996) (owner: 10DCausse) [10:07:48] (03CR) 10ArielGlenn: [C: 031] "better ;-)" [puppet] - 10https://gerrit.wikimedia.org/r/304192 (owner: 10Muehlenhoff) [10:08:47] (03PS1) 10Joal: Update camus job to use new check_jar [puppet] - 10https://gerrit.wikimedia.org/r/304195 [10:09:23] (03CR) 10Joal: [C: 04-1] "Deploy needs to be synchronized with analytics refinery deploy." [puppet] - 10https://gerrit.wikimedia.org/r/304195 (owner: 10Joal) [10:15:31] PROBLEM - statsv process on hafnium is CRITICAL: PROCS CRITICAL: 0 processes with command name python, args statsv [10:18:00] RECOVERY - Make sure enwiki dumps are not empty on checker.tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 166 bytes in 0.012 second response time [10:19:53] (03CR) 10Alexandros Kosiaris: "Chase, Andrew, I think this addresses the issues created in 9bb29c12. What do you think ?" [puppet] - 10https://gerrit.wikimedia.org/r/304190 (owner: 10Alexandros Kosiaris) [10:22:28] (03CR) 10DCausse: [C: 031] "I encountered the same issue" [puppet] - 10https://gerrit.wikimedia.org/r/298636 (owner: 10EBernhardson) [10:24:30] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1001 is CRITICAL: CRITICAL: 80.00% of data above the critical threshold [50.0] [10:25:52] !log kafka restarted on kafka200[12] for jvm upgrades [10:25:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [10:26:22] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1001 is OK: OK: Less than 1.00% above the threshold [25.0] [10:26:24] !log restarting statsv on hafnium (process stuck after kafka brokers restart) [10:26:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [10:27:02] RECOVERY - statsv process on hafnium is OK: PROCS OK: 13 processes with command name python, args statsv [10:46:50] PROBLEM - Host palladium is DOWN: PING CRITICAL - Packet loss = 100% [10:48:20] PROBLEM - puppet last run on cp3040 is CRITICAL: CRITICAL: puppet fail [10:48:30] PROBLEM - puppet last run on ms-fe1001 is CRITICAL: CRITICAL: puppet fail [10:48:40] PROBLEM - puppet last run on db1050 is CRITICAL: CRITICAL: puppet fail [10:48:41] PROBLEM - puppet last run on copper is CRITICAL: CRITICAL: puppet fail [10:49:20] PROBLEM - puppet last run on mw1300 is CRITICAL: CRITICAL: puppet fail [10:49:22] PROBLEM - puppet last run on mw1298 is CRITICAL: CRITICAL: puppet fail [10:49:22] PROBLEM - puppet last run on ruthenium is CRITICAL: CRITICAL: puppet fail [10:49:30] PROBLEM - puppet last run on ms-be2004 is CRITICAL: CRITICAL: puppet fail [10:49:40] PROBLEM - puppet last run on cp3042 is CRITICAL: CRITICAL: puppet fail [10:49:41] PROBLEM - puppet last run on cp2016 is CRITICAL: CRITICAL: puppet fail [10:49:41] PROBLEM - puppet last run on cp2026 is CRITICAL: CRITICAL: puppet fail [10:49:50] PROBLEM - puppet last run on mw1294 is CRITICAL: CRITICAL: puppet fail [10:50:01] PROBLEM - puppet last run on lvs1005 is CRITICAL: CRITICAL: puppet fail [10:50:11] PROBLEM - puppet last run on lvs1004 is CRITICAL: CRITICAL: puppet fail [10:50:20] PROBLEM - puppet last run on cp3047 is CRITICAL: CRITICAL: puppet fail [10:50:20] PROBLEM - puppet last run on cp3010 is CRITICAL: CRITICAL: puppet fail [10:50:21] PROBLEM - puppet last run on mw1285 is CRITICAL: CRITICAL: puppet fail [10:50:31] PROBLEM - puppet last run on cp2011 is CRITICAL: CRITICAL: puppet fail [10:50:40] PROBLEM - puppet last run on ms-fe1002 is CRITICAL: CRITICAL: puppet fail [10:50:50] PROBLEM - puppet last run on cp2003 is CRITICAL: CRITICAL: puppet fail [10:50:50] PROBLEM - puppet last run on ms-fe2003 is CRITICAL: CRITICAL: puppet fail [10:51:01] PROBLEM - puppet last run on cp2010 is CRITICAL: CRITICAL: puppet fail [10:51:10] PROBLEM - puppet last run on ms-be1015 is CRITICAL: CRITICAL: puppet fail [10:51:12] PROBLEM - puppet last run on ms-be2009 is CRITICAL: CRITICAL: puppet fail [10:51:22] PROBLEM - puppet last run on cp4015 is CRITICAL: CRITICAL: puppet fail [10:51:22] PROBLEM - puppet last run on cp3006 is CRITICAL: CRITICAL: puppet fail [10:51:22] PROBLEM - puppet last run on db1021 is CRITICAL: CRITICAL: puppet fail [10:51:30] PROBLEM - puppet last run on cp1058 is CRITICAL: CRITICAL: puppet fail [10:51:31] PROBLEM - puppet last run on cp1066 is CRITICAL: CRITICAL: puppet fail [10:51:41] PROBLEM - puppet last run on cp3030 is CRITICAL: CRITICAL: puppet fail [10:51:42] PROBLEM - puppet last run on mc1012 is CRITICAL: CRITICAL: puppet fail [10:51:51] PROBLEM - puppet last run on ms-be2027 is CRITICAL: CRITICAL: puppet fail [10:51:51] palladium is in d-i? [10:51:51] PROBLEM - puppet last run on cp4019 is CRITICAL: CRITICAL: puppet fail [10:52:02] PROBLEM - puppet last run on lvs1007 is CRITICAL: CRITICAL: Puppet has 30 failures [10:52:11] PROBLEM - puppet last run on db1022 is CRITICAL: CRITICAL: puppet fail [10:52:11] PROBLEM - puppet last run on db1040 is CRITICAL: CRITICAL: puppet fail [10:52:11] PROBLEM - puppet last run on cp4014 is CRITICAL: CRITICAL: puppet fail [10:52:12] PROBLEM - puppet last run on ms-be3002 is CRITICAL: CRITICAL: puppet fail [10:52:20] PROBLEM - puppet last run on mw1304 is CRITICAL: CRITICAL: puppet fail [10:52:20] PROBLEM - puppet last run on potassium is CRITICAL: CRITICAL: Puppet has 39 failures [10:52:21] PROBLEM - puppet last run on lvs1011 is CRITICAL: CRITICAL: puppet fail [10:52:31] PROBLEM - puppet last run on cp2014 is CRITICAL: CRITICAL: puppet fail [10:52:32] PROBLEM - puppet last run on cp1071 is CRITICAL: CRITICAL: puppet fail [10:52:32] PROBLEM - puppet last run on mw1211 is CRITICAL: CRITICAL: puppet fail [10:52:32] PROBLEM - puppet last run on cp2022 is CRITICAL: CRITICAL: puppet fail [10:52:41] PROBLEM - puppet last run on mw1213 is CRITICAL: CRITICAL: puppet fail [10:52:50] PROBLEM - puppet last run on cp1099 is CRITICAL: CRITICAL: puppet fail [10:53:10] PROBLEM - puppet last run on mw1296 is CRITICAL: CRITICAL: puppet fail [10:53:10] PROBLEM - puppet last run on db1034 is CRITICAL: CRITICAL: puppet fail [10:53:21] PROBLEM - puppet last run on ms-be2008 is CRITICAL: CRITICAL: puppet fail [10:53:31] PROBLEM - puppet last run on ms-be2010 is CRITICAL: CRITICAL: puppet fail [10:53:41] PROBLEM - puppet last run on ms-be3001 is CRITICAL: CRITICAL: puppet fail [10:53:50] PROBLEM - puppet last run on mc2014 is CRITICAL: CRITICAL: puppet fail [10:54:00] PROBLEM - puppet last run on cp1073 is CRITICAL: CRITICAL: puppet fail [10:54:01] PROBLEM - puppet last run on cp4005 is CRITICAL: CRITICAL: puppet fail [10:54:01] PROBLEM - puppet last run on cp3009 is CRITICAL: CRITICAL: puppet fail [10:54:01] PROBLEM - puppet last run on db1024 is CRITICAL: CRITICAL: puppet fail [10:54:11] PROBLEM - puppet last run on es1014 is CRITICAL: CRITICAL: puppet fail [10:54:11] PROBLEM - puppet last run on cp4017 is CRITICAL: CRITICAL: puppet fail [10:54:12] PROBLEM - puppet last run on ms-fe3002 is CRITICAL: CRITICAL: puppet fail [10:54:32] PROBLEM - puppet last run on ms-fe2004 is CRITICAL: CRITICAL: Puppet has 31 failures [10:54:32] PROBLEM - puppet last run on db2039 is CRITICAL: CRITICAL: Puppet has 2 failures [10:54:41] PROBLEM - puppet last run on lvs1002 is CRITICAL: CRITICAL: Puppet has 30 failures [10:54:42] PROBLEM - puppet last run on cp2019 is CRITICAL: CRITICAL: puppet fail [10:54:46] (03PS4) 10Jcrespo: Decommision pc100[123] 2/2 [puppet] - 10https://gerrit.wikimedia.org/r/274076 (https://phabricator.wikimedia.org/T124962) [10:54:50] PROBLEM - puppet last run on cp2015 is CRITICAL: CRITICAL: puppet fail [10:54:51] PROBLEM - puppet last run on ms-be1006 is CRITICAL: CRITICAL: Puppet has 30 failures [10:54:51] PROBLEM - puppet last run on cp1065 is CRITICAL: CRITICAL: puppet fail [10:55:00] PROBLEM - puppet last run on ms-be2014 is CRITICAL: CRITICAL: puppet fail [10:55:02] PROBLEM - puppet last run on cp1062 is CRITICAL: CRITICAL: puppet fail [10:55:10] PROBLEM - puppet last run on ms-be1009 is CRITICAL: CRITICAL: puppet fail [10:55:10] PROBLEM - puppet last run on ms-fe1003 is CRITICAL: CRITICAL: puppet fail [10:55:11] PROBLEM - puppet last run on mw1295 is CRITICAL: CRITICAL: puppet fail [10:55:11] PROBLEM - puppet last run on labsdb1005 is CRITICAL: CRITICAL: puppet fail [10:55:21] PROBLEM - puppet last run on mc1008 is CRITICAL: CRITICAL: puppet fail [10:55:30] PROBLEM - puppet last run on lvs1001 is CRITICAL: CRITICAL: puppet fail [10:55:30] PROBLEM - puppet last run on ms-be1017 is CRITICAL: CRITICAL: puppet fail [10:55:32] PROBLEM - puppet last run on mw1218 is CRITICAL: CRITICAL: puppet fail [10:55:41] PROBLEM - puppet last run on ms-be2013 is CRITICAL: CRITICAL: puppet fail [10:55:51] PROBLEM - puppet last run on mw1241 is CRITICAL: CRITICAL: puppet fail [10:56:00] PROBLEM - puppet last run on cp3031 is CRITICAL: CRITICAL: puppet fail [10:56:01] PROBLEM - puppet last run on ms-be1024 is CRITICAL: CRITICAL: puppet fail [10:56:12] PROBLEM - puppet last run on wtp1013 is CRITICAL: CRITICAL: puppet fail [10:56:12] PROBLEM - puppet last run on ms-be1007 is CRITICAL: CRITICAL: puppet fail [10:56:13] PROBLEM - puppet last run on cp3046 is CRITICAL: CRITICAL: puppet fail [10:56:13] PROBLEM - puppet last run on cp3045 is CRITICAL: CRITICAL: puppet fail [10:56:20] PROBLEM - puppet last run on mw1205 is CRITICAL: CRITICAL: puppet fail [10:56:31] PROBLEM - puppet last run on mw2083 is CRITICAL: CRITICAL: puppet fail [10:56:31] PROBLEM - puppet last run on netmon1001 is CRITICAL: CRITICAL: puppet fail [10:56:32] PROBLEM - puppet last run on mw2082 is CRITICAL: CRITICAL: puppet fail [10:56:40] PROBLEM - puppet last run on db1076 is CRITICAL: CRITICAL: puppet fail [10:56:41] PROBLEM - puppet last run on uranium is CRITICAL: CRITICAL: puppet fail [10:56:41] PROBLEM - puppet last run on mw1189 is CRITICAL: CRITICAL: puppet fail [10:56:41] PROBLEM - puppet last run on cp2017 is CRITICAL: CRITICAL: puppet fail [10:56:50] PROBLEM - puppet last run on mw2163 is CRITICAL: CRITICAL: puppet fail [10:56:50] PROBLEM - puppet last run on analytics1035 is CRITICAL: CRITICAL: puppet fail [10:56:51] PROBLEM - puppet last run on ms-be2016 is CRITICAL: CRITICAL: puppet fail [10:56:51] PROBLEM - puppet last run on mc1016 is CRITICAL: CRITICAL: puppet fail [10:57:00] PROBLEM - puppet last run on mw2117 is CRITICAL: CRITICAL: puppet fail [10:57:01] PROBLEM - puppet last run on ms-be2017 is CRITICAL: CRITICAL: puppet fail [10:57:02] PROBLEM - puppet last run on mw1297 is CRITICAL: CRITICAL: puppet fail [10:57:10] PROBLEM - puppet last run on dbproxy1006 is CRITICAL: CRITICAL: puppet fail [10:57:11] PROBLEM - puppet last run on mw2093 is CRITICAL: CRITICAL: puppet fail [10:57:20] PROBLEM - puppet last run on mw2196 is CRITICAL: CRITICAL: puppet fail [10:57:20] PROBLEM - puppet last run on mw1166 is CRITICAL: CRITICAL: puppet fail [10:57:21] PROBLEM - puppet last run on mw1173 is CRITICAL: CRITICAL: puppet fail [10:57:21] PROBLEM - puppet last run on logstash1005 is CRITICAL: CRITICAL: puppet fail [10:57:21] PROBLEM - puppet last run on mc1011 is CRITICAL: CRITICAL: puppet fail [10:57:30] PROBLEM - puppet last run on mc2004 is CRITICAL: CRITICAL: puppet fail [10:57:31] PROBLEM - puppet last run on db1066 is CRITICAL: CRITICAL: puppet fail [10:57:31] PROBLEM - puppet last run on elastic1018 is CRITICAL: CRITICAL: puppet fail [10:57:40] PROBLEM - puppet last run on mw1204 is CRITICAL: CRITICAL: puppet fail [10:57:40] PROBLEM - puppet last run on ms-fe3001 is CRITICAL: CRITICAL: puppet fail [10:57:40] PROBLEM - puppet last run on ms-be1004 is CRITICAL: CRITICAL: puppet fail [10:57:41] PROBLEM - puppet last run on mw2143 is CRITICAL: CRITICAL: puppet fail [10:57:41] PROBLEM - puppet last run on db2045 is CRITICAL: CRITICAL: puppet fail [10:57:41] PROBLEM - puppet last run on mw2087 is CRITICAL: CRITICAL: puppet fail [10:57:41] PROBLEM - puppet last run on analytics1003 is CRITICAL: CRITICAL: puppet fail [10:57:41] PROBLEM - puppet last run on achernar is CRITICAL: CRITICAL: puppet fail [10:57:42] PROBLEM - puppet last run on labcontrol1001 is CRITICAL: CRITICAL: puppet fail [10:57:42] PROBLEM - puppet last run on ms-be2006 is CRITICAL: CRITICAL: Puppet has 40 failures [10:57:51] PROBLEM - puppet last run on ms-be1022 is CRITICAL: CRITICAL: puppet fail [10:57:51] PROBLEM - puppet last run on ms-be1003 is CRITICAL: CRITICAL: puppet fail [10:58:01] PROBLEM - puppet last run on cp3039 is CRITICAL: CRITICAL: puppet fail [10:58:01] PROBLEM - puppet last run on mc1018 is CRITICAL: CRITICAL: puppet fail [10:58:01] PROBLEM - puppet last run on mw1219 is CRITICAL: CRITICAL: puppet fail [10:58:21] PROBLEM - puppet last run on mw1220 is CRITICAL: CRITICAL: puppet fail [10:58:21] PROBLEM - puppet last run on elastic2004 is CRITICAL: CRITICAL: puppet fail [10:58:30] PROBLEM - puppet last run on mw1215 is CRITICAL: CRITICAL: puppet fail [10:58:30] PROBLEM - puppet last run on mw1260 is CRITICAL: CRITICAL: puppet fail [10:58:30] PROBLEM - puppet last run on mw2134 is CRITICAL: CRITICAL: puppet fail [10:58:30] PROBLEM - puppet last run on mw2176 is CRITICAL: CRITICAL: puppet fail [10:58:31] PROBLEM - puppet last run on labvirt1010 is CRITICAL: CRITICAL: puppet fail [10:58:31] PROBLEM - puppet last run on mw2092 is CRITICAL: CRITICAL: puppet fail [10:58:31] PROBLEM - puppet last run on mw2123 is CRITICAL: CRITICAL: puppet fail [10:58:31] PROBLEM - puppet last run on mw2084 is CRITICAL: CRITICAL: puppet fail [10:58:31] PROBLEM - puppet last run on mw2114 is CRITICAL: CRITICAL: puppet fail [10:58:32] PROBLEM - puppet last run on mw1235 is CRITICAL: CRITICAL: puppet fail [10:58:32] PROBLEM - puppet last run on db2054 is CRITICAL: CRITICAL: puppet fail [10:58:33] PROBLEM - puppet last run on mw2110 is CRITICAL: CRITICAL: puppet fail [10:58:50] PROBLEM - puppet last run on mw2184 is CRITICAL: CRITICAL: puppet fail [10:58:51] PROBLEM - puppet last run on mw2101 is CRITICAL: CRITICAL: puppet fail [10:58:51] PROBLEM - puppet last run on analytics1030 is CRITICAL: CRITICAL: puppet fail [10:58:51] PROBLEM - puppet last run on mw2127 is CRITICAL: CRITICAL: puppet fail [10:58:51] PROBLEM - puppet last run on mw2212 is CRITICAL: CRITICAL: puppet fail [10:58:51] PROBLEM - puppet last run on mw2067 is CRITICAL: CRITICAL: puppet fail [10:58:52] PROBLEM - puppet last run on elastic2008 is CRITICAL: CRITICAL: puppet fail [10:58:52] PROBLEM - puppet last run on mw2079 is CRITICAL: CRITICAL: puppet fail [10:59:00] PROBLEM - puppet last run on mw2131 is CRITICAL: CRITICAL: puppet fail [10:59:01] PROBLEM - puppet last run on ms-be1002 is CRITICAL: CRITICAL: puppet fail [10:59:01] PROBLEM - puppet last run on mw2070 is CRITICAL: CRITICAL: puppet fail [10:59:01] PROBLEM - puppet last run on elastic1027 is CRITICAL: CRITICAL: puppet fail [10:59:01] PROBLEM - puppet last run on mw2085 is CRITICAL: CRITICAL: puppet fail [10:59:10] PROBLEM - puppet last run on labvirt1009 is CRITICAL: CRITICAL: puppet fail [10:59:11] PROBLEM - puppet last run on mw2130 is CRITICAL: CRITICAL: puppet fail [10:59:12] PROBLEM - puppet last run on db1051 is CRITICAL: CRITICAL: puppet fail [10:59:12] PROBLEM - puppet last run on cp1053 is CRITICAL: CRITICAL: puppet fail [10:59:20] PROBLEM - puppet last run on mw2142 is CRITICAL: CRITICAL: puppet fail [10:59:20] PROBLEM - puppet last run on mw2096 is CRITICAL: CRITICAL: puppet fail [10:59:21] PROBLEM - puppet last run on analytics1031 is CRITICAL: CRITICAL: puppet fail [10:59:21] PROBLEM - puppet last run on ms-be2020 is CRITICAL: CRITICAL: puppet fail [10:59:21] PROBLEM - puppet last run on mw1208 is CRITICAL: CRITICAL: puppet fail [10:59:22] PROBLEM - puppet last run on ms-be2025 is CRITICAL: CRITICAL: puppet fail [10:59:30] PROBLEM - puppet last run on ms-be1019 is CRITICAL: CRITICAL: puppet fail [10:59:30] PROBLEM - puppet last run on mc2001 is CRITICAL: CRITICAL: puppet fail [10:59:31] PROBLEM - puppet last run on elastic1040 is CRITICAL: CRITICAL: puppet fail [10:59:40] PROBLEM - puppet last run on mw1194 is CRITICAL: CRITICAL: puppet fail [10:59:41] PROBLEM - puppet last run on db1009 is CRITICAL: CRITICAL: puppet fail [10:59:41] PROBLEM - puppet last run on mw2168 is CRITICAL: CRITICAL: puppet fail [10:59:41] PROBLEM - puppet last run on mw1206 is CRITICAL: CRITICAL: puppet fail [10:59:41] PROBLEM - puppet last run on mw2090 is CRITICAL: CRITICAL: puppet fail [10:59:41] PROBLEM - puppet last run on mw1284 is CRITICAL: CRITICAL: puppet fail [10:59:42] PROBLEM - puppet last run on mw2062 is CRITICAL: CRITICAL: puppet fail [10:59:42] PROBLEM - puppet last run on mc1002 is CRITICAL: CRITICAL: puppet fail [10:59:42] PROBLEM - puppet last run on ms-be2026 is CRITICAL: CRITICAL: puppet fail [10:59:42] PROBLEM - puppet last run on wtp1010 is CRITICAL: CRITICAL: puppet fail [10:59:51] PROBLEM - puppet last run on elastic1041 is CRITICAL: CRITICAL: puppet fail [10:59:51] PROBLEM - puppet last run on cp4009 is CRITICAL: CRITICAL: puppet fail [11:00:01] PROBLEM - puppet last run on mc1017 is CRITICAL: CRITICAL: puppet fail [11:00:01] PROBLEM - puppet last run on lvs1003 is CRITICAL: CRITICAL: puppet fail [11:00:10] PROBLEM - puppet last run on ms-fe1004 is CRITICAL: CRITICAL: puppet fail [11:00:12] PROBLEM - puppet last run on elastic1045 is CRITICAL: CRITICAL: puppet fail [11:00:21] PROBLEM - puppet last run on labvirt1012 is CRITICAL: CRITICAL: puppet fail [11:00:21] PROBLEM - puppet last run on db2038 is CRITICAL: CRITICAL: puppet fail [11:00:21] PROBLEM - puppet last run on cp1068 is CRITICAL: CRITICAL: puppet fail [11:00:27] (03PS5) 10Jcrespo: Decommision pc100[123] 2/2 [puppet] - 10https://gerrit.wikimedia.org/r/274076 (https://phabricator.wikimedia.org/T124962) [11:00:30] PROBLEM - puppet last run on mw2182 is CRITICAL: CRITICAL: puppet fail [11:00:31] PROBLEM - puppet last run on db1026 is CRITICAL: CRITICAL: puppet fail [11:00:31] PROBLEM - puppet last run on cp2013 is CRITICAL: CRITICAL: puppet fail [11:00:31] PROBLEM - puppet last run on db1028 is CRITICAL: CRITICAL: puppet fail [11:00:32] PROBLEM - puppet last run on mw1305 is CRITICAL: CRITICAL: puppet fail [11:00:40] PROBLEM - puppet last run on mw1238 is CRITICAL: CRITICAL: puppet fail [11:00:40] PROBLEM - puppet last run on mw1179 is CRITICAL: CRITICAL: puppet fail [11:00:40] PROBLEM - puppet last run on kafka1002 is CRITICAL: CRITICAL: puppet fail [11:00:40] PROBLEM - puppet last run on db1084 is CRITICAL: CRITICAL: puppet fail [11:00:41] RECOVERY - Host palladium is UP: PING OK - Packet loss = 0%, RTA = 1.33 ms [11:00:50] PROBLEM - puppet last run on mw2174 is CRITICAL: CRITICAL: puppet fail [11:00:50] PROBLEM - puppet last run on cp1054 is CRITICAL: CRITICAL: puppet fail [11:00:50] PROBLEM - puppet last run on sca2002 is CRITICAL: CRITICAL: puppet fail [11:00:51] PROBLEM - puppet last run on mw1230 is CRITICAL: CRITICAL: puppet fail [11:00:51] PROBLEM - puppet last run on mw2111 is CRITICAL: CRITICAL: puppet fail [11:00:51] PROBLEM - puppet last run on analytics1051 is CRITICAL: CRITICAL: puppet fail [11:00:52] PROBLEM - puppet last run on mw2172 is CRITICAL: CRITICAL: puppet fail [11:00:52] PROBLEM - puppet last run on maps2002 is CRITICAL: CRITICAL: puppet fail [11:00:52] PROBLEM - puppet last run on mw2152 is CRITICAL: CRITICAL: puppet fail [11:01:00] PROBLEM - puppet last run on ocg1002 is CRITICAL: CRITICAL: puppet fail [11:01:00] PROBLEM - puppet last run on mira is CRITICAL: CRITICAL: puppet fail [11:01:00] PROBLEM - puppet last run on ganeti2004 is CRITICAL: CRITICAL: puppet fail [11:01:00] PROBLEM - puppet last run on mw2247 is CRITICAL: CRITICAL: puppet fail [11:01:01] PROBLEM - puppet last run on cp2001 is CRITICAL: CRITICAL: puppet fail [11:01:01] PROBLEM - puppet last run on cp2002 is CRITICAL: CRITICAL: puppet fail [11:01:01] PROBLEM - puppet last run on mw1169 is CRITICAL: CRITICAL: puppet fail [11:01:10] PROBLEM - puppet last run on db1045 is CRITICAL: CRITICAL: puppet fail [11:01:11] PROBLEM - puppet last run on db1018 is CRITICAL: CRITICAL: puppet fail [11:01:11] PROBLEM - puppet last run on mw2203 is CRITICAL: CRITICAL: puppet fail [11:01:11] PROBLEM - puppet last run on db2029 is CRITICAL: CRITICAL: puppet fail [11:01:12] PROBLEM - puppet last run on analytics1056 is CRITICAL: CRITICAL: puppet fail [11:01:12] PROBLEM - puppet last run on mw2200 is CRITICAL: CRITICAL: puppet fail [11:01:12] PROBLEM - puppet last run on elastic2013 is CRITICAL: CRITICAL: puppet fail [11:01:20] PROBLEM - puppet last run on mc2007 is CRITICAL: CRITICAL: puppet fail [11:01:20] PROBLEM - puppet last run on mw2133 is CRITICAL: CRITICAL: puppet fail [11:01:20] PROBLEM - puppet last run on mw2098 is CRITICAL: CRITICAL: puppet fail [11:01:20] PROBLEM - puppet last run on cp1064 is CRITICAL: CRITICAL: puppet fail [11:01:21] PROBLEM - puppet last run on ms-be1010 is CRITICAL: CRITICAL: puppet fail [11:01:21] PROBLEM - puppet last run on cp3018 is CRITICAL: CRITICAL: puppet fail [11:01:22] PROBLEM - puppet last run on ganeti1002 is CRITICAL: CRITICAL: puppet fail [11:01:22] PROBLEM - puppet last run on db2008 is CRITICAL: CRITICAL: puppet fail [11:01:22] PROBLEM - puppet last run on mc2005 is CRITICAL: CRITICAL: puppet fail [11:01:22] PROBLEM - puppet last run on ms-be2022 is CRITICAL: CRITICAL: puppet fail [11:01:30] PROBLEM - puppet last run on restbase2002 is CRITICAL: CRITICAL: puppet fail [11:01:30] PROBLEM - puppet last run on mw1180 is CRITICAL: CRITICAL: puppet fail [11:01:40] PROBLEM - puppet last run on wtp1012 is CRITICAL: CRITICAL: puppet fail [11:01:40] PROBLEM - puppet last run on cp3036 is CRITICAL: CRITICAL: puppet fail [11:01:40] PROBLEM - puppet last run on cp3008 is CRITICAL: CRITICAL: puppet fail [11:01:40] PROBLEM - puppet last run on wasat is CRITICAL: CRITICAL: puppet fail [11:01:40] PROBLEM - puppet last run on db2041 is CRITICAL: CRITICAL: puppet fail [11:01:41] PROBLEM - puppet last run on mw2150 is CRITICAL: CRITICAL: puppet fail [11:01:41] PROBLEM - puppet last run on mw2094 is CRITICAL: CRITICAL: puppet fail [11:01:41] PROBLEM - puppet last run on db1043 is CRITICAL: CRITICAL: puppet fail [11:01:41] PROBLEM - puppet last run on mc2015 is CRITICAL: CRITICAL: puppet fail [11:01:42] PROBLEM - puppet last run on rdb2005 is CRITICAL: CRITICAL: puppet fail [11:01:42] PROBLEM - puppet last run on labvirt1005 is CRITICAL: CRITICAL: puppet fail [11:01:50] PROBLEM - puppet last run on wtp2001 is CRITICAL: CRITICAL: puppet fail [11:01:50] PROBLEM - puppet last run on analytics1002 is CRITICAL: CRITICAL: puppet fail [11:01:50] PROBLEM - IPv6 ping to codfw on ripe-atlas-codfw is CRITICAL: CRITICAL - failed 20 probes of 237 (alerts on 19) - https://atlas.ripe.net/measurements/1791212/#!map [11:01:51] PROBLEM - puppet last run on restbase1015 is CRITICAL: CRITICAL: puppet fail [11:01:51] PROBLEM - puppet last run on cp4016 is CRITICAL: CRITICAL: puppet fail [11:02:00] PROBLEM - puppet last run on cp3048 is CRITICAL: CRITICAL: puppet fail [11:02:00] PROBLEM - puppet last run on cp3007 is CRITICAL: CRITICAL: puppet fail [11:02:00] PROBLEM - puppet last run on bast3001 is CRITICAL: CRITICAL: puppet fail [11:02:00] PROBLEM - puppet last run on db1046 is CRITICAL: CRITICAL: puppet fail [11:02:11] PROBLEM - puppet last run on restbase1012 is CRITICAL: CRITICAL: puppet fail [11:02:11] PROBLEM - puppet last run on auth1001 is CRITICAL: CRITICAL: puppet fail [11:02:11] PROBLEM - puppet last run on wtp1001 is CRITICAL: CRITICAL: puppet fail [11:02:11] PROBLEM - puppet last run on cp4010 is CRITICAL: CRITICAL: puppet fail [11:02:12] PROBLEM - puppet last run on lvs3003 is CRITICAL: CRITICAL: puppet fail [11:02:12] PROBLEM - puppet last run on stat1002 is CRITICAL: CRITICAL: puppet fail [11:02:22] PROBLEM - puppet last run on mw1181 is CRITICAL: CRITICAL: puppet fail [11:02:30] PROBLEM - puppet last run on analytics1029 is CRITICAL: CRITICAL: puppet fail [11:02:31] PROBLEM - puppet last run on analytics1032 is CRITICAL: CRITICAL: puppet fail [11:02:31] PROBLEM - puppet last run on db2001 is CRITICAL: CRITICAL: puppet fail [11:02:31] PROBLEM - puppet last run on db2037 is CRITICAL: CRITICAL: puppet fail [11:02:31] PROBLEM - puppet last run on lvs2004 is CRITICAL: CRITICAL: puppet fail [11:02:31] PROBLEM - puppet last run on ms-be2015 is CRITICAL: CRITICAL: puppet fail [11:02:40] PROBLEM - puppet last run on labsdb1001 is CRITICAL: CRITICAL: puppet fail [11:02:40] PROBLEM - puppet last run on analytics1036 is CRITICAL: CRITICAL: puppet fail [11:02:40] PROBLEM - puppet last run on neon is CRITICAL: CRITICAL: puppet fail [11:02:41] PROBLEM - puppet last run on db1055 is CRITICAL: CRITICAL: puppet fail [11:02:41] PROBLEM - puppet last run on elastic1035 is CRITICAL: CRITICAL: puppet fail [11:02:42] PROBLEM - puppet last run on logstash1002 is CRITICAL: CRITICAL: puppet fail [11:02:50] PROBLEM - puppet last run on ms-be1014 is CRITICAL: CRITICAL: puppet fail [11:02:50] PROBLEM - puppet last run on ms-be2011 is CRITICAL: CRITICAL: puppet fail [11:02:50] PROBLEM - puppet last run on mw2202 is CRITICAL: CRITICAL: puppet fail [11:02:50] PROBLEM - puppet last run on db2004 is CRITICAL: CRITICAL: puppet fail [11:02:51] PROBLEM - puppet last run on db1079 is CRITICAL: CRITICAL: puppet fail [11:02:51] PROBLEM - puppet last run on mc1014 is CRITICAL: CRITICAL: puppet fail [11:02:51] PROBLEM - puppet last run on db2023 is CRITICAL: CRITICAL: puppet fail [11:03:00] PROBLEM - puppet last run on ms-be2021 is CRITICAL: CRITICAL: puppet fail [11:03:00] PROBLEM - puppet last run on mw1267 is CRITICAL: CRITICAL: puppet fail [11:03:00] PROBLEM - puppet last run on mw2107 is CRITICAL: CRITICAL: puppet fail [11:03:00] PROBLEM - puppet last run on kafka2001 is CRITICAL: CRITICAL: puppet fail [11:03:01] PROBLEM - puppet last run on wtp1014 is CRITICAL: CRITICAL: puppet fail [11:03:01] PROBLEM - puppet last run on db1010 is CRITICAL: CRITICAL: puppet fail [11:03:01] PROBLEM - puppet last run on wtp2016 is CRITICAL: CRITICAL: puppet fail [11:03:01] PROBLEM - puppet last run on lvs1009 is CRITICAL: CRITICAL: puppet fail [11:03:02] PROBLEM - puppet last run on db1068 is CRITICAL: CRITICAL: puppet fail [11:03:10] PROBLEM - puppet last run on labvirt1007 is CRITICAL: CRITICAL: puppet fail [11:03:10] PROBLEM - puppet last run on ms-be1005 is CRITICAL: CRITICAL: puppet fail [11:03:10] PROBLEM - puppet last run on snapshot1005 is CRITICAL: CRITICAL: puppet fail [11:03:11] PROBLEM - puppet last run on analytics1044 is CRITICAL: CRITICAL: puppet fail [11:03:11] PROBLEM - puppet last run on dbproxy1001 is CRITICAL: CRITICAL: puppet fail [11:03:11] PROBLEM - puppet last run on mc2013 is CRITICAL: CRITICAL: puppet fail [11:03:11] PROBLEM - puppet last run on db1039 is CRITICAL: CRITICAL: puppet fail [11:03:20] PROBLEM - puppet last run on mw1198 is CRITICAL: CRITICAL: puppet fail [11:03:20] PROBLEM - puppet last run on analytics1052 is CRITICAL: CRITICAL: puppet fail [11:03:20] PROBLEM - puppet last run on graphite2002 is CRITICAL: CRITICAL: puppet fail [11:03:20] PROBLEM - puppet last run on db2047 is CRITICAL: CRITICAL: puppet fail [11:03:20] PROBLEM - puppet last run on mw2106 is CRITICAL: CRITICAL: puppet fail [11:03:21] PROBLEM - puppet last run on mw1302 is CRITICAL: CRITICAL: puppet fail [11:03:21] PROBLEM - puppet last run on mw1245 is CRITICAL: CRITICAL: puppet fail [11:03:21] PROBLEM - puppet last run on mw1240 is CRITICAL: CRITICAL: puppet fail [11:03:21] PROBLEM - puppet last run on mw2233 is CRITICAL: CRITICAL: puppet fail [11:03:22] PROBLEM - puppet last run on mw1288 is CRITICAL: CRITICAL: puppet fail [11:03:22] PROBLEM - puppet last run on mc2016 is CRITICAL: CRITICAL: puppet fail [11:03:26] (03CR) 10Jcrespo: [C: 031] "I've checked and the account number issue is still present on very old servers that will be hopefully retired soon." [puppet] - 10https://gerrit.wikimedia.org/r/274076 (https://phabricator.wikimedia.org/T124962) (owner: 10Jcrespo) [11:03:30] PROBLEM - puppet last run on db2065 is CRITICAL: CRITICAL: puppet fail [11:03:30] PROBLEM - puppet last run on mw1165 is CRITICAL: CRITICAL: puppet fail [11:03:30] PROBLEM - puppet last run on mw2249 is CRITICAL: CRITICAL: puppet fail [11:03:31] PROBLEM - puppet last run on mw1201 is CRITICAL: CRITICAL: Puppet has 5 failures [11:03:32] PROBLEM - puppet last run on ms-be1008 is CRITICAL: CRITICAL: puppet fail [11:03:40] PROBLEM - puppet last run on iridium is CRITICAL: CRITICAL: puppet fail [11:03:40] PROBLEM - puppet last run on cp3003 is CRITICAL: CRITICAL: puppet fail [11:03:41] PROBLEM - puppet last run on mw1303 is CRITICAL: CRITICAL: puppet fail [11:03:41] PROBLEM - puppet last run on etherpad1001 is CRITICAL: CRITICAL: puppet fail [11:03:41] PROBLEM - puppet last run on mx1001 is CRITICAL: CRITICAL: puppet fail [11:03:41] PROBLEM - puppet last run on oresrdb1001 is CRITICAL: CRITICAL: puppet fail [11:03:41] PROBLEM - puppet last run on einsteinium is CRITICAL: CRITICAL: puppet fail [11:03:41] PROBLEM - puppet last run on chromium is CRITICAL: CRITICAL: puppet fail [11:03:42] PROBLEM - puppet last run on wtp2010 is CRITICAL: CRITICAL: puppet fail [11:03:42] PROBLEM - puppet last run on install2001 is CRITICAL: CRITICAL: puppet fail [11:03:43] should be recovering soon [11:03:43] PROBLEM - puppet last run on analytics1040 is CRITICAL: CRITICAL: puppet fail [11:03:43] PROBLEM - puppet last run on conf2001 is CRITICAL: CRITICAL: puppet fail [11:03:44] PROBLEM - puppet last run on sinistra is CRITICAL: CRITICAL: puppet fail [11:03:44] PROBLEM - puppet last run on mw1202 is CRITICAL: CRITICAL: puppet fail [11:04:00] PROBLEM - puppet last run on db1049 is CRITICAL: CRITICAL: puppet fail [11:04:00] PROBLEM - puppet last run on pc1006 is CRITICAL: CRITICAL: puppet fail [11:04:00] PROBLEM - puppet last run on osmium is CRITICAL: CRITICAL: puppet fail [11:04:00] PROBLEM - puppet last run on dubnium is CRITICAL: CRITICAL: puppet fail [11:04:00] PROBLEM - puppet last run on rhenium is CRITICAL: CRITICAL: puppet fail [11:04:01] PROBLEM - puppet last run on analytics1053 is CRITICAL: CRITICAL: puppet fail [11:04:01] PROBLEM - puppet last run on mw1178 is CRITICAL: CRITICAL: puppet fail [11:04:02] PROBLEM - puppet last run on elastic1046 is CRITICAL: CRITICAL: puppet fail [11:04:02] PROBLEM - puppet last run on bast1001 is CRITICAL: CRITICAL: puppet fail [11:04:03] PROBLEM - puppet last run on mw1246 is CRITICAL: CRITICAL: puppet fail [11:04:11] PROBLEM - puppet last run on db1074 is CRITICAL: CRITICAL: puppet fail [11:04:11] PROBLEM - puppet last run on pc1004 is CRITICAL: CRITICAL: puppet fail [11:04:11] PROBLEM - puppet last run on mendelevium is CRITICAL: CRITICAL: puppet fail [11:04:11] PROBLEM - puppet last run on terbium is CRITICAL: CRITICAL: puppet fail [11:04:11] PROBLEM - puppet last run on fluorine is CRITICAL: CRITICAL: Puppet has 43 failures [11:04:11] PROBLEM - puppet last run on mw1174 is CRITICAL: CRITICAL: puppet fail [11:04:11] PROBLEM - puppet last run on mw1252 is CRITICAL: CRITICAL: puppet fail [11:04:12] PROBLEM - puppet last run on dbstore1002 is CRITICAL: CRITICAL: puppet fail [11:04:12] PROBLEM - puppet last run on lvs4002 is CRITICAL: CRITICAL: puppet fail [11:04:13] PROBLEM - puppet last run on cp4018 is CRITICAL: CRITICAL: puppet fail [11:04:13] PROBLEM - puppet last run on mw1183 is CRITICAL: CRITICAL: puppet fail [11:04:14] PROBLEM - puppet last run on db1094 is CRITICAL: CRITICAL: puppet fail [11:04:30] PROBLEM - puppet last run on mw2146 is CRITICAL: CRITICAL: puppet fail [11:04:30] PROBLEM - puppet last run on ocg1001 is CRITICAL: CRITICAL: puppet fail [11:04:30] PROBLEM - puppet last run on db2034 is CRITICAL: CRITICAL: puppet fail [11:04:30] PROBLEM - puppet last run on mw2207 is CRITICAL: CRITICAL: puppet fail [11:04:31] PROBLEM - puppet last run on lvs2003 is CRITICAL: CRITICAL: puppet fail [11:04:31] PROBLEM - puppet last run on scb1001 is CRITICAL: CRITICAL: puppet fail [11:04:31] PROBLEM - puppet last run on eventlog2001 is CRITICAL: CRITICAL: puppet fail [11:04:31] PROBLEM - puppet last run on mw1289 is CRITICAL: CRITICAL: puppet fail [11:04:31] PROBLEM - puppet last run on aqs1002 is CRITICAL: CRITICAL: puppet fail [11:04:32] PROBLEM - puppet last run on db2050 is CRITICAL: CRITICAL: puppet fail [11:04:32] PROBLEM - puppet last run on mw2144 is CRITICAL: CRITICAL: puppet fail [11:04:33] PROBLEM - puppet last run on mw2186 is CRITICAL: CRITICAL: puppet fail [11:04:44] PROBLEM - puppet last run on snapshot1006 is CRITICAL: CRITICAL: puppet fail [11:04:44] PROBLEM - puppet last run on wtp1023 is CRITICAL: CRITICAL: puppet fail [11:04:45] PROBLEM - puppet last run on bromine is CRITICAL: CRITICAL: puppet fail [11:04:45] PROBLEM - puppet last run on wtp1022 is CRITICAL: CRITICAL: puppet fail [11:04:46] PROBLEM - puppet last run on mw1192 is CRITICAL: CRITICAL: puppet fail [11:04:46] PROBLEM - puppet last run on mw1257 is CRITICAL: CRITICAL: puppet fail [11:04:47] PROBLEM - puppet last run on mw1203 is CRITICAL: CRITICAL: puppet fail [11:04:47] PROBLEM - puppet last run on rdb1001 is CRITICAL: CRITICAL: puppet fail [11:04:48] PROBLEM - puppet last run on db1016 is CRITICAL: CRITICAL: puppet fail [11:04:48] PROBLEM - puppet last run on db2051 is CRITICAL: CRITICAL: puppet fail [11:04:49] PROBLEM - puppet last run on lvs2002 is CRITICAL: CRITICAL: puppet fail [11:04:49] PROBLEM - puppet last run on analytics1041 is CRITICAL: CRITICAL: puppet fail [11:04:50] PROBLEM - puppet last run on dbproxy1003 is CRITICAL: CRITICAL: puppet fail [11:04:50] PROBLEM - puppet last run on lvs2006 is CRITICAL: CRITICAL: puppet fail [11:04:51] PROBLEM - puppet last run on restbase1014 is CRITICAL: CRITICAL: puppet fail [11:04:51] PROBLEM - puppet last run on es2002 is CRITICAL: CRITICAL: puppet fail [11:06:11] PROBLEM - puppet last run on relforge1002 is CRITICAL: CRITICAL: Puppet has 6 failures [11:06:12] PROBLEM - puppet last run on lvs4001 is CRITICAL: CRITICAL: puppet fail [11:06:13] PROBLEM - puppet last run on mw1185 is CRITICAL: CRITICAL: puppet fail [11:06:20] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1001 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [50.0] [11:06:30] PROBLEM - puppet last run on mw2091 is CRITICAL: CRITICAL: puppet fail [11:06:40] PROBLEM - puppet last run on analytics1048 is CRITICAL: CRITICAL: Puppet has 8 failures [11:06:51] PROBLEM - puppet last run on mw2080 is CRITICAL: CRITICAL: Puppet has 6 failures [11:07:11] PROBLEM - puppet last run on mw1254 is CRITICAL: CRITICAL: Puppet has 7 failures [11:07:12] PROBLEM - puppet last run on mw2075 is CRITICAL: CRITICAL: Puppet has 8 failures [11:07:27] so puppet is going well then [11:07:42] RECOVERY - IPv6 ping to codfw on ripe-atlas-codfw is OK: OK - failed 15 probes of 237 (alerts on 19) - https://atlas.ripe.net/measurements/1791212/#!map [11:09:42] RECOVERY - puppet last run on restbase1015 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:13:21] RECOVERY - puppet last run on wtp2005 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [11:13:41] RECOVERY - puppet last run on mw2240 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [11:13:50] RECOVERY - puppet last run on lvs1007 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [11:14:00] RECOVERY - puppet last run on relforge1002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:14:01] RECOVERY - puppet last run on auth1001 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [11:14:01] RECOVERY - puppet last run on db1022 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [11:14:02] RECOVERY - puppet last run on potassium is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [11:14:12] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1001 is OK: OK: Less than 1.00% above the threshold [25.0] [11:14:12] RECOVERY - puppet last run on ms-fe1001 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [11:14:23] RECOVERY - puppet last run on ms-fe2004 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [11:14:23] RECOVERY - puppet last run on db2039 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [11:14:30] RECOVERY - puppet last run on copper is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [11:14:31] RECOVERY - puppet last run on kafka1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:14:31] RECOVERY - puppet last run on lvs1002 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [11:14:31] RECOVERY - puppet last run on analytics1048 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [11:14:41] RECOVERY - puppet last run on ms-be1006 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:14:50] RECOVERY - puppet last run on db2059 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [11:14:50] RECOVERY - puppet last run on maps2002 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [11:14:51] RECOVERY - puppet last run on ganeti2004 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [11:14:51] RECOVERY - puppet last run on mw2247 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [11:14:51] RECOVERY - puppet last run on restbase-test2002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:15:00] RECOVERY - puppet last run on cp2023 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:15:00] RECOVERY - puppet last run on mw2064 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:15:01] RECOVERY - puppet last run on mw1254 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:15:01] RECOVERY - puppet last run on mw1300 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:15:10] RECOVERY - puppet last run on mw2075 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:15:11] RECOVERY - puppet last run on ruthenium is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [11:15:11] RECOVERY - puppet last run on mw2109 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [11:15:11] RECOVERY - puppet last run on mw2244 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [11:15:11] RECOVERY - puppet last run on ganeti1002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:15:11] RECOVERY - puppet last run on cp3018 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [11:15:20] RECOVERY - puppet last run on ms-be2004 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [11:15:20] RECOVERY - puppet last run on db2008 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:15:20] RECOVERY - puppet last run on mc2016 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [11:15:20] RECOVERY - puppet last run on db1066 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:15:21] RECOVERY - puppet last run on elastic1018 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [11:15:21] RECOVERY - puppet last run on mw1201 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:15:31] RECOVERY - puppet last run on analytics1003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:15:31] RECOVERY - puppet last run on cp2016 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [11:15:31] RECOVERY - puppet last run on mw2188 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:15:31] RECOVERY - puppet last run on db2045 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:15:32] RECOVERY - puppet last run on mw2105 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:15:32] RECOVERY - puppet last run on mw1294 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:15:32] RECOVERY - puppet last run on cp2026 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [11:15:33] RECOVERY - puppet last run on achernar is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:15:41] RECOVERY - puppet last run on ms-be2006 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:15:41] RECOVERY - puppet last run on sinistra is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [11:15:41] RECOVERY - puppet last run on rdb2005 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [11:15:41] RECOVERY - puppet last run on mw2229 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:15:41] RECOVERY - puppet last run on wtp2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:15:41] RECOVERY - puppet last run on mw1241 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:15:41] RECOVERY - puppet last run on thumbor1002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:15:42] RECOVERY - puppet last run on cp4002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:15:50] RECOVERY - puppet last run on lvs1005 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:00] RECOVERY - puppet last run on elastic1046 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:00] RECOVERY - puppet last run on bast1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:00] RECOVERY - puppet last run on restbase1012 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:01] RECOVERY - puppet last run on es1014 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [11:16:01] RECOVERY - puppet last run on lvs1004 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [11:16:01] RECOVERY - puppet last run on db1040 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:01] RECOVERY - puppet last run on fluorine is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:01] RECOVERY - puppet last run on wtp1001 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [11:16:02] RECOVERY - puppet last run on cp3047 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:03] RECOVERY - puppet last run on cp3021 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:03] RECOVERY - puppet last run on cp3040 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:04] RECOVERY - puppet last run on lvs3003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:21] RECOVERY - puppet last run on cp2011 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:21] RECOVERY - puppet last run on labvirt1010 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:21] RECOVERY - puppet last run on cp2014 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [11:16:21] RECOVERY - puppet last run on mw2134 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [11:16:21] RECOVERY - puppet last run on mw2176 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [11:16:22] RECOVERY - puppet last run on mw2083 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:22] RECOVERY - puppet last run on mw2114 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [11:16:22] RECOVERY - puppet last run on db1050 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:23] RECOVERY - puppet last run on mw1235 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [11:16:24] RECOVERY - puppet last run on db2054 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [11:16:24] RECOVERY - puppet last run on ms-fe1002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:25] RECOVERY - puppet last run on lvs2004 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [11:16:25] RECOVERY - puppet last run on mw2082 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:26] RECOVERY - puppet last run on mw2113 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [11:16:30] RECOVERY - puppet last run on mw1253 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [11:16:31] RECOVERY - puppet last run on mw1189 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:31] RECOVERY - puppet last run on cp2003 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [11:16:32] RECOVERY - puppet last run on logstash1002 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [11:16:32] RECOVERY - puppet last run on ms-fe2003 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [11:16:40] RECOVERY - puppet last run on analytics1035 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:40] RECOVERY - puppet last run on mw2184 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [11:16:41] RECOVERY - puppet last run on mw2163 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:41] RECOVERY - puppet last run on analytics1030 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [11:16:42] RECOVERY - puppet last run on db1079 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:42] RECOVERY - puppet last run on mw2212 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [11:16:42] RECOVERY - puppet last run on mw2127 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [11:16:50] RECOVERY - puppet last run on mw2067 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [11:16:50] RECOVERY - puppet last run on mw2080 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:16:50] RECOVERY - puppet last run on mw2079 is OK: OK: Puppet is currently enabled, last run 10 seconds ago with 0 failures [11:16:50] RECOVERY - puppet last run on mw1267 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:51] RECOVERY - puppet last run on mw2117 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:51] RECOVERY - puppet last run on kafka2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:51] RECOVERY - puppet last run on ms-be1015 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:52] RECOVERY - puppet last run on elastic1027 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:16:52] RECOVERY - puppet last run on wtp2016 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [11:17:00] RECOVERY - puppet last run on db1034 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [11:17:01] RECOVERY - puppet last run on labvirt1009 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [11:17:01] RECOVERY - puppet last run on db1082 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [11:17:01] RECOVERY - puppet last run on labsdb1005 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [11:17:02] RECOVERY - puppet last run on dbproxy1001 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [11:17:02] RECOVERY - puppet last run on mw2093 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [11:17:10] RECOVERY - puppet last run on db1051 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [11:17:11] RECOVERY - puppet last run on mw1166 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:17:11] RECOVERY - puppet last run on mw2196 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:17:11] RECOVERY - puppet last run on graphite2002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:17:11] RECOVERY - puppet last run on mw2096 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [11:17:11] RECOVERY - puppet last run on mw1173 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:17:12] RECOVERY - puppet last run on db2047 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:17:12] RECOVERY - puppet last run on mw1298 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:17:13] RECOVERY - puppet last run on analytics1031 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [11:17:13] RECOVERY - puppet last run on db1021 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [11:17:13] RECOVERY - puppet last run on mw2233 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [11:17:14] RECOVERY - puppet last run on cp4015 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [11:17:21] RECOVERY - puppet last run on cp1066 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [11:17:21] RECOVERY - puppet last run on db2065 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [11:17:21] RECOVERY - puppet last run on restbase2002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:17:21] RECOVERY - puppet last run on labstore2003 is OK: OK: Puppet is currently enabled, last run 33 seconds ago with 0 failures [11:17:31] RECOVERY - puppet last run on mw1204 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:17:31] RECOVERY - puppet last run on mc1012 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [11:17:31] RECOVERY - puppet last run on mw2143 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [11:17:31] RECOVERY - puppet last run on etherpad1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:17:32] RECOVERY - puppet last run on cp3030 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [11:17:32] RECOVERY - puppet last run on cp3042 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:17:32] RECOVERY - puppet last run on mw2087 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:17:32] RECOVERY - puppet last run on labcontrol1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:17:33] RECOVERY - puppet last run on mw2090 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [11:17:33] RECOVERY - puppet last run on oresrdb1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:17:40] RECOVERY - puppet last run on wtp2010 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:17:40] RECOVERY - puppet last run on install2001 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [11:17:41] RECOVERY - puppet last run on elastic1041 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [11:17:41] RECOVERY - puppet last run on prometheus2002 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [11:17:41] RECOVERY - puppet last run on cp4019 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [11:17:50] RECOVERY - puppet last run on bast3001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:18:00] RECOVERY - puppet last run on mendelevium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:18:01] RECOVERY - puppet last run on elastic1045 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [11:18:01] RECOVERY - puppet last run on mw1304 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [11:18:01] RECOVERY - puppet last run on cp4014 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:18:02] RECOVERY - puppet last run on ms-be3002 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [11:18:10] RECOVERY - puppet last run on labstore1001 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [11:18:12] RECOVERY - puppet last run on db2038 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [11:18:21] RECOVERY - puppet last run on cp1071 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:18:22] RECOVERY - puppet last run on mw1211 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:18:22] RECOVERY - puppet last run on mw2182 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [11:18:22] RECOVERY - puppet last run on mw2092 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [11:18:22] RECOVERY - puppet last run on mw2123 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:18:22] RECOVERY - puppet last run on mw2084 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [11:18:23] RECOVERY - puppet last run on mw2110 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [11:18:30] RECOVERY - puppet last run on mw1179 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [11:18:31] RECOVERY - puppet last run on mw1279 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:18:31] RECOVERY - puppet last run on mw1213 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:18:31] RECOVERY - puppet last run on bromine is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [11:18:32] RECOVERY - puppet last run on db1016 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:18:40] RECOVERY - puppet last run on restbase1014 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:18:41] RECOVERY - puppet last run on druid1001 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [11:18:41] RECOVERY - puppet last run on mw2101 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [11:18:41] RECOVERY - puppet last run on mw1230 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [11:18:41] RECOVERY - puppet last run on sca2002 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [11:18:42] RECOVERY - puppet last run on mw2111 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [11:18:50] RECOVERY - puppet last run on ocg1002 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [11:18:50] RECOVERY - puppet last run on elastic2008 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:18:51] RECOVERY - puppet last run on cp2010 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:18:51] RECOVERY - puppet last run on mw2131 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:18:51] RECOVERY - puppet last run on restbase1009 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:18:52] RECOVERY - puppet last run on mw2070 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:18:52] RECOVERY - puppet last run on mw2085 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [11:18:52] RECOVERY - puppet last run on db1068 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:19:01] RECOVERY - puppet last run on dataset1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:19:01] RECOVERY - puppet last run on es1016 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:19:01] RECOVERY - puppet last run on mw2203 is OK: OK: Puppet is currently enabled, last run 31 seconds ago with 0 failures [11:19:02] RECOVERY - puppet last run on analytics1056 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [11:19:02] RECOVERY - puppet last run on ms-be2009 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [11:19:02] RECOVERY - puppet last run on db2029 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:19:02] RECOVERY - puppet last run on mw2130 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [11:19:10] RECOVERY - puppet last run on elastic2013 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [11:19:10] RECOVERY - puppet last run on mw2142 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:19:10] RECOVERY - puppet last run on aqs1001 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [11:19:11] RECOVERY - puppet last run on conf2002 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [11:19:11] RECOVERY - puppet last run on stat1001 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [11:19:11] RECOVERY - puppet last run on mw1208 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:19:11] RECOVERY - puppet last run on cp1058 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:19:12] RECOVERY - puppet last run on cp3006 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:19:20] RECOVERY - puppet last run on db2061 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [11:19:20] RECOVERY - puppet last run on maps2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:19:21] RECOVERY - puppet last run on ms-be2010 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [11:19:21] RECOVERY - puppet last run on mw2223 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [11:19:21] RECOVERY - puppet last run on mw2249 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:19:21] RECOVERY - puppet last run on pc2005 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [11:19:21] RECOVERY - puppet last run on elastic1040 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:19:22] RECOVERY - puppet last run on serpens is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [11:19:22] RECOVERY - puppet last run on mw1194 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:19:30] RECOVERY - puppet last run on db1009 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:19:31] RECOVERY - puppet last run on mw1206 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [11:19:31] RECOVERY - puppet last run on mw2168 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:19:31] RECOVERY - puppet last run on ms-be3001 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [11:19:31] RECOVERY - puppet last run on kafka1014 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [11:19:32] RECOVERY - puppet last run on mw2062 is OK: OK: Puppet is currently enabled, last run 58 seconds ago with 0 failures [11:19:32] RECOVERY - puppet last run on mc2014 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:19:32] RECOVERY - puppet last run on analytics1002 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [11:19:32] RECOVERY - puppet last run on db2057 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:19:41] RECOVERY - puppet last run on ms-be2027 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:19:41] RECOVERY - puppet last run on mw1278 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [11:19:41] RECOVERY - puppet last run on cp1073 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:19:42] RECOVERY - puppet last run on cp4005 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [11:19:42] RECOVERY - puppet last run on lvs3004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:19:51] RECOVERY - puppet last run on db1024 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [11:20:02] RECOVERY - puppet last run on lvs4001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:20:02] RECOVERY - puppet last run on cp4017 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:20:02] RECOVERY - puppet last run on ms-fe3002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:20:03] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1001 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [50.0] [11:20:11] RECOVERY - puppet last run on labvirt1012 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:20:11] RECOVERY - puppet last run on lvs1011 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:20:11] RECOVERY - puppet last run on wtp1002 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [11:20:11] RECOVERY - puppet last run on analytics1029 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [11:20:21] RECOVERY - puppet last run on db1026 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:20:21] RECOVERY - puppet last run on db2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:20:21] RECOVERY - puppet last run on db2037 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [11:20:22] RECOVERY - puppet last run on cp2022 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:20:30] RECOVERY - puppet last run on mw1238 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:20:30] RECOVERY - puppet last run on labsdb1001 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [11:20:31] RECOVERY - puppet last run on elastic1035 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:20:32] RECOVERY - puppet last run on cp2019 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [11:20:32] RECOVERY - puppet last run on cp1099 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:20:40] RECOVERY - puppet last run on db2004 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [11:20:40] RECOVERY - puppet last run on mw2174 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [11:20:41] RECOVERY - puppet last run on cp2015 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [11:20:41] RECOVERY - puppet last run on cp1065 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:20:41] RECOVERY - puppet last run on analytics1051 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:20:41] RECOVERY - puppet last run on db2023 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:20:50] RECOVERY - puppet last run on mw2172 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:20:50] RECOVERY - puppet last run on mw2152 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:20:51] RECOVERY - puppet last run on ms-be2014 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:20:51] RECOVERY - puppet last run on conf2003 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [11:20:51] RECOVERY - puppet last run on mira is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [11:20:51] RECOVERY - puppet last run on mw2242 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [11:20:51] RECOVERY - puppet last run on db1010 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [11:20:52] RECOVERY - puppet last run on mw1169 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:20:52] RECOVERY - puppet last run on zosma is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [11:20:53] RECOVERY - puppet last run on cp1062 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:20:53] RECOVERY - puppet last run on bohrium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:21:00] RECOVERY - puppet last run on labvirt1007 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [11:21:00] RECOVERY - puppet last run on mw1296 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:21:00] RECOVERY - puppet last run on db1071 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:21:00] RECOVERY - puppet last run on db1093 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [11:21:00] RECOVERY - puppet last run on mw1274 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:21:01] RECOVERY - puppet last run on mw1295 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:21:01] RECOVERY - puppet last run on dbproxy1006 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [11:21:02] RECOVERY - puppet last run on analytics1044 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [11:21:02] RECOVERY - puppet last run on krypton is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [11:21:03] RECOVERY - puppet last run on rdb1006 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:21:03] RECOVERY - puppet last run on mw2200 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:21:20] RECOVERY - puppet last run on hassium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:21:20] RECOVERY - puppet last run on tegmen is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:21:21] RECOVERY - puppet last run on mw1218 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [11:21:30] RECOVERY - puppet last run on wasat is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:21:31] RECOVERY - puppet last run on db2041 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:21:31] RECOVERY - puppet last run on mw2150 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:21:31] RECOVERY - puppet last run on mw2094 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [11:21:31] RECOVERY - puppet last run on mx1001 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [11:21:32] RECOVERY - puppet last run on labvirt1005 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:21:40] RECOVERY - puppet last run on db1081 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [11:21:40] RECOVERY - puppet last run on conf2001 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [11:21:50] RECOVERY - puppet last run on cp3031 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:21:50] RECOVERY - puppet last run on cp3009 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:22:00] RECOVERY - puppet last run on wtp1013 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [11:22:00] RECOVERY - puppet last run on pc1004 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [11:22:01] RECOVERY - puppet last run on ms-be1007 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:22:02] RECOVERY - puppet last run on stat1002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:22:11] RECOVERY - puppet last run on cp3046 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [11:22:11] RECOVERY - puppet last run on cp3013 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:22:11] RECOVERY - puppet last run on kafka1013 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:22:12] RECOVERY - puppet last run on mw1181 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:22:21] RECOVERY - puppet last run on ocg1001 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [11:22:22] RECOVERY - puppet last run on analytics1032 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:22:22] RECOVERY - puppet last run on lvs2003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:22:23] RECOVERY - puppet last run on mw2144 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [11:22:23] RECOVERY - puppet last run on mw2186 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [11:22:30] RECOVERY - puppet last run on analytics1036 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:22:30] RECOVERY - puppet last run on labcontrol1002 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [11:22:30] RECOVERY - puppet last run on db1076 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:22:30] RECOVERY - puppet last run on mw1233 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [11:22:31] RECOVERY - puppet last run on uranium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:22:31] RECOVERY - puppet last run on db1055 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:22:32] RECOVERY - puppet last run on cp2017 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:22:41] RECOVERY - puppet last run on mw2202 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [11:22:41] RECOVERY - puppet last run on mw2193 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [11:22:41] RECOVERY - puppet last run on db1063 is OK: OK: Puppet is currently enabled, last run 54 seconds ago with 0 failures [11:22:41] RECOVERY - puppet last run on mc1016 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [11:22:42] RECOVERY - puppet last run on ms-be2016 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:22:51] RECOVERY - puppet last run on mw2107 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:22:51] RECOVERY - puppet last run on elastic2005 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [11:22:51] RECOVERY - puppet last run on mw2231 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:22:51] RECOVERY - puppet last run on pc2006 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [11:22:51] RECOVERY - puppet last run on mw2068 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [11:22:52] RECOVERY - puppet last run on elastic2020 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [11:22:52] RECOVERY - puppet last run on prometheus2001 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [11:22:52] RECOVERY - puppet last run on ms-be2017 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:23:00] RECOVERY - puppet last run on mw1297 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:23:00] RECOVERY - puppet last run on ms-be1009 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:23:00] RECOVERY - puppet last run on ms-fe1003 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:23:01] RECOVERY - puppet last run on snapshot1005 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:23:01] RECOVERY - puppet last run on db1053 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [11:23:10] RECOVERY - puppet last run on mw2167 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [11:23:10] RECOVERY - puppet last run on mw1198 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:23:11] RECOVERY - puppet last run on mw2209 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [11:23:11] RECOVERY - puppet last run on elastic1023 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [11:23:11] RECOVERY - puppet last run on mw1240 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:23:11] RECOVERY - puppet last run on logstash1005 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [11:23:11] RECOVERY - puppet last run on mw2106 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:23:12] RECOVERY - puppet last run on es2012 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:23:12] RECOVERY - puppet last run on db2003 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:23:12] RECOVERY - puppet last run on labstore2004 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [11:23:20] RECOVERY - puppet last run on aqs1005 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:23:21] RECOVERY - puppet last run on ms-be1017 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:23:21] RECOVERY - puppet last run on db2011 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [11:23:30] RECOVERY - puppet last run on iridium is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [11:23:31] RECOVERY - puppet last run on ms-be1004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:23:31] RECOVERY - puppet last run on cp3019 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [11:23:32] RECOVERY - puppet last run on cp3012 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [11:23:32] RECOVERY - puppet last run on maerlant is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:23:32] RECOVERY - puppet last run on ms-fe3001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:23:32] RECOVERY - puppet last run on ms-be2013 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:23:41] RECOVERY - puppet last run on mw1256 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:23:50] RECOVERY - puppet last run on cp3039 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [11:23:51] RECOVERY - puppet last run on analytics1053 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [11:23:51] RECOVERY - puppet last run on mw1178 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:23:51] RECOVERY - puppet last run on mc1018 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [11:23:51] RECOVERY - puppet last run on ms-be1024 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:23:51] RECOVERY - puppet last run on mw1219 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [11:24:00] RECOVERY - puppet last run on db1074 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:24:01] RECOVERY - puppet last run on dbstore1002 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [11:24:01] RECOVERY - puppet last run on elastic1020 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [11:24:01] RECOVERY - puppet last run on lvs4002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:24:02] RECOVERY - puppet last run on mw1185 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [11:24:10] RECOVERY - puppet last run on cp3045 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:24:11] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1001 is OK: OK: Less than 1.00% above the threshold [25.0] [11:24:21] RECOVERY - puppet last run on netmon1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:24:21] RECOVERY - puppet last run on db2034 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [11:24:22] RECOVERY - puppet last run on mw2091 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [11:24:22] RECOVERY - puppet last run on db2050 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:24:31] RECOVERY - puppet last run on mw1305 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [11:24:31] RECOVERY - puppet last run on mw1262 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:24:31] RECOVERY - puppet last run on wtp1020 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:24:31] RECOVERY - puppet last run on db1084 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [11:24:31] RECOVERY - puppet last run on ms-be1025 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [11:24:32] RECOVERY - puppet last run on relforge1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:24:32] RECOVERY - puppet last run on cp1055 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:24:33] RECOVERY - puppet last run on analytics1041 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [11:24:33] RECOVERY - puppet last run on mc1006 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:24:34] RECOVERY - puppet last run on graphite1002 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [11:24:41] RECOVERY - puppet last run on wtp1006 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:24:42] RECOVERY - puppet last run on es2002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:24:42] RECOVERY - puppet last run on stat1004 is OK: OK: Puppet is currently enabled, last run 41 seconds ago with 0 failures [11:24:43] RECOVERY - puppet last run on analytics1028 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [11:24:50] RECOVERY - puppet last run on db1077 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [11:24:50] RECOVERY - puppet last run on mw2141 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [11:24:50] RECOVERY - puppet last run on db1011 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:24:51] RECOVERY - puppet last run on mw1200 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:24:51] RECOVERY - puppet last run on ms-be1002 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [11:24:51] RECOVERY - puppet last run on mw2248 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [11:25:00] RECOVERY - puppet last run on elastic2021 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:25:00] RECOVERY - puppet last run on cp2002 is OK: OK: Puppet is currently enabled, last run 0 seconds ago with 0 failures [11:25:01] RECOVERY - puppet last run on heze is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:25:02] RECOVERY - puppet last run on db1072 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:25:02] RECOVERY - puppet last run on mw2157 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [11:25:02] RECOVERY - puppet last run on db1073 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:25:02] RECOVERY - puppet last run on db1018 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [11:25:02] RECOVERY - puppet last run on mw1276 is OK: OK: Puppet is currently enabled, last run 8 seconds ago with 0 failures [11:25:12] RECOVERY - puppet last run on mw1228 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [11:25:13] RECOVERY - puppet last run on labservices1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:25:13] RECOVERY - puppet last run on elastic2014 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:25:13] RECOVERY - puppet last run on mw2118 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:25:13] RECOVERY - puppet last run on mw2076 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [11:25:20] RECOVERY - puppet last run on ganeti2002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:25:20] RECOVERY - puppet last run on ms-be2020 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [11:25:21] RECOVERY - puppet last run on wtp2017 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [11:25:21] RECOVERY - puppet last run on graphite1003 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [11:25:21] RECOVERY - puppet last run on mw1231 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:25:21] RECOVERY - puppet last run on mw2069 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [11:25:21] RECOVERY - puppet last run on ms-be1019 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:25:22] RECOVERY - puppet last run on mc2005 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [11:25:22] RECOVERY - puppet last run on mc2004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:25:23] RECOVERY - puppet last run on sarin is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:25:23] RECOVERY - puppet last run on db2067 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:25:23] RECOVERY - puppet last run on ms-be2025 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [11:25:41] RECOVERY - puppet last run on ms-be2026 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [11:25:41] RECOVERY - puppet last run on es2014 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:25:41] RECOVERY - puppet last run on ganeti2003 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:25:41] RECOVERY - puppet last run on ms-be1022 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:25:41] RECOVERY - puppet last run on rdb2003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:25:41] RECOVERY - puppet last run on mw1222 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:25:42] RECOVERY - puppet last run on ms-be1003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:25:50] RECOVERY - puppet last run on cp4009 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [11:25:51] RECOVERY - puppet last run on db1046 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [11:25:51] RECOVERY - puppet last run on pc1006 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [11:25:51] RECOVERY - puppet last run on dubnium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:25:51] RECOVERY - puppet last run on mc1017 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [11:25:51] RECOVERY - puppet last run on lvs1003 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [11:26:00] RECOVERY - puppet last run on ms-fe1004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:26:00] RECOVERY - puppet last run on terbium is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [11:26:01] RECOVERY - puppet last run on mw1174 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:26:02] RECOVERY - puppet last run on labnet1002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:26:11] RECOVERY - puppet last run on lvs3001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:26:11] RECOVERY - puppet last run on mw1220 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [11:26:11] RECOVERY - puppet last run on wtp1016 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:26:12] RECOVERY - puppet last run on cp1068 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:26:12] RECOVERY - puppet last run on mw1215 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:26:12] RECOVERY - puppet last run on mw1260 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:26:23] RECOVERY - puppet last run on scb1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:26:23] RECOVERY - puppet last run on elastic1042 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [11:26:24] RECOVERY - puppet last run on nobelium is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [11:26:24] RECOVERY - puppet last run on db1028 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [11:26:24] RECOVERY - puppet last run on eventlog2001 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [11:26:24] RECOVERY - puppet last run on cp2013 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [11:26:30] RECOVERY - puppet last run on labsdb1008 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:26:30] RECOVERY - puppet last run on kafka1002 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [11:26:31] RECOVERY - puppet last run on mw1203 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:26:34] heh... [11:26:40] RECOVERY - puppet last run on wtp1008 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:26:41] RECOVERY - puppet last run on cp1054 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [11:26:41] RECOVERY - puppet last run on mw2145 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [11:26:42] RECOVERY - puppet last run on mw2073 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [11:26:50] RECOVERY - puppet last run on db1086 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:26:51] RECOVERY - puppet last run on mw2081 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:26:51] RECOVERY - puppet last run on db2055 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [11:26:51] RECOVERY - puppet last run on db2058 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:26:51] RECOVERY - puppet last run on snapshot1007 is OK: OK: Puppet is currently enabled, last run 15 seconds ago with 0 failures [11:26:51] RECOVERY - puppet last run on sca1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:00] RECOVERY - puppet last run on druid1002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:00] RECOVERY - puppet last run on wtp2008 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:01] RECOVERY - puppet last run on cp2001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:01] RECOVERY - puppet last run on db1045 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:02] RECOVERY - puppet last run on db1059 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:02] RECOVERY - puppet last run on restbase2006 is OK: OK: Puppet is currently enabled, last run 9 seconds ago with 0 failures [11:27:02] RECOVERY - puppet last run on mw2136 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [11:27:10] RECOVERY - puppet last run on cp1053 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:11] RECOVERY - puppet last run on db2044 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [11:27:11] RECOVERY - puppet last run on elastic2007 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [11:27:11] RECOVERY - puppet last run on db1015 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:11] RECOVERY - puppet last run on mc2007 is OK: OK: Puppet is currently enabled, last run 43 seconds ago with 0 failures [11:27:11] RECOVERY - puppet last run on ms-be1010 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [11:27:11] RECOVERY - puppet last run on mw2129 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:12] RECOVERY - puppet last run on db1056 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:21] RECOVERY - puppet last run on db2064 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:21] RECOVERY - puppet last run on db2056 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [11:27:21] RECOVERY - puppet last run on ms-be2022 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [11:27:22] RECOVERY - puppet last run on gallium is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [11:27:22] RECOVERY - puppet last run on db2060 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:22] RECOVERY - puppet last run on mw2228 is OK: OK: Puppet is currently enabled, last run 25 seconds ago with 0 failures [11:27:30] RECOVERY - puppet last run on mw2250 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:30] RECOVERY - puppet last run on es2013 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:30] RECOVERY - puppet last run on mw1170 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:30] RECOVERY - puppet last run on wtp2015 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:30] RECOVERY - puppet last run on wtp2004 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [11:27:31] RECOVERY - puppet last run on analytics1047 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:31] RECOVERY - puppet last run on wtp1012 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [11:27:32] RECOVERY - puppet last run on db1089 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [11:27:32] RECOVERY - puppet last run on mw1284 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:33] RECOVERY - puppet last run on chromium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:33] RECOVERY - puppet last run on db1043 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [11:27:34] RECOVERY - puppet last run on mw2095 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [11:27:40] RECOVERY - puppet last run on einsteinium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:40] RECOVERY - puppet last run on cp3036 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [11:27:40] RECOVERY - puppet last run on cp3008 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:41] RECOVERY - puppet last run on mw2158 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:41] RECOVERY - puppet last run on mw2126 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [11:27:41] RECOVERY - puppet last run on hydrogen is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [11:27:41] RECOVERY - puppet last run on mc2015 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:27:41] RECOVERY - puppet last run on wtp1005 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [11:28:11] RECOVERY - puppet last run on cp4010 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:28:20] RECOVERY - puppet last run on rdb1005 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:28:21] RECOVERY - puppet last run on mw2166 is OK: OK: Puppet is currently enabled, last run 1 second ago with 0 failures [11:28:21] RECOVERY - puppet last run on mw2146 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [11:28:21] RECOVERY - puppet last run on mw2207 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:28:30] RECOVERY - puppet last run on mw1289 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [11:28:30] RECOVERY - puppet last run on restbase1007 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:28:30] RECOVERY - puppet last run on aqs1002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:28:30] RECOVERY - puppet last run on ms-be2015 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [11:28:40] RECOVERY - puppet last run on mw1199 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:28:40] RECOVERY - puppet last run on snapshot1006 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:28:41] RECOVERY - puppet last run on dbproxy1003 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [11:28:41] RECOVERY - puppet last run on ms-be1014 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:28:42] RECOVERY - puppet last run on lvs2002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:28:50] RECOVERY - puppet last run on lvs2006 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [11:28:50] RECOVERY - puppet last run on ms-be2011 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:28:51] RECOVERY - puppet last run on graphite2001 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [11:28:51] RECOVERY - puppet last run on mc1014 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:29:00] RECOVERY - puppet last run on ms-be2021 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:29:01] RECOVERY - puppet last run on wtp1014 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [11:29:01] RECOVERY - puppet last run on lvs1009 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [11:29:02] RECOVERY - puppet last run on elastic1039 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:29:02] RECOVERY - puppet last run on elastic1038 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [11:29:10] RECOVERY - puppet last run on snapshot1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:29:10] RECOVERY - puppet last run on conf1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:29:10] RECOVERY - puppet last run on labsdb1007 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [11:29:11] RECOVERY - puppet last run on ms-be1005 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [11:29:11] RECOVERY - puppet last run on db1060 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:29:12] RECOVERY - puppet last run on stat1003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:29:20] RECOVERY - puppet last run on db1039 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [11:29:20] RECOVERY - puppet last run on mc2013 is OK: OK: Puppet is currently enabled, last run 5 seconds ago with 0 failures [11:29:21] RECOVERY - puppet last run on db2016 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [11:29:21] RECOVERY - puppet last run on db2068 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:29:21] RECOVERY - puppet last run on restbase2008 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:29:21] RECOVERY - puppet last run on cp1064 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:29:21] RECOVERY - puppet last run on mw1302 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:29:22] RECOVERY - puppet last run on francium is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [11:29:30] RECOVERY - puppet last run on mw1259 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [11:29:30] RECOVERY - puppet last run on mw1288 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [11:29:30] RECOVERY - puppet last run on cp1048 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [11:29:30] RECOVERY - puppet last run on ununpentium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:29:30] RECOVERY - puppet last run on mw2061 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [11:29:31] RECOVERY - puppet last run on mc2006 is OK: OK: Puppet is currently enabled, last run 4 seconds ago with 0 failures [11:29:31] RECOVERY - puppet last run on db2033 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [11:29:32] RECOVERY - puppet last run on suhail is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:29:41] RECOVERY - puppet last run on mw1195 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:29:41] RECOVERY - puppet last run on ms-be1008 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [11:29:41] RECOVERY - puppet last run on silver is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:29:50] RECOVERY - puppet last run on mw2138 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:29:50] RECOVERY - puppet last run on cp3003 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [11:29:51] RECOVERY - puppet last run on mw1202 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [11:29:51] RECOVERY - puppet last run on mw1184 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:29:51] RECOVERY - puppet last run on wtp2003 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [11:29:51] RECOVERY - puppet last run on alsafi is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:29:52] RECOVERY - puppet last run on mw1282 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [11:29:52] RECOVERY - puppet last run on kafka1012 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:30:01] RECOVERY - puppet last run on cp4013 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:30:01] RECOVERY - puppet last run on bast4001 is OK: OK: Puppet is currently enabled, last run 30 seconds ago with 0 failures [11:30:01] RECOVERY - puppet last run on rhenium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:30:01] RECOVERY - puppet last run on osmium is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [11:30:11] RECOVERY - puppet last run on mw1246 is OK: OK: Puppet is currently enabled, last run 42 seconds ago with 0 failures [11:30:20] RECOVERY - puppet last run on db1094 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [11:30:21] RECOVERY - puppet last run on cp4018 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:30:21] RECOVERY - puppet last run on cp3032 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:30:30] RECOVERY - puppet last run on praseodymium is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:30:42] RECOVERY - puppet last run on analytics1026 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:31:00] RECOVERY - puppet last run on mw1287 is OK: OK: Puppet is currently enabled, last run 7 seconds ago with 0 failures [11:31:00] RECOVERY - puppet last run on mw1191 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:31:00] RECOVERY - puppet last run on mw1286 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [11:31:01] RECOVERY - puppet last run on eeden is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [11:31:12] RECOVERY - puppet last run on mw1264 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [11:31:21] RECOVERY - puppet last run on mw1252 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [11:31:30] RECOVERY - puppet last run on mw1183 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:31:32] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1001 is CRITICAL: CRITICAL: 80.00% of data above the critical threshold [50.0] [11:31:41] RECOVERY - puppet last run on maps-test2004 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:31:42] RECOVERY - puppet last run on mw1257 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:31:52] RECOVERY - puppet last run on analytics1050 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:31:53] RECOVERY - puppet last run on meitnerium is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:32:00] RECOVERY - puppet last run on wtp1023 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:32:01] RECOVERY - puppet last run on analytics1039 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [11:32:01] RECOVERY - puppet last run on mw1192 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:32:10] RECOVERY - puppet last run on labservices1002 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [11:32:10] RECOVERY - puppet last run on neon is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:32:10] RECOVERY - puppet last run on mw2214 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:32:11] RECOVERY - puppet last run on es2015 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:32:11] RECOVERY - puppet last run on ms-be1020 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:32:21] RECOVERY - puppet last run on es2019 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:32:21] RECOVERY - puppet last run on mc2008 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:32:21] RECOVERY - puppet last run on mw1244 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:32:31] RECOVERY - puppet last run on db2035 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:32:31] RECOVERY - puppet last run on db2051 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:32:32] RECOVERY - puppet last run on elastic2001 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:32:40] 06Operations, 06Services, 06Services-next, 05Security, 15User-mobrovac: Productize the Electron PDF render service & create a REST API end point - https://phabricator.wikimedia.org/T142226#2543531 (10Lea_WMDE) @ssastry The problem on WMDE side is that from autumn on I'm going to be on a leave for 5 month... [11:32:41] RECOVERY - puppet last run on mw2125 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:32:41] RECOVERY - puppet last run on mw2191 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:33:00] RECOVERY - puppet last run on notebook1002 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:33:00] RECOVERY - puppet last run on mw1248 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [11:33:11] RECOVERY - puppet last run on kafka1020 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [11:33:11] RECOVERY - puppet last run on mw1216 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:33:12] RECOVERY - puppet last run on sca2001 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [11:33:21] RECOVERY - puppet last run on db2066 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [11:33:21] RECOVERY - puppet last run on mw2088 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [11:33:21] RECOVERY - puppet last run on rdb1001 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [11:33:41] RECOVERY - puppet last run on db2049 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [11:33:41] RECOVERY - puppet last run on mw1258 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [11:33:41] RECOVERY - puppet last run on mw2177 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [11:33:41] RECOVERY - puppet last run on mw2235 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [11:33:41] RECOVERY - puppet last run on analytics1037 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:33:50] RECOVERY - puppet last run on elastic1028 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [11:33:50] RECOVERY - puppet last run on mw1163 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:33:51] RECOVERY - puppet last run on mw1212 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [11:33:51] RECOVERY - puppet last run on mw1303 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [11:33:51] RECOVERY - puppet last run on mw2072 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [11:33:52] RECOVERY - puppet last run on wtp1022 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [11:34:01] RECOVERY - puppet last run on analytics1034 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [11:34:01] RECOVERY - puppet last run on db1041 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [11:34:01] RECOVERY - puppet last run on hafnium is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [11:34:10] RECOVERY - puppet last run on mw2232 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [11:34:11] RECOVERY - puppet last run on restbase2003 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [11:37:47] (03PS1) 10Dereckson: Disable local upload on bat-smg.wikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/304202 (https://phabricator.wikimedia.org/T142632) [11:40:43] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1001 is OK: OK: Less than 1.00% above the threshold [25.0] [11:54:15] (03CR) 10Gehel: [C: 031] mwgrep: fails gracefully when an invalid regex is provided [puppet] - 10https://gerrit.wikimedia.org/r/302892 (https://phabricator.wikimedia.org/T141996) (owner: 10DCausse) [11:55:24] (03PS6) 10Alexandros Kosiaris: puppetmaster: Split extra_auth_rules from is_labs_master [puppet] - 10https://gerrit.wikimedia.org/r/303757 (owner: 10Alex Monk) [11:56:38] 06Operations, 05Gitblit-Deprecate: Gitblit links not redirecting to the correct moved resource unless .git is part of repo name in url - https://phabricator.wikimedia.org/T139027#2543632 (10Danny_B) [11:59:55] (03PS7) 10Alexandros Kosiaris: puppetmaster: Split extra_auth_rules from is_labs_master [puppet] - 10https://gerrit.wikimedia.org/r/303757 (owner: 10Alex Monk) [12:00:31] (03PS1) 10Jcrespo: Remove unused accounts from unneeded functionalities with large uid [puppet] - 10https://gerrit.wikimedia.org/r/304203 [12:00:40] (03CR) 10Gehel: "Just created T142705 to keep track of the wmf-config cleanup." [puppet] - 10https://gerrit.wikimedia.org/r/304112 (https://phabricator.wikimedia.org/T142581) (owner: 10Dzahn) [12:00:59] (03CR) 10jenkins-bot: [V: 04-1] puppetmaster: Split extra_auth_rules from is_labs_master [puppet] - 10https://gerrit.wikimedia.org/r/303757 (owner: 10Alex Monk) [12:02:30] 06Operations, 10Mail: not being able to send emails via Special:EmailUser (Yahoo / GMail) - https://phabricator.wikimedia.org/T137337#2543674 (10Aklapper) [12:04:42] 06Operations, 10Ops-Access-Requests: Access for platonides to chanops - https://phabricator.wikimedia.org/T142668#2542616 (10Danny_B) Side note: I actually don't think it is necessary to track requests for IRC flags requests in Phabricator. Such requests were never tracked anywhere, and happened always pretty... [12:04:58] (03PS3) 10Gehel: WIP - elasticsearch - cleanup roles [puppet] - 10https://gerrit.wikimedia.org/r/304067 [12:05:15] 06Operations, 10DBA, 13Patch-For-Review: Decomission old coredb machines (<=db1050) - https://phabricator.wikimedia.org/T134476#2543680 (10jcrespo) [12:05:54] (03CR) 10Muehlenhoff: "@Jaime, no objection from my pov" [puppet] - 10https://gerrit.wikimedia.org/r/274076 (https://phabricator.wikimedia.org/T124962) (owner: 10Jcrespo) [12:07:48] (03PS2) 10Jcrespo: Remove unused accounts from unneeded functionalities with large uid [puppet] - 10https://gerrit.wikimedia.org/r/304203 [12:08:41] PROBLEM - puppet last run on oxygen is CRITICAL: CRITICAL: Puppet has 1 failures [12:12:33] 06Operations, 07Tracking: reduce amount of remaining Ubuntu 12.04 (precise) systems - https://phabricator.wikimedia.org/T123525#2543709 (10jcrespo) I would like to bring your attention to https://gerrit.wikimedia.org/r/304203 This is not directly related to this goal, but with the amount of old server reimage... [12:14:15] (03CR) 10Jcrespo: [C: 032] Decommision pc100[123] 2/2 [puppet] - 10https://gerrit.wikimedia.org/r/274076 (https://phabricator.wikimedia.org/T124962) (owner: 10Jcrespo) [12:15:01] (03PS8) 10Alexandros Kosiaris: puppetmaster: Split extra_auth_rules from is_labs_master [puppet] - 10https://gerrit.wikimedia.org/r/303757 (owner: 10Alex Monk) [12:15:06] (03PS1) 10Gehel: CirrusSearch - drop references to nobelium and labsearch [mediawiki-config] - 10https://gerrit.wikimedia.org/r/304204 (https://phabricator.wikimedia.org/T142705) [12:15:47] (03CR) 10jenkins-bot: [V: 04-1] CirrusSearch - drop references to nobelium and labsearch [mediawiki-config] - 10https://gerrit.wikimedia.org/r/304204 (https://phabricator.wikimedia.org/T142705) (owner: 10Gehel) [12:16:09] (03CR) 10Gehel: "This was a mechanical cleanup. I don't completely understand what I'm doing here. In depth review from David and/or Erik is required." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/304204 (https://phabricator.wikimedia.org/T142705) (owner: 10Gehel) [12:18:42] (03PS2) 10Gehel: CirrusSearch - drop references to nobelium and labsearch [mediawiki-config] - 10https://gerrit.wikimedia.org/r/304204 (https://phabricator.wikimedia.org/T142705) [12:19:51] 06Operations, 10Ops-Access-Requests: Access for platonides to chanops - https://phabricator.wikimedia.org/T142668#2542616 (10akosiaris) FWIW, I 've added platonides to @wikimedia-operations chanops. I got no access in the rest of the channels though [12:20:16] 06Operations, 10Ops-Access-Requests: Access for platonides to chanops - https://phabricator.wikimedia.org/T142668#2543726 (10akosiaris) p:05Triage>03Low [12:22:21] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1001 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [50.0] [12:24:20] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1001 is OK: OK: Less than 1.00% above the threshold [25.0] [12:25:52] (03PS4) 10Gehel: WIP - elasticsearch - cleanup roles [puppet] - 10https://gerrit.wikimedia.org/r/304067 [12:25:57] 06Operations, 10Phabricator-Bot-Requests: Creation of bot for Operations - https://phabricator.wikimedia.org/T142362#2543732 (10Aklapper) @volans: Could you drop the info on https://www.mediawiki.org/wiki/Phabricator/Bots#Acquiring_a_bot ? * Name * Purpose * An email address (which can be invalid but... [12:29:18] (03CR) 10Jcrespo: "I have added all potential people that may have code dependent on this to warn them of the intended deletion." [puppet] - 10https://gerrit.wikimedia.org/r/301076 (owner: 10Jcrespo) [12:29:38] gehel: a WIP might be in order if yoyr patch is not ready for general review :) [12:30:15] 06Operations: Install puppetDB at WMF - https://phabricator.wikimedia.org/T139476#2543738 (10akosiaris) [12:30:17] 06Operations, 10vm-requests, 05Puppet-infrastructure-modernization: eqiad+codfw: 1 VM request for puppetDB - https://phabricator.wikimedia.org/T142365#2543737 (10akosiaris) 05Open>03Resolved [12:30:52] Nemo_bis: yep, that's the case. Should I do something more than having a "WIP" in the title? [12:32:59] 06Operations, 10vm-requests: EQIAD: (1) VM request for url-downloader - https://phabricator.wikimedia.org/T134496#2543748 (10akosiaris) Move this forward. Chosing the name `aluminium` as we are running out of elements in eqiad. It's been free since 3 weeks ago with rODNScdb5d67edc2. [12:34:24] RECOVERY - puppet last run on oxygen is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [12:34:56] PROBLEM - dhclient process on mw2086 is CRITICAL: Timeout while attempting connection [12:35:04] PROBLEM - Apache HTTP on mw2086 is CRITICAL: Connection timed out [12:35:14] PROBLEM - mediawiki-installation DSH group on mw2086 is CRITICAL: Host mw2086 is not in mediawiki-installation dsh group [12:35:44] (03PS9) 10Alexandros Kosiaris: puppetmaster: Split extra_auth_rules from is_labs_master [puppet] - 10https://gerrit.wikimedia.org/r/303757 (owner: 10Alex Monk) [12:35:44] PROBLEM - nutcracker port on mw2086 is CRITICAL: Timeout while attempting connection [12:36:04] PROBLEM - nutcracker process on mw2086 is CRITICAL: Timeout while attempting connection [12:36:14] PROBLEM - puppet last run on mw2086 is CRITICAL: Timeout while attempting connection [12:36:14] (03CR) 10Alexandros Kosiaris: [C: 032 V: 032] "After missing the obvious typo (<%- instead of <%=) in the ERB file for quite a while, PCC is happy, merging" [puppet] - 10https://gerrit.wikimedia.org/r/303757 (owner: 10Alex Monk) [12:36:24] PROBLEM - salt-minion processes on mw2086 is CRITICAL: Timeout while attempting connection [12:38:33] mw2086 is wip, it's being reimaged to jessie [12:40:35] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [50.0] [12:42:14] (03CR) 10Jcrespo: "It seems analytics uses mysql_wmf::mylvmbackup, we should move that to the mariadb module." [puppet] - 10https://gerrit.wikimedia.org/r/301076 (owner: 10Jcrespo) [12:43:16] (03PS2) 10Jcrespo: Delete coredb_mysql module [puppet] - 10https://gerrit.wikimedia.org/r/301076 [12:44:49] (03PS3) 10Jcrespo: Delete coredb_mysql module and dependent roles and modules [puppet] - 10https://gerrit.wikimedia.org/r/301076 [12:48:34] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1001 is OK: OK: Less than 1.00% above the threshold [25.0] [12:48:36] RECOVERY - Apache HTTP on mw2086 is OK: HTTP OK: HTTP/1.1 200 OK - 10975 bytes in 0.074 second response time [12:50:34] RECOVERY - dhclient process on mw2086 is OK: PROCS OK: 0 processes with command name dhclient [12:55:01] (03PS2) 10Muehlenhoff: Inline firejail profile no longer shipped in firejail 0.9.40 [puppet] - 10https://gerrit.wikimedia.org/r/303553 (https://phabricator.wikimedia.org/T121756) [12:55:04] (03PS5) 10Gehel: elasticsearch - cleanup roles [puppet] - 10https://gerrit.wikimedia.org/r/304067 [12:55:07] (03CR) 10DCausse: [C: 031] "lgtm" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/304204 (https://phabricator.wikimedia.org/T142705) (owner: 10Gehel) [12:59:04] 06Operations, 10Analytics-Cluster: Migrate titanium to jessie (archiva.wikimedia.org upgrade) - https://phabricator.wikimedia.org/T123725#2543774 (10MoritzMuehlenhoff) a:05MoritzMuehlenhoff>03None [12:59:32] (03PS1) 10Alexandros Kosiaris: Point eqiad url-downloader to codfw [dns] - 10https://gerrit.wikimedia.org/r/304210 (https://phabricator.wikimedia.org/T134496) [12:59:34] (03PS1) 10Alexandros Kosiaris: Introduce aluminium.wikimedia.org [dns] - 10https://gerrit.wikimedia.org/r/304211 (https://phabricator.wikimedia.org/T134496) [12:59:36] (03PS1) 10Alexandros Kosiaris: Revert "Point eqiad url-downloader to codfw" [dns] - 10https://gerrit.wikimedia.org/r/304212 [13:02:50] !log restarting cassandra on aqs100[123] for jvm upgrades [13:02:55] (03PS2) 10Alexandros Kosiaris: Revert "Point eqiad url-downloader to codfw" [dns] - 10https://gerrit.wikimedia.org/r/304212 (https://phabricator.wikimedia.org/T134496) [13:02:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [13:03:52] (03CR) 10Gehel: "It seems that vagrant::lxc is already used by role::labs::mediawiki_vagrant and role::labs::vagrant_lxc. I wonder if they are already used" [puppet] - 10https://gerrit.wikimedia.org/r/298636 (owner: 10EBernhardson) [13:04:58] (03PS2) 10BBlack: rcstream: remove internal TLS listener [puppet] - 10https://gerrit.wikimedia.org/r/304023 (https://phabricator.wikimedia.org/T134871) [13:06:05] (03CR) 10BBlack: [C: 032] rcstream: remove internal TLS listener [puppet] - 10https://gerrit.wikimedia.org/r/304023 (https://phabricator.wikimedia.org/T134871) (owner: 10BBlack) [13:09:31] (03PS5) 10Alexandros Kosiaris: puppetmaster: Kill is_labs_master [puppet] - 10https://gerrit.wikimedia.org/r/303761 (owner: 10Alex Monk) [13:09:48] (03PS2) 10Gehel: vagrant-lxc requires ruby build dependencies [puppet] - 10https://gerrit.wikimedia.org/r/298636 (owner: 10EBernhardson) [13:09:54] (03CR) 10Ottomata: [C: 031] "To be merged after refinery changes are deployed." [puppet] - 10https://gerrit.wikimedia.org/r/304195 (owner: 10Joal) [13:11:38] (03CR) 10BBlack: [C: 031] "I'm guessing deployment will be slightly tricky? Maybe disable puppet on all the caches to (manually) remove libvmod-header and add varni" [puppet] - 10https://gerrit.wikimedia.org/r/304189 (https://phabricator.wikimedia.org/T122881) (owner: 10Ema) [13:12:44] (03PS1) 10BBlack: ciphersuite: remove last non-AEAD AES256 [puppet] - 10https://gerrit.wikimedia.org/r/304213 [13:12:46] (03PS1) 10BBlack: ciphersuite: update commentary [puppet] - 10https://gerrit.wikimedia.org/r/304214 [13:13:48] (03CR) 10Gehel: [C: 032] vagrant-lxc requires ruby build dependencies [puppet] - 10https://gerrit.wikimedia.org/r/298636 (owner: 10EBernhardson) [13:17:17] (03PS2) 10BBlack: ciphersuite: remove last non-AEAD AES256 [puppet] - 10https://gerrit.wikimedia.org/r/304213 [13:18:36] (03CR) 10Alexandros Kosiaris: [C: 032] puppetmaster: Kill is_labs_master [puppet] - 10https://gerrit.wikimedia.org/r/303761 (owner: 10Alex Monk) [13:18:41] (03PS6) 10Alexandros Kosiaris: puppetmaster: Kill is_labs_master [puppet] - 10https://gerrit.wikimedia.org/r/303761 (owner: 10Alex Monk) [13:18:43] (03CR) 10Alexandros Kosiaris: [V: 032] puppetmaster: Kill is_labs_master [puppet] - 10https://gerrit.wikimedia.org/r/303761 (owner: 10Alex Monk) [13:19:10] (03PS3) 10BBlack: ciphersuite: remove last non-AEAD AES256 [puppet] - 10https://gerrit.wikimedia.org/r/304213 [13:19:16] (03CR) 10BBlack: [C: 032 V: 032] ciphersuite: remove last non-AEAD AES256 [puppet] - 10https://gerrit.wikimedia.org/r/304213 (owner: 10BBlack) [13:19:23] (03PS2) 10BBlack: ciphersuite: update commentary [puppet] - 10https://gerrit.wikimedia.org/r/304214 [13:19:25] (03PS5) 10Gehel: Maps: Added list of cassandra servers [puppet] - 10https://gerrit.wikimedia.org/r/304019 (https://phabricator.wikimedia.org/T138092) (owner: 10Yurik) [13:19:27] (03CR) 10BBlack: [C: 032 V: 032] ciphersuite: update commentary [puppet] - 10https://gerrit.wikimedia.org/r/304214 (owner: 10BBlack) [13:23:07] (03CR) 10Gehel: [C: 032] Maps: Added list of cassandra servers [puppet] - 10https://gerrit.wikimedia.org/r/304019 (https://phabricator.wikimedia.org/T138092) (owner: 10Yurik) [13:23:21] (03PS6) 10Gehel: Maps: Added list of cassandra servers [puppet] - 10https://gerrit.wikimedia.org/r/304019 (https://phabricator.wikimedia.org/T138092) (owner: 10Yurik) [13:28:35] (03PS9) 10Giuseppe Lavagetto: puppetmaster: add role for puppetdb [puppet] - 10https://gerrit.wikimedia.org/r/303801 (https://phabricator.wikimedia.org/T142363) [13:28:53] (03PS2) 10Gehel: LVS configuration for maps cluster in eqiad [puppet] - 10https://gerrit.wikimedia.org/r/303559 (https://phabricator.wikimedia.org/T142393) [13:29:01] (03CR) 10BBlack: [C: 031] ipsec_allow: Restrict to domain networks [puppet] - 10https://gerrit.wikimedia.org/r/303837 (owner: 10Muehlenhoff) [13:29:56] (03PS1) 10Alexandros Kosiaris: Create account for chelsyx [puppet] - 10https://gerrit.wikimedia.org/r/304220 (https://phabricator.wikimedia.org/T142648) [13:30:16] (03PS1) 10Gehel: Maps LVS - activate icinga check [puppet] - 10https://gerrit.wikimedia.org/r/304221 (https://phabricator.wikimedia.org/T142393) [13:30:56] (03CR) 10Alexandros Kosiaris: [C: 031] LVS configuration for maps cluster in eqiad [puppet] - 10https://gerrit.wikimedia.org/r/303559 (https://phabricator.wikimedia.org/T142393) (owner: 10Gehel) [13:31:02] (03CR) 10jenkins-bot: [V: 04-1] Create account for chelsyx [puppet] - 10https://gerrit.wikimedia.org/r/304220 (https://phabricator.wikimedia.org/T142648) (owner: 10Alexandros Kosiaris) [13:31:14] (03PS1) 10Thcipriani: Fix bad owner check [puppet] - 10https://gerrit.wikimedia.org/r/304223 (https://phabricator.wikimedia.org/T127093) [13:31:50] 06Operations, 10Ops-Access-Requests, 13Patch-For-Review: Requesting access to stat1003, stat1002 and fluorine for chelsyx - https://phabricator.wikimedia.org/T142648#2542048 (10akosiaris) Change uploaded, manager approval done, into the 3 day waiting phase per the process now. [13:32:47] (03CR) 10Giuseppe Lavagetto: "https://puppet-compiler.wmflabs.org/3689/ seems to do the right thing." [puppet] - 10https://gerrit.wikimedia.org/r/303801 (https://phabricator.wikimedia.org/T142363) (owner: 10Giuseppe Lavagetto) [13:33:10] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1001 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [50.0] [13:33:22] (03CR) 10BBlack: [C: 031] LVS configuration for maps cluster in eqiad [puppet] - 10https://gerrit.wikimedia.org/r/303559 (https://phabricator.wikimedia.org/T142393) (owner: 10Gehel) [13:34:01] (03PS2) 10Ema: Install package varnish-modules on v4 hosts [puppet] - 10https://gerrit.wikimedia.org/r/304189 (https://phabricator.wikimedia.org/T122881) [13:34:48] (03PS2) 10BBlack: lvs: switch port 80 loadbalancing to sh for cache_ services [puppet] - 10https://gerrit.wikimedia.org/r/297418 (https://phabricator.wikimedia.org/T108827) (owner: 10Ema) [13:35:10] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1001 is OK: OK: Less than 1.00% above the threshold [25.0] [13:42:21] RECOVERY - salt-minion processes on mw2086 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/salt-minion [13:46:34] (03PS3) 10Muehlenhoff: Inline firejail profile no longer shipped in firejail 0.9.40 [puppet] - 10https://gerrit.wikimedia.org/r/303553 (https://phabricator.wikimedia.org/T121756) [13:47:01] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1001 is CRITICAL: CRITICAL: 100.00% of data above the critical threshold [50.0] [13:47:46] O_O [13:48:28] (03PS2) 10Gehel: Maps - categorizing new eqiad slaves [puppet] - 10https://gerrit.wikimedia.org/r/304025 (https://phabricator.wikimedia.org/T138092) [13:49:08] (03CR) 10Jcrespo: "Abandoned in favor of 301076." [puppet] - 10https://gerrit.wikimedia.org/r/295628 (owner: 10Jcrespo) [13:49:24] (03Abandoned) 10Jcrespo: [WIP] Delete deprecated modules coredb_mysql and mysql_wmf [puppet] - 10https://gerrit.wikimedia.org/r/295628 (owner: 10Jcrespo) [13:49:33] 06Operations, 10ops-eqiad: investigate ores1002 - not in racktables but shows up on switch - https://phabricator.wikimedia.org/T142621#2543905 (10Cmjohnson) Verified server and removed ores1002 from switch. the servers is wmf4723 and is in spare list. [13:50:14] (03CR) 10Gehel: [C: 032] Maps - categorizing new eqiad slaves [puppet] - 10https://gerrit.wikimedia.org/r/304025 (https://phabricator.wikimedia.org/T138092) (owner: 10Gehel) [13:53:32] !log restarting nodepool to test stats collection [13:53:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [13:54:56] (03PS3) 10Muehlenhoff: role::dataset::common: Limit to production networks [puppet] - 10https://gerrit.wikimedia.org/r/304192 [13:56:43] (03CR) 10Muehlenhoff: [C: 032] role::dataset::common: Limit to production networks [puppet] - 10https://gerrit.wikimedia.org/r/304192 (owner: 10Muehlenhoff) [13:56:53] (03CR) 10Mobrovac: [C: 031] "PCC OK - https://puppet-compiler.wmflabs.org/3690/" [puppet] - 10https://gerrit.wikimedia.org/r/304125 (https://phabricator.wikimedia.org/T142360) (owner: 10Ladsgroup) [13:59:01] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1001 is OK: OK: Less than 1.00% above the threshold [25.0] [14:02:12] (03PS3) 10Jcrespo: Retire db2008 and db2009 as x1 nodes [puppet] - 10https://gerrit.wikimedia.org/r/286172 (https://phabricator.wikimedia.org/T125827) [14:02:23] (03PS3) 10Giuseppe Lavagetto: changeprop: use one request for multiple models [puppet] - 10https://gerrit.wikimedia.org/r/304125 (https://phabricator.wikimedia.org/T142360) (owner: 10Ladsgroup) [14:02:41] (03CR) 10Alexandros Kosiaris: "looks generally ok. Minor comment inline" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/303801 (https://phabricator.wikimedia.org/T142363) (owner: 10Giuseppe Lavagetto) [14:03:08] (03PS1) 10ArielGlenn: more pylint: no more filter, one more staticmethod decorator [dumps] - 10https://gerrit.wikimedia.org/r/304229 [14:04:32] 06Operations, 10Ops-Access-Requests, 13Patch-For-Review: Requesting access to stat1003, stat1002 and fluorine for chelsyx - https://phabricator.wikimedia.org/T142648#2544008 (10akosiaris) p:05Triage>03Normal [14:06:54] (03CR) 10Giuseppe Lavagetto: [C: 032] changeprop: use one request for multiple models [puppet] - 10https://gerrit.wikimedia.org/r/304125 (https://phabricator.wikimedia.org/T142360) (owner: 10Ladsgroup) [14:07:04] (03CR) 10Volans: [C: 031] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/286172 (https://phabricator.wikimedia.org/T125827) (owner: 10Jcrespo) [14:07:11] <_joe_> mobrovac: let's go ^^ [14:07:20] kk [14:07:25] _joe_: merged already? [14:07:38] 06Operations, 10Phabricator-Bot-Requests: Creation of bot for Operations - https://phabricator.wikimedia.org/T142362#2544011 (10Volans) @Aklapper >>! In T142362#2543732, @Aklapper wrote: > @volans: Could you drop the info on https://www.mediawiki.org/wiki/Phabricator/Bots#Acquiring_a_bot ? > * Name `ops-m... [14:08:12] ah it seems you're running puppet there already [14:08:58] (03CR) 10Muehlenhoff: [C: 032 V: 032] Inline firejail profile no longer shipped in firejail 0.9.40 [puppet] - 10https://gerrit.wikimedia.org/r/303553 (https://phabricator.wikimedia.org/T121756) (owner: 10Muehlenhoff) [14:09:04] (03PS4) 10Muehlenhoff: Inline firejail profile no longer shipped in firejail 0.9.40 [puppet] - 10https://gerrit.wikimedia.org/r/303553 (https://phabricator.wikimedia.org/T121756) [14:09:11] (03CR) 10Muehlenhoff: [V: 032] Inline firejail profile no longer shipped in firejail 0.9.40 [puppet] - 10https://gerrit.wikimedia.org/r/303553 (https://phabricator.wikimedia.org/T121756) (owner: 10Muehlenhoff) [14:10:19] 06Operations, 10Analytics-Cluster, 10Graphoid, 06Services, and 2 others: Graphoid access logs are missing from Hadoop - https://phabricator.wikimedia.org/T99372#2544017 (10akosiaris) parsoidcache has been deprecated and graphoid is now exposed via the text cluster. That solves the problem and graphoid logs... [14:10:41] 06Operations, 10Analytics-Cluster, 10Graphoid, 06Services, and 2 others: Graphoid access logs are missing from Hadoop - https://phabricator.wikimedia.org/T99372#2544018 (10akosiaris) 05Open>03Resolved a:03akosiaris [14:10:47] (03CR) 10Jcrespo: [C: 032] Retire db2008 and db2009 as x1 nodes [puppet] - 10https://gerrit.wikimedia.org/r/286172 (https://phabricator.wikimedia.org/T125827) (owner: 10Jcrespo) [14:10:52] <_joe_> mobrovac: puppet ran everywhere [14:10:56] kk thnx _joe_ [14:10:58] will restart [14:11:06] (03PS4) 10Jcrespo: Retire db2008 and db2009 as x1 nodes [puppet] - 10https://gerrit.wikimedia.org/r/286172 (https://phabricator.wikimedia.org/T125827) [14:11:45] 06Operations, 06Performance-Team, 06Reading-Web-Backlog, 10Traffic, 13Patch-For-Review: Vary mobile HTML by connection speed - https://phabricator.wikimedia.org/T119798#2544023 (10akosiaris) [14:15:44] 06Operations, 10Mail: Emails dropping from Greenhouse to Alan - https://phabricator.wikimedia.org/T142427#2544035 (10akosiaris) p:05Triage>03Low @bbogaert Can you provide a little bit more info so I can dig up the logs? Sender/recipient addresses and an (approximate) time would be lovely [14:16:04] 06Operations, 10Analytics, 10Traffic: Correct cache_status field on webrequest dataset - https://phabricator.wikimedia.org/T142410#2544038 (10akosiaris) p:05Triage>03Normal [14:16:45] 06Operations, 10Analytics, 10Traffic: Correct cache_status field on webrequest dataset - https://phabricator.wikimedia.org/T142410#2544040 (10BBlack) 05Open>03Resolved a:03BBlack I think we're done here, assuming the data looks sane on the analytics end. [14:18:24] (03CR) 10ArielGlenn: [C: 032] more pylint: no more filter, one more staticmethod decorator [dumps] - 10https://gerrit.wikimedia.org/r/304229 (owner: 10ArielGlenn) [14:20:03] !log disabling puppet on db2001-2009; preparing for decomission [14:20:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [14:23:16] 06Operations, 10hardware-requests: EQIAD: (2) hardware access request for PUPPET - https://phabricator.wikimedia.org/T142218#2544057 (10akosiaris) p:05Triage>03Normal [14:23:30] 06Operations, 10fundraising-tech-ops, 07Security-General: use granularity (g=) restrictions for wikimedia.org fundraising DKIM records - https://phabricator.wikimedia.org/T142205#2544058 (10akosiaris) p:05Triage>03Normal [14:23:55] 06Operations, 10Analytics-Cluster: Migrate titanium to jessie (archiva.wikimedia.org upgrade) - https://phabricator.wikimedia.org/T123725#2544059 (10akosiaris) p:05Triage>03Normal [14:26:07] (03PS1) 10Giuseppe Lavagetto: nutcracker: do not validate nutcracker config before creation [puppet] - 10https://gerrit.wikimedia.org/r/304234 [14:26:13] <_joe_> moritzm: ^^ [14:28:10] Amir1: the new rules for ores are live in prod now [14:28:18] for change-prop [14:28:28] mobrovac: thanks :) [14:30:18] (03PS3) 10Ema: Install package varnish-modules on v4 hosts [puppet] - 10https://gerrit.wikimedia.org/r/304189 (https://phabricator.wikimedia.org/T122881) [14:30:32] (03CR) 10Ema: [C: 032 V: 032] Install package varnish-modules on v4 hosts [puppet] - 10https://gerrit.wikimedia.org/r/304189 (https://phabricator.wikimedia.org/T122881) (owner: 10Ema) [14:31:35] 06Operations, 06Community-Tech, 10MediaWiki-CrossWikiWatchlist, 10hardware-requests, 07Crosswiki: Acquire new hardware for hosting cross-wiki watchlist database - https://phabricator.wikimedia.org/T142538#2544068 (10akosiaris) p:05Triage>03Normal [14:31:53] 06Operations, 10DBA, 07Availability: Setup automatic failover for misc database servers - https://phabricator.wikimedia.org/T141547#2544070 (10akosiaris) p:05Triage>03Normal [14:33:42] 06Operations, 10Wikimedia-Logstash: Add monitoring for detecting when logstash services are down - https://phabricator.wikimedia.org/T141783#2544071 (10akosiaris) p:05Triage>03Normal [14:34:01] (03CR) 10Muehlenhoff: [C: 031] "Thanks, that should fix it" [puppet] - 10https://gerrit.wikimedia.org/r/304234 (owner: 10Giuseppe Lavagetto) [14:34:48] PROBLEM - tilerator on maps1004 is CRITICAL: Connection refused [14:35:09] PROBLEM - tileratorui on maps1004 is CRITICAL: Connection refused [14:36:13] ^sorry, that's me, not fast enough to disable alerts after initial config [14:36:30] (03CR) 10Giuseppe Lavagetto: [C: 032] nutcracker: do not validate nutcracker config before creation [puppet] - 10https://gerrit.wikimedia.org/r/304234 (owner: 10Giuseppe Lavagetto) [14:36:55] (03PS2) 10Giuseppe Lavagetto: nutcracker: do not validate nutcracker config before creation [puppet] - 10https://gerrit.wikimedia.org/r/304234 [14:36:59] PROBLEM - cassandra CQL 10.64.48.154:9042 on maps1004 is CRITICAL: Connection refused [14:37:05] (03CR) 10Giuseppe Lavagetto: [V: 032] nutcracker: do not validate nutcracker config before creation [puppet] - 10https://gerrit.wikimedia.org/r/304234 (owner: 10Giuseppe Lavagetto) [14:37:18] PROBLEM - cassandra service on maps1004 is CRITICAL: CRITICAL - Expecting active but unit cassandra is inactive [14:37:51] damn... I'm not authorized to disable icinga notification for cassandra on maps1004 [14:38:07] I was allowed to disable other notifications... [14:38:33] <_joe_> moritzm: did it work? [14:38:45] puppet run is ongoing, will let you know [14:39:31] all good now, thanks! [14:39:36] <_joe_> cool [14:40:09] RECOVERY - nutcracker port on mw2086 is OK: TCP OK - 0.000 second response time on 127.0.0.1 port 11212 [14:40:38] RECOVERY - nutcracker process on mw2086 is OK: PROCS OK: 1 process with UID = 110 (nutcracker), command name nutcracker [14:40:59] RECOVERY - puppet last run on mw2086 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [14:43:36] gehel: i take it then that you are on/aware of it then... [14:43:47] maps1004 that is [14:44:18] urandom: yes. Initial configuration in progress, nothing to worry about [14:44:27] gehel: kk! [14:44:59] urandom: and I'm fighting with icinga to disable those alerts... it randomly tells me I'm unauthorized [14:45:29] and suddenly Icinga let's me silence those checks... [14:45:30] 06Operations, 10MediaWiki-API, 10Traffic: Evaluate the feasibility of cache invalidation for the action API - https://phabricator.wikimedia.org/T122867#2544097 (10ema) [14:45:33] 06Operations, 10Traffic, 13Patch-For-Review: Install XKey vmod - https://phabricator.wikimedia.org/T122881#2544095 (10ema) 05Open>03Resolved [14:45:36] 06Operations, 10Traffic: Content purges are unreliable - https://phabricator.wikimedia.org/T133821#2544096 (10ema) [14:46:54] (03PS1) 10Muehlenhoff: Readd mw2086 to dsh, all fine [puppet] - 10https://gerrit.wikimedia.org/r/304235 [14:48:24] gehel: perhaps you're actually only randomly authorized [14:48:45] probablistic alert disabling [14:49:17] * gehel blames urandom for all randomness ever... [14:49:35] Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth. [14:52:23] (03CR) 10Muehlenhoff: [C: 032] Readd mw2086 to dsh, all fine [puppet] - 10https://gerrit.wikimedia.org/r/304235 (owner: 10Muehlenhoff) [14:53:41] 06Operations, 10ops-codfw: mw2086 was down - https://phabricator.wikimedia.org/T142661#2544119 (10MoritzMuehlenhoff) 05Open>03Resolved a:03MoritzMuehlenhoff Reimaged to jessie and re-added to dsh. [14:54:29] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1001 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [50.0] [14:54:55] 06Operations, 06WMF-Legal, 06WMF-NDA-Requests: ZhouZ needs access to WMF-NDA group - https://phabricator.wikimedia.org/T98722#2544134 (10Qgil) What about a task / email by their manager? It should be simple to verify the manager's username/email. [14:59:32] 06Operations, 10ops-eqiad: Rack/setup sodium (carbon/mirror server replacement) - https://phabricator.wikimedia.org/T139171#2544145 (10Cmjohnson) [15:00:04] anomie, ostriches, thcipriani, hashar, twentyafterfour, and aude: Dear anthropoid, the time has come. Please deploy Morning SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20160811T1500). [15:00:27] nothing to SWAT [15:04:16] 06Operations, 06Community-Tech, 10MediaWiki-CrossWikiWatchlist, 10hardware-requests, 07Crosswiki: Acquire new hardware for hosting cross-wiki watchlist database - https://phabricator.wikimedia.org/T142538#2544157 (10jcrespo) We should block this request on #1, having a prototype running somewhere, then #... [15:04:38] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1001 is OK: OK: Less than 1.00% above the threshold [25.0] [15:10:13] 06Operations, 10ops-eqiad: Rack/setup sodium (carbon/mirror server replacement) - https://phabricator.wikimedia.org/T139171#2544186 (10Cmjohnson) new cables ordered today. [15:11:07] 06Operations, 06Discovery, 06Maps: Icinga is randomly loosing connectivity to maps1002 - https://phabricator.wikimedia.org/T138782#2544189 (10Cmjohnson) [15:11:09] 06Operations, 10ops-eqiad: investigate ores1002 - not in racktables but shows up on switch - https://phabricator.wikimedia.org/T142621#2544187 (10Cmjohnson) 05Open>03Resolved Not sure how it was missed but disks have been wiped. [15:12:21] @elukey is it okay if I swap out the disk slot 8 an1045? [15:14:13] 06Operations, 10ops-eqiad, 10netops: cr2-eqiad temperature alerts ("system warm") - https://phabricator.wikimedia.org/T141898#2544208 (10Cmjohnson) 05Open>03Resolved I adjusted the blanking panels and the fan speed as lowered Fans Top Rear Fan OK Spinning at intermediate-spee... [15:14:41] cmjohnson1: sure! Let me stop hadoop daemons on it first ok [15:14:43] ? [15:15:02] 06Operations, 10hardware-requests: decommission dickson - https://phabricator.wikimedia.org/T120752#2544210 (10Cmjohnson) disks are wiped [15:16:16] 06Operations, 10Traffic: Support TLS chacha20-poly1305 AEAD ciphers - https://phabricator.wikimedia.org/T131908#2544211 (10BBlack) Recording some related thoughts on performance and security: = Server Perf: This is openssl benchmarks, using our 1.0.2+cloudflare package, on our latest-gen cp hardware (this is... [15:17:19] cmjohnson1: you are free to go [15:17:36] great...doing so now [15:17:52] 06Operations, 10ops-codfw, 10DBA: Decom db2001-db2009 - https://phabricator.wikimedia.org/T125827#2544212 (10jcrespo) [15:18:27] 06Operations, 10ops-codfw, 10DBA: Decom db2001-db2009 - https://phabricator.wikimedia.org/T125827#2544216 (10jcrespo) [15:21:32] 06Operations, 10ops-codfw, 10DBA: Decom db2001-db2009 - https://phabricator.wikimedia.org/T125827#2544219 (10jcrespo) @Papaul ,@Robh These servers are ready to go; icinga/puppet/salt-wiped. DNS and tftboot are still active. I set on you the decision of its final destination. [15:22:08] @elukey physical disk replaced [15:25:21] 06Operations, 10ops-codfw, 10DBA: Decom db2001-db2009 - https://phabricator.wikimedia.org/T125827#2544236 (10RobH) a:03mark I'm re-assigning this to @mark for his approval to decommission db2001-db2009. All 9 of these systems had their warranties expire on 2014-11-10. These are old Dell PowerEdge R510 sy... [15:25:41] 06Operations, 10ops-codfw, 10DBA, 10hardware-requests: Decom db2001-db2009 - https://phabricator.wikimedia.org/T125827#2544238 (10RobH) [15:28:00] 06Operations, 10ops-eqiad: Megaraid controller reset due to (what seemsa) a faulty disk on analytics1045 - https://phabricator.wikimedia.org/T141761#2544241 (10Cmjohnson) Replaced disk with on-site spare and ordered a new one Congratulations: Work Order SR934470571 was successfully submitted. [15:30:48] 06Operations, 10Ops-Access-Requests: Access for platonides to chanops - https://phabricator.wikimedia.org/T142668#2544251 (10RobH) 05Open>03Resolved a:03RobH >>! In T142668#2543724, @akosiaris wrote: > FWIW, I 've added platonides to @wikimedia-operations chanops. I got no access in the rest of the chann... [15:32:26] 06Operations, 10ops-eqiad, 06DC-Ops, 10hardware-requests: Decommission pc1001-1003 - https://phabricator.wikimedia.org/T124962#2544261 (10RobH) [15:32:52] 06Operations, 10ops-eqiad: Megaraid controller reset due to (what seemsa) a faulty disk on analytics1045 - https://phabricator.wikimedia.org/T141761#2544262 (10Cmjohnson) 05Open>03Resolved [15:38:05] RECOVERY - mediawiki-installation DSH group on mw2086 is OK: OK [15:40:09] 06Operations, 10hardware-requests: EQIAD: (2) hardware access request for PUPPET - https://phabricator.wikimedia.org/T142218#2544296 (10RobH) [15:42:07] 06Operations, 10ops-eqiad: Decommission strontium - https://phabricator.wikimedia.org/T142722#2544299 (10Cmjohnson) [15:42:34] 06Operations, 10ops-eqiad: Investigate strontium disk issues on 2016-08-05 - https://phabricator.wikimedia.org/T142187#2544315 (10Cmjohnson) [15:42:36] 06Operations, 10ops-eqiad: Decommission strontium - https://phabricator.wikimedia.org/T142722#2544314 (10Cmjohnson) [15:44:06] 06Operations, 10Ops-Access-Requests: Access for platonides to chanops - https://phabricator.wikimedia.org/T142668#2544317 (10Danny_B) >>! In T142668#2544251, @RobH wrote: >>>! In T142668#2543677, @Danny_B wrote: >> Side note: I actually don't think it is necessary to track requests for IRC flags requests in Ph... [15:46:08] 06Operations, 10Ops-Access-Requests: Access for platonides to chanops - https://phabricator.wikimedia.org/T142668#2544323 (10RobH) We are no longer a 5 person operations team, and not filing tasks eliminates oversight. Just my viewpoint. [15:46:35] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1001 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [50.0] [15:47:26] (03PS1) 10Ladsgroup: ores: increase number of workers to 32 / node [puppet] - 10https://gerrit.wikimedia.org/r/304245 (https://phabricator.wikimedia.org/T142361) [15:47:28] (03CR) 10Dzahn: [C: 04-1] "i see a decom ticket at https://phabricator.wikimedia.org/T142722 abandon ?" [dns] - 10https://gerrit.wikimedia.org/r/302757 (owner: 10Dzahn) [15:47:44] (03PS10) 10Giuseppe Lavagetto: puppetmaster: add role for puppetdb [puppet] - 10https://gerrit.wikimedia.org/r/303801 (https://phabricator.wikimedia.org/T142363) [15:51:47] !log disabling puppet on lvs hosts to test https://gerrit.wikimedia.org/r/#/c/297418/ [15:51:53] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [15:52:25] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1001 is OK: OK: Less than 1.00% above the threshold [25.0] [15:52:52] 06Operations, 10Ops-Access-Requests: Access for platonides to chanops - https://phabricator.wikimedia.org/T142668#2542616 (10Dzahn) >>! In T142668#2544317, @Danny_B wrote: > It was always a habit that someone asked on the channel and any of present `+f` people either acted directly Acting directly on access... [15:57:33] cmjohnson1: I don't see any trace of it in the logs, plus megacli shows me the media errors.. Should I do something specific? [15:57:34] (03CR) 10Ema: [C: 032] lvs: switch port 80 loadbalancing to sh for cache_ services [puppet] - 10https://gerrit.wikimedia.org/r/297418 (https://phabricator.wikimedia.org/T108827) (owner: 10Ema) [15:57:44] (03PS3) 10Ema: lvs: switch port 80 loadbalancing to sh for cache_ services [puppet] - 10https://gerrit.wikimedia.org/r/297418 (https://phabricator.wikimedia.org/T108827) [15:57:48] (03CR) 10Ema: [V: 032] lvs: switch port 80 loadbalancing to sh for cache_ services [puppet] - 10https://gerrit.wikimedia.org/r/297418 (https://phabricator.wikimedia.org/T108827) (owner: 10Ema) [15:58:00] I mean, I can still see all the partitions filled up with data [15:58:15] I expected either one empty or something in the dmesg telling me to partition [15:58:16] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1001 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [50.0] [16:00:04] godog, moritzm, and _joe_: Dear anthropoid, the time has come. Please deploy Puppet SWAT(Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20160811T1600). [16:00:05] tgr, bd808, and Pchelolo: A patch you scheduled for Puppet SWAT(Max 8 patches) is about to be deployed. Please be available during the process. [16:00:26] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1001 is OK: OK: Less than 1.00% above the threshold [25.0] [16:00:50] elukey yeah, you have to add the disk bad to your raid...i just did that with megacli -CfgEachDskRaid0 WB RA Direct CachedBadBBU -a0 [16:01:59] taking notes :) [16:02:26] ah now it is better [16:03:05] puppet swat! [16:03:16] yeah, that should just work, if not there is usually preserved cache that needs to be cleared or a foreign cfg [16:04:48] I can see the new disk popping up in dmesg [16:05:00] so probably I'd need only yo add the partition [16:05:01] 06Operations, 10Traffic: Support TLS chacha20-poly1305 AEAD ciphers - https://phabricator.wikimedia.org/T131908#2544363 (10faidon) >>! In T131908#2544211, @BBlack wrote: > So for now, I recommend that we keep letting client preference make the call on this and see how things play out over the long run. Yeah,... [16:05:40] (03CR) 10Addshore: "Yes!" [puppet] - 10https://gerrit.wikimedia.org/r/302119 (https://phabricator.wikimedia.org/T141636) (owner: 10Addshore) [16:07:08] 06Operations, 06Release-Engineering-Team, 05Gitblit-Deprecate: Clones from git.wikimedia.org are not redirected - https://phabricator.wikimedia.org/T139206#2422729 (10jayvdb) I just ran into this, via a sysadmin who was previously using git branches to keep his wiki extensions stable. Has there been a #sysad... [16:08:16] (03PS2) 10Dzahn: Fix bad owner check [puppet] - 10https://gerrit.wikimedia.org/r/304223 (https://phabricator.wikimedia.org/T127093) (owner: 10Thcipriani) [16:10:43] (03CR) 10Dzahn: [C: 032] Fix bad owner check [puppet] - 10https://gerrit.wikimedia.org/r/304223 (https://phabricator.wikimedia.org/T127093) (owner: 10Thcipriani) [16:16:08] cmjohnson1: all done, thanks! [16:16:58] (03PS1) 10Subramanya Sastry: Bump heap limits for Parsoid from 600 mb -> 800 mb [puppet] - 10https://gerrit.wikimedia.org/r/304253 [16:18:08] (03CR) 10Subramanya Sastry: "https://gerrit.wikimedia.org/r/#/c/304251/ is the parsoid side patch for bumping other resource limits." [puppet] - 10https://gerrit.wikimedia.org/r/304253 (owner: 10Subramanya Sastry) [16:18:59] (03CR) 10Dzahn: [C: 032] ores: increase number of workers to 32 / node [puppet] - 10https://gerrit.wikimedia.org/r/304245 (https://phabricator.wikimedia.org/T142361) (owner: 10Ladsgroup) [16:19:08] (03PS2) 10Dzahn: ores: increase number of workers to 32 / node [puppet] - 10https://gerrit.wikimedia.org/r/304245 (https://phabricator.wikimedia.org/T142361) (owner: 10Ladsgroup) [16:19:32] 06Operations, 06Community-Tech, 10MediaWiki-CrossWikiWatchlist, 10hardware-requests, 07Crosswiki: Acquire new hardware for hosting cross-wiki watchlist database - https://phabricator.wikimedia.org/T142538#2544406 (10RobH) Also the request doesn't note which kind of DB config. We've recently ordered 1U S... [16:20:36] RECOVERY - Improperly owned -0:0- files in /srv/mediawiki-staging on mira is OK: Files ownership is ok. [16:20:38] 06Operations, 06Discovery, 06Maps: Icinga is randomly loosing connectivity to maps1002 - https://phabricator.wikimedia.org/T138782#2544407 (10Gehel) Thanks @Cmjohnson ! I'll check again alert history to make sure no new issue are seen. And I'll close this issue if all seems OK. [16:21:15] (03CR) 10Dzahn: "thanks! 09:26 < icinga-wm> RECOVERY - Improperly owned -0:0- files in /srv/mediawiki-staging on mira is OK: Files ownership is ok." [puppet] - 10https://gerrit.wikimedia.org/r/304223 (https://phabricator.wikimedia.org/T127093) (owner: 10Thcipriani) [16:21:39] 06Operations, 10Deployment-Systems, 13Patch-For-Review: error on tin:/srv/mediawiki-staging: insufficient permission for adding an object to repository database .git/objects - https://phabricator.wikimedia.org/T127093#2544408 (10Dzahn) 09:26 < icinga-wm> RECOVERY - Improperly owned -0:0- files in /srv/mediaw... [16:25:14] thanks mutante :) [16:25:57] (03CR) 10Ladsgroup: "recheck" [puppet] - 10https://gerrit.wikimedia.org/r/304245 (https://phabricator.wikimedia.org/T142361) (owner: 10Ladsgroup) [16:26:22] why it's not merging :D [16:26:55] Amir1: because it needed Verified. it is now [16:28:28] yeah [16:28:29] thanks! [16:34:01] (03CR) 10Mobrovac: [C: 031] "LGTM" [puppet] - 10https://gerrit.wikimedia.org/r/304253 (owner: 10Subramanya Sastry) [16:35:03] subbu: it's puppet-swat time now, perhaps get ^^ in? [16:35:38] oh wait, moritzm, _joe_: is puppetswat happening? [16:36:10] !log applying https://gerrit.wikimedia.org/r/#/c/297418/ to esams load balancers [16:36:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [16:36:52] mobrovac, sure, wfm. [16:37:06] subbu: just add it to https://wikitech.wikimedia.org/wiki/Deployments#Thursday.2C.C2.A0August.C2.A011 [16:37:28] but it looks like there's no opsens around for puppetswat [16:37:30] hm [16:38:16] RECOVERY - Improperly owned -0:0- files in /srv/mediawiki-staging on tin is OK: Files ownership is ok. [16:40:06] 06Operations: mw2086 & mw2087 do not respond to IPMI commands - https://phabricator.wikimedia.org/T142726#2544453 (10RobH) [16:40:56] icinga-wm: thank you [16:41:02] lol [16:41:14] mobrovac: it's been a long confusing road on that one :) [16:41:22] :) [16:43:10] 06Operations, 10Deployment-Systems, 13Patch-For-Review: error on tin:/srv/mediawiki-staging: insufficient permission for adding an object to repository database .git/objects - https://phabricator.wikimedia.org/T127093#2544477 (10Dzahn) and also fixed on tin: 09:44 < icinga-wm> RECOVERY - Improperly owned -0... [16:43:17] 06Operations, 10Deployment-Systems, 13Patch-For-Review: error on tin:/srv/mediawiki-staging: insufficient permission for adding an object to repository database .git/objects - https://phabricator.wikimedia.org/T127093#2544478 (10Dzahn) 05Open>03Resolved [16:43:31] 06Operations, 10Deployment-Systems: error on tin:/srv/mediawiki-staging: insufficient permission for adding an object to repository database .git/objects - https://phabricator.wikimedia.org/T127093#2032127 (10Dzahn) [16:43:38] 06Operations: mw2086 & mw2087 do not respond to IPMI commands - https://phabricator.wikimedia.org/T142726#2544480 (10RobH) Additionally, I wasn't aware this reimage script used IPMI. We had to disable the IPMI on the ilom interfaces across the Dell fleet over a year ago due to some security flaw. (My memory ma... [16:45:58] mobrovac, added it in any case. [16:46:11] kk [16:46:38] otherwise, we'll have to wait for monday [16:48:52] well, with 10 mins left, i'd say no puppetswat today [16:50:06] 06Operations, 06Discovery, 06Discovery-Search-Backlog, 10Elasticsearch: Reclaim nobelium - https://phabricator.wikimedia.org/T142581#2544507 (10debt) p:05Triage>03Normal [16:50:13] (03PS2) 10Joal: Update camus job to use new check_jar [puppet] - 10https://gerrit.wikimedia.org/r/304195 (https://phabricator.wikimedia.org/T142717) [16:50:23] mobrovac: I don't have time today (the owner of puppet swat of that just gets copied around weekly) [16:50:23] 06Operations, 06Discovery, 10Elasticsearch, 03Discovery-Search-Sprint: Reclaim nobelium - https://phabricator.wikimedia.org/T142581#2540089 (10debt) [16:51:10] mobrovac, sounds good. [16:52:09] moritzm: do ya'll (ops) determine who's on point for it each week during your team meeting? would be good to update teh deploy calendar if you do :) [16:55:13] if you mean ops clinic duty from the /title, we don't plan it very far in advance [16:55:27] (e.g. 1-2 weeks out) [16:55:50] s/title/topic/ :) [16:56:05] oh puppetswat is separate I think [16:56:49] oh [16:56:57] seems it was resolved [16:57:31] bblack: yeah, that (puppetswat) [16:57:33] puppetswat used to be organised weekly like clinic duty, but at some point the process changed. I've made an agenda topic for the offsite, the current process is working poolrly [16:57:46] moritzm: thanks :) [16:58:05] is the new process meant to be clinic==puppetswat, or? [16:58:16] thanks akosiaris [16:58:23] that might be smart (clinic == puppetswat) [16:58:30] if the timing works for them [16:58:39] but I'll leave it to you all [16:58:46] not my baby :) [16:58:59] no, I think it was meant to be a collective thing where everyone chimes in, but for practical purposes it was always Joe, myself an Filippo who ended up there :-)( [16:59:00] no, I think it was meant to be a collective thing where everyone chimes in, but for practical purposes it was always Joe, myself an Filippo who ended up there :-) [16:59:34] but I'm not sure I was out the week when the new process was discussed [17:00:05] yurik, gwicke, cscott, arlolra, subbu, halfak, and Amir1: Respected human, time to deploy Services – Graphoid / Parsoid / OCG / Citoid / ORES (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20160811T1700). Please do the needful. [17:00:13] !log applying https://gerrit.wikimedia.org/r/#/c/297418/ to ulsfo load balancers [17:00:18] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [17:00:20] no deployment for ores today [17:00:32] we have a simple restart though [17:02:21] !log restarting celery-ores-worker service in scb[12]00[12] T142361 [17:02:25] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [17:02:57] T142361: Increase web and worker processes in production - https://phabricator.wikimedia.org/T142361 [17:03:26] 06Operations, 10Ops-Access-Requests: Access for platonides to chanops - https://phabricator.wikimedia.org/T142668#2544583 (10Danny_B) OK, let me put it this way: It was //never// a rule nor even a habit for the entire existence of Wikimedia channels. If you guys want to set up new practices, feel free to do so... [17:05:20] restarts are done now [17:09:37] Platonides: why'd you ban ottomata ? [17:09:39] 06Operations, 10Ops-Access-Requests: Access for platonides to chanops - https://phabricator.wikimedia.org/T142668#2544606 (10Platonides) The irc is separate from the wiki. As I understand, RobH wants that, //for #wikimedia-operations// the requests are tracked on phabricator. That's up to #wikimedia-operation... [17:09:51] sorry? [17:09:58] 16:55 ChanServ sets +o Platonides 16:55 Platonides sets +b *!*@172.56.10.238 16:55 <<< ottomata1 [ottomata1] kicked by Platonides [17:10:11] kickban [17:10:12] was it the real ottomata? [17:10:20] why wasn't it? [17:10:25] and yes, he is saying he can't join [17:10:29] omg [17:10:37] im chating with him right now [17:10:55] tyring to see if it was that he is on shared IP like irc cloud [17:11:00] so why was that IP banned? [17:11:08] a real wikimedia user logging in without a cloak? :) [17:11:19] mata = "kill" [17:11:25] so it is in the script blacklist [17:11:31] ottomata nick is whitelisted [17:11:37] (03PS1) 10Thcipriani: Labs: Shinken alert for beta error rate [puppet] - 10https://gerrit.wikimedia.org/r/304263 (https://phabricator.wikimedia.org/T141785) [17:11:39] and the cloak would have exempted him as well [17:11:47] some irc troll? [17:11:53] ahh [17:12:15] 06Operations, 10MediaWiki-Database, 06Performance-Team, 05MW-1.28-release-notes, 05WMF-deploy-2016-08-16_(1.28.0-wmf.15): periodic spike of MW exceptions "DB connection was already closed or the connection dropped." - https://phabricator.wikimedia.org/T142079#2544628 (10aaron) 05Open>03Resolved Closi... [17:12:18] the trol is currently using similar nicks [17:12:33] so I wouldn't blindly assume that ottomata1 is the real one [17:12:52] but it seems I owe him an apologise [17:13:15] i told him it was fixed [17:13:15] thx for fixing [17:13:47] that was quick abusing the @, lol [17:14:05] ok now everyone stop talking shit about ottomata he rejoined ;D [17:14:12] har ahr [17:14:13] :) [17:14:52] sorry ottomata [17:15:10] bblack: how much vcl would it be to active the cookie added in https://gerrit.wikimedia.org/r/#/c/260797/1/includes/MediaWiki.php (basically to bypass cache the same as session/token cookies)? [17:15:11] 17:08 <@Platonides> ottomata nick is whitelisted [17:15:17] 17:09 <@Platonides> and the cloak would have exempted him as well [17:15:30] you came here on disguise :) [17:16:39] !log applying https://gerrit.wikimedia.org/r/#/c/297418/ to codfw load balancers [17:16:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [17:17:53] (03CR) 10Thcipriani: "Setup graphite-labs.wikimedia.org as a host to mirror what is done in production in the monitoring::graphite definition. Couldn't find tha" [puppet] - 10https://gerrit.wikimedia.org/r/304263 (https://phabricator.wikimedia.org/T141785) (owner: 10Thcipriani) [17:20:06] (03PS1) 10Catrope: Re-enable thank-you-edit notifications [mediawiki-config] - 10https://gerrit.wikimedia.org/r/304267 (https://phabricator.wikimedia.org/T128249) [17:26:01] wow, good writeup on TLS, bblack [17:26:28] I didn't know that djb opinion btw [17:27:06] Platonides: my internet was being really flaky and kept reconnecting me [17:27:10] so i got a new nick [17:27:24] and then I banned you xD [17:30:27] (03PS1) 10Catrope: Re-enable the Echo footer notice [mediawiki-config] - 10https://gerrit.wikimedia.org/r/304271 (https://phabricator.wikimedia.org/T141414) [17:33:50] (03CR) 10Luke081515: [C: 031] Re-enable the Echo footer notice [mediawiki-config] - 10https://gerrit.wikimedia.org/r/304271 (https://phabricator.wikimedia.org/T141414) (owner: 10Catrope) [17:37:11] !log applying https://gerrit.wikimedia.org/r/#/c/297418/ to eqiad load balancers [17:37:14] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [17:38:45] 06Operations, 10Ops-Access-Requests: Access for platonides to chanops - https://phabricator.wikimedia.org/T142668#2544741 (10Danny_B) >>! In T142668#2544606, @Platonides wrote: > The irc is separate from the wiki. Of course (and similarly separate from Phabricator). And has its own habits, where requesting the... [17:48:59] !log switched LVS schedulers for text, upload, maps and misc port 80 to source hash scheduling T108827 [17:49:00] T108827: Investigate TCP Fast Open for tlsproxy - https://phabricator.wikimedia.org/T108827 [17:49:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [17:53:41] 06Operations, 10Datasets-General-or-Unknown: investigate rsync between dcs with encryption - https://phabricator.wikimedia.org/T123560#2544772 (10ArielGlenn) p:05Triage>03Normal [18:07:39] twentyafterfour: a CentralNotice update is on today's train... (merged into our wmf_deploy branch yesterday, then automatically submodule-updated...). Could you tell me more or less the procedure for testing? For SWAT I know it goes to mw1017 first these days... [18:09:29] * AndyRussG reads some doc....... [18:13:02] Hmmm looks like not... [18:16:59] 06Operations, 10Ops-Access-Requests: Access for platonides to chanops - https://phabricator.wikimedia.org/T142668#2544901 (10Dzahn) What's the problem here really? Everybody was cool with adding platonides and get got added within 24 hours or something and it's resolved. [18:18:54] !log restarted hhvm on mw1017 to catch up with hhvm-wikidiff2 upgrade [18:18:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [18:19:03] 06Operations, 06WMF-Legal, 06WMF-NDA-Requests: ZhouZ needs access to WMF-NDA group - https://phabricator.wikimedia.org/T98722#2544919 (10ZhouZ) > What about a task / email by their manager? It should be simple to verify the manager's username/email. Yep that could work (I assume this means there's something... [18:23:54] (03PS1) 10ArielGlenn: check for empty output files, not just truncated ones [dumps] - 10https://gerrit.wikimedia.org/r/304281 [18:27:53] (03PS2) 10ArielGlenn: check for empty output files, not just truncated ones [dumps] - 10https://gerrit.wikimedia.org/r/304281 [18:30:22] (03CR) 10Arlolra: [C: 031] Bump heap limits for Parsoid from 600 mb -> 800 mb [puppet] - 10https://gerrit.wikimedia.org/r/304253 (owner: 10Subramanya Sastry) [18:38:29] * robh is cleaning up the flags list and moving the bans from it to the ban list, ignore the scroll [18:39:39] (03CR) 10ArielGlenn: [C: 032] check for empty output files, not just truncated ones [dumps] - 10https://gerrit.wikimedia.org/r/304281 (owner: 10ArielGlenn) [18:57:59] (03CR) 10Alex Monk: "I don't think we ever did that with labmon, but... should graphite-labs really be something monitored for deployment-prep?" [puppet] - 10https://gerrit.wikimedia.org/r/304263 (https://phabricator.wikimedia.org/T141785) (owner: 10Thcipriani) [19:00:04] twentyafterfour: Dear anthropoid, the time has come. Please deploy MediaWiki train (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20160811T1900). [19:02:10] (03PS1) 10Merlijn van Deen: toollabs: re-enable redis collector [puppet] - 10https://gerrit.wikimedia.org/r/304295 (https://phabricator.wikimedia.org/T142735) [19:03:06] (03PS2) 10Merlijn van Deen: toollabs: re-enable redis collector [puppet] - 10https://gerrit.wikimedia.org/r/304295 (https://phabricator.wikimedia.org/T142735) [19:04:31] (03CR) 10Thcipriani: "> should graphite-labs really be something monitored for deployment-prep?" [puppet] - 10https://gerrit.wikimedia.org/r/304263 (https://phabricator.wikimedia.org/T141785) (owner: 10Thcipriani) [19:05:58] (03PS3) 10Yuvipanda: toollabs: re-enable redis collector [puppet] - 10https://gerrit.wikimedia.org/r/304295 (https://phabricator.wikimedia.org/T142735) (owner: 10Merlijn van Deen) [19:06:03] (03CR) 10Yuvipanda: [C: 032 V: 032] toollabs: re-enable redis collector [puppet] - 10https://gerrit.wikimedia.org/r/304295 (https://phabricator.wikimedia.org/T142735) (owner: 10Merlijn van Deen) [19:12:52] 06Operations, 10MediaWiki-Cache, 06Performance-Team, 10Traffic, and 2 others: Cached outdated revisions served to logged-out users - https://phabricator.wikimedia.org/T141687#2545091 (10Krinkle) [19:15:43] 06Operations, 06Performance-Team: nf_conntrack: table full errors on Eqiad Job Runners - https://phabricator.wikimedia.org/T130364#2545111 (10Krinkle) [19:16:16] twentyafterfour: I guess I should just watch log messages here to know the train status? [19:16:33] AndyRussG: I am deploying the train now [19:16:49] starting scap in a couple of minutes [19:17:08] Ah OK :) [19:18:13] (03PS1) 10Merlijn van Deen: toollabs: require redis::client::python instead of redis::instance [puppet] - 10https://gerrit.wikimedia.org/r/304298 [19:19:43] to respond to your earlier question, it follows essentially the same process as SWAT [19:20:04] Ah OK [19:20:16] So is it on mw1017? [19:20:37] Sorry if this is a silly question... [19:21:07] twentyafterfour: ^ [19:21:11] I'm actually not sure now that I think about it. It does automated testing on canaries but maybe not 1017 [19:21:34] Hmm interesting! [19:21:44] AndyRussG: if your change is already merged then it should be on group1 wikis, can you test there? [19:21:58] K [19:22:17] or did the change get merged but not deployed? [19:22:43] can you link me to the patch in gerrit? [19:23:24] twentyafterfour: The timestamp on the core submodule update is Wed Aug 10 19:03:17 ... one sec [19:23:36] So it wouldn't have gone out with yesterday's train or the day before [19:23:52] I was assuming today's train would put it out everywhere... [19:24:08] with the train I don't sync the branch I just update the version number so changes like this have to go through swat to go live [19:24:27] twentyafterfour: ahh OK hmmm [19:24:34] Yeah git commit ba31abe048d6f10a0efef6336a6f77e63d290c19 in core [19:24:45] I can deploy the change separately from the train, so you don't need to wait for a swat window [19:25:25] * twentyafterfour can't find it in gerrit [19:25:50] uh sure! that'd be fine if it's OK on the deploy policy side [19:26:21] We +2'd a merge of CentralNotice master into CentralNotice's wmf_deploy branch [19:26:32] Then I think something automatically updated core [19:26:49] Working out the best procedure for this is something we're working on [19:30:27] twentyafterfour: yeah I don't see anything in Gerrit for the core merge. It's in history, here's the github link: https://github.com/wikimedia/mediawiki/commit/ba31abe048d6f10a0efef6336a6f77e63d290c19 [19:32:10] So submodule updates would go out on the Tuesday train, but not on Wednesday or Thursday? [19:32:54] I thought once upon a time we could just let things ride the train if they were merged in core (as submodule updates) but maybe those times are long gone.... [19:33:06] AndyRussG: So the thing is that when a change merges to the branch, it doesn't automatically update the code deployed on production. If it doesn't go through swat it's easy to miss it and it'll just sit there undeployed [19:33:25] if they merge before tuesday then it does "ride the train" [19:33:31] (03CR) 10Rush: [V: 032] toollabs: require redis::client::python instead of redis::instance [puppet] - 10https://gerrit.wikimedia.org/r/304298 (owner: 10Merlijn van Deen) [19:33:32] 06Operations, 10Traffic: Support TLS chacha20-poly1305 AEAD ciphers - https://phabricator.wikimedia.org/T131908#2545144 (10BBlack) > It's also interesting to make some observations about the AES 128-vs-256 debate while we're here... The above was mostly a side-note, and was drawn from memory of reading over... [19:33:32] otherwise no [19:33:46] OK gotcha... Thx for the clarification...... [19:34:15] (03CR) 10Rush: [C: 032] toollabs: require redis::client::python instead of redis::instance [puppet] - 10https://gerrit.wikimedia.org/r/304298 (owner: 10Merlijn van Deen) [19:34:56] Yeah I guess that's the way it's been for a while, I was just looking at things sideways because I haven't "trained" things since the faster deploy deploy cadence [19:37:37] twentyafterfour: regarding this specific update, it's also fine to put it on the evening SWAT. I can imagine that could be more appropriate... Don't see any patches in there right now in fact [19:38:03] Now-ish is also great though :) [19:38:15] (I mean, during this window... no rush at all!!) [19:41:19] AndyRussG: I'm going to sync it now [19:43:44] AndyRussG: once this sync completes, please test on group1 before I sync to group2 [19:44:21] !log twentyafterfour@tin Synchronized php-1.28.0-wmf.14/extensions/CentralNotice: deploy https://gerrit.wikimedia.org/r/#/c/304130/ (duration: 00m 58s) [19:44:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [19:44:55] AndyRussG: done, please test and let me know if it's good to go with group2 [19:46:16] twentyafterfour: K one sec! So it's on groups 0 and 1 now? [19:46:21] yes [19:46:48] K checking [19:49:19] twentyafterfour: basic smoke testing seems fine! [19:49:45] i.e., banners showing and the sites are up ;) [19:50:11] (This was tested quite a lot on the beta cluster, so, not a surprise... ;) ) [19:50:39] Mmm lemme check logstash [19:53:18] Hmm new interface..... [19:55:51] twentyafterfour: I don't see any CentralNotice errors anywhere but I'm not sure I'm looking in the right place [19:56:05] But as far as I can tell, it looks great :) [19:56:16] Are you tailing some logs somewhere? [19:56:57] twentyafterfour, still deploying? [19:59:54] yurik: yes I am about to deploy to group2 now [20:00:07] AndyRussG: I watch fatalmonitor in logstash [20:00:12] gehel, i will deploy kartotherian once twentyafterfour is done. twentyafterfour please ping me :) [20:00:15] twentyafterfour: if you don't see anything untoward like a spike in errors or error responses pls go ahead [20:00:47] yurik: ok [20:01:04] gehel, what's the status with 1002? I cannot connect to it [20:01:42] gehel: i get a weird "password: " [20:01:44] yurik: not yet installed, will do it just after dinner [20:02:08] (03PS1) 1020after4: group2 wikis to 1.28.0-wmf.14 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/304315 [20:02:22] (03CR) 1020after4: [C: 032] group2 wikis to 1.28.0-wmf.14 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/304315 (owner: 1020after4) [20:02:47] (03Merged) 10jenkins-bot: group2 wikis to 1.28.0-wmf.14 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/304315 (owner: 1020after4) [20:04:32] !log twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: group2 wikis to 1.28.0-wmf.14 [20:04:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [20:05:02] train :) [20:06:40] !log tools restart rabbitmq-server on labcontrol1001 [20:06:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [20:06:56] AndyRussG: I see "Banner::save: Automatic transaction with writes in progress (from DatabaseBase::query (DatabaseBase::selectRow)), performing implicit commit!" - is that something to worry about? [20:08:33] twentyafterfour: hmmm, when is that from? [20:08:38] !log restart nova-compute on labvirt1010 [20:08:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [20:09:59] twentyafterfour: one of the patches we just deployed actually is expected to remove DB related warnings [20:10:09] 54a9b5a7844e63e901bd1f7f85f4bdb67bab7d09 [20:10:23] Convert callers to startAtomic/endAtomic [20:10:24] This avoids warnings about committing implicit transactions prematurely [20:11:01] ^ from the commit message [20:11:06] AaronSchulz: ^ [20:11:59] AndyRussG: maybe the error was from prior to the sync [20:12:23] twentyafterfour: yeah that would be good to hear if it were [20:16:21] twentyafterfour: on fatalmonitor I see this: [20:16:23] https://logstash.wikimedia.org/app/kibana#/dashboard/Fatal-Monitor?_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:now-1h,mode:quick,to:now))&_a=(filters:!((%27$state%27:(store:appState),bool:(must:!((terms:(level:!(NOTICE,INFO,WARNING))),(term:(type:mediawiki)))),meta:(alias:!n,disabled:!f,index:%27logstash-*%27,key:bool,negate:!t,value:%27%7B%22must%22:%5B%7B%22terms%22:%7B%22lev [20:16:25] el%22:%5B%22NOTICE%22,%22INFO%22,%22WARNING%22%5D%7D%7D,%7B%22term%22:%7B%22type%22:%22mediawiki%22%7D%7D%5D%7D%27)),(%27$state%27:(store:appState),meta:(alias:!n,disabled:!f,index:%27logstash-*%27,key:message,negate:!t,value:SlowTimer),query:(match:(message:(query:SlowTimer,type:phrase)))),(%27$state%27:(store:appState),meta:(alias:!n,disabled:!f,index:%27logstash-*%27,key:message,negate:!t,val [20:16:27] ue:%27Invalid%20host%20name%27),query:(match:(message:(query:%27Invalid%20host%20name%27,type:phrase))))),options:(darkTheme:!f),panels:!((col:1,id:Top-20-Hosts,panelIndex:2,row:3,size_x:9,size_y:2,type:visualization),(col:1,columns:!(type,level,wiki,host,message),id:Default-Events-List,panelIndex:3,row:10,size_x:12,size_y:23,sort:!(%27@timestamp%27,desc),type:search),(col:1,id:Fatal-Events-Over [20:16:30] -Time,panelIndex:4,row:1,size_x:12,size_y:2,type:visualization),(col:1,id:Trending-Messages,panelIndex:5,row:5,size_x:12,size_y:5,type:visualization),(col:10,id:MediaWiki-Versions,panelIndex:6,row:3,size_x:3,size_y:2,type:visualization),(col:1,id:All-Events,panelIndex:7,row:33,size_x:3,size_y:2,type:visualization),(col:4,id:All-Events,panelIndex:8,row:33,size_x:3,size_y:2,type:visualization)),qu [20:16:31] ery:(query_string:(analyze_wildcard:!t,query:%27(type:mediawiki%20AND%20(channel:exception%20OR%20channel:wfLogDBError))%20OR%20type:hhvm%20AND%20message:CentralNotice%27)),title:%27Fatal%20Monitor%27,uiState:(P-2:(spy:(mode:(fill:!f,name:!n)),vis:(legendOpen:!f)),P-4:(spy:(mode:(fill:!f,name:!n)),vis:(colors:(exception:%23C15C17,hhvm:%23BF1B00))),P-6:(spy:(mode:(fill:!f,name:!n)),vis:(legendOpe [20:16:33] n:!t)))) [20:16:35] Aaaarg [20:16:45] Fatal error: Uncaught exception 'Exception' with message '/srv/mediawiki/php-1.28.0-wmf.14/extensions/CentralNotice/extension.json does not exist!' in /srv/mediawiki/php-1.28.0-wmf.14/includes/registration/ExtensionRegistry.php:106 Stack trace: #0 /srv/me [20:17:09] Around 19:44 UTC [20:17:29] (03PS10) 10BryanDavis: [WIP] Provision Striker via scap3 [puppet] - 10https://gerrit.wikimedia.org/r/301505 (https://phabricator.wikimedia.org/T141014) [20:17:32] What we just deployed is indeed extension registration for CN [20:17:49] (03PS9) 10Alex Monk: [WIP/POC/POS] Add python version of maintain-replicas script [software] - 10https://gerrit.wikimedia.org/r/295607 (https://phabricator.wikimedia.org/T138450) [20:17:53] Maybe it's due to lack of sync of files during the exact deploy? [20:17:56] maybe sync timing? [20:18:00] heh [20:18:06] heh, those URLs [20:18:18] Aaarg yeah sorry about that [20:18:20] surprised that didn't trigger a flood alert in -ops [20:18:30] AndyRussG: Fatal error: Uncaught exception 'Exception' with message '/srv/mediawiki/php-1.28.0-wmf.14/extensions/CentralNotice/extension.json does not exist!' in /srv/mediawiki/php-1.28.0-wmf.14/includes/registration/ExtensionRegistry.php:106 [20:18:49] yeah [20:18:53] sorry I see you already found it [20:20:19] Hmm yeah finally figured out the search box is where u filter.... [20:20:34] The DB one is just a warning, no? [20:21:11] yeah [20:21:16] and it seems to be from before the sync [20:21:25] the fatal exception only occurred once [20:21:27] I think we are good [20:21:34] Yeah me too! [20:22:30] cool [20:24:35] (03PS10) 10Alex Monk: [WIP/POC/POS] Add python version of maintain-replicas script [software] - 10https://gerrit.wikimedia.org/r/295607 (https://phabricator.wikimedia.org/T138450) [20:25:31] twentyafterfour: yeah all looks great... I'll keep watching fatal monitor, then.... thx so much for ur help!!!! :D [20:26:35] AndyRussG: you're welcome [20:26:44] :) [20:33:36] twentyafterfour: out of curiosity, how on logstash did u find the Cn deb-related error? [20:39:33] (03CR) 10Madhuvishy: WIP labstore nfs: nfs client mount manager (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/304070 (https://phabricator.wikimedia.org/T140483) (owner: 10Rush) [20:40:31] (03PS11) 10BryanDavis: [WIP] Provision Striker via scap3 [puppet] - 10https://gerrit.wikimedia.org/r/301505 (https://phabricator.wikimedia.org/T141014) [20:40:47] 06Operations, 06Community-Tech, 10MediaWiki-CrossWikiWatchlist, 10hardware-requests, 07Crosswiki: Acquire new hardware for hosting cross-wiki watchlist database - https://phabricator.wikimedia.org/T142538#2545360 (10kaldari) [20:44:50] 06Operations, 06Community-Tech, 10MediaWiki-CrossWikiWatchlist, 10hardware-requests, 07Crosswiki: Acquire new hardware for hosting cross-wiki watchlist database - https://phabricator.wikimedia.org/T142538#2545374 (10kaldari) I added the requested fields to the task description. Unfortunately, most of tha... [20:44:54] 06Operations, 06Commons, 06Multimedia, 10media-storage, 15User-Josve05a: Specific revision of a file triggers 404 (not found) - https://phabricator.wikimedia.org/T124101#2545375 (10Josve05a) [20:45:51] 06Operations, 06Commons, 06Multimedia, 10media-storage, 15User-Josve05a: Specific revision of a files triggers 404 (not found) - https://phabricator.wikimedia.org/T124101#2545380 (10Josve05a) [20:46:12] 06Operations, 06Commons, 06Multimedia, 10media-storage, 15User-Josve05a: Specific revisions of multiple files triggers 404 (not found) - https://phabricator.wikimedia.org/T124101#1945898 (10Josve05a) [20:46:22] (03PS1) 10Yuvipanda: Don't attempt to set root user password [labs/private] - 10https://gerrit.wikimedia.org/r/304321 [20:46:29] (03PS1) 10Yuvipanda: Revert "Don't install root password if has_admin = false" [puppet] - 10https://gerrit.wikimedia.org/r/304322 [20:46:34] (03PS2) 10Yuvipanda: Revert "Don't install root password if has_admin = false" [puppet] - 10https://gerrit.wikimedia.org/r/304322 [20:46:51] (03PS2) 10Yuvipanda: Don't attempt to set root user password [labs/private] - 10https://gerrit.wikimedia.org/r/304321 [20:46:56] 06Operations, 10Traffic: Support TLS chacha20-poly1305 AEAD ciphers - https://phabricator.wikimedia.org/T131908#2182588 (10Platonides) Don't forget about that timing-resistant AES implementation by djb. I thought people had been using timing-resistant AES implementations for years (assuming they used a modern... [20:48:13] (03CR) 10Yuvipanda: [C: 032 V: 032] Revert "Don't install root password if has_admin = false" [puppet] - 10https://gerrit.wikimedia.org/r/304322 (owner: 10Yuvipanda) [20:49:46] 06Operations, 10Traffic: Support TLS chacha20-poly1305 AEAD ciphers - https://phabricator.wikimedia.org/T131908#2545401 (10BBlack) >>! In T131908#2545388, @Platonides wrote: > Don't forget about that timing-resistant AES implementation by djb. > I thought people had been using timing-resistant AES implementati... [20:54:54] AndyRussG: by accident [20:57:18] Mmm do you know what u did? [20:57:45] (03PS11) 10Alex Monk: [WIP/POC/POS] Add python version of maintain-replicas script [software] - 10https://gerrit.wikimedia.org/r/295607 (https://phabricator.wikimedia.org/T138450) [20:58:20] (03PS1) 10Gehel: Maps - categorize maps1002 in site.pp [puppet] - 10https://gerrit.wikimedia.org/r/304324 (https://phabricator.wikimedia.org/T138092) [21:14:17] (03PS2) 10Rush: WIP labstore nfs: nfs client mount manager [puppet] - 10https://gerrit.wikimedia.org/r/304070 (https://phabricator.wikimedia.org/T140483) [21:22:56] anyone knows who maintains accounts on grafana? [21:23:10] https://wikitech.wikimedia.org/wiki/Grafana.wikimedia.org does not have any info [21:23:19] hmm... yuvipanda, maybe you know that? ^ [21:23:49] hello yurik [21:23:52] what are you trying to do? [21:23:57] godog is ultimately responsible but he's on vacation [21:24:08] yuvipanda, hi, i want to add debt to it [21:24:12] do you need just grafana or grafana-admin [21:24:15] admin [21:24:18] role/manifests/grafana/production.pp: ldap_editor_description => 'nda/ops/wmf/grafana-admin', [21:24:35] you need to be in one of thse LDAP groups [21:24:38] afaik [21:25:07] it's restricted-edit because of vandalism concerns [21:25:10] iirc [21:25:29] sure, but i think we can trust debt, she is our PM afterall.... and we can always fire her :) [21:25:33] so you just have to go to grafana-admin and do basic auth using ldap credentials, with your accout being in one of those groups [21:25:36] she should be in one of those groups [21:25:39] * aude thinks already [21:25:52] should be in wmf + nda [21:26:08] Krenair, so you think she is already in it? [21:26:13] i do [21:26:27] yurik ah debt the person :D should just add the appropriate LDAP group, as Krenair said [21:26:43] yurik, I didn't check. did she try logging in to grafana-admin? [21:27:04] I did try... [21:27:17] using my debt and using my WMF user name [21:27:18] though don't see her in https://github.com/wikimedia/operations-puppet/blob/production/modules/admin/data/data.yaml [21:27:37] i don't know if all wmf people need to be there [21:28:32] the wmf ldap group grants too much stuff [in before Krenair ;)] [21:28:41] @aude not sure...I see various other PM's in there... [21:28:55] lydia, for example, has analytics access [21:29:10] (is the reason i think she is there, and you should also have that, at least) [21:29:16] debt: you'll want it at some point to get access to hadoop and event logging data [21:29:17] (03PS1) 10Thcipriani: Add the fatalmonitor query to logstash_checker [puppet] - 10https://gerrit.wikimedia.org/r/304327 [21:29:35] but shell !== wmf ldap [21:29:49] yeah, i see [21:30:06] debt: so you'll need a wikitech account (used for gerrit and other things) [21:30:09] wmf ldap must be an office it thing [21:30:15] or something [21:30:17] lack of onboarding workflow [21:30:21] and then you'll need to be granted into the wmf ldap group [21:30:26] (03PS1) 10Tjones: Enable Language ID for Russian, Japanese, Portuguese Wikipedias [mediawiki-config] - 10https://gerrit.wikimedia.org/r/304328 (https://phabricator.wikimedia.org/T142413) [21:30:45] wmf ldap also gives +2 on pretty much all MW related repos which is the too much access part [21:30:57] o_O [21:32:04] it's treated like "staff NDA" in some spots (like the grafana/kibana ones) [21:32:52] First, pretend you're requesting ldap/wmf access or approving/+1ing someone else who is. [21:32:59] Now think of all the things that you *know* ldap/wmf grants access to. [21:33:00] I have a staff NDA for Phabricator [21:33:03] Now read https://wikitech.wikimedia.org/wiki/LDAP_Groups [21:33:34] I don't have a wikitech account...let me work on that [21:34:00] (these are general, not specifically for you deb) [21:37:03] alrighty, thanks @Krenair [21:37:29] aude, oh, yeah [21:37:33] So there's this separate LDAP within WMF [21:37:34] corp LDAP [21:38:16] It has no link to wikimedia production/labs LDAP [21:41:26] 06Operations, 06Community-Tech, 10MediaWiki-CrossWikiWatchlist, 10hardware-requests, 07Crosswiki: Acquire new hardware for hosting cross-wiki watchlist database - https://phabricator.wikimedia.org/T142538#2545726 (10kaldari) [21:42:55] (03PS12) 10Alex Monk: [WIP/POC/POS] Add python version of maintain-replicas script [software] - 10https://gerrit.wikimedia.org/r/295607 (https://phabricator.wikimedia.org/T138450) [21:44:49] 06Operations, 06Community-Tech, 10wikidiff2, 13Patch-For-Review: Deploy new version of wikidiff2 package - https://phabricator.wikimedia.org/T140443#2545748 (10MaxSem) So far so good: * I see no crashes in logs * The bug is fixed * I ran a diff of 500 pages, looks good: https://people.wikimedia.org/~maxsem... [21:55:21] twentyafterfour, all done? [21:55:39] yurik: yes [21:55:55] awesome! i will push some maps stuff out. CC: gehel [21:57:20] yurik: back. But sleep time for me... [21:57:33] gehel, no worries, i should be able to do it by myself :) [21:58:34] yurik: kool! I'll see that tomorrow... [22:16:13] bd808 - I have a wikitech account now [22:17:35] debt: sweet. I think the next step is "ask ostriches to add you to the wmf ldap group" [22:18:13] Or Reedy! :p [22:18:21] ooh! [22:18:30] pretty please Reedy or ostriches :) [22:18:44] where's ldap stuff living now? [22:18:45] (03PS1) 10Yuvipanda: PHP: Add php5-apcu [docker-images/toollabs-images] - 10https://gerrit.wikimedia.org/r/304406 [22:19:36] Hehe, all done [22:19:37] Reedy, terbium [22:19:39] debt: ^^ [22:20:32] yay - let me see if I can log into grafana [22:21:02] ostriches which script did you use? :) [22:21:12] ldaplist because I'm a creature of habit [22:21:13] lol [22:21:20] Errrr, modify-ldap-group [22:21:21] :P [22:21:25] Then ldaplist to confirm [22:21:38] ostriches isn't modify-ldap-group just an ldapvi alias now? [22:21:45] I have no idea but it worked! [22:21:55] `modify-ldap-group --addmembers=debt wmf` [22:21:55] hmm :( logging in gets me this error: Failed to create user specified in auth proxy header [22:21:58] heh [22:21:59] ostriches can you tell me the commandline you used? [22:22:02] ah I see [22:22:03] ok [22:22:05] thanks [22:22:52] ostriches ^ [22:23:21] That sounds like grafana's fault. [22:24:11] debt: try doing something else the wmf grants you access to [22:26:07] p858snake um...suggestion? [22:26:29] Try logstash [22:26:34] Or icinga [22:28:12] (03PS1) 10RobH: reclaim WMF4724 to spares [puppet] - 10https://gerrit.wikimedia.org/r/304407 [22:28:26] PROBLEM - Host wmf4724 is DOWN: PING CRITICAL - Packet loss = 100% [22:28:45] yeah yeah i know icinga, you'll forget it exists in 30 seconds... [22:28:47] ostriches https://logstash.wikimedia.org/ doesn't let me log in [22:29:00] Hmmmm [22:29:15] I super duper promise you're in the group! [22:29:28] Or at least that's what ldap claims to me! [22:29:29] am I in as debt or debt wmf? [22:29:41] I did "debt" [22:29:53] that's what I created my wikitech account with as well "debt" [22:30:18] it'll need the 'cn' instead of 'uid' [22:30:28] (03CR) 10RobH: [C: 032] reclaim WMF4724 to spares [puppet] - 10https://gerrit.wikimedia.org/r/304407 (owner: 10RobH) [22:30:30] robh: :P [22:30:32] Which is "Debt" not "debt" [22:30:40] Yay case sensitive! [22:30:43] oi [22:30:47] Zppix: ? [22:30:53] I was able to log in here: https://icinga.wikimedia.org/icinga/ using 'debt' [22:30:53] krenair@tools-bastion-03:~/operations-software/maintain-replicas ((f2f36dd...))$ ldapsearch -x cn=wmf | grep debt [22:30:53] member: uid=debt,ou=people,dc=wikimedia,dc=org [22:30:53] krenair@tools-bastion-03:~/operations-software/maintain-replicas ((f2f36dd...))$ ldapsearch -x uid=debt | grep cn: [22:30:53] cn: Debt [22:31:08] you said icinga-wm you'll forget it exists in 30 sec [22:31:52] yeah, i just killed a host, but it didnt page so no big deal icinga alerted. [22:31:53] Yay! I'm in logstash now as Debt [22:33:39] and in grafana as an editor [22:34:03] 06Operations, 10hardware-requests: reclaim WMF4724 to spares - https://phabricator.wikimedia.org/T142412#2545934 (10RobH) [22:35:43] (03PS1) 10RobH: reclaim WMF4724 to spares [dns] - 10https://gerrit.wikimedia.org/r/304408 [22:36:39] (03CR) 10RobH: [C: 032] reclaim WMF4724 to spares [dns] - 10https://gerrit.wikimedia.org/r/304408 (owner: 10RobH) [22:36:54] thanks everyone! :) [22:37:36] debt: it takes a village :) [22:37:52] bd808 indeed! :) [22:37:56] 06Operations, 10ops-eqiad, 10hardware-requests: reclaim WMF4724 to spares - https://phabricator.wikimedia.org/T142412#2546003 (10RobH) a:05RobH>03Cmjohnson Assigning to @Cmjohnson for disk wipe, and then please add back to the spares tracking sheet. Thanks! [22:38:00] Speaking of.... bd808 see the weird test failure I was complaining about in #-core [22:38:02] <3 [22:56:05] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1001 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [50.0] [22:58:05] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1001 is OK: OK: Less than 1.00% above the threshold [25.0] [23:00:04] RoanKattouw, ostriches, MaxSem, awight, and Dereckson: Dear anthropoid, the time has come. Please deploy Evening SWAT (Max 8 patches) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20160811T2300). [23:00:04] James_F, RoanKattouw, SMalyshev, and mafk: A patch you scheduled for Evening SWAT (Max 8 patches) is about to be deployed. Please be available during the process. [23:00:19] * SMalyshev is here [23:00:22] Please do the needfull [23:00:31] * mafk is here [23:00:38] * James_F waves. [23:01:38] I'm here [23:02:37] thcipriani: wanna SWAT ;) [23:02:57] Hi. RoanKattouw: you swat or do you wish I do it? [23:03:01] I can do it if everyone else is busy but I do have other things to do too [23:03:07] Dereckson: So if you're up for it, go right ahead [23:03:39] Ok. [23:06:21] MatmaRex: you're around? [23:06:36] (03PS2) 10Dereckson: Re-enable thank-you-edit notifications [mediawiki-config] - 10https://gerrit.wikimedia.org/r/304267 (https://phabricator.wikimedia.org/T128249) (owner: 10Catrope) [23:07:30] (03CR) 10Dereckson: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/304267 (https://phabricator.wikimedia.org/T128249) (owner: 10Catrope) [23:07:42] Dereckson: sure [23:07:56] (03Merged) 10jenkins-bot: Re-enable thank-you-edit notifications [mediawiki-config] - 10https://gerrit.wikimedia.org/r/304267 (https://phabricator.wikimedia.org/T128249) (owner: 10Catrope) [23:08:18] RoanKattouw: Re-enable thank-you-edit notifications live on mw1099 [23:08:31] Thanks Dereckson, lemme think if I can test this [23:08:42] I'll just create a sock account there and make 1 edit I guess [23:09:11] or make an edit on a wiki you didn't edit on? [23:10:26] (03PS2) 10Dereckson: Re-enable the Echo footer notice [mediawiki-config] - 10https://gerrit.wikimedia.org/r/304271 (https://phabricator.wikimedia.org/T141414) (owner: 10Catrope) [23:11:58] (03CR) 10Dereckson: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/304271 (https://phabricator.wikimedia.org/T141414) (owner: 10Catrope) [23:12:24] (03Merged) 10jenkins-bot: Re-enable the Echo footer notice [mediawiki-config] - 10https://gerrit.wikimedia.org/r/304271 (https://phabricator.wikimedia.org/T141414) (owner: 10Catrope) [23:12:45] ...yes that would be simpler [23:12:49] (03PS2) 10Smalyshev: UrlShortener: Whitelist *.wikidata.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/302851 (https://phabricator.wikimedia.org/T142055) (owner: 10Legoktm) [23:12:55] It's not working for a new account just yet [23:12:57] Wondering why [23:13:43] I'm having a really strange issue involving a git clone from gerrit ostriches [23:13:59] oh? [23:14:26] ostriches, https://phabricator.wikimedia.org/P3820 [23:14:47] RoanKattouw: footer live on mw1099 [23:15:00] Krenair: Use /r/p/ first of all, pet peeve and it bugs me :p [23:15:02] Looking tho [23:15:19] * Krenair points at gerrit [23:15:28] think it generated that style of url [23:15:43] 06Operations, 10Ops-Access-Requests, 06Research-and-Data, 10Research-collaborations, 10Research-management: Request access to data for WDQS research - https://phabricator.wikimedia.org/T142780#2546200 (10leila) [23:16:43] MatmaRex: you self-merged the revert here: https://gerrit.wikimedia.org/r/#/c/304411/ [23:16:57] Krenair: "java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 30000/30000 ms" maybe [23:17:12] 30s timeout [23:17:21] Fun times on a slow connection or a shitty filesystem [23:17:51] Dereckson: Footer notice looks good [23:17:54] krenair@tools-bastion-03:~/operations-software/maintain-replicas ((f2f36dd...))$ hostname -f [23:17:54] tools-bastion-03.tools.eqiad.wmflabs [23:18:02] Still trying to test the thank-you-edit thing, doesn't seem to be working, but should be safe to deploy [23:18:03] Slow connection from wmflabs to gerrit? I doubt it :) [23:18:05] I'll continue debugging it [23:18:45] shitty filesystem? this FS comes from NFS on labstore.svc.eqiad.wmnet, so I can believe that [23:18:48] RoanKattouw: and the footer one? [23:18:56] The footer one is working [23:19:16] Except that the Dutch translators put in instead of :| but that's not a bug, just a mistranslation [23:19:58] Krenair: I think we saw this reported before on labs. [23:20:02] And filesystem was most likely [23:20:10] Okay [23:20:16] Dereckson: Also I will need another patch, unfortunately :( https://gerrit.wikimedia.org/r/#/c/304415/ [23:20:22] yuvipanda, ^ think this issue is NFS? [23:20:43] RoanKattouw: I don't know if bypass translatewiki is a good idea, but if you do so, that could be fixed by l10nupdate at 3h [23:20:50] Krenair: sshd has a timeout of 864000. I don't have a gerrit-side max timeout for httpd, just at the apache level which is set to 720. [23:20:53] Yeah I will fix it in the repo [23:20:58] And then fix it in TWN too [23:21:47] Krenair: We could tweak maxWait possibly. [23:21:47] Dereckson: yeah [23:21:47] Thanks for the tip [23:21:57] Maybe I should just run it elsewhere in labs, not using NFS [23:22:11] Maximum amount of time a client will wait for an available thread to handle a project clone, fetch or push request over the smart HTTP transport. [23:22:16] Default is 5m [23:22:33] shouldn't really have to make prod more patient just for NFS [23:22:47] !log dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Re-enable Echo footer notice (T141414) (duration: 00m 54s) [23:22:48] T141414: Invite users to try the new Notifications page (using the Notifications panel) - https://phabricator.wikimedia.org/T141414 [23:22:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [23:23:17] 235MB?! [23:23:18] wtf git. [23:23:33] footer works on fr. [23:23:40] is that the current size of mediawiki-config ostriches? [23:23:48] Yeah [23:23:52] heh [23:24:07] eh, gc only got it down to 230 [23:24:15] jgit jc has gotten worse in 2.12 imho :\ [23:24:33] !log dereckson@tin Synchronized wmf-config/CommonSettings.php: Re-enable thank-you-edit notifications (T128249) (duration: 00m 58s) [23:24:34] T128249: Multiple "You made your edit!" notifications - https://phabricator.wikimedia.org/T128249 [23:24:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [23:25:06] PROBLEM - HHVM jobrunner on mw1162 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:25:15] so it lost like 2.1% of the size? probably not very useful [23:25:18] (03PS3) 10Dereckson: UrlShortener: Whitelist *.wikidata.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/302851 (https://phabricator.wikimedia.org/T142055) (owner: 10Legoktm) [23:25:32] my local copy is 278M [23:25:35] Actual git repack can get it down to 99M [23:25:44] Try again [23:25:46] (the clone) [23:25:52] (03CR) 10Dereckson: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/302851 (https://phabricator.wikimedia.org/T142055) (owner: 10Legoktm) [23:26:19] doing [23:26:21] (03Merged) 10jenkins-bot: UrlShortener: Whitelist *.wikidata.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/302851 (https://phabricator.wikimedia.org/T142055) (owner: 10Legoktm) [23:26:31] ostriches: Krenair: mine was 197M, 142M after git gc [23:26:44] git gc by itself isn't smart enough :) [23:26:47] try with --aggressive [23:27:01] Or be even more aggressive with something like `git repack -adf --depth=250 --window=250` [23:27:26] SMalyshev: UrlShortener: Whitelist *.wikidata.org live on mw1099 [23:27:40] Dereckson: great, thanks [23:28:17] live on labs too [23:28:35] Yay, thank-you-edit is working [23:28:44] (03PS2) 10Dereckson: Remove 'gather-hidelist' from CommonSettings.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/303803 (owner: 10MarcoAurelio) [23:29:00] Dereckson: If I add https://gerrit.wikimedia.org/r/#/c/304415/ to the wiki page could you deploy it too? [23:29:10] Krenair: I got a timeout on the same repo again [23:29:12] Assuming it failed? [23:29:20] (got in the log, that is) [23:29:20] not yet [23:29:24] remote: Compressing objects: 99% (512541/512542) [23:29:28] it then moved on to receiving objects [23:29:31] RoanKattouw: yup [23:29:35] isn't it supposed to get to 100% before moving to the next step? [23:29:35] Thanks [23:29:55] woah [23:30:04] Eh, git sometimes does funky things and counts to 99% :p [23:30:04] so it was going really slowly, just a few KiB/s [23:30:15] then it sped up massively and broke again [23:30:31] Man git is weird [23:30:33] error: RPC failed; result=56, HTTP code = 200iB | 196.00 KiB/s [23:30:33] fatal: The remote end hung up unexpectedly03 MiB | 2.40 MiB/s [23:30:33] fatal: early EOF [23:30:33] fatal: index-pack failed [23:31:23] James_F: did you test https://gerrit.wikimedia.org/r/#/c/304268/ thoroughly? [23:31:43] SMalyshev: does your config change works on mw1099? [23:31:58] Dereckson: Personally? No, the experts did. [23:32:03] Dereckson: Why? [23:33:07] Dereckson: hmm I don't think I can check on actual wiki since it's disabled: https://en.wikipedia.org/wiki/Special:UrlShortener [23:33:31] also on https://test.wikipedia.org/wiki/Special:UrlShortener [23:34:18] Krenair: File a bug. I'm not sure what's going on here yet but it's weird. [23:34:27] I want to blame a crappy underlying filesystem, but that's still odd.... [23:34:54] SMalyshev: ack [23:35:02] I started inserting newlines to stop it overwriting itself [23:35:05] Receiving objects: 72% (38945/53835), 21.82 MiB | 826.00 KiB/s [23:35:05] error: RPC failed; result=56, HTTP code = 200MiB | 1.14 MiB/s [23:35:16] hi [23:35:17] Ahhhhhhhhh [23:35:21] Gerrit lies! [23:35:26] Default for maxWait *is* 30 seconds. [23:35:29] Let's raise it [23:35:41] krenair you can try on /tmp, which is larger [23:36:09] and has 3.4G available, ok [23:36:13] SMalyshev: by the way, the change fixes another issue it seems: https://phabricator.wikimedia.org/T142055#2546244 [23:36:23] yep [23:36:24] Receiving objects: 100% (53835/53835), 40.95 MiB | 7.98 MiB/s, done. [23:36:26] ostriches: Krenair: 103M with aggressive [23:36:28] much much faster [23:36:39] try on tools-dev? [23:36:48] * Krenair is not used to the filesystem being the limiting factor on download speed [23:36:59] nfs is throttled fairly heavily. [23:37:10] I think jetty7/gerrit.xml overrides this.maxWait = MINUTES.toMillis(getTimeUnit(cfg, "httpd", null, "maxwait", 5, MINUTES)); actually [23:37:17] So the 5 MINUTES hardcode doesn't work [23:37:40] !log dereckson@tin Synchronized wmf-config/CommonSettings.php: UrlShortener: Whitelist *.wikidata.org (T142055) (duration: 00m 47s) [23:37:42] T142055: Add query.wikidata.org to shortening URL list for UrlShortener - https://phabricator.wikimedia.org/T142055 [23:37:42] Let's raise it to 5 minutes. [23:37:44] For realz. [23:37:45] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [23:39:02] yuvipanda, yeah can already tell that tools-dev is having similar issues [23:39:03] Patch incoming [23:39:16] (03PS1) 10Chad: Gerrit: Raise httpd.maxWait from default of 5mins to actually be 5mins [puppet] - 10https://gerrit.wikimedia.org/r/304418 [23:39:33] yuvipanda: Mind having a look-see? ^ [23:39:44] I think that's mostly what's causing Krenair's problem. [23:40:03] ostriches do you have ability to run puppet on the gerrit box? [23:40:09] I do! [23:40:15] ok [23:40:18] I just need the review + merge to puppetmasta [23:41:07] (03CR) 10Dereckson: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/303803 (owner: 10MarcoAurelio) [23:41:22] (03CR) 10Yuvipanda: [C: 032] Gerrit: Raise httpd.maxWait from default of 5mins to actually be 5mins [puppet] - 10https://gerrit.wikimedia.org/r/304418 (owner: 10Chad) [23:41:27] ostriches, deploying? i will need to scap3 the maps service after you are done [23:41:35] merging, ostriches [23:41:35] (03Merged) 10jenkins-bot: Remove 'gather-hidelist' from CommonSettings.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/303803 (owner: 10MarcoAurelio) [23:41:45] yurik: No I'm not [23:41:56] ostriches merged [23:41:57] mafk: live on mw1099 [23:42:16] Dereckson: testing [23:42:25] yurik: I'll ping you when SWAT is done [23:42:31] Dereckson, thx! [23:42:37] yurik: will be probably 1:20 UTC [23:42:42] er 00:20 UTC [23:42:50] !log gerrit: running puppet to pick up config change, gerrit will do a quick restart [23:42:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [23:43:09] Dereckson, 40 min? ok [23:43:31] Dereckson: gather-hidelist is no longer present in Special:GlobalGroupPermissions, which is what was desired. [23:43:42] yurik: yes [23:43:44] mafk: ok [23:45:18] Krenair: Try again [23:45:46] Gerrit 503 [23:45:54] ah works again [23:46:12] ;-) [23:46:20] Had a quick restart to pick up a config change [23:46:59] Okay, config changes are done. [23:47:22] !log dereckson@tin Synchronized wmf-config/CommonSettings.php: Remove 'gather-hidelist' user right [[gerrit:303803]] (duration: 00m 48s) [23:47:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [23:53:14] * MatmaRex lurks [23:53:25] Dereckson: Isn't it helpful how fast CI is? ;-) [23:54:16] MatmaRex: live on mw1099 [23:54:27] RoanKattouw: live on mw1099 [23:54:39] Thanks, checking [23:54:52] is kaldari here? [23:54:56] James_F: https://integration.wikimedia.org/ci/job/mediawiki-extensions-php55/6457/console [23:55:00] Dereckson: thanks, looks good [23:55:41] Dereckson: hi!!!! [23:55:43] RoanKattouw: full swat for https://gerrit.wikimedia.org/r/#/c/304412/ ? [23:55:47] hi AndyRussG [23:55:47] yes you can [23:55:54] hehehehe [23:55:56] full scap sorry [23:56:13] Dereckson: Yup. [23:56:24] Dereckson: u SWATting? [23:56:28] Yes. [23:56:29] James_F: merged :) [23:56:30] Dereckson: Ask MatmaRex ;) [23:56:41] Dereckson: I think Roan's patch doesn't need a full scap (no i18n changes). [23:56:44] The CN patch that was meant for yesterday went out earlier today. A teensy error appeared on production following the deploy. [23:56:50] Dereckson: My patch works fine on mw1099 [23:56:55] Now a patch is available for a fix [23:56:59] no, mine just needs sync [23:57:05] AndyRussG: ok [23:57:10] Dereckson: cool beans thx! [23:57:19] MatmaRex: er... sync what? [23:57:28] To do a quasi-scap-but-not-scap you can always do sync-dir php-1.28.0-wmf.14/ :) [23:57:37] Dereckson: er, the files? [23:57:45] RoanKattouw: well, good idea [23:57:58] we don't have any need to rebuild l10n or other stuff here [23:58:23] MatmaRex: I tried to figure how to pick your files, but not the extensions/ directory at first [23:59:22] Dereckson: i am confused. do you need me to help with something?