[00:00:50] (03CR) 10Dzahn: "@krenair this one also needs bastiononly additonally, right?" [puppet] - 10https://gerrit.wikimedia.org/r/267919 (https://phabricator.wikimedia.org/T125651) (owner: 10Gehel) [00:01:27] (03CR) 10Alex Monk: "Nope, user is being added to the deployment group. No bastiononly needed." [puppet] - 10https://gerrit.wikimedia.org/r/267919 (https://phabricator.wikimedia.org/T125651) (owner: 10Gehel) [00:01:51] (03CR) 10Dzahn: [C: 031] "thanks, then it's a +1" [puppet] - 10https://gerrit.wikimedia.org/r/267919 (https://phabricator.wikimedia.org/T125651) (owner: 10Gehel) [00:01:57] (03PS5) 10Dzahn: Adding user gehel (Guillaume Lederrey) to user list and to necessary groups [puppet] - 10https://gerrit.wikimedia.org/r/267919 (https://phabricator.wikimedia.org/T125651) (owner: 10Gehel) [00:02:31] (03PS15) 10Krinkle: [WIP] Implement /w/static.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263566 (https://phabricator.wikimedia.org/T99096) [00:02:39] (03CR) 10Dzahn: "yea, because now they are inconsistent. before you knew each group was it's own thing.. yep" [puppet] - 10https://gerrit.wikimedia.org/r/267919 (https://phabricator.wikimedia.org/T125651) (owner: 10Gehel) [00:02:41] (03CR) 10Alex Monk: "(See the admin::groups section of hieradata/role/common/bastionhost/general.yaml - deployment, restricted, parsoid-admin, and ocg-render-a" [puppet] - 10https://gerrit.wikimedia.org/r/267919 (https://phabricator.wikimedia.org/T125651) (owner: 10Gehel) [00:04:39] RECOVERY - puppet last run on mw1152 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [00:04:58] RECOVERY - puppet last run on mw2148 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [00:05:17] RECOVERY - puppet last run on strontium is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [00:05:28] RECOVERY - puppet last run on mw2154 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [00:06:28] RECOVERY - puppet last run on mw1071 is OK: OK: Puppet is currently enabled, last run 36 seconds ago with 0 failures [00:06:37] RECOVERY - puppet last run on mw1193 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [00:07:02] (03CR) 10Dzahn: "yep, thanks. this was basically my concern when we had that discussion about making one group automatically include other groups. Some are" [puppet] - 10https://gerrit.wikimedia.org/r/267919 (https://phabricator.wikimedia.org/T125651) (owner: 10Gehel) [00:07:28] RECOVERY - puppet last run on mw2013 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [00:07:29] RECOVERY - puppet last run on mw2171 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [00:08:12] robh: https://gerrit.wikimedia.org/r/#/c/268706/ .. to solve the "spaces vs. tabs" thing in DHCP files for once [00:08:40] is that making every dhcp file spaces? [00:09:13] (03CR) 10Alex Monk: "one of the fun implications of this is that if the user were to lose their deployment rights for whatever reason, they would no longer be " [puppet] - 10https://gerrit.wikimedia.org/r/267919 (https://phabricator.wikimedia.org/T125651) (owner: 10Gehel) [00:09:15] seems the opposite of most dhcp config files but meh if its our standard its our standard. [00:09:19] robh: yes [00:10:03] i just want us to stop spending time on fixing those changes that mix both [00:13:52] (03PS16) 10Krinkle: [WIP] Implement /w/static.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263566 (https://phabricator.wikimedia.org/T99096) [00:14:33] (03CR) 10Dzahn: "yes, that was my point to _not_ make this change in the past" [puppet] - 10https://gerrit.wikimedia.org/r/267919 (https://phabricator.wikimedia.org/T125651) (owner: 10Gehel) [00:14:39] andrewbogott, are all of those OSM patches dependant on one of them? [00:16:57] RECOVERY - puppet last run on cp2014 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [00:17:15] (03CR) 10Dzahn: "if you want to bring that up for discussion again, please add Yuvipanda" [puppet] - 10https://gerrit.wikimedia.org/r/267919 (https://phabricator.wikimedia.org/T125651) (owner: 10Gehel) [00:22:44] 6operations, 10DBA, 6Labs, 10Tool-Labs: Replicate wikimania2017wiki to labs - https://phabricator.wikimedia.org/T126096#2004612 (10MaxSem) 3NEW [00:22:58] 6operations, 10EventBus, 6Services, 10hardware-requests: 4 more Kafka brokers, 2 in eqiad and 2 codfw - https://phabricator.wikimedia.org/T124469#2004620 (10RobH) a:5RobH>3Ottomata [00:23:52] (03CR) 1020after4: [C: 031] [WIP] Implement /w/static.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/263566 (https://phabricator.wikimedia.org/T99096) (owner: 10Krinkle) [00:24:31] 6operations: decom magnesium (was: Reinstall magnesium with jessie) - https://phabricator.wikimedia.org/T123713#2004628 (10Dzahn) [00:24:42] 6operations: decom magnesium (was: Reinstall magnesium with jessie) - https://phabricator.wikimedia.org/T123713#1936373 (10Dzahn) 5Open>3stalled [00:24:44] 6operations, 7Tracking: reduce amount of remaining Ubuntu 12.04 (precise) systems - https://phabricator.wikimedia.org/T123525#2004632 (10Dzahn) [00:44:22] (03CR) 10JGirault: "Been replaced by https://gerrit.wikimedia.org/r/#/c/268804/" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/268713 (owner: 10JGirault) [00:44:35] (03CR) 10JGirault: [C: 04-1] Bump portals to master [mediawiki-config] - 10https://gerrit.wikimedia.org/r/268713 (owner: 10JGirault) [00:50:46] 6operations, 10Citoid, 6Services: Package and test Zotero for Jessie - https://phabricator.wikimedia.org/T107302#2004814 (10mobrovac) p:5High>3Low [00:52:47] PROBLEM - puppet last run on ms-be2010 is CRITICAL: CRITICAL: puppet fail [01:08:09] (03PS1) 10Mobrovac: Zotero: Move logs to /srv/log [puppet] - 10https://gerrit.wikimedia.org/r/268847 (https://phabricator.wikimedia.org/T107900) [01:12:48] (03PS1) 10JGirault: Bump portals to master (color standardization, CSS sprites, updated stats) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/268849 [01:13:00] SHOW GLOBAL STATUS like 'Uptime' -> 177 [01:13:34] SHOW PROCESSLIST query time #45 -> 24559 [01:15:02] (03PS2) 10JGirault: Bump portals to master (color standardization, CSS sprites, updated stats) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/268849 (https://phabricator.wikimedia.org/T124993) [01:15:33] jynus, a query has been running longer than the server has been up...? [01:15:41] yes [01:15:44] wut [01:15:49] it may be a feature [01:15:59] (03PS14) 10Dduvall: Puppet provider for scap3 [puppet] - 10https://gerrit.wikimedia.org/r/262742 (https://phabricator.wikimedia.org/T113072) (owner: 10Alexandros Kosiaris) [01:16:02] this is mysql, right? not php? [01:16:10] yes [01:16:31] "For a slave SQL thread, this is the time in seconds between the last replicated event's timestamp and the slave machine's real time." [01:16:49] (03CR) 10Mobrovac: "Looking good - https://puppet-compiler.wmflabs.org/1689/sca1001.eqiad.wmnet/" [puppet] - 10https://gerrit.wikimedia.org/r/268847 (https://phabricator.wikimedia.org/T107900) (owner: 10Mobrovac) [01:17:37] (03CR) 10jenkins-bot: [V: 04-1] Puppet provider for scap3 [puppet] - 10https://gerrit.wikimedia.org/r/262742 (https://phabricator.wikimedia.org/T113072) (owner: 10Alexandros Kosiaris) [01:17:38] this is the first time I noticed this [01:18:15] but seems so wrong [01:19:34] (03PS15) 10Dduvall: Puppet provider for scap3 [puppet] - 10https://gerrit.wikimedia.org/r/262742 (https://phabricator.wikimedia.org/T113072) (owner: 10Alexandros Kosiaris) [01:19:37] RECOVERY - puppet last run on ms-be2010 is OK: OK: Puppet is currently enabled, last run 24 seconds ago with 0 failures [01:27:14] (03PS1) 10Dzahn: exim: rewriting rule for maint-announce@ mail to phab [puppet] - 10https://gerrit.wikimedia.org/r/268851 (https://phabricator.wikimedia.org/T118176) [01:27:17] (03PS1) 10Mobrovac: Apertium: Move logs to /srv/log [puppet] - 10https://gerrit.wikimedia.org/r/268852 (https://phabricator.wikimedia.org/T107900) [01:27:57] (03PS2) 10Dzahn: exim: rewriting rule for maint-announce@ mail to phab [puppet] - 10https://gerrit.wikimedia.org/r/268851 (https://phabricator.wikimedia.org/T118176) [01:29:43] (03PS3) 10Dzahn: exim: rewriting rule for maint-announce@ mail to phab [puppet] - 10https://gerrit.wikimedia.org/r/268851 (https://phabricator.wikimedia.org/T118176) [01:34:53] (03CR) 10Dzahn: [C: 031] Apertium: Move logs to /srv/log [puppet] - 10https://gerrit.wikimedia.org/r/268852 (https://phabricator.wikimedia.org/T107900) (owner: 10Mobrovac) [01:35:50] (03CR) 10Dzahn: [C: 031] Zotero: Move logs to /srv/log [puppet] - 10https://gerrit.wikimedia.org/r/268847 (https://phabricator.wikimedia.org/T107900) (owner: 10Mobrovac) [01:36:14] ..away too [01:37:10] (03CR) 10Mobrovac: "PCC is happy - https://puppet-compiler.wmflabs.org/1690/sca1001.eqiad.wmnet/" [puppet] - 10https://gerrit.wikimedia.org/r/268852 (https://phabricator.wikimedia.org/T107900) (owner: 10Mobrovac) [01:42:46] 6operations, 10Wikimedia-Site-Requests: Rename cbk-zamwiki to cbkwiki - https://phabricator.wikimedia.org/T124657#2004933 (10Liuxinyu970226) >>! In T124657#2001728, @seav wrote: > Sorry, there is no consensus for this renaming. And there have no clear concensus for T64717, T41968, T30443, and T25216, so are y... [01:58:04] (03PS1) 10Mobrovac: Apertium: Fix --log-path position in SystemD unit file [puppet] - 10https://gerrit.wikimedia.org/r/268856 [02:02:23] 6operations, 10Incident-Labs-NFS-20151216: Add step in start-nfs to ask operator to consider dropping some snapshots - https://phabricator.wikimedia.org/T121890#2004954 (10yuvipanda) a:5yuvipanda>3None [02:16:50] Hey folks. I'm getting a 503 on Special:OAuthManageConsumers on Meta. Known issue or shall I add a task in phab? [02:17:14] known issue, will look into it soonish [02:17:19] halfak, I think I already reported that [02:17:33] Cool. Have a link handy? [02:17:37] https://phabricator.wikimedia.org/T125939 [02:17:39] * halfak should subscribe [02:17:41] Thanks! [02:19:22] I think it's due to not having the mwoauthviewprivate permission [02:23:55] halfak, I think there's actually 2 bugs you're seeing there [02:24:04] the other one is linked from that [02:24:53] 6operations, 10Traffic, 7Beta-Cluster-reproducible: PHP fatal errors causing Varnish to return 503 - "Junk after gzip data" - https://phabricator.wikimedia.org/T125938#2004990 (10Krenair) Hey folks. I'm getting a 503 on Special:OAuthManageConsumers on Meta. Known issue or shall I add a task in pha... [02:26:09] 6operations, 10Traffic, 7Beta-Cluster-reproducible: PHP fatal errors causing Varnish to return 503 - "Junk after gzip data" - https://phabricator.wikimedia.org/T125938#2004993 (10Halfak) I'm getting a 503 on Meta. Here's the copy-paste details: ``` Request from 10.128.0.118 via cp1068 cp1068 ([10.64.0.105]... [02:26:24] Thanks Krenair. I just posted the details from a 503 page load. [02:26:38] I'll have to run away in a few minutes. [02:26:46] Anything else I can help with before I go? [02:27:30] tgr, ^ [02:27:58] no, it probably just got broken by https://gerrit.wikimedia.org/r/#/c/267733/ [02:30:30] 10Ops-Access-Requests, 6operations, 3Discovery-Search-Sprint, 5Patch-For-Review: Access for new Discovery OpsEng: Guillaume Lederrey - https://phabricator.wikimedia.org/T125651#2004994 (10Krenair) [02:32:55] (03CR) 1020after4: "Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Invalid parameter package_settings" [puppet] - 10https://gerrit.wikimedia.org/r/262742 (https://phabricator.wikimedia.org/T113072) (owner: 10Alexandros Kosiaris) [02:34:14] (03CR) 1020after4: "No I'm wrong - ubuntu trusty doesn't have that feature, it's running 3.4.3" [puppet] - 10https://gerrit.wikimedia.org/r/262742 (https://phabricator.wikimedia.org/T113072) (owner: 10Alexandros Kosiaris) [02:41:08] PROBLEM - puppet last run on mw2029 is CRITICAL: CRITICAL: puppet fail [02:52:38] 6operations, 10Traffic, 7Beta-Cluster-reproducible: PHP fatal errors causing Varnish to return 503 - "Junk after gzip data" - https://phabricator.wikimedia.org/T125938#2005005 (10Tgr) Full error message is `Fatal error: Call to undefined method Message::toJson() in /srv/mediawiki/php-1.27.0-wmf.12/extensions... [03:09:29] RECOVERY - puppet last run on mw2029 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [03:17:11] 6operations, 10Traffic, 7Beta-Cluster-reproducible: PHP fatal errors causing Varnish to return 503 - "Junk after gzip data" - https://phabricator.wikimedia.org/T125938#2005011 (10Tgr) @halfak if this is blocking you, you can probably get the same information by using `Special:OAuthListConsumers` instead. [03:51:12] 6operations, 10Traffic, 7Beta-Cluster-reproducible: PHP fatal errors causing Varnish to return 503 - "Junk after gzip data" - https://phabricator.wikimedia.org/T125938#2005023 (10Tgr) Should have read T125939 first, duh. ``` curl -vs --raw -H 'Accept-Encoding: gzip' -H 'Host:meta.wikimedia.beta.wmflabs.org'... [05:05:49] 6operations, 10Traffic, 7Beta-Cluster-reproducible: PHP fatal errors causing Varnish to return 503 - "Junk after gzip data" - https://phabricator.wikimedia.org/T125938#2005038 (10Krenair) [05:22:46] 6operations, 10Traffic, 7Beta-Cluster-reproducible: PHP fatal errors causing Varnish to return 503 - "Junk after gzip data" - https://phabricator.wikimedia.org/T125938#2005044 (10BBlack) >>! In T125938#2005023, @Tgr wrote: > HHVM can do its own gzip encryption, so presumably this is a bug with that. In gene... [05:33:48] PROBLEM - Host cp2006 is DOWN: PING CRITICAL - Packet loss = 100% [05:36:45] ^ looking [05:40:39] RECOVERY - Host cp2006 is UP: PING WARNING - Packet loss = 44%, RTA = 38.16 ms [05:43:37] !log rebooted cp2006 via racadm after crash - no crash data in logs... [05:43:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [06:30:29] PROBLEM - puppet last run on neodymium is CRITICAL: CRITICAL: Puppet has 2 failures [06:30:58] PROBLEM - puppet last run on db1045 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:08] PROBLEM - puppet last run on db2055 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:37] PROBLEM - puppet last run on mw2045 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:48] PROBLEM - puppet last run on mw1170 is CRITICAL: CRITICAL: Puppet has 1 failures [06:31:57] PROBLEM - puppet last run on mw1061 is CRITICAL: CRITICAL: Puppet has 1 failures [06:33:18] PROBLEM - puppet last run on mw2018 is CRITICAL: CRITICAL: Puppet has 1 failures [06:34:07] PROBLEM - puppet last run on mw2126 is CRITICAL: CRITICAL: Puppet has 1 failures [06:34:17] PROBLEM - puppet last run on mw1055 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:19] RECOVERY - puppet last run on mw1170 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [06:56:59] RECOVERY - puppet last run on neodymium is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [06:57:27] RECOVERY - puppet last run on db1045 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:57:39] RECOVERY - puppet last run on db2055 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:57:58] RECOVERY - puppet last run on mw2018 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [06:58:08] RECOVERY - puppet last run on mw2045 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [06:58:19] RECOVERY - puppet last run on mw1061 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:58:48] RECOVERY - puppet last run on mw2126 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:58:58] RECOVERY - puppet last run on mw1055 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [08:35:28] RECOVERY - Kafka Broker Under Replicated Partitions on kafka1014 is OK: OK: Less than 50.00% above the threshold [1.0] [09:22:52] (03CR) 1020after4: [C: 04-1] Puppet provider for scap3 (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/262742 (https://phabricator.wikimedia.org/T113072) (owner: 10Alexandros Kosiaris) [09:44:41] 6operations, 10Wikimedia-Video, 5Patch-For-Review: 1gb file upload limit is too restrictive for conference presentation videos - https://phabricator.wikimedia.org/T116514#2005156 (10zhuyifei1999) Duplicate of {T76614}? [09:45:36] 6operations, 10Wikimedia-Video, 5Patch-For-Review: 1gb file upload limit is too restrictive for conference presentation videos - https://phabricator.wikimedia.org/T116514#2005158 (10zhuyifei1999) [09:45:39] 6operations, 6Commons, 10MediaWiki-Uploading, 6Multimedia: Raise max upload limit above 1GB - https://phabricator.wikimedia.org/T76614#807206 (10zhuyifei1999) [09:53:30] 6operations, 6Commons, 10MediaWiki-Uploading, 6Multimedia, 10Wikimedia-Video: Raise max upload limit above 1GB - https://phabricator.wikimedia.org/T76614#2005161 (10zhuyifei1999) [10:05:48] RECOVERY - Kafka Broker Under Replicated Partitions on kafka1020 is OK: OK: Less than 50.00% above the threshold [1.0] [10:10:17] PROBLEM - check_mysql on db1008 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 638 [10:20:17] RECOVERY - check_mysql on db1008 is OK: Uptime: 1536114 Threads: 3 Questions: 8967146 Slow queries: 10343 Opens: 3913 Flush tables: 2 Open tables: 417 Queries per second avg: 5.837 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [10:35:17] PROBLEM - check_mysql on db1008 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 746 [10:40:17] RECOVERY - check_mysql on db1008 is OK: Uptime: 1537315 Threads: 2 Questions: 8973664 Slow queries: 10362 Opens: 3914 Flush tables: 2 Open tables: 418 Queries per second avg: 5.837 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [10:56:44] 6operations, 10DBA, 5Patch-For-Review: Prepare db1018 and s2-slaves for s2 master failover - https://phabricator.wikimedia.org/T125215#2005172 (10jcrespo) db1018 has been reinstalled with jessie/MariaDB 10.0.23 and it is ready for switchover. Only tasks pending is prepare/decide about the list of things to d... [13:52:57] PROBLEM - puppet last run on mw2200 is CRITICAL: CRITICAL: puppet fail [14:19:29] RECOVERY - puppet last run on mw2200 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [14:28:30] hi [14:28:30] can I have the assistence of a tech guy please? [14:28:30] I'd like to delete this page: https://commons.wikimedia.org/wiki/User:Steinsplitter/test [14:28:30] (with 5,000+ revisions as it seems) [14:28:30] is it okay to do so? [14:33:35] hello? [14:47:39] I decided to proceed.... please confirm everything went well [14:48:12] unless the wiki is down, it probably did [14:48:20] :) [14:48:50] icinga-wm_ is also not reporting anything scary-looking [14:49:05] good [15:40:34] (03PS1) 10Dereckson: Enable CategoryMembershipChanges on fr.wiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/268879 (https://phabricator.wikimedia.org/T126051) [15:46:38] PROBLEM - puppet last run on wtp2005 is CRITICAL: CRITICAL: puppet fail [15:48:35] (03CR) 10Luke081515: Set nlwiki collation to uca-nl [mediawiki-config] - 10https://gerrit.wikimedia.org/r/268409 (https://phabricator.wikimedia.org/T125774) (owner: 10Merlijn van Deen) [15:50:16] (03CR) 10Luke081515: [C: 031] Enable CategoryMembershipChanges on fr.wiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/268879 (https://phabricator.wikimedia.org/T126051) (owner: 10Dereckson) [16:08:08] PROBLEM - puppet last run on mw2104 is CRITICAL: CRITICAL: puppet fail [16:14:58] RECOVERY - puppet last run on wtp2005 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [16:26:01] (03CR) 10JanZerebecki: "Why would the edit rate itself be something that is critical?" [puppet] - 10https://gerrit.wikimedia.org/r/268662 (owner: 10Addshore) [16:36:28] RECOVERY - puppet last run on mw2104 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [17:10:35] (03CR) 10Hoo man: "That's a very good question. Unless we can actually find a value which is critical in a way that is not covered by anything else, we shoul" [puppet] - 10https://gerrit.wikimedia.org/r/268662 (owner: 10Addshore) [17:43:53] (03PS1) 10BBlack: SPDY support toggle, off for cp1008 canary [puppet] - 10https://gerrit.wikimedia.org/r/268892 (https://phabricator.wikimedia.org/T125979) [17:43:55] (03PS1) 10BBlack: disable SPDY for all cache_text [puppet] - 10https://gerrit.wikimedia.org/r/268893 (https://phabricator.wikimedia.org/T125979) [17:59:26] 6operations, 10Traffic: Decrease max object TTL in varnishes - https://phabricator.wikimedia.org/T124954#2005469 (10GWicke) [18:41:06] 6operations, 10EventBus, 6Services, 10hardware-requests: 4 more Kafka brokers, 2 in eqiad and 2 codfw - https://phabricator.wikimedia.org/T124469#2005494 (10GWicke) I don't necessarily want to hold up the procurement, but saw the meeting & decision mentioned here for the first time. I think it's worth bein... [18:41:42] 6operations, 10Traffic, 7Beta-Cluster-reproducible: PHP fatal errors causing Varnish to return 503 - "Junk after gzip data" - https://phabricator.wikimedia.org/T125938#2005496 (10bd808) >>! In T125938#2005005, @Tgr wrote: > Full error message is `Fatal error: Call to undefined method Message::toJson() in /sr... [19:15:20] 6operations, 10EventBus, 6Services, 10hardware-requests: 4 more Kafka brokers, 2 in eqiad and 2 codfw - https://phabricator.wikimedia.org/T124469#2005538 (10mobrovac) >>! In T124469#2004185, @GWicke wrote: > 1) Which advantages do you see in having separate clusters at this point, considering the relativel... [20:52:46] (03CR) 10Alex Monk: Define wgOpenStackManagerProject (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/267192 (https://phabricator.wikimedia.org/T115029) (owner: 10Andrew Bogott) [21:13:28] PROBLEM - Kafka Broker Replica Max Lag on kafka1022 is CRITICAL: CRITICAL: 69.23% of data above the critical threshold [5000000.0] [21:20:46] 6operations, 10EventBus, 6Services, 10hardware-requests: 4 more Kafka brokers, 2 in eqiad and 2 codfw - https://phabricator.wikimedia.org/T124469#2005649 (10GWicke) > While performance-wise adding the EventBus streams to the existing Analytics cluster wouldn't be a problem We ruled this possibility out ra... [21:31:18] RECOVERY - Kafka Broker Replica Max Lag on kafka1022 is OK: OK: Less than 50.00% above the threshold [1000000.0] [22:16:20] (03PS1) 10Ori.livneh: dotfiles(ori): add smarter cd [puppet] - 10https://gerrit.wikimedia.org/r/268920 [22:43:01] (03PS1) 10Alex Monk: WIP: labs dnsrecursor: work on all projects, not just some arbitrary ones [puppet] - 10https://gerrit.wikimedia.org/r/268921 [22:44:11] (03CR) 10jenkins-bot: [V: 04-1] WIP: labs dnsrecursor: work on all projects, not just some arbitrary ones [puppet] - 10https://gerrit.wikimedia.org/r/268921 (owner: 10Alex Monk) [22:45:51] (03PS2) 10Alex Monk: WIP: labs dnsrecursor IP aliasing: work on all projects, not just some arbitrary ones [puppet] - 10https://gerrit.wikimedia.org/r/268921 [23:16:47] 6operations, 10OTRS, 6Security, 5Patch-For-Review: Make OTRS sessions IP-address-agnostic - https://phabricator.wikimedia.org/T87217#2005757 (10Platonides) 5Open>3Resolved The session id is no longer leaked in the url, and the connfiguration has been changed (and deployed) not to verify the IP address. [23:23:29] RECOVERY - Kafka Broker Under Replicated Partitions on kafka1018 is OK: OK: Less than 50.00% above the threshold [1.0] [23:42:33] (03PS3) 10Alex Monk: labs dnsrecursor IP aliasing: work on all projects, not just some arbitrary ones [puppet] - 10https://gerrit.wikimedia.org/r/268921 [23:44:17] (03CR) 10jenkins-bot: [V: 04-1] labs dnsrecursor IP aliasing: work on all projects, not just some arbitrary ones [puppet] - 10https://gerrit.wikimedia.org/r/268921 (owner: 10Alex Monk) [23:46:26] (03PS4) 10Alex Monk: labs dnsrecursor IP aliasing: work on all projects, not just some arbitrary ones [puppet] - 10https://gerrit.wikimedia.org/r/268921 [23:58:32] (03CR) 10Alex Monk: "So in labs I get "keystoneclient.apiclient.exceptions.NotFound: The resource could not be found. (HTTP 404)" when trying to get /tenants f" [puppet] - 10https://gerrit.wikimedia.org/r/268921 (owner: 10Alex Monk)