[00:00:04] Deploy window No deploys - SRE Summit (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20190613T0000) [00:24:48] (03PS1) 10Faidon Liambotis: dsa-check-hpssacli: import latest version from DSA [puppet] - 10https://gerrit.wikimedia.org/r/516724 [00:24:50] (03PS1) 10Faidon Liambotis: dsa-check-hpssacli: speed when checking many disks [puppet] - 10https://gerrit.wikimedia.org/r/516725 (https://phabricator.wikimedia.org/T210723) [00:24:52] (03PS1) 10Faidon Liambotis: dsa-check-hpssacli: make compatible with ssacli [puppet] - 10https://gerrit.wikimedia.org/r/516726 [00:41:24] 10Operations, 10observability, 10Patch-For-Review, 10User-fgiunchedi: Address recurrent service check time out for "HP RAID" on swift backend hosts - https://phabricator.wikimedia.org/T210723 (10faidon) So, the timeout patch above bumped the timeouts to 100s I think. On many hosts (e.g. ms-be1036, ms-be103... [00:43:58] (03PS2) 10Faidon Liambotis: dsa-check-hpssacli: refactor for speed/efficiency [puppet] - 10https://gerrit.wikimedia.org/r/516725 (https://phabricator.wikimedia.org/T210723) [00:44:00] (03PS2) 10Faidon Liambotis: dsa-check-hpssacli: make compatible with ssacli [puppet] - 10https://gerrit.wikimedia.org/r/516726 [00:45:28] !log setting the CPU governor to performance for ms-be1036 (a while ago) [00:45:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [00:47:00] (03PS1) 10Paladox: Merge remote-tracking branch 'upstream/v2.15.14' into wmf/stable-2.15 [software/gerrit] (wmf/stable-2.15) - 10https://gerrit.wikimedia.org/r/516727 [01:25:17] PROBLEM - puppet last run on lvs3002 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. Could be an interrupted request or a dependency cycle. [01:38:49] PROBLEM - puppet last run on bast5001 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. Could be an interrupted request or a dependency cycle. [01:52:27] RECOVERY - puppet last run on lvs3002 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [02:11:27] RECOVERY - puppet last run on bast5001 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [04:42:41] PROBLEM - Check systemd state on ms-be1042 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [05:02:37] PROBLEM - Check systemd state on ms-be1028 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [05:11:19] RECOVERY - Check systemd state on ms-be1028 is OK: OK - running: The system is fully operational [05:16:05] RECOVERY - Check systemd state on ms-be1042 is OK: OK - running: The system is fully operational [06:30:35] PROBLEM - puppet last run on analytics1061 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 4 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/etc/R/update-library.R] [06:31:59] PROBLEM - puppet last run on sodium is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 6 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/etc/profile.d/mysql-ps1.sh] [06:34:19] PROBLEM - puppet last run on mw2285 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. Could be an interrupted request or a dependency cycle. [06:46:10] 10Operations, 10WMDE-QWERTY-Team, 10serviceops, 10wikidiff2, and 3 others: Deploy Wikidiff2 version 1.8.2 with the timeout issue fixed - https://phabricator.wikimedia.org/T223391 (10WMDE-Fisch) [06:50:48] (03CR) 10ArielGlenn: [C: 03+1] "This is good with the caveat that we still need a way to prevent unhappy disks from flapping." [puppet] - 10https://gerrit.wikimedia.org/r/516615 (https://phabricator.wikimedia.org/T225613) (owner: 10Filippo Giunchedi) [06:55:49] (03PS1) 10Marostegui: db-eqiad.php: More traffic db1077 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/516738 [06:57:47] RECOVERY - puppet last run on analytics1061 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:58:24] (03CR) 10Marostegui: [C: 03+1] "+1 and also labsdb1010 has caught up with replication :)" [puppet] - 10https://gerrit.wikimedia.org/r/516639 (owner: 10Jcrespo) [06:58:34] (03CR) 10Marostegui: [C: 03+2] db-eqiad.php: More traffic db1077 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/516738 (owner: 10Marostegui) [06:59:07] RECOVERY - puppet last run on sodium is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:59:31] (03Merged) 10jenkins-bot: db-eqiad.php: More traffic db1077 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/516738 (owner: 10Marostegui) [06:59:50] (03CR) 10jenkins-bot: db-eqiad.php: More traffic db1077 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/516738 (owner: 10Marostegui) [07:00:35] !log marostegui@deploy1001 Synchronized wmf-config/db-eqiad.php: More traffic to db1077 after recovering from a crash (duration: 00m 50s) [07:00:40] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [07:01:27] RECOVERY - puppet last run on mw2285 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [07:11:42] 10Operations, 10Operations-Software-Development, 10User-Joe, 10User-jijiki: Create cookbook to do `nodetool repair` across cassandra cluster - https://phabricator.wikimedia.org/T225694 (10Mathew.onipe) [07:11:55] 10Operations, 10Operations-Software-Development, 10User-Joe, 10User-jijiki: Create cookbook to do `nodetool repair` across cassandra cluster - https://phabricator.wikimedia.org/T225694 (10Mathew.onipe) p:05Triage→03Normal [07:19:11] 10Operations, 10Release Pipeline, 10Release-Engineering-Team-TODO, 10Core Platform Team Backlog (Watching / External), and 2 others: Migrate production services to kubernetes using the pipeline - https://phabricator.wikimedia.org/T198901 (10mobrovac) [07:21:05] 10Operations, 10ops-codfw: ms-be2018 sdc unreadable sector - https://phabricator.wikimedia.org/T225630 (10ArielGlenn) p:05Triage→03Normal [07:21:54] 10Operations, 10MediaWiki-Releasing, 10Parsoid, 10Release-Engineering-Team: signatures were invalid: EXPKEYSIG 90E9F83F22250DD7 MediaWiki releases repository - https://phabricator.wikimedia.org/T225601 (10ArielGlenn) p:05Triage→03High [07:22:31] 10Operations, 10Wikimedia-Mailing-lists: gmail users being suspended from mediawiki-l due to excessive bounces - https://phabricator.wikimedia.org/T225553 (10ArielGlenn) p:05Triage→03Normal [07:23:32] 10Operations, 10Wikimedia-Mailing-lists: Verify that all mailman mailing lists have private_roster=2 - https://phabricator.wikimedia.org/T225269 (10ArielGlenn) p:05Triage→03Normal [07:23:47] 10Operations, 10ops-codfw: Degraded RAID on es2003 - https://phabricator.wikimedia.org/T225131 (10ArielGlenn) p:05Triage→03Normal [07:25:38] 10Operations, 10Traffic, 10HTTPS: en.wikipedia.com [sic] serves an invalid certificate - https://phabricator.wikimedia.org/T214253 (10ArielGlenn) [07:25:40] 10Operations: wikipedia.com has invalid certificate - https://phabricator.wikimedia.org/T225650 (10ArielGlenn) [07:27:33] 10Operations, 10serviceops, 10Service-deployment-requests, 10Services (watching): Internal deployment of open_nsfw-- image scoring service - https://phabricator.wikimedia.org/T225664 (10mobrovac) [07:32:18] 10Operations, 10serviceops, 10Service-deployment-requests, 10Services (watching): Internal deployment of open_nsfw-- image scoring service - https://phabricator.wikimedia.org/T225664 (10ArielGlenn) p:05Triage→03Normal [07:41:10] 10Operations, 10ops-codfw: Degraded RAID on es2003 - https://phabricator.wikimedia.org/T225131 (10Marostegui) [07:45:31] 10Operations, 10Continuous-Integration-Infrastructure, 10Release-Engineering-Team-TODO, 10SRE-Access-Requests: Request: add awight to contint-docker - https://phabricator.wikimedia.org/T223262 (10awight) 05Open→03Declined I'm having second thoughts about this request, because I'm no longer see that I'l... [07:48:31] 10Operations, 10SRE-Access-Requests: Typo in workboard column name: "Confirmation" - https://phabricator.wikimedia.org/T225696 (10awight) [07:52:06] (03PS2) 10Awight: New configuration to pull from Wikidata [mediawiki-config] - 10https://gerrit.wikimedia.org/r/514715 (https://phabricator.wikimedia.org/T224007) [07:55:49] 10Operations, 10ops-codfw, 10Patch-For-Review: rack/setup/ codfw: ganeti2009 - ganeti201[0-8] - https://phabricator.wikimedia.org/T224603 (10ayounsi) >>! In T224603#5243200, @Papaul wrote: > @ayounsi I am planning on installing those new servers in row c and row D and I don't have the "interface-range ganeti... [08:05:04] (03PS3) 10Awight: New configuration to pull sitelinks from Wikidata [mediawiki-config] - 10https://gerrit.wikimedia.org/r/514715 (https://phabricator.wikimedia.org/T224007) [08:05:32] (03CR) 10Jcrespo: [C: 03+2] labsdb: Move labsdb1010 from analytics to web to ease the extra load [puppet] - 10https://gerrit.wikimedia.org/r/516639 (owner: 10Jcrespo) [08:09:24] !log reloading proxies for wikireplicas to rebalance load [08:09:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [08:36:51] (03CR) 10WMDE-Fisch: [C: 03+1] New configuration to pull sitelinks from Wikidata [mediawiki-config] - 10https://gerrit.wikimedia.org/r/514715 (https://phabricator.wikimedia.org/T224007) (owner: 10Awight) [08:36:53] (03PS7) 10Mathew.onipe: Add maps reboot cookbook [cookbooks] - 10https://gerrit.wikimedia.org/r/511819 (https://phabricator.wikimedia.org/T224072) [08:37:23] (03CR) 10Mathew.onipe: Add maps reboot cookbook (035 comments) [cookbooks] - 10https://gerrit.wikimedia.org/r/511819 (https://phabricator.wikimedia.org/T224072) (owner: 10Mathew.onipe) [08:38:28] (03CR) 10jerkins-bot: [V: 04-1] Add maps reboot cookbook [cookbooks] - 10https://gerrit.wikimedia.org/r/511819 (https://phabricator.wikimedia.org/T224072) (owner: 10Mathew.onipe) [08:43:02] 10Operations, 10SRE-Access-Requests: Typo in workboard column name: "Confirmation" - https://phabricator.wikimedia.org/T225696 (10Aklapper) Thanks. Meh, I cannot edit that column because "Members of the project "acl*sre-team" can take this action."... [08:44:54] (03CR) 10Mathew.onipe: "recheck" [cookbooks] - 10https://gerrit.wikimedia.org/r/511819 (https://phabricator.wikimedia.org/T224072) (owner: 10Mathew.onipe) [08:46:21] (03CR) 10jerkins-bot: [V: 04-1] Add maps reboot cookbook [cookbooks] - 10https://gerrit.wikimedia.org/r/511819 (https://phabricator.wikimedia.org/T224072) (owner: 10Mathew.onipe) [08:50:19] 10Operations, 10MediaWiki-Releasing, 10Parsoid, 10Release-Engineering-Team: signatures were invalid: EXPKEYSIG 90E9F83F22250DD7 MediaWiki releases repository - https://phabricator.wikimedia.org/T225601 (10fgiunchedi) a:03fgiunchedi I'll be looking into renewing this key [08:50:39] (03PS1) 10Jcrespo: labsdb: Setup labsdb1010 as a web wikireplica [puppet] - 10https://gerrit.wikimedia.org/r/516749 [08:55:16] (03CR) 10Jcrespo: [C: 03+2] labsdb: Setup labsdb1010 as a web wikireplica [puppet] - 10https://gerrit.wikimedia.org/r/516749 (owner: 10Jcrespo) [08:55:18] (03PS1) 10DCausse: [cirrus] Use correct factory declaration for EntityFullTextQueryBuilder [mediawiki-config] - 10https://gerrit.wikimedia.org/r/516750 (https://phabricator.wikimedia.org/T216429) [08:56:11] (03CR) 10jerkins-bot: [V: 04-1] [cirrus] Use correct factory declaration for EntityFullTextQueryBuilder [mediawiki-config] - 10https://gerrit.wikimedia.org/r/516750 (https://phabricator.wikimedia.org/T216429) (owner: 10DCausse) [08:56:47] (03CR) 10Mathew.onipe: "pylint is failing to run causing build to fail." [cookbooks] - 10https://gerrit.wikimedia.org/r/511819 (https://phabricator.wikimedia.org/T224072) (owner: 10Mathew.onipe) [08:57:00] volans ^ [08:57:06] can you take a look pls [08:57:58] (03PS2) 10DCausse: [cirrus] Use correct factory declaration for EntityFullTextQueryBuilder [mediawiki-config] - 10https://gerrit.wikimedia.org/r/516750 (https://phabricator.wikimedia.org/T216429) [08:58:04] onimisionipe: sure, I'll try sometime today between sessions [08:58:15] Ok. thanks1 [09:10:31] PROBLEM - mailman_queue_size on fermium is CRITICAL: CRITICAL: 1 mailman queue(s) above limits (thresholds: bounces: 25 in: 25 virgin: 25) https://wikitech.wikimedia.org/wiki/Mailman [09:11:12] (03PS1) 10Filippo Giunchedi: releases: update expired gpg key [puppet] - 10https://gerrit.wikimedia.org/r/516752 (https://phabricator.wikimedia.org/T225601) [09:11:51] (03CR) 10jerkins-bot: [V: 04-1] releases: update expired gpg key [puppet] - 10https://gerrit.wikimedia.org/r/516752 (https://phabricator.wikimedia.org/T225601) (owner: 10Filippo Giunchedi) [09:15:25] (03CR) 10Filippo Giunchedi: [C: 03+1] "Looks like CI is looking into the file itself to check for python shebang, but doesn't like binary files (rake's setup_python_extensions)." [puppet] - 10https://gerrit.wikimedia.org/r/516752 (https://phabricator.wikimedia.org/T225601) (owner: 10Filippo Giunchedi) [09:23:31] RECOVERY - mailman_queue_size on fermium is OK: OK: mailman queues are below the limits. https://wikitech.wikimedia.org/wiki/Mailman [09:28:16] (03PS10) 10CRusnov: profile::netbox: Reorganize for splitting front and back-end. [puppet] - 10https://gerrit.wikimedia.org/r/514395 (https://phabricator.wikimedia.org/T223291) [09:32:03] (03PS2) 10Filippo Giunchedi: releases: update expired gpg key [puppet] - 10https://gerrit.wikimedia.org/r/516752 (https://phabricator.wikimedia.org/T225601) [09:32:50] (03CR) 10jerkins-bot: [V: 04-1] releases: update expired gpg key [puppet] - 10https://gerrit.wikimedia.org/r/516752 (https://phabricator.wikimedia.org/T225601) (owner: 10Filippo Giunchedi) [09:33:15] (03CR) 10Filippo Giunchedi: [V: 03+2 C: 03+2] releases: update expired gpg key [puppet] - 10https://gerrit.wikimedia.org/r/516752 (https://phabricator.wikimedia.org/T225601) (owner: 10Filippo Giunchedi) [09:37:49] (03PS1) 10Matthias Mullie: Consistent beta wikidata urls, without www [mediawiki-config] - 10https://gerrit.wikimedia.org/r/516753 [09:42:36] (03CR) 10Reedy: [C: 04-1] "Tentative CR-1" (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/516753 (owner: 10Matthias Mullie) [09:42:59] 10Operations, 10MediaWiki-Releasing, 10Parsoid, 10Release-Engineering-Team, 10Patch-For-Review: signatures were invalid: EXPKEYSIG 90E9F83F22250DD7 MediaWiki releases repository - https://phabricator.wikimedia.org/T225601 (10fgiunchedi) Ok this should be done now, the new... [09:47:58] (03PS11) 10CRusnov: profile::netbox: Reorganize for splitting front and back-end. [puppet] - 10https://gerrit.wikimedia.org/r/514395 (https://phabricator.wikimedia.org/T223291) [09:48:40] 10Operations, 10MediaWiki-Releasing, 10Parsoid, 10Release-Engineering-Team, 10Patch-For-Review: signatures were invalid: EXPKEYSIG 90E9F83F22250DD7 MediaWiki releases repository - https://phabricator.wikimedia.org/T225601 (10fgiunchedi) Instructions at https://wikitech.wi... [10:02:47] 10Operations, 10Cloud-Services, 10wikitech.wikimedia.org: Investigate issues with wikitech-static.wikimedia.org - https://phabricator.wikimedia.org/T156570 (10ArielGlenn) 05Open→03Resolved a:03ArielGlenn The search listed as the second issue now works fine. The google result listed as the first issue... [10:05:03] 10Operations, 10ops-eqiad, 10DBA: rack/setup/install (4) dbproxy systems. - https://phabricator.wikimedia.org/T225704 (10RobH) p:05Triage→03Normal [10:05:15] 10Operations, 10ops-eqiad, 10DBA: rack/setup/install (4) dbproxy systems. - https://phabricator.wikimedia.org/T225704 (10RobH) [10:06:27] 10Operations, 10ops-eqiad, 10DBA: rack/setup/install (4) dbproxy systems. - https://phabricator.wikimedia.org/T225704 (10RobH) [10:06:30] 10Operations, 10wikitech.wikimedia.org: wikitech-static cert renewal seems to stop apache2 - https://phabricator.wikimedia.org/T214640 (10ArielGlenn) This has since been set to standalone, and new certs were generated. See T204840#5243222 for the context. Should this task remain open? [10:13:03] (03CR) 10Filippo Giunchedi: [C: 03+1] dsa-check-hpssacli: import latest version from DSA [puppet] - 10https://gerrit.wikimedia.org/r/516724 (owner: 10Faidon Liambotis) [10:14:42] (03CR) 10Filippo Giunchedi: [C: 03+1] dsa-check-hpssacli: make compatible with ssacli [puppet] - 10https://gerrit.wikimedia.org/r/516726 (owner: 10Faidon Liambotis) [10:18:16] 10Operations, 10ops-eqiad, 10DBA: eqiad: rack/setup/install (4) dbproxy systems. - https://phabricator.wikimedia.org/T225704 (10Marostegui) [10:19:03] 10Operations, 10ops-eqiad, 10DBA: eqiad: rack/setup/install (4) dbproxy systems. - https://phabricator.wikimedia.org/T225704 (10Marostegui) [10:26:19] 10Operations, 10serviceops, 10Core Platform Team Backlog (Watching / External), 10SCB, 10Services (watching): Upgrade python-service-checker across the fleet - https://phabricator.wikimedia.org/T225707 (10mobrovac) [10:29:57] (03CR) 10Filippo Giunchedi: "Very nice! LGTM from a perl-untrained eye. Another good target for testing I think would be WMCS boxes and DBs which have different raid c" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/516725 (https://phabricator.wikimedia.org/T210723) (owner: 10Faidon Liambotis) [10:31:00] (03PS1) 10Marostegui: install_server: Allow installation of new dbproxy [puppet] - 10https://gerrit.wikimedia.org/r/516758 (https://phabricator.wikimedia.org/T225704) [10:36:16] 10Operations, 10ops-eqiad, 10DBA, 10Patch-For-Review: eqiad: rack/setup/install (4) dbproxy systems. - https://phabricator.wikimedia.org/T225704 (10Marostegui) I updated the task description, but for the record: Racking Proposal: Install one per row. If possible, avoid installing them in the same rack of... [10:41:54] 10Operations, 10Operations-Software-Development: Error while checking binary files for python shebang - https://phabricator.wikimedia.org/T225710 (10fgiunchedi) [10:43:04] (03PS1) 10Cparle: Add 'sms' and 'smn' langcodes to commons for use in captions [mediawiki-config] - 10https://gerrit.wikimedia.org/r/516760 (https://phabricator.wikimedia.org/T222309) [10:46:07] 10Operations, 10Traffic, 10Core Platform Team Backlog (Designing), 10MW-1.34-notes (1.34.0-wmf.6; 2019-05-21), and 6 others: Harmonise the identification of requests across our stack - https://phabricator.wikimedia.org/T201409 (10mobrovac) [10:47:46] 10Operations, 10Wikimedia-Mailing-lists: Wikimedia-au-members and wikimedia-au-announce password reset - https://phabricator.wikimedia.org/T225712 (10StevenCrossin) [10:53:58] (03CR) 10Matthias Mullie: [C: 03+1] "LGTM, will deploy next week" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/516760 (https://phabricator.wikimedia.org/T222309) (owner: 10Cparle) [10:58:29] PROBLEM - puppet last run on bast3002 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. Could be an interrupted request or a dependency cycle. [11:01:21] 10Operations, 10media-storage: CPU scaling governor on ms-be hosts - https://phabricator.wikimedia.org/T225713 (10fgiunchedi) [11:02:39] 10Operations, 10media-storage: CPU scaling governor on ms-be hosts - https://phabricator.wikimedia.org/T225713 (10fgiunchedi) [11:03:42] (03CR) 10Vgutierrez: [C: 04-1] "Looks good in general, I couldn't find a related commit allowing the load balancers to reach the configured ports in cloudelastic[1001-100" (035 comments) [puppet] - 10https://gerrit.wikimedia.org/r/512925 (https://phabricator.wikimedia.org/T224324) (owner: 10EBernhardson) [11:22:04] 10Operations, 10Wikimedia-Mailing-lists: Wikimedia-au-members and wikimedia-au-announce password reset - https://phabricator.wikimedia.org/T225712 (10StevenCrossin) Never mind this has been sorted out on our end [11:25:37] RECOVERY - puppet last run on bast3002 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [11:29:26] 10Operations, 10Wikimedia-Mailing-lists: Wikimedia-au-members and wikimedia-au-announce password reset - https://phabricator.wikimedia.org/T225712 (10Aklapper) @StevenCrossin: If there is nothing to do, feel free to set the status of this report to "Declined" via the {nav name=Add Action... > Change Status} dr... [11:30:19] 10Operations, 10Wikimedia-Mailing-lists: Wikimedia-au-members and wikimedia-au-announce password reset - https://phabricator.wikimedia.org/T225712 (10StevenCrossin) 05Open→03Declined Closed as sorted [11:37:39] PROBLEM - HTTP availability for Nginx -SSL terminators- at ulsfo on icinga1001 is CRITICAL: cluster=cache_text site=ulsfo https://grafana.wikimedia.org/dashboard/db/frontend-traffic?panelId=4&fullscreen&refresh=1m&orgId=1 [11:38:03] PROBLEM - Text HTTP 5xx reqs/min on graphite1004 is CRITICAL: CRITICAL: 22.22% of data above the critical threshold [1000.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes?panelId=3&fullscreen&orgId=1&var-site=All&var-cache_type=text&var-status_type=5 [11:38:15] PROBLEM - Esams HTTP 5xx reqs/min on graphite1004 is CRITICAL: CRITICAL: 11.11% of data above the critical threshold [1000.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes?panelId=3&fullscreen&orgId=1&var-site=esams&var-cache_type=All&var-status_type=5 [11:38:27] PROBLEM - HTTP availability for Nginx -SSL terminators- at codfw on icinga1001 is CRITICAL: cluster=cache_text site=codfw https://grafana.wikimedia.org/dashboard/db/frontend-traffic?panelId=4&fullscreen&refresh=1m&orgId=1 [11:40:33] RECOVERY - HTTP availability for Nginx -SSL terminators- at ulsfo on icinga1001 is OK: All metrics within thresholds. https://grafana.wikimedia.org/dashboard/db/frontend-traffic?panelId=4&fullscreen&refresh=1m&orgId=1 [11:41:21] RECOVERY - HTTP availability for Nginx -SSL terminators- at codfw on icinga1001 is OK: All metrics within thresholds. https://grafana.wikimedia.org/dashboard/db/frontend-traffic?panelId=4&fullscreen&refresh=1m&orgId=1 [11:45:00] 10Operations, 10MediaWiki-Releasing, 10Parsoid, 10Release-Engineering-Team: signatures were invalid: EXPKEYSIG 90E9F83F22250DD7 MediaWiki releases repository - https://phabricator.wikimedia.org/T225601 (10Tkshamburg) Hi @fgiunchedi , thanks for creating the new key (now 10... [11:46:47] RECOVERY - Text HTTP 5xx reqs/min on graphite1004 is OK: OK: Less than 1.00% above the threshold [250.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes?panelId=3&fullscreen&orgId=1&var-site=All&var-cache_type=text&var-status_type=5 [11:46:59] RECOVERY - Esams HTTP 5xx reqs/min on graphite1004 is OK: OK: Less than 1.00% above the threshold [250.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes?panelId=3&fullscreen&orgId=1&var-site=esams&var-cache_type=All&var-status_type=5 [12:21:57] 10Operations, 10media-storage: CPU scaling governor on HP Gen9 hosts - https://phabricator.wikimedia.org/T225713 (10faidon) [12:22:57] 10Operations, 10MediaWiki-Releasing, 10Parsoid, 10Release-Engineering-Team: signatures were invalid: EXPKEYSIG 90E9F83F22250DD7 MediaWiki releases repository - https://phabricator.wikimedia.org/T225601 (10fgiunchedi) >>! In T225601#5256399, @Tkshamburg wrote: > Hi @fgiunche... [12:32:38] 10Operations, 10MediaWiki-Releasing, 10Parsoid, 10Release-Engineering-Team: signatures were invalid: EXPKEYSIG 90E9F83F22250DD7 MediaWiki releases repository - https://phabricator.wikimedia.org/T225601 (10Tkshamburg) Everything is fine now, "apt update" shows no errors now.... [12:54:45] 10Operations, 10DC-Ops, 10Traffic: poll power data for redeployment of esams/knams - https://phabricator.wikimedia.org/T225720 (10RobH) p:05Triage→03Normal [12:56:25] 10Operations, 10DC-Ops, 10Traffic: poll power data for redeployment of esams/knams - https://phabricator.wikimedia.org/T225720 (10RobH) ` 5 $> ssh cr2-esams.wikimedia.org --- JUNOS 13.3R8.7 built 2015-10-23 21:23:16 UTC {master} robh@re0.cr2-esams> show power ^ syntax error, expectin... [13:09:57] 10Operations, 10DC-Ops, 10Traffic: poll power data for redeployment of esams/knams - https://phabricator.wikimedia.org/T225720 (10RobH) [13:10:54] 10Operations, 10ops-eqiad, 10DBA, 10Patch-For-Review: eqiad: rack/setup/install (4) dbproxy systems. - https://phabricator.wikimedia.org/T225704 (10jcrespo) Wait, some of these will go to the cloud racks, that needs planing! [13:11:33] 10Operations, 10DC-Ops, 10Traffic: poll power data for redeployment of esams/knams - https://phabricator.wikimedia.org/T225720 (10RobH) My understanding is we won't be using any MX80s when this is all done, so I did not pull that info. I'm not sure of the peak usage hours for each site, or if there is a jui... [13:17:45] 10Operations, 10DC-Ops, 10Traffic: poll power data for redeployment of esams/knams - https://phabricator.wikimedia.org/T225720 (10ayounsi) I don't think Junos have that feature. You can find the peak time of a device using their "overall traffic" graph in LibreNMS (eg. https://librenms.wikimedia.org/device/... [13:18:17] 10Operations, 10MediaWiki-Releasing, 10Parsoid, 10Release-Engineering-Team: signatures were invalid: EXPKEYSIG 90E9F83F22250DD7 MediaWiki releases repository - https://phabricator.wikimedia.org/T225601 (10fgiunchedi) 05Open→03Resolved No problem @Tkshamburg ! Thanks for... [13:33:06] 10Operations, 10media-storage, 10observability: swift-drive-audit unmounting a drive doesn't produce any alerts or notifications - https://phabricator.wikimedia.org/T222362 (10fgiunchedi) [13:33:08] 10Operations, 10ops-codfw, 10media-storage, 10observability, 10User-fgiunchedi: ms-be2043 'sdd' throwing lots of errors - https://phabricator.wikimedia.org/T222654 (10fgiunchedi) 05Open→03Resolved All done, resolving. [13:49:54] 10Operations, 10Discovery, 10Discovery-Analysis, 10Product-Analytics, and 3 others: Setup a mirror for R language dependencies (CRAN) - https://phabricator.wikimedia.org/T170995 (10hashar) 05Open→03Declined maybe one day if we look again at R [13:51:52] (03CR) 10Bearloga: [C: 03+1] "@Gehel: Deb is okay with sunsetting the Portal stuff so we can proceed with this patch" [puppet] - 10https://gerrit.wikimedia.org/r/504577 (https://phabricator.wikimedia.org/T197138) (owner: 10Bearloga) [13:54:25] (03PS2) 10Matthias Mullie: Consistent beta wikidata urls, without www [mediawiki-config] - 10https://gerrit.wikimedia.org/r/516753 [13:55:38] * debt with a sentimental sigh, yes on portal patch [14:14:38] (03PS2) 10Gehel: profile::discovery_dashboards: remove Wikipedia Portal dashboard [puppet] - 10https://gerrit.wikimedia.org/r/504577 (https://phabricator.wikimedia.org/T197138) (owner: 10Bearloga) [14:15:49] (03CR) 10Gehel: [C: 03+2] profile::discovery_dashboards: remove Wikipedia Portal dashboard [puppet] - 10https://gerrit.wikimedia.org/r/504577 (https://phabricator.wikimedia.org/T197138) (owner: 10Bearloga) [14:18:39] PROBLEM - High CPU load on API appserver on mw1227 is CRITICAL: CRITICAL - load average: 48.17, 27.93, 19.47 [14:19:54] 10Operations, 10ops-eqiad, 10DBA, 10Patch-For-Review: eqiad: rack/setup/install (4) dbproxy systems. - https://phabricator.wikimedia.org/T225704 (10Marostegui) Very good point! [14:20:37] PROBLEM - High CPU load on API appserver on mw1222 is CRITICAL: CRITICAL - load average: 50.39, 27.41, 17.94 [14:21:11] PROBLEM - High CPU load on API appserver on mw1231 is CRITICAL: CRITICAL - load average: 66.85, 35.21, 21.32 [14:21:26] 10Operations, 10ops-eqiad, 10DBA, 10Patch-For-Review: eqiad: rack/setup/install (4) dbproxy systems. - https://phabricator.wikimedia.org/T225704 (10Marostegui) [14:22:49] PROBLEM - High CPU load on API appserver on mw1233 is CRITICAL: CRITICAL - load average: 87.18, 45.37, 27.49 [14:23:29] PROBLEM - High CPU load on API appserver on mw1232 is CRITICAL: CRITICAL - load average: 68.32, 36.81, 21.43 [14:24:31] PROBLEM - High CPU load on API appserver on mw1234 is CRITICAL: CRITICAL - load average: 52.33, 29.94, 19.13 [14:27:16] looks like api hosts, I'm takign a look at e.g. mw1222 [14:28:57] * apergos peeks in [14:29:50] load is going back down but I don't know what caused load on api [14:30:15] RECOVERY - High CPU load on API appserver on mw1234 is OK: OK - load average: 11.94, 23.34, 20.63 [14:30:41] RECOVERY - High CPU load on API appserver on mw1232 is OK: OK - load average: 13.22, 23.45, 22.29 [14:35:58] I was looknig in logstash for mw1233 and didn't see anything that jumped out as to the number or type of requests really [14:38:41] RECOVERY - High CPU load on API appserver on mw1233 is OK: OK - load average: 8.66, 14.86, 23.39 [14:39:23] RECOVERY - High CPU load on API appserver on mw1222 is OK: OK - load average: 7.88, 14.92, 23.89 [14:41:47] RECOVERY - High CPU load on API appserver on mw1227 is OK: OK - load average: 6.91, 12.69, 23.16 [14:42:49] RECOVERY - High CPU load on API appserver on mw1231 is OK: OK - load average: 6.94, 12.44, 23.70 [15:23:37] PROBLEM - High CPU load on API appserver on mw1233 is CRITICAL: CRITICAL - load average: 63.70, 37.80, 23.12 [15:25:50] (03PS1) 10Filippo Giunchedi: profile: fix swift symlink for WMCS LVs [puppet] - 10https://gerrit.wikimedia.org/r/516791 [15:26:47] scribunto whines in hhvm on that box [15:28:01] \nFatal error: entire web request took longer than 200 seconds and timed out in /srv/mediawiki/php-1.34.0-wmf.8/extensions/Scribunto/includes/engines/LuaSandbox/Engine.php on line 282 [15:28:22] 10Operations, 10MediaWiki-Releasing, 10Parsoid: signatures were invalid: EXPKEYSIG 90E9F83F22250DD7 MediaWiki releases repository - https://phabricator.wikimedia.org/T225601 (10greg) [15:28:41] but load already dropping since then [15:28:52] back down to 21 now [15:30:51] RECOVERY - High CPU load on API appserver on mw1233 is OK: OK - load average: 19.31, 24.31, 22.12 [15:31:10] PROBLEM - LVS HTTP IPv4 on wdqs.svc.eqiad.wmnet is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/LVS%23Diagnosing_problems [15:31:42] uh oh [15:32:04] yeah [15:32:32] cpu climbing on wdqs https://grafana.wikimedia.org/d/000000607/cluster-overview?orgId=1&from=now-1h&to=now&var-datasource=eqiad%20prometheus%2Fops&var-cluster=wdqs&var-instance=All [15:32:55] gehel, onimisionipe? [15:33:15] onimisionipe: can you look? [15:33:34] yeah looks like cpu is pretty much jammed [15:35:06] for once, it does not seem related to edit load [15:35:39] yup, seeing requests being banned [15:36:54] !log restarting blazegraph on wdqs-internal / eqiad (just in case) [15:36:58] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:40:00] PROBLEM - LVS HTTP IPv4 on wdqs.svc.eqiad.wmnet is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/LVS%23Diagnosing_problems [15:42:48] RECOVERY - LVS HTTP IPv4 on wdqs.svc.eqiad.wmnet is OK: HTTP OK: HTTP/1.1 200 OK - 448 bytes in 0.098 second response time https://wikitech.wikimedia.org/wiki/LVS%23Diagnosing_problems [15:43:20] (03PS1) 10CDanis: varnish text FE: ban python-requests User-Agent on WDQS [puppet] - 10https://gerrit.wikimedia.org/r/516793 [15:45:15] (03CR) 10Filippo Giunchedi: [C: 03+1] varnish text FE: ban python-requests User-Agent on WDQS [puppet] - 10https://gerrit.wikimedia.org/r/516793 (owner: 10CDanis) [15:47:56] (03CR) 10Gehel: [C: 03+1] "+1 as a temporary measure" [puppet] - 10https://gerrit.wikimedia.org/r/516793 (owner: 10CDanis) [15:50:10] <_joe_> I'd prefer if we do that at the nginx level [15:50:10] (03PS1) 10CDanis: wdqs: ban disallowed User-Agent at nginx [puppet] - 10https://gerrit.wikimedia.org/r/516794 [15:50:15] <_joe_> ^^ [15:50:18] gehel: please look at ^ instead [15:50:40] (03CR) 10Giuseppe Lavagetto: [C: 03+2] wdqs: ban disallowed User-Agent at nginx [puppet] - 10https://gerrit.wikimedia.org/r/516794 (owner: 10CDanis) [15:50:42] (03Abandoned) 10CDanis: varnish text FE: ban python-requests User-Agent on WDQS [puppet] - 10https://gerrit.wikimedia.org/r/516793 (owner: 10CDanis) [15:56:35] wow [15:56:42] I'm late to the party [15:56:48] * onimisionipe is reading backlog [15:58:32] !log restart blazegraph on wdqs public cluster [15:58:36] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:02:18] !log restart blazegraph on wdqs public cluster completed [16:02:22] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [16:22:51] PROBLEM - High CPU load on API appserver on mw1231 is CRITICAL: CRITICAL - load average: 59.18, 28.59, 18.47 [16:27:03] 200 seconds seems like a long time to allow for a web request [16:27:25] hello! [16:27:40] is there a cumin release planned soon? i would love to not have to patch cumin on install :) [16:27:43] /srv/mediawiki/php-1.34.0-wmf.8/extensions/Scribunto/includes/engines/LuaSandbox/Engine.php whines again [16:28:51] I don't know the release schedule, and the folks who could answer that aren't around right now [16:30:25] volans: ^^ [16:30:49] anarcat: AFAIK volans is the best one to answer that [16:31:19] yeah, that's what i figured as well [16:31:35] I don't see anything right away in phabricator, though I imagine you already looked there [16:34:55] https://doc.wikimedia.org/cumin/master/release.html seem to be due for one :-) [16:36:35] i did not, actually - didn't know where to look [16:36:42] yeah, i looked on pypi and it seemed we're overdue [16:37:21] RECOVERY - High CPU load on API appserver on mw1231 is OK: OK - load average: 12.74, 18.22, 23.49 [16:38:37] https://phabricator.wikimedia.org/search/query/EVg.YagYZYue/#R or maybe (if you dare) https://phabricator.wikimedia.org/tag/operations-software-development/ but there's other stuff mixed in on the workboard [16:40:04] * anarcat dares [16:40:18] undare! undare! undare! [16:40:20] ;) [16:40:41] :-D :-D [16:48:57] (03PS1) 10Vgutierrez: varnish: Rate limit wdqs requests violating UA policy [puppet] - 10https://gerrit.wikimedia.org/r/516803 [16:58:04] (03CR) 10Ema: [C: 03+1] varnish: Rate limit wdqs requests violating UA policy [puppet] - 10https://gerrit.wikimedia.org/r/516803 (owner: 10Vgutierrez) [16:58:19] (03CR) 10Vgutierrez: [C: 03+2] varnish: Rate limit wdqs requests violating UA policy [puppet] - 10https://gerrit.wikimedia.org/r/516803 (owner: 10Vgutierrez) [17:34:48] !log T203254 set cpu scaling governor to performance on labstore1004 and labstore1005 [17:34:53] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:34:53] T203254: labstore1004 and labstore1005 high load issues following upgrades - https://phabricator.wikimedia.org/T203254 [17:44:16] (03CR) 10EBernhardson: "Looking into how to allow the load balancers to reach the configured ports, it seems that is going to be our profile::elasticsearch::cirru" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/512925 (https://phabricator.wikimedia.org/T224324) (owner: 10EBernhardson) [17:44:53] !log fdans@deploy1001 Started deploy [analytics/refinery@67b34fe]: deploying refinery source 0.0.92 into refinery [17:44:56] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [17:53:35] PROBLEM - SSH on proton1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/SSH/monitoring [17:56:19] RECOVERY - SSH on proton1001 is OK: SSH OK - OpenSSH_7.4p1 Debian-10+deb9u3 (protocol 2.0) https://wikitech.wikimedia.org/wiki/SSH/monitoring [17:57:51] PROBLEM - Request latencies on neon is CRITICAL: instance=10.64.0.40:6443 verb=PATCH https://grafana.wikimedia.org/dashboard/db/kubernetes-api [17:58:33] PROBLEM - Disk space on notebook1003 is CRITICAL: DISK CRITICAL - free space: /srv 2031 MB (1% inode=86%): https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space [18:01:38] !log fdans@deploy1001 Finished deploy [analytics/refinery@67b34fe]: deploying refinery source 0.0.92 into refinery (duration: 16m 45s) [18:01:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:02:11] RECOVERY - Request latencies on neon is OK: All metrics within thresholds. https://grafana.wikimedia.org/dashboard/db/kubernetes-api [18:05:47] RECOVERY - Disk space on notebook1003 is OK: DISK OK https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space [18:10:24] !log fdans@deploy1001 Started deploy [analytics/refinery@67b34fe]: retrying deployment of analytics refinery [18:10:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:10:43] !log fdans@deploy1001 Finished deploy [analytics/refinery@67b34fe]: retrying deployment of analytics refinery (duration: 00m 19s) [18:10:47] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:35:45] (03PS5) 10EBernhardson: LVS for cloudelastic [puppet] - 10https://gerrit.wikimedia.org/r/512925 (https://phabricator.wikimedia.org/T224324) [19:06:53] PROBLEM - proton endpoints health on proton1001 is CRITICAL: connect to address 10.64.0.20 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Services/Monitoring/proton [19:06:59] PROBLEM - dhclient process on proton1001 is CRITICAL: connect to address 10.64.0.20 port 5666: Connection refused [19:06:59] PROBLEM - Check size of conntrack table on proton1001 is CRITICAL: connect to address 10.64.0.20 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack [19:07:05] PROBLEM - Disk space on proton1001 is CRITICAL: connect to address 10.64.0.20 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space [19:07:09] PROBLEM - Check systemd state on proton1001 is CRITICAL: connect to address 10.64.0.20 port 5666: Connection refused [19:07:13] PROBLEM - configured eth on proton1001 is CRITICAL: connect to address 10.64.0.20 port 5666: Connection refused [19:07:25] PROBLEM - DPKG on proton1001 is CRITICAL: connect to address 10.64.0.20 port 5666: Connection refused [19:07:45] PROBLEM - Check whether ferm is active by checking the default input chain on proton1001 is CRITICAL: connect to address 10.64.0.20 port 5666: Connection refused https://wikitech.wikimedia.org/wiki/Monitoring/check_ferm [19:08:33] PROBLEM - puppet last run on proton1001 is CRITICAL: connect to address 10.64.0.20 port 5666: Connection refused [19:29:05] RECOVERY - DPKG on proton1001 is OK: All packages OK [19:29:25] RECOVERY - Check whether ferm is active by checking the default input chain on proton1001 is OK: OK ferm input default policy is set https://wikitech.wikimedia.org/wiki/Monitoring/check_ferm [19:30:03] RECOVERY - dhclient process on proton1001 is OK: PROCS OK: 0 processes with command name dhclient [19:30:05] RECOVERY - Check size of conntrack table on proton1001 is OK: OK: nf_conntrack is 0 % full https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack [19:30:11] RECOVERY - Disk space on proton1001 is OK: DISK OK https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space [19:30:15] RECOVERY - Check systemd state on proton1001 is OK: OK - running: The system is fully operational [19:30:17] RECOVERY - puppet last run on proton1001 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [19:30:19] RECOVERY - configured eth on proton1001 is OK: OK - interfaces up [19:30:54] (03PS1) 10Gehel: wdqs: limit number of messages from the same logger also for file logging. [puppet] - 10https://gerrit.wikimedia.org/r/516837 [19:59:49] PROBLEM - puppet last run on bast3002 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. Could be an interrupted request or a dependency cycle. [20:00:24] 10Operations, 10MediaWiki-Releasing, 10Parsoid: debian signing keyid E84AFDD2 has expired - https://phabricator.wikimedia.org/T141400 (10Kghbln) [20:27:01] RECOVERY - puppet last run on bast3002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [21:02:54] 10Operations, 10ops-codfw, 10netops: Setup new msw1-codfw - https://phabricator.wikimedia.org/T224250 (10Papaul) [21:08:04] I understand everybody is on SRE offsite, but maybe somebody can create next week for https://wikitech.wikimedia.org/wiki/Deployments ? [21:13:32] releng will do that I believe [21:14:40] hope so... usually it appears couple of days in advance, but now there's nothing [21:14:43] * apergos off for real this time (midnight) [21:34:38] (03PS1) 10Smalyshev: Also ban empty user agents [puppet] - 10https://gerrit.wikimedia.org/r/516959 [22:18:24] (03CR) 10Smalyshev: [C: 03+1] wdqs: limit number of messages from the same logger also for file logging. [puppet] - 10https://gerrit.wikimedia.org/r/516837 (owner: 10Gehel) [22:18:56] (03CR) 10Smalyshev: [C: 03+1] "Given that we're counting the events in metrics, repeated logging messages are not much useful." [puppet] - 10https://gerrit.wikimedia.org/r/516837 (owner: 10Gehel) [22:26:30] (03CR) 10Smalyshev: [cirrus] Use correct factory declaration for EntityFullTextQueryBuilder (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/516750 (https://phabricator.wikimedia.org/T216429) (owner: 10DCausse) [22:29:01] (03PS1) 10Bstorm: cloudstore: re-enable diamond collectors for monitoring [puppet] - 10https://gerrit.wikimedia.org/r/516967 (https://phabricator.wikimedia.org/T225265) [22:40:43] SMalyshev: I'll get to it in a bit, been a bit crushed with things lately [22:40:56] greg-g: thanks! [22:48:23] SMalyshev: done: https://wikitech.wikimedia.org/wiki/Deployments#Week_of_June_17th [22:58:00] (03CR) 10Bstorm: [C: 03+2] cloudstore: re-enable diamond collectors for monitoring [puppet] - 10https://gerrit.wikimedia.org/r/516967 (https://phabricator.wikimedia.org/T225265) (owner: 10Bstorm) [23:25:39] !log depooled wdqs1006 to let it catch up quicker [23:25:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [23:31:06] (03PS1) 10Tim Starling: Add a fatal error page to go with the proposed wmerrors feature [mediawiki-config] - 10https://gerrit.wikimedia.org/r/516975 (https://phabricator.wikimedia.org/T187147) [23:32:02] (03CR) 10jerkins-bot: [V: 04-1] Add a fatal error page to go with the proposed wmerrors feature [mediawiki-config] - 10https://gerrit.wikimedia.org/r/516975 (https://phabricator.wikimedia.org/T187147) (owner: 10Tim Starling) [23:40:24] (03PS2) 10Tim Starling: Add a fatal error page to go with the proposed wmerrors feature [mediawiki-config] - 10https://gerrit.wikimedia.org/r/516975 (https://phabricator.wikimedia.org/T187147) [23:41:49] 10Operations, 10Release-Engineering-Team, 10Scap: Enable scap to roll back broken changes to MediaWiki - https://phabricator.wikimedia.org/T225207 (10greg) [23:51:46] (03CR) 10Krinkle: "I've moved the hhvm equivalent to puppet recently, in prep for making it share the error-page.erb template. Might make sense to do this fo" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/516975 (https://phabricator.wikimedia.org/T187147) (owner: 10Tim Starling) [23:54:17] (03CR) 10Krinkle: Add a fatal error page to go with the proposed wmerrors feature (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/516975 (https://phabricator.wikimedia.org/T187147) (owner: 10Tim Starling) [23:57:17] (03CR) 10Smalyshev: [C: 03+1] [cirrus] Use correct factory declaration for EntityFullTextQueryBuilder (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/516750 (https://phabricator.wikimedia.org/T216429) (owner: 10DCausse)