[00:02:59] 10Operations, 10PHP 7.0 support: Audit and sync INI settings as needed between HHVM and PHP 7 - https://phabricator.wikimedia.org/T211488 (10Krinkle) p:05Triage>03Normal [00:08:51] RECOVERY - puppet last run on cloudnet1004 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [00:39:53] 10Operations, 10CirrusSearch, 10Discovery-Search: Find an alternative to HHVM curl connection pooling for PHP 7 - https://phabricator.wikimedia.org/T210717 (10Krinkle) [02:12:49] (03PS21) 10Krinkle: [tests] Ensure only existing wikis are referenced from IS.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/467448 (https://phabricator.wikimedia.org/T115138) (owner: 10Urbanecm) [02:14:00] (03CR) 10jerkins-bot: [V: 04-1] [tests] Ensure only existing wikis are referenced from IS.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/467448 (https://phabricator.wikimedia.org/T115138) (owner: 10Urbanecm) [02:14:36] (03CR) 10Krinkle: [tests] Ensure only existing wikis are referenced from IS.php (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/467448 (https://phabricator.wikimedia.org/T115138) (owner: 10Urbanecm) [02:16:43] PROBLEM - puppet last run on labtestservices2002 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [02:26:13] 10Operations, 10monitoring, 10Patch-For-Review: Provision >= 50% of statsd/Graphite-only metrics in Prometheus - https://phabricator.wikimedia.org/T205870 (10Krinkle) [02:29:36] 10Operations, 10monitoring, 10Patch-For-Review: Provision >= 50% of statsd/Graphite-only metrics in Prometheus - https://phabricator.wikimedia.org/T205870 (10Krinkle) [02:35:10] 10Operations, 10Performance-Team, 10monitoring, 10Patch-For-Review: Provision >= 50% of statsd/Graphite-only metrics in Prometheus - https://phabricator.wikimedia.org/T205870 (10Krinkle) > [ ] performance (generated from where?) @Peter When I looked at the data under `performance.*` on the Graphite serve... [02:42:41] RECOVERY - puppet last run on labtestservices2002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [02:49:45] PROBLEM - Apache HTTP on mw1281 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:50:49] RECOVERY - Apache HTTP on mw1281 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 616 bytes in 0.046 second response time [03:00:05] 10Operations, 10ops-codfw, 10Core Platform Team, 10Services (doing), and 2 others: Reshape RESTBase Cassandra cluster for server refresh - https://phabricator.wikimedia.org/T210843 (10Eevans) [03:00:09] !log decommissioning cassandra-b, restbase2003 -- T210843 [03:00:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [03:00:13] T210843: Reshape RESTBase Cassandra cluster for server refresh - https://phabricator.wikimedia.org/T210843 [03:33:59] PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 894.80 seconds [04:17:11] 10Operations, 10Operations-Software-Development, 10Patch-For-Review: Develop and deploy at least three Netbox reports to assist with data correctness and consistency - https://phabricator.wikimedia.org/T205899 (10crusnov) Note that Netbox does not allow one to create a second device with the same asset tag,... [04:23:35] RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 241.49 seconds [04:46:50] (03PS1) 10CRusnov: Add "Coherence" check [software/netbox-reports] - 10https://gerrit.wikimedia.org/r/478458 [04:59:19] PROBLEM - puppet last run on wdqs1003 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [05:21:59] PROBLEM - Memory correctable errors -EDAC- on thumbor1004 is CRITICAL: 4.001 ge 4 https://grafana.wikimedia.org/dashboard/db/host-overview?orgId=1var-server=thumbor1004var-datasource=eqiad%2520prometheus%252Fops [05:30:29] RECOVERY - puppet last run on wdqs1003 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [06:29:29] PROBLEM - puppet last run on analytics1028 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 3 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/usr/local/bin/puppet-enabled] [07:00:39] RECOVERY - puppet last run on analytics1028 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [07:29:51] PROBLEM - Check systemd state on ms-be2015 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [07:34:37] RECOVERY - Check systemd state on ms-be2015 is OK: OK - running: The system is fully operational [07:52:18] (03PS1) 10Zoranzoki21: Merge branch 'master' of ssh://gerrit.wikimedia.org:29418/operations/mediawiki-config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478462 [07:52:20] (03PS1) 10Zoranzoki21: Disable unused Flow extension on de.wikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478463 (https://phabricator.wikimedia.org/T207626) [07:53:03] (03PS1) 10Zoranzoki21: Disable unused Flow extension on ur.wikibooks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478464 (https://phabricator.wikimedia.org/T207627) [07:53:56] (03Abandoned) 10Zoranzoki21: Merge branch 'master' of ssh://gerrit.wikimedia.org:29418/operations/mediawiki-config [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478462 (owner: 10Zoranzoki21) [07:57:59] PROBLEM - Restbase edge esams on text-lb.esams.wikimedia.org is CRITICAL: /api/rest_v1/data/citation/{format}/{query} (Get citation for Darth Vader) timed out before a response was received [07:59:03] RECOVERY - Restbase edge esams on text-lb.esams.wikimedia.org is OK: All endpoints are healthy [07:59:33] (03PS1) 10Zoranzoki21: Remove FlaggedRevs for ptwikipedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478465 (https://phabricator.wikimedia.org/T211433) [08:12:14] (03PS1) 10Zoranzoki21: Add http://idb.ub.uni-tuebingen.de/digitue to the wgCopyUploadsDomains [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478466 (https://phabricator.wikimedia.org/T211466) [08:27:19] (03PS1) 10Gergő Tisza: Add some missing groups to the privileged list [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478467 [10:05:15] 10Operations, 10Citoid, 10Regression, 10VisualEditor (Current work): Some regressions in production with Zotero translation-server in production at all - https://phabricator.wikimedia.org/T211114 (10Mvolz) [10:05:18] 10Operations, 10Citoid, 10Regression, 10VisualEditor (Current work): QIDs work locally but not in production with new translation-server - https://phabricator.wikimedia.org/T211148 (10Mvolz) 05Open>03Resolved [10:06:16] 10Operations, 10Citoid, 10Regression, 10VisualEditor (Current work): QIDs work locally but not in production with new translation-server - https://phabricator.wikimedia.org/T211148 (10Mvolz) Seems to be working now: https://en.wikipedia.org/api/rest_v1/data/citation/mediawiki/Q33415777 May have been timin... [11:18:50] (03PS3) 10Volans: validator: complete refactor of the validation [dns] - 10https://gerrit.wikimedia.org/r/478416 (https://phabricator.wikimedia.org/T182028) [11:19:29] (03CR) 10Volans: "Updated paste https://phabricator.wikimedia.org/P7896 with latest version output example." [dns] - 10https://gerrit.wikimedia.org/r/478416 (https://phabricator.wikimedia.org/T182028) (owner: 10Volans) [11:29:35] (03CR) 10Volans: "Nice! Minor comments inline." (035 comments) [software/netbox-reports] - 10https://gerrit.wikimedia.org/r/478458 (owner: 10CRusnov) [11:33:53] (03PS1) 10Reedy: Re-enable EP namespaces [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478473 (https://phabricator.wikimedia.org/T211494) [11:55:17] !log decommissioning cassandra-c, restbase2003 -- T210843 [11:55:23] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [11:55:23] T210843: Reshape RESTBase Cassandra cluster for server refresh - https://phabricator.wikimedia.org/T210843 [12:43:37] godog: o/ [12:43:45] godog: thanks; beat me to it again :) [12:45:29] 10Operations, 10ops-codfw, 10Core Platform Team, 10Services (doing), and 2 others: Reshape RESTBase Cassandra cluster for server refresh - https://phabricator.wikimedia.org/T210843 (10Eevans) [13:01:24] (03PS1) 10Urbanecm: [typo] ruwikivnews instead of ruwikinews [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478476 [13:02:31] (03PS22) 10Urbanecm: [tests] Ensure only existing wikis are referenced from IS.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/467448 (https://phabricator.wikimedia.org/T115138) [13:02:55] (03CR) 10Urbanecm: "PS22: Rebased on https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/478476, so tests vote +2." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/467448 (https://phabricator.wikimedia.org/T115138) (owner: 10Urbanecm) [13:05:22] (03PS23) 10Urbanecm: [tests] Ensure only existing wikis are referenced from IS.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/467448 (https://phabricator.wikimedia.org/T115138) [13:05:50] (03CR) 10Urbanecm: [tests] Ensure only existing wikis are referenced from IS.php (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/467448 (https://phabricator.wikimedia.org/T115138) (owner: 10Urbanecm) [13:06:10] (03CR) 10jerkins-bot: [V: 04-1] [tests] Ensure only existing wikis are referenced from IS.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/467448 (https://phabricator.wikimedia.org/T115138) (owner: 10Urbanecm) [13:08:21] (03PS24) 10Urbanecm: [tests] Ensure only existing wikis are referenced from IS.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/467448 (https://phabricator.wikimedia.org/T115138) [13:11:18] (03PS2) 10Urbanecm: [typo] Use ruwikinews instead of ruwikivnews [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478476 [13:14:02] (03PS25) 10Urbanecm: [tests] Ensure only existing wikis are referenced from IS.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/467448 (https://phabricator.wikimedia.org/T115138) [14:01:27] (03CR) 10Urbanecm: [C: 031] "LGTM" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478466 (https://phabricator.wikimedia.org/T211466) (owner: 10Zoranzoki21) [18:00:55] (03CR) 10Krinkle: [C: 032] [tests] Ensure only existing wikis are referenced from IS.php (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/467448 (https://phabricator.wikimedia.org/T115138) (owner: 10Urbanecm) [18:02:50] (03CR) 10Krinkle: [C: 032] [typo] Use ruwikinews instead of ruwikivnews [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478476 (owner: 10Urbanecm) [18:04:00] (03Merged) 10jenkins-bot: [typo] Use ruwikinews instead of ruwikivnews [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478476 (owner: 10Urbanecm) [18:04:03] (03Merged) 10jenkins-bot: [tests] Ensure only existing wikis are referenced from IS.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/467448 (https://phabricator.wikimedia.org/T115138) (owner: 10Urbanecm) [18:09:10] (03CR) 10jenkins-bot: [typo] Use ruwikinews instead of ruwikivnews [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478476 (owner: 10Urbanecm) [18:09:25] (03CR) 10jenkins-bot: [tests] Ensure only existing wikis are referenced from IS.php [mediawiki-config] - 10https://gerrit.wikimedia.org/r/467448 (https://phabricator.wikimedia.org/T115138) (owner: 10Urbanecm) [18:32:16] 10Operations, 10Security-Team, 10Security: Fetching ORES API from en.wikipedia.org blocked in debug mode - https://phabricator.wikimedia.org/T211511 (10Krinkle) [18:32:20] 10Operations, 10Security-Team, 10Security: Fetching ORES API from en.wikipedia.org blocked in debug mode - https://phabricator.wikimedia.org/T211511 (10Krinkle) [18:32:29] 10Operations, 10Security-Team: Fetching ORES API from en.wikipedia.org blocked in debug mode - https://phabricator.wikimedia.org/T211511 (10Krinkle) [18:37:10] 10Operations: "sql" command fails with "sh: 1: mysql: not found" on mwdebug1002 - https://phabricator.wikimedia.org/T211512 (10Krinkle) [18:39:49] !log krinkle@deploy1001 Synchronized wmf-config/InitialiseSettings.php: Icb9ad2f554e1 - Fix ruwikinews logo config (duration: 00m 57s) [18:39:52] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:32:13] 10Operations: "sql" command fails with "sh: 1: mysql: not found" on mwdebug1002 - https://phabricator.wikimedia.org/T211512 (10Marostegui) I don't know if it used to work on the old hosts. MySQL client isn't installed there - not sure if it used to be and "got lost" when migrating to stretch. [20:15:35] 10Operations, 10ops-codfw, 10Core Platform Team, 10Services (doing), and 2 others: Reshape RESTBase Cassandra cluster for server refresh - https://phabricator.wikimedia.org/T210843 (10Eevans) [20:16:51] !log decommissioning cassandra-a, restbase2004 -- T210843 [20:16:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:16:55] T210843: Reshape RESTBase Cassandra cluster for server refresh - https://phabricator.wikimedia.org/T210843 [20:39:40] (03PS1) 10Robingan7: Add logos. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478498 [20:39:42] (03CR) 10Welcome, new contributor!: "Thank you for making your first contribution to Wikimedia! :) To learn how to get your code changes reviewed faster and more likely to get" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478498 (owner: 10Robingan7) [21:18:35] PROBLEM - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is CRITICAL: /api (Scrapes sample page) timed out before a response was received [21:19:41] RECOVERY - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is OK: All endpoints are healthy [21:27:09] (03CR) 10CRusnov: Add "Coherence" check (033 comments) [software/netbox-reports] - 10https://gerrit.wikimedia.org/r/478458 (owner: 10CRusnov) [21:27:29] PROBLEM - Restbase edge esams on text-lb.esams.wikimedia.org is CRITICAL: /api/rest_v1/data/citation/{format}/{query} (Get citation for Darth Vader) timed out before a response was received [21:27:33] (03CR) 10CRusnov: Add "Coherence" check (031 comment) [software/netbox-reports] - 10https://gerrit.wikimedia.org/r/478458 (owner: 10CRusnov) [21:28:13] PROBLEM - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is CRITICAL: /api (Scrapes sample page) timed out before a response was received [21:30:55] (03PS2) 10Robingan7: Add logos. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478498 [21:30:59] RECOVERY - Restbase edge esams on text-lb.esams.wikimedia.org is OK: All endpoints are healthy [21:31:43] RECOVERY - Citoid LVS eqiad on citoid.svc.eqiad.wmnet is OK: All endpoints are healthy [21:58:09] (03PS3) 10Robingan7: Add logos. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478498 [22:01:15] (03CR) 10Ebe123: [C: 04-1] "Should be frwikinews-2x.png, testwiki-1.5x.png, frwikinews-1.5x.png, testwiki-2x.png" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478498 (owner: 10Robingan7) [22:14:36] (03PS4) 10Robingan7: Add logos. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478498 [22:29:00] (03PS5) 10Robingan7: Add logos. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478498 [22:36:05] (03PS6) 10Robingan7: Add logos. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478498 [22:39:54] (03CR) 10Krinkle: redirects.dat - split non-canonical to separate section (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/292785 (https://phabricator.wikimedia.org/T133548) (owner: 10BBlack) [22:39:57] beta cluster is down? "Request from [my IP] via deployment-cache-text05 deployment-cache-text05, Varnish XID 64223614 Error: 503, Backend fetch failed at Sun, 09 Dec 2018 22:38:57 GMT" [22:40:09] (03CR) 10Krinkle: [C: 04-1] "potential mistake, at least inconsistent with docs." [puppet] - 10https://gerrit.wikimedia.org/r/292785 (https://phabricator.wikimedia.org/T133548) (owner: 10BBlack) [22:41:21] I get the same error at both https://en.wikipedia.beta.wmflabs.org and https://deployment.wikimedia.beta.wmflabs.org/ [22:42:31] kosta filed T211524 [22:42:32] T211524: Unable to access en/ko.wikipedia.beta.wmflabs.org - https://phabricator.wikimedia.org/T211524 [22:59:29] (03PS1) 10Krinkle: tests: Assert LabsServices contains all prod keys [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478569 (https://phabricator.wikimedia.org/T211526) [23:00:15] (03CR) 10jerkins-bot: [V: 04-1] tests: Assert LabsServices contains all prod keys [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478569 (https://phabricator.wikimedia.org/T211526) (owner: 10Krinkle) [23:03:28] (03PS2) 10Krinkle: tests: Assert LabsServices contains all prod keys [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478569 (https://phabricator.wikimedia.org/T211526) [23:04:14] (03CR) 10jerkins-bot: [V: 04-1] tests: Assert LabsServices contains all prod keys [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478569 (https://phabricator.wikimedia.org/T211526) (owner: 10Krinkle) [23:07:51] (03PS2) 10CRusnov: Add "Coherence" check [software/netbox-reports] - 10https://gerrit.wikimedia.org/r/478458 (https://phabricator.wikimedia.org/T205899) [23:08:59] (03CR) 10CRusnov: "Fixed 'done'd below. Added null/blank serial check as a separate check." (033 comments) [software/netbox-reports] - 10https://gerrit.wikimedia.org/r/478458 (https://phabricator.wikimedia.org/T205899) (owner: 10CRusnov) [23:22:24] 10Operations, 10Beta-Cluster-Infrastructure: Unable to access en/ko.wikipedia.beta.wmflabs.org - https://phabricator.wikimedia.org/T211524 (10Quiddity) [23:22:38] 10Operations, 10Beta-Cluster-Infrastructure: Unable to access beta cluster - https://phabricator.wikimedia.org/T211524 (10Quiddity) [23:58:40] (03PS1) 10Robingan7: Upload some new logos [mediawiki-config] - 10https://gerrit.wikimedia.org/r/478570