[00:06:15] PROBLEM - Filesystem available is greater than filesystem size on ms-be2041 is CRITICAL: cluster=swift device=/dev/sde1 fstype=xfs instance=ms-be2041:9100 job=node mountpoint=/srv/swift-storage/sde1 site=codfw https://grafana.wikimedia.org/dashboard/db/host-overview?orgId=1&var-server=ms-be2041&var-datasource=codfw%2520prometheus%252Fops [00:06:50] (03PS1) 10Acamicamacaraca: Enable the Visual Editor in the Wikipedia namespace on the Serbian Wikipedia (sr wiki) per task. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/462190 [00:15:11] (03PS1) 10Zoranzoki21: Enable VisualEditor in Project namespace on srwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/462191 (https://phabricator.wikimedia.org/T205206) [03:30:45] PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 876.83 seconds [03:53:25] PROBLEM - MediaWiki memcached error rate on graphite1001 is CRITICAL: CRITICAL: 80.00% of data above the critical threshold [5000.0] https://grafana.wikimedia.org/dashboard/db/mediawiki-graphite-alerts?orgId=1&panelId=1&fullscreen [04:00:04] RECOVERY - MediaWiki memcached error rate on graphite1001 is OK: OK: Less than 40.00% above the threshold [1000.0] https://grafana.wikimedia.org/dashboard/db/mediawiki-graphite-alerts?orgId=1&panelId=1&fullscreen [04:05:44] RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 280.51 seconds [04:08:45] PROBLEM - MediaWiki memcached error rate on graphite1001 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [5000.0] https://grafana.wikimedia.org/dashboard/db/mediawiki-graphite-alerts?orgId=1&panelId=1&fullscreen [04:15:11] (03PS1) 10Andrew Bogott: Horizon: remove region defs for some deleted projects [puppet] - 10https://gerrit.wikimedia.org/r/462194 [04:19:35] RECOVERY - MediaWiki memcached error rate on graphite1001 is OK: OK: Less than 40.00% above the threshold [1000.0] https://grafana.wikimedia.org/dashboard/db/mediawiki-graphite-alerts?orgId=1&panelId=1&fullscreen [04:31:15] PROBLEM - Varnish traffic drop between 30min ago and now at eqiad on einsteinium is CRITICAL: 59.78 le 60 https://grafana.wikimedia.org/dashboard/db/varnish-http-requests?panelId=6&fullscreen&orgId=1 [04:32:24] RECOVERY - Varnish traffic drop between 30min ago and now at eqiad on einsteinium is OK: (C)60 le (W)70 le 78.1 https://grafana.wikimedia.org/dashboard/db/varnish-http-requests?panelId=6&fullscreen&orgId=1 [04:32:56] !log krinkle@deploy1001 Synchronized php-1.32.0-wmf.22/includes/user/User.php: T202149 - Ic0c25f66f23f (duration: 00m 53s) [04:33:42] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [04:33:43] T202149: Exception thrown for failure to save settings appears ~ 1000 times/day - https://phabricator.wikimedia.org/T202149 [06:28:15] PROBLEM - puppet last run on mw1323 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): File[/etc/apache2/conf-available/00-defaults.conf] [06:50:25] PROBLEM - MediaWiki memcached error rate on graphite1001 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [5000.0] https://grafana.wikimedia.org/dashboard/db/mediawiki-graphite-alerts?orgId=1&panelId=1&fullscreen [06:54:54] RECOVERY - MediaWiki memcached error rate on graphite1001 is OK: OK: Less than 40.00% above the threshold [1000.0] https://grafana.wikimedia.org/dashboard/db/mediawiki-graphite-alerts?orgId=1&panelId=1&fullscreen [06:58:45] RECOVERY - puppet last run on mw1323 is OK: OK: Puppet is currently enabled, last run 3 minutes ago with 0 failures [07:00:15] PROBLEM - MediaWiki memcached error rate on graphite1001 is CRITICAL: CRITICAL: 60.00% of data above the critical threshold [5000.0] https://grafana.wikimedia.org/dashboard/db/mediawiki-graphite-alerts?orgId=1&panelId=1&fullscreen [07:04:35] RECOVERY - MediaWiki memcached error rate on graphite1001 is OK: OK: Less than 40.00% above the threshold [1000.0] https://grafana.wikimedia.org/dashboard/db/mediawiki-graphite-alerts?orgId=1&panelId=1&fullscreen [08:45:24] PROBLEM - MediaWiki memcached error rate on graphite1001 is CRITICAL: CRITICAL: 80.00% of data above the critical threshold [5000.0] https://grafana.wikimedia.org/dashboard/db/mediawiki-graphite-alerts?orgId=1&panelId=1&fullscreen [08:56:15] RECOVERY - MediaWiki memcached error rate on graphite1001 is OK: OK: Less than 40.00% above the threshold [1000.0] https://grafana.wikimedia.org/dashboard/db/mediawiki-graphite-alerts?orgId=1&panelId=1&fullscreen [09:03:15] (03PS1) 10Framawiki: Add ns in wgNamespacesToBeSearchedDefault on frwiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/462271 (https://phabricator.wikimedia.org/T205198) [09:11:34] PROBLEM - MediaWiki memcached error rate on graphite1001 is CRITICAL: CRITICAL: 40.00% of data above the critical threshold [5000.0] https://grafana.wikimedia.org/dashboard/db/mediawiki-graphite-alerts?orgId=1&panelId=1&fullscreen [09:20:05] RECOVERY - MediaWiki memcached error rate on graphite1001 is OK: OK: Less than 40.00% above the threshold [1000.0] https://grafana.wikimedia.org/dashboard/db/mediawiki-graphite-alerts?orgId=1&panelId=1&fullscreen [09:23:38] (03CR) 10Volans: [C: 032] "recheck" [software/spicerack] - 10https://gerrit.wikimedia.org/r/460114 (https://phabricator.wikimedia.org/T199079) (owner: 10Volans) [09:24:34] (03PS1) 10Volans: Custom fields: add label field [software/netbox] - 10https://gerrit.wikimedia.org/r/462273 (https://phabricator.wikimedia.org/T199083) [09:24:49] (03Merged) 10jenkins-bot: mediawiki: improve siteinfo checks [software/spicerack] - 10https://gerrit.wikimedia.org/r/460114 (https://phabricator.wikimedia.org/T199079) (owner: 10Volans) [09:26:40] (03Abandoned) 10Acamicamacaraca: Enable the Visual Editor in the Wikipedia namespace on the Serbian Wikipedia (sr wiki) per task. [mediawiki-config] - 10https://gerrit.wikimedia.org/r/462190 (owner: 10Acamicamacaraca) [09:27:33] (03CR) 10Volans: "We've multiple devices for which the visible label in racktables doesn't match the device name. Adding a custom field to track them." [software/netbox] - 10https://gerrit.wikimedia.org/r/462273 (https://phabricator.wikimedia.org/T199083) (owner: 10Volans) [09:30:06] (03CR) 10Acamicamacaraca: [C: 031] "Good." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/462191 (https://phabricator.wikimedia.org/T205206) (owner: 10Zoranzoki21) [09:33:49] (03PS1) 10Volans: Add cumin1001 IPs and PTRs [dns] - 10https://gerrit.wikimedia.org/r/462274 (https://phabricator.wikimedia.org/T201346) [09:39:14] PROBLEM - Varnish traffic drop between 30min ago and now at eqiad on einsteinium is CRITICAL: 59.65 le 60 https://grafana.wikimedia.org/dashboard/db/varnish-http-requests?panelId=6&fullscreen&orgId=1 [09:42:24] PROBLEM - Varnish traffic drop between 30min ago and now at eqiad on einsteinium is CRITICAL: 57.6 le 60 https://grafana.wikimedia.org/dashboard/db/varnish-http-requests?panelId=6&fullscreen&orgId=1 [09:44:34] RECOVERY - Varnish traffic drop between 30min ago and now at eqiad on einsteinium is OK: (C)60 le (W)70 le 77.63 https://grafana.wikimedia.org/dashboard/db/varnish-http-requests?panelId=6&fullscreen&orgId=1 [09:53:44] (03PS1) 10Volans: cumin: installation of cumin1001 [puppet] - 10https://gerrit.wikimedia.org/r/462278 (https://phabricator.wikimedia.org/T201346) [09:59:11] (03CR) 10Volans: "Compiler results for cumin2001 (as cumin1001 doesn't have facts and cannot be compiled yet):" [puppet] - 10https://gerrit.wikimedia.org/r/462278 (https://phabricator.wikimedia.org/T201346) (owner: 10Volans) [10:36:39] (03PS2) 10Volans: Tests: improve naming for SSH key file [software/cumin] - 10https://gerrit.wikimedia.org/r/458357 [10:36:58] (03CR) 10Volans: "done" (031 comment) [software/cumin] - 10https://gerrit.wikimedia.org/r/458357 (owner: 10Volans) [12:10:38] (03CR) 10Faidon Liambotis: Custom fields: add label field (031 comment) [software/netbox] - 10https://gerrit.wikimedia.org/r/462273 (https://phabricator.wikimedia.org/T199083) (owner: 10Volans) [12:44:25] PROBLEM - HHVM rendering on mw1285 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:45:25] RECOVERY - HHVM rendering on mw1285 is OK: HTTP OK: HTTP/1.1 200 OK - 75970 bytes in 0.181 second response time [12:49:57] (03PS2) 10Volans: Custom fields: add label field [software/netbox] - 10https://gerrit.wikimedia.org/r/462273 (https://phabricator.wikimedia.org/T199083) [13:40:47] 10Operations, 10ops-codfw, 10DBA: db2042 (m3) master RAID battery failed - https://phabricator.wikimedia.org/T202051 (10Marostegui) [14:19:35] 10Operations, 10Beta-Cluster-Infrastructure, 10DNS, 10Traffic, and 3 others: Ferm/DNS library weirdness causing puppet errors on some deployment-prep instances - https://phabricator.wikimedia.org/T153468 (10Krenair) It looks like my net-dns-users subscription got approved some time between July 25 and Aug... [14:51:33] 10Operations, 10Beta-Cluster-Infrastructure, 10DNS, 10Traffic, and 3 others: Ferm/DNS library weirdness causing puppet errors on some deployment-prep instances - https://phabricator.wikimedia.org/T153468 (10Krenair) And actually now that I've done this and investigated some more I'm confident enough to jus... [15:28:02] Is anybody here? Could you invite me to #mediawiki_security? I can't join if not [15:34:19] 10Operations, 10Beta-Cluster-Infrastructure, 10DNS, 10Traffic, and 3 others: Ferm/DNS library weirdness causing puppet errors on some deployment-prep instances - https://phabricator.wikimedia.org/T153468 (10Krenair) Okay here's what I'm gonna send to upstream bug tracking: ```After a bit of investigation i... [15:37:56] 10Operations, 10Beta-Cluster-Infrastructure, 10DNS, 10Traffic, and 3 others: Ferm/DNS library weirdness causing puppet errors on some deployment-prep instances - https://phabricator.wikimedia.org/T153468 (10Krenair) Upstreamed, skipping ferm and going straight to Net::DNS: https://rt.cpan.org/Ticket/Displa... [15:38:51] 10Operations, 10Beta-Cluster-Infrastructure, 10DNS, 10Traffic, and 3 others: Ferm's upstream Net::DNS Perl library bad handling of NOERROR responses without records causing puppet errors when we try to @resolve AAAA in labs - https://phabricator.wikimedia.org/T153468 (10Krenair) [16:12:32] https://commons.wikimedia.org/w/index.php?page=File%3AJasmyn_Maher_the_faget.jpg&title=Special%3ALog [16:12:42] we have a major issue here ^ [16:12:58] a deleted file without a deletion log entry [16:13:18] how do we know it got deleted? [16:13:48] well, it is deleted, and we have an upload log [16:14:04] probably removed by staff using a script? [16:14:20] why would they do that? [16:14:23] (CPROT, etc...) [16:14:27] Depends what's in the image yannf [16:14:41] Where is the upload log yannf? [16:14:47] the most egregious paedophilic images get taken down, fully, afaik [16:15:13] just speculating here, I'm not even logged in [16:15:44] ah yes, it may be oversighted [16:16:01] but shouldn't still have a delete log? [16:16:18] if it were suppressed it wouldn't be in the deletion log [16:16:22] it'd be in the suppression log [16:16:38] which isn't a public log [16:17:58] even for admins? [16:18:12] yes [16:18:18] ah ok [16:18:29] I didn't know that [16:18:54] ok, so nvm [16:18:56] there are 4 people in the oversight group who may be able to find more [16:19:04] I don't know if they'll tell you something with that name was suppressed [16:19:28] I just thought it was a bug [16:19:28] if it were the contents that were problematic rather than name, maybe, idk [16:19:46] it's also possible as Hauskatze says that someone removed it on the server-side [16:21:11] to me, it doesn't make sense to suppress the deletion log if the upload log is kept [16:21:22] the name of the file is still there [16:21:37] but well, it's a minor issue [16:22:16] I don't see an upload log entry for it? [16:22:45] https://commons.wikimedia.org/w/index.php?page=File%3AJasmyn_Maher_the_faget.jpg&title=Special%3ALog [16:22:59] Yes I saw that link [16:23:04] 'No matching items in log.' [16:23:49] I can see that this user uploaded that file on 15 June 2016 [16:24:19] but I can't see who delete it, and who oversight it [16:24:24] *d [16:24:50] interesting [16:25:18] you have admin rights on commons right? [16:26:15] yes [16:26:20] from https://commons.wikimedia.org/wiki/Special:Undelete/File:Jasmyn_Maher_the_faget.jpg [16:26:42] I can see that the file was 3,840 × 5,760 (4,783,146 bytes) [16:27:07] to me, all this is not logical [16:27:28] You can see that it was deleted on Special:Undelete but can't see the file? I would expect such a case indicates suppression [16:28:00] yes, but I can't see who delete it, and at which date, etc. [16:28:18] yes which likely would mean suppression [16:35:34] 10Operations, 10Beta-Cluster-Infrastructure, 10Wikidata, 10wikidata-tech-focus, and 3 others: Run mediawiki::maintenance scripts in Beta Cluster - https://phabricator.wikimedia.org/T125976 (10MarcoAurelio) On the other hand, if purge_checkuser detects CheckUser is not installed it will just print that the... [16:53:52] 10Operations, 10Beta-Cluster-Infrastructure, 10Wikidata, 10wikidata-tech-focus, and 3 others: Run mediawiki::maintenance scripts in Beta Cluster - https://phabricator.wikimedia.org/T125976 (10Reedy) Make it do a file existence && run script [16:59:38] 10Operations, 10Beta-Cluster-Infrastructure, 10Wikidata, 10wikidata-tech-focus, and 3 others: Run mediawiki::maintenance scripts in Beta Cluster - https://phabricator.wikimedia.org/T125976 (10Krenair) We can try, but this is puppet.git, and we may just get a CR-2. [17:20:33] (03CR) 10Urbanecm: [C: 031] "If they want to welcome users created automatically, then it is fine." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/462045 (https://phabricator.wikimedia.org/T204405) (owner: 10Jayprakash12345) [17:21:42] (03CR) 10Urbanecm: [C: 031] "LGTM" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/462271 (https://phabricator.wikimedia.org/T205198) (owner: 10Framawiki) [17:22:35] (03PS2) 10Urbanecm: Add NS 110 to wgNamespacesToBeSearchedDefault on frwiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/462271 (https://phabricator.wikimedia.org/T205198) (owner: 10Framawiki) [17:23:18] (03CR) 10Urbanecm: [C: 031] Throttle for October 11 event [mediawiki-config] - 10https://gerrit.wikimedia.org/r/462014 (https://phabricator.wikimedia.org/T204829) (owner: 10Framawiki) [17:24:47] (03CR) 10Urbanecm: [C: 04-1] "This was already done globally. Please abandon the change." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/461240 (owner: 10Gergő Tisza) [17:25:41] (03CR) 10Urbanecm: [C: 031] Use translated MetaNamespace for fy.wiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/455249 (https://phabricator.wikimedia.org/T202769) (owner: 10MarcoAurelio) [17:27:34] (03PS2) 10Urbanecm: Enable Translate on idwikimedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/460602 (https://phabricator.wikimedia.org/T204292) [17:28:38] (03CR) 10Urbanecm: [C: 031] "LGTM" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/462011 (https://phabricator.wikimedia.org/T205055) (owner: 10Framawiki) [17:56:22] (03CR) 10Gergő Tisza: "wikitech ignores global rules." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/461240 (owner: 10Gergő Tisza) [18:58:19] (03CR) 10Alex Monk: [C: 031] "looks correct to me. wikitech doesn't add to the defaults it overrides." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/461240 (owner: 10Gergő Tisza) [19:58:35] 10Operations, 10DBA, 10JADE, 10Patch-For-Review, and 2 others: Write our anticipated "phase two" schemas and submit for review - https://phabricator.wikimedia.org/T202596 (10Ladsgroup) It's a little bit hard to understand the query in P7570 (for example do you mean page_title.judgment_page instead of judgm... [21:31:55] 10Operations, 10Beta-Cluster-Infrastructure, 10DNS, 10Traffic, and 3 others: Ferm's upstream Net::DNS Perl library bad handling of NOERROR responses without records causing puppet errors when we try to @resolve AAAA in labs - https://phabricator.wikimedia.org/T153468 (10Krenair) A comment on the Net::DNS t... [21:37:10] (03CR) 10Krinkle: [C: 032] profiler: Include MediaWiki post-send in XHGui profiles [mediawiki-config] - 10https://gerrit.wikimedia.org/r/462189 (owner: 10Krinkle) [21:38:24] (03Merged) 10jenkins-bot: profiler: Include MediaWiki post-send in XHGui profiles [mediawiki-config] - 10https://gerrit.wikimedia.org/r/462189 (owner: 10Krinkle) [21:40:34] (03CR) 10jenkins-bot: profiler: Include MediaWiki post-send in XHGui profiles [mediawiki-config] - 10https://gerrit.wikimedia.org/r/462189 (owner: 10Krinkle) [21:41:44] !log krinkle@deploy1001 Synchronized wmf-config/profiler.php: (no justification provided) (duration: 00m 52s) [21:41:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:42:04] Krinkle: zomg sunday deploys :)