[02:00:27] RECOVERY - Disk space on Hadoop worker on an-worker1087 is OK: DISK OK https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration [02:02:19] RECOVERY - Disk space on Hadoop worker on an-worker1090 is OK: DISK OK https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration [02:43:05] Can someone take a look at https://commons.wikimedia.org/wiki/File:John_Carmack_-_The_Dawn_of_Mobile_VR_-_Game_Developer_Conference_2015.jpg [02:45:17] It throws "Wikibase\DataModel\Services\Lookup\EntityLookupException" when tring to display the file discritpon page [05:19:51] PROBLEM - Disk space on alsafi is CRITICAL: DISK CRITICAL - free space: / 340 MB (3% inode=89%): https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=alsafi&var-datasource=codfw+prometheus/ops [05:41:23] (03PS9) 10Jeena Huneidi: Add restbase chart (port from local-charts) [deployment-charts] - 10https://gerrit.wikimedia.org/r/517557 (https://phabricator.wikimedia.org/T224935) [06:25:59] RECOVERY - Disk space on alsafi is OK: DISK OK https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/dashboard/db/host-overview?var-server=alsafi&var-datasource=codfw+prometheus/ops [07:22:03] PROBLEM - HHVM rendering on mw1229 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Application_servers [07:23:29] RECOVERY - HHVM rendering on mw1229 is OK: HTTP OK: HTTP/1.1 200 OK - 74493 bytes in 0.139 second response time https://wikitech.wikimedia.org/wiki/Application_servers [08:18:37] (03PS4) 10Volans: transports: add JunOS transport [software/homer] - 10https://gerrit.wikimedia.org/r/533558 (https://phabricator.wikimedia.org/T228388) [08:18:39] (03PS3) 10Volans: config: inject role and site to the configuration [software/homer] - 10https://gerrit.wikimedia.org/r/533568 (https://phabricator.wikimedia.org/T228388) [08:18:41] (03PS4) 10Volans: CLI: suppress ncclient noisy logger [software/homer] - 10https://gerrit.wikimedia.org/r/533570 (https://phabricator.wikimedia.org/T228388) [08:18:43] (03PS1) 10Volans: config: enforce positional vs. keyword args [software/homer] - 10https://gerrit.wikimedia.org/r/533623 [08:26:35] PROBLEM - Disk space on Hadoop worker on an-worker1078 is CRITICAL: DISK CRITICAL - free space: /var/lib/hadoop/data/e 16 GB (0% inode=99%): /var/lib/hadoop/data/m 30 GB (0% inode=99%): /var/lib/hadoop/data/l 27 GB (0% inode=99%): /var/lib/hadoop/data/g 26 GB (0% inode=99%): https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration [08:34:25] RECOVERY - Disk space on Hadoop worker on an-worker1078 is OK: DISK OK https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration [10:42:54] (03PS1) 10DannyS712: Add autopatrolled user group to az.wikibooks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/533625 (https://phabricator.wikimedia.org/T231493) [10:45:09] (03PS2) 10DannyS712: Add autopatrolled user group to az.wikibooks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/533625 (https://phabricator.wikimedia.org/T231493) [12:27:09] (03PS1) 10MarcoAurelio: Add .gitreview [debs/prometheus-swagger-exporter] - 10https://gerrit.wikimedia.org/r/533630 [12:27:45] (03CR) 10MarcoAurelio: [V: 03+2 C: 03+2] Add .gitreview [debs/prometheus-swagger-exporter] - 10https://gerrit.wikimedia.org/r/533630 (owner: 10MarcoAurelio) [13:20:17] (03PS2) 10Krinkle: Avoid localised url computation for P3P headers from CentralAuth [mediawiki-config] - 10https://gerrit.wikimedia.org/r/532268 (https://phabricator.wikimedia.org/T189966) [13:20:51] * Krinkle testing patch on deploy1001/mwdwbug1002 (not for deploy) [13:22:55] Krinkle, Thank you so much! \o/ [13:23:12] Everything OK now. [13:23:28] I am talking about https://phabricator.wikimedia.org/T231061 [13:23:42] Tulsi: glad to hear it :) [13:24:23] :D [13:27:50] (03CR) 10Krinkle: [C: 03+2] "before:" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/532268 (https://phabricator.wikimedia.org/T189966) (owner: 10Krinkle) [13:28:49] (03Merged) 10jenkins-bot: Avoid localised url computation for P3P headers from CentralAuth [mediawiki-config] - 10https://gerrit.wikimedia.org/r/532268 (https://phabricator.wikimedia.org/T189966) (owner: 10Krinkle) [13:29:06] (03CR) 10jenkins-bot: Avoid localised url computation for P3P headers from CentralAuth [mediawiki-config] - 10https://gerrit.wikimedia.org/r/532268 (https://phabricator.wikimedia.org/T189966) (owner: 10Krinkle) [13:33:16] !log krinkle@deploy1001 Synchronized wmf-config/CommonSettings.php: 88ba4f8f4d49 (duration: 00m 55s) [13:33:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [15:26:16] 10Operations, 10Maps, 10Product-Infrastructure-Team-Backlog: Lake Huron missing due to apparent OSM vandalism - https://phabricator.wikimedia.org/T231691 (10Mholloway) [19:57:30] 10Operations, 10MediaWiki-Maintenance-scripts, 10serviceops: Stop forcing RUNNER=php for foreachwiki/foreachwikiindblist - https://phabricator.wikimedia.org/T230110 (10Jdforrester-WMF) This is done as of https://gerrit.wikimedia.org/r/c/operations/puppet/+/425027 on 12 August, right? [20:46:22] (03CR) 10Krinkle: "Yeah, that's presumably also a blocker for pre-calculate locally. Unless we remove a lot of features and complexity, that's going to be fr" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/533593 (https://phabricator.wikimedia.org/T223602) (owner: 10Jforrester) [20:47:35] (03CR) 10Krinkle: "(continuing on task)" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/533593 (https://phabricator.wikimedia.org/T223602) (owner: 10Jforrester) [22:11:02] PROBLEM - HHVM rendering on mw1316 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Application_servers [22:12:30] RECOVERY - HHVM rendering on mw1316 is OK: HTTP OK: HTTP/1.1 200 OK - 74626 bytes in 1.614 second response time https://wikitech.wikimedia.org/wiki/Application_servers [23:27:34] 10Operations, 10Phabricator, 10Traffic, 10Release-Engineering-Team (Development services), and 2 others: Prepare Phame to support heavy traffic for a Tech Department blog - https://phabricator.wikimedia.org/T226044 (10Urbanecm) [bulk] Setting points to "", given it doesn't make any sense to have them as "0". [23:27:54] PROBLEM - High average POST latency for mw requests on appserver in eqiad on icinga1001 is CRITICAL: cluster=appserver code=200 handler=proxy:unix:/run/php/fpm-www.sock https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=eqiad+prometheus/ops&var-cluster=appserver&var-method=POST [23:29:30] RECOVERY - High average POST latency for mw requests on appserver in eqiad on icinga1001 is OK: All metrics within thresholds. https://wikitech.wikimedia.org/wiki/Monitoring/Missing_notes_link https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?panelId=9&fullscreen&orgId=1&from=now-3h&to=now&var-datasource=eqiad+prometheus/ops&var-cluster=appserver&var-method=POST