[00:59:12] twentyafterfour re: package is it jessie vs trusty maybe? [01:50:35] 06Operations, 10Jupyter-Hub: notebook1001 shown as DOWN in icinga, due to firewall rules - https://phabricator.wikimedia.org/T138685#2407379 (10Peachey88) [02:28:19] !log mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.6) (duration: 10m 48s) [02:28:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [02:46:29] !log mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.7) (duration: 08m 15s) [02:46:38] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [02:52:49] !log l10nupdate@tin ResourceLoader cache refresh completed at Sun Jun 26 02:52:48 UTC 2016 (duration 6m 19s) [02:52:54] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [03:17:39] 06Operations, 10Traffic, 06Community-Liaisons (Jul-Sep-2016): Help contact bot owners about the end of HTTP access to the API - https://phabricator.wikimedia.org/T136674#2407413 (10Whatamidoing-WMF) Electron Bot is supposed to be fixed now. [04:04:24] PROBLEM - puppet last run on mw1248 is CRITICAL: CRITICAL: puppet fail [04:33:22] RECOVERY - puppet last run on mw1248 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:31:43] PROBLEM - puppet last run on mw1135 is CRITICAL: CRITICAL: Puppet has 1 failures [06:32:02] PROBLEM - puppet last run on cp3017 is CRITICAL: CRITICAL: Puppet has 2 failures [06:32:03] PROBLEM - puppet last run on kafka1002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:32:12] PROBLEM - puppet last run on mw2158 is CRITICAL: CRITICAL: Puppet has 1 failures [06:32:33] PROBLEM - puppet last run on db2044 is CRITICAL: CRITICAL: Puppet has 1 failures [06:32:34] PROBLEM - puppet last run on cp4010 is CRITICAL: CRITICAL: Puppet has 2 failures [06:32:34] PROBLEM - puppet last run on db2055 is CRITICAL: CRITICAL: Puppet has 1 failures [06:34:23] PROBLEM - puppet last run on mw1119 is CRITICAL: CRITICAL: Puppet has 1 failures [06:35:02] PROBLEM - puppet last run on aqs1002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:35:33] PROBLEM - puppet last run on mw1170 is CRITICAL: CRITICAL: Puppet has 1 failures [06:51:14] PROBLEM - puppet last run on mw2196 is CRITICAL: CRITICAL: Puppet has 1 failures [06:56:23] RECOVERY - puppet last run on kafka1002 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [06:56:53] RECOVERY - puppet last run on db2044 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [06:56:54] RECOVERY - puppet last run on db2055 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [06:57:03] RECOVERY - puppet last run on aqs1002 is OK: OK: Puppet is currently enabled, last run 18 seconds ago with 0 failures [06:57:33] RECOVERY - puppet last run on mw1170 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:58:14] RECOVERY - puppet last run on mw1135 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:58:34] RECOVERY - puppet last run on cp3017 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:58:43] RECOVERY - puppet last run on mw1119 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:58:44] RECOVERY - puppet last run on mw2158 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [06:59:13] RECOVERY - puppet last run on cp4010 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [07:16:31] RECOVERY - puppet last run on mw2196 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [07:20:11] PROBLEM - puppet last run on elastic1008 is CRITICAL: CRITICAL: Puppet has 1 failures [07:46:42] RECOVERY - puppet last run on elastic1008 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [09:10:11] PROBLEM - check_mysql on fdb2001 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 2038 [09:15:11] RECOVERY - check_mysql on fdb2001 is OK: Uptime: 1076376 Threads: 2 Questions: 8963293 Slow queries: 6243 Opens: 853 Flush tables: 2 Open tables: 576 Queries per second avg: 8.327 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 737 [09:34:02] PROBLEM - puppet last run on mw2214 is CRITICAL: CRITICAL: puppet fail [10:03:02] RECOVERY - puppet last run on mw2214 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [10:13:04] (03PS1) 10Urbanecm: Increase move rate limit for extendedmovers to 16/60 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/296077 (https://phabricator.wikimedia.org/T138703) [10:15:20] (03PS2) 10Urbanecm: Increase move rate limit for extendedmovers in enwiki to 16/60 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/296077 (https://phabricator.wikimedia.org/T138703) [11:05:25] (03PS4) 10Dzahn: Rewrite rules for git.wikimedia.org [puppet] - 10https://gerrit.wikimedia.org/r/294867 (https://phabricator.wikimedia.org/T137224) (owner: 1020after4) [11:06:38] (03CR) 10Dzahn: [C: 032] Rewrite rules for git.wikimedia.org [puppet] - 10https://gerrit.wikimedia.org/r/294867 (https://phabricator.wikimedia.org/T137224) (owner: 1020after4) [11:11:50] (03PS2) 10Dzahn: diamond: move _lib classes to own files [puppet] - 10https://gerrit.wikimedia.org/r/295982 [11:13:54] (03CR) 10Dzahn: [C: 032] diamond: move _lib classes to own files [puppet] - 10https://gerrit.wikimedia.org/r/295982 (owner: 10Dzahn) [11:14:29] (03CR) 10Dzahn: [C: 032] lint-ignore arrows in tests/server [puppet/kafka] - 10https://gerrit.wikimedia.org/r/295986 (owner: 10Dzahn) [11:16:37] (03PS1) 10Dzahn: kafka: bump submodule for lint fix [puppet] - 10https://gerrit.wikimedia.org/r/296083 [11:17:16] (03CR) 10Dzahn: [C: 032] kafka: bump submodule for lint fix [puppet] - 10https://gerrit.wikimedia.org/r/296083 (owner: 10Dzahn) [11:17:38] (03PS2) 10Dzahn: openstack: move instancersync define to own file [puppet] - 10https://gerrit.wikimedia.org/r/295985 [11:19:39] (03PS3) 10Dzahn: openstack: move instancersync define to own file [puppet] - 10https://gerrit.wikimedia.org/r/295985 [11:23:49] (03PS3) 10Dzahn: Install arcanist from apt rather than git. [puppet] - 10https://gerrit.wikimedia.org/r/295975 (owner: 1020after4) [11:24:19] (03PS4) 10Dzahn: openstack: move instancersync define to own file [puppet] - 10https://gerrit.wikimedia.org/r/295985 [11:25:02] (03CR) 10Dzahn: [C: 032] "http://puppet-compiler.wmflabs.org/3201/" [puppet] - 10https://gerrit.wikimedia.org/r/295985 (owner: 10Dzahn) [11:26:20] (03Abandoned) 10Dzahn: switch git.wikimedia.org from misc to text cluster [dns] - 10https://gerrit.wikimedia.org/r/293747 (https://phabricator.wikimedia.org/T123718) (owner: 10Dzahn) [11:55:09] (03CR) 10Dzahn: "can you please confirm this works on iridium? it looks like i always get a redirect to twentyafterfour: are you around? [12:00:21] twentyafterfour: i merged the git.wm.org rewrite rules to be on iridium (but of course not yet switcehd varnish) [12:01:21] looks like they might not work and the logging config is wrong [12:01:41] we need a test on a real phabricator instance apparently [12:01:56] and it wasnt done on phab-0X [12:08:52] (03CR) 10Dzahn: "doesn't work. breaks old bugzilla.wm.org URLs, redirects everything to diffusion, apparently was NOT actually tested" [puppet] - 10https://gerrit.wikimedia.org/r/294867 (https://phabricator.wikimedia.org/T137224) (owner: 1020after4) [12:09:05] (03PS1) 10Dzahn: Revert "Rewrite rules for git.wikimedia.org" [puppet] - 10https://gerrit.wikimedia.org/r/296085 [12:09:13] (03PS1) 10Paladox: Revert "Rewrite rules for git.wikimedia.org" [puppet] - 10https://gerrit.wikimedia.org/r/296086 [12:09:24] (03PS2) 10Paladox: Revert "Rewrite rules for git.wikimedia.org" [puppet] - 10https://gerrit.wikimedia.org/r/296086 [12:09:37] (03PS2) 10Dzahn: Revert "Rewrite rules for git.wikimedia.org" [puppet] - 10https://gerrit.wikimedia.org/r/296085 [12:10:23] (03CR) 10Dzahn: [C: 032] Revert "Rewrite rules for git.wikimedia.org" [puppet] - 10https://gerrit.wikimedia.org/r/296085 (owner: 10Dzahn) [12:21:08] (03Abandoned) 10Paladox: Revert "Rewrite rules for git.wikimedia.org" [puppet] - 10https://gerrit.wikimedia.org/r/296086 (owner: 10Paladox) [12:21:13] 06Operations, 06Release-Engineering-Team, 07Developer-notice, 05Gitblit-Deprecate, and 2 others: Redirect Gitblit urls (git.wikimedia.org) -> Diffusion urls (phabricator.wikimedia.org/diffusion) - https://phabricator.wikimedia.org/T137224#2407860 (10Dzahn) a:05Dzahn>03None I merged the change that puts... [12:21:29] (03PS1) 10Paladox: Rewrite rules for git.wikimedia.org [puppet] - 10https://gerrit.wikimedia.org/r/296138 [12:21:45] (03PS2) 10Paladox: Rewrite rules for git.wikimedia.org [puppet] - 10https://gerrit.wikimedia.org/r/296138 (https://phabricator.wikimedia.org/T137224) [12:22:19] 06Operations, 06Release-Engineering-Team, 07Developer-notice, 05Gitblit-Deprecate, and 2 others: Redirect Gitblit urls (git.wikimedia.org) -> Diffusion urls (phabricator.wikimedia.org/diffusion) - https://phabricator.wikimedia.org/T137224#2407865 (10Dzahn) The announced date should be put on hold until we... [12:37:57] 06Operations, 06Research-and-Data-Backlog, 10Research-management, 06Revision-Scoring-As-A-Service, and 3 others: [Epic] Deploy Revscoring/ORES service in Prod - https://phabricator.wikimedia.org/T106867#1480216 (10He7d3r) Now we have https://ores.wmflabs.org/ and https://ores.wikimedia.org/ Was this task a... [12:40:31] (03PS1) 10Aude: Get descriptions from pageterms for Wikidata [mediawiki-config] - 10https://gerrit.wikimedia.org/r/296153 (https://phabricator.wikimedia.org/T138705) [13:07:44] 06Operations, 06Research-and-Data-Backlog, 10Research-management, 06Revision-Scoring-As-A-Service, and 3 others: [Epic] Deploy Revscoring/ORES service in Prod - https://phabricator.wikimedia.org/T106867#2407914 (10JanZerebecki) It was not about the first one. [13:09:15] (03PS1) 10KartikMistry: apertium-isl-eng: New upstream, rebuild for Jessie [debs/contenttranslation/apertium-isl-eng] - 10https://gerrit.wikimedia.org/r/296157 (https://phabricator.wikimedia.org/T107306) [13:09:45] 06Operations, 10ContentTranslation-Deployments, 10ContentTranslation-cxserver, 10MediaWiki-extensions-ContentTranslation, and 4 others: Package and test apertium for Jessie - https://phabricator.wikimedia.org/T107306#2407934 (10KartikMistry) [13:15:25] (03PS1) 10KartikMistry: apertium-id-ms: Rebuild for Jessie and cleanup [debs/contenttranslation/apertium-id-ms] - 10https://gerrit.wikimedia.org/r/296159 (https://phabricator.wikimedia.org/T107306) [13:16:26] (03PS1) 10Andrew Bogott: Lower RAM overcommit ratio from 1.5 to 1.3 [puppet] - 10https://gerrit.wikimedia.org/r/296160 [13:16:28] (03PS1) 10Andrew Bogott: Temporarily remove labvirt1001 and labvirt1010 from scheduling pool [puppet] - 10https://gerrit.wikimedia.org/r/296161 [13:18:52] PROBLEM - mobileapps endpoints health on scb1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [13:18:54] (03CR) 10Andrew Bogott: [C: 032] Lower RAM overcommit ratio from 1.5 to 1.3 [puppet] - 10https://gerrit.wikimedia.org/r/296160 (owner: 10Andrew Bogott) [13:19:12] (03CR) 10Andrew Bogott: [C: 032] Temporarily remove labvirt1001 and labvirt1010 from scheduling pool [puppet] - 10https://gerrit.wikimedia.org/r/296161 (owner: 10Andrew Bogott) [13:21:02] RECOVERY - mobileapps endpoints health on scb1001 is OK: All endpoints are healthy [13:27:13] (03PS1) 10KartikMistry: apertium-pt-gl: Rebuild for Jessie, cleanup [debs/contenttranslation/apertium-pt-gl] - 10https://gerrit.wikimedia.org/r/296162 (https://phabricator.wikimedia.org/T107306) [13:27:34] 06Operations, 10ContentTranslation-Deployments, 10ContentTranslation-cxserver, 10MediaWiki-extensions-ContentTranslation, and 4 others: Package and test apertium for Jessie - https://phabricator.wikimedia.org/T107306#2407943 (10KartikMistry) [13:28:23] PROBLEM - nova-compute process on labvirt1005 is CRITICAL: PROCS CRITICAL: 0 processes with regex args ^/usr/bin/python /usr/bin/nova-compute [13:33:20] andrewboggot ^ [13:33:30] looking [13:33:42] ok [13:34:57] (03PS1) 10KartikMistry: apertium-pt-ca: Rebuild for Jessie, cleanup. [debs/contenttranslation/apertium-pt-ca] - 10https://gerrit.wikimedia.org/r/296164 (https://phabricator.wikimedia.org/T107306) [13:35:09] RECOVERY - nova-compute process on labvirt1005 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/nova-compute [13:35:26] andrewbogott did you fix it or did it come back up? [13:35:44] I restarted it [13:35:52] ah ok [13:35:55] is this all just the pressure? [13:36:55] I don't know. I put in some nova.conf changes [13:37:07] Maybe it's dying as puppet rolls them out, but I don't know why. Still looking [13:38:07] other nodes don't seem to have a problem [13:39:03] ok [13:51:52] (03PS1) 10Andrew Bogott: Increase RAM overprovision ratio a bit. [puppet] - 10https://gerrit.wikimedia.org/r/296165 [14:36:22] (03PS2) 10Andrew Bogott: Increase RAM and disk overprovision ratio a bit. [puppet] - 10https://gerrit.wikimedia.org/r/296165 [14:38:08] (03CR) 10Andrew Bogott: [C: 032] Increase RAM and disk overprovision ratio a bit. [puppet] - 10https://gerrit.wikimedia.org/r/296165 (owner: 10Andrew Bogott) [14:52:49] PROBLEM - puppet last run on mw1131 is CRITICAL: CRITICAL: Puppet has 15 failures [14:55:00] (03PS1) 10Andrew Bogott: Add labvirt1011 to scheduling pool. [puppet] - 10https://gerrit.wikimedia.org/r/296168 [14:56:35] (03CR) 10Andrew Bogott: [C: 032 V: 032] Add labvirt1011 to scheduling pool. [puppet] - 10https://gerrit.wikimedia.org/r/296168 (owner: 10Andrew Bogott) [15:03:59] Request from 88.149.203.34 via cp3007 cp3007, Varnish XID 1062031423 Error: 503, Backend fetch failed at Sun, 26 Jun 2016 15:03:23 GMT [15:04:05] I had just typed a long email on OTRS.... [15:04:58] It sent the email after I pressed try again...just notifying you that I got a 503...problems incoming perhaps? oh well... [15:18:39] RECOVERY - puppet last run on mw1131 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [15:47:49] RECOVERY - Router interfaces on cr2-eqiad is OK: OK: host 208.80.154.197, interfaces up: 202, down: 0, dormant: 0, excluded: 0, unused: 0 [17:12:57] PROBLEM - puppet last run on mw2222 is CRITICAL: CRITICAL: puppet fail [17:39:35] RECOVERY - puppet last run on mw2222 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [18:08:17] 06Operations, 10Wikimedia-SVG-rendering, 07Upstream: PNG thumbnail preview of SVG misses some text - https://phabricator.wikimedia.org/T123106#2408409 (10Aklapper) [18:08:37] 06Operations, 10Wikimedia-SVG-rendering, 07Upstream: PNG thumbnail preview of SVG misses some text - https://phabricator.wikimedia.org/T123106#1921435 (10Aklapper) [19:34:06] I was recently browsing the Zuul graphs and kept misreading "Gearman" as "German"... I need more sleep. [20:24:39] 06Operations, 06Discovery, 06Maps, 10Maps-data, 10hardware-requests: 2 servers for maps-beta cluster - https://phabricator.wikimedia.org/T138600#2408545 (10Yurik) [20:24:42] 06Operations, 06Discovery, 06Maps, 10Maps-data, 03Maps-Sprint: Ensure Maps servers can be installed easily (automation + documentation) - https://phabricator.wikimedia.org/T138501#2408547 (10Yurik) [20:24:44] 06Operations, 06Discovery, 06Maps, 10Maps-data: Maps - enable Geoshapes on production - https://phabricator.wikimedia.org/T138525#2408546 (10Yurik) [20:24:56] 06Operations, 06Discovery, 06Maps, 10Maps-data, and 2 others: Configure new maps servers in eqiad - https://phabricator.wikimedia.org/T138092#2408553 (10Yurik) [20:24:59] 06Operations, 06Discovery, 06Maps, 10Maps-data: Improve automation around Maps servers - https://phabricator.wikimedia.org/T138017#2408554 (10Yurik) [20:25:02] 06Operations, 06Discovery, 06Maps, 10Maps-data, 07Epic: Epic: cultivating the Maps garden - https://phabricator.wikimedia.org/T137616#2408555 (10Yurik) [20:25:10] 06Operations, 06Discovery, 06Maps, 10Maps-data: Tune thread for osm2pgsql / postgres max connections for Maps - https://phabricator.wikimedia.org/T137229#2408558 (10Yurik) [20:25:18] 06Operations, 06Discovery, 06Maps, 10Maps-data, and 2 others: Configure monitoring / alerting of Postgresql / redis / ... cluster for maps - https://phabricator.wikimedia.org/T135647#2408561 (10Yurik) [21:02:02] 06Operations, 10ORES, 06Revision-Scoring-As-A-Service, 10Traffic, 07HTTPS: https://ores.wikimedia.org redirects me to HTTP when I don't include a trailing slash - https://phabricator.wikimedia.org/T138682#2408595 (10Ladsgroup) This PR will fix it: https://github.com/wiki-ai/ores/pull/152