[00:08:14] (03CR) 10MaxSem: [C: 031] mediawiki/hhvm: Move fatal-error.php to Puppet (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/379953 (https://phabricator.wikimedia.org/T113114) (owner: 10Krinkle) [00:13:38] 10Operations, 10Continuous-Integration-Infrastructure, 10MediaWiki-Core-Tests, 10HHVM: HHVM 3.18.5+dfsg-1+wmf3 changes parse_url causing unit tests to fail - https://phabricator.wikimedia.org/T185024#3998309 (10Krinkle) The Travis CI jobs for HHVM (3.18, 3.21, and 3.24) are all passing. This task can be c... [00:15:01] (03Restored) 10Chad: Stop forcing php5 in `mwscript` [puppet] - 10https://gerrit.wikimedia.org/r/358896 (https://phabricator.wikimedia.org/T146285) (owner: 10Chad) [00:15:28] (03CR) 10Krinkle: [C: 031] ":)" [puppet] - 10https://gerrit.wikimedia.org/r/358896 (https://phabricator.wikimedia.org/T146285) (owner: 10Chad) [00:19:09] (03PS2) 10Chad: Stop forcing php5 in `mwscript` [puppet] - 10https://gerrit.wikimedia.org/r/358896 (https://phabricator.wikimedia.org/T146285) [00:27:52] 10Operations, 10Traffic, 10Zero, 10ZeroPortal: Cannot fetch Zero carriers/proxies JSON files from eqsin - https://phabricator.wikimedia.org/T188111#3998330 (10Mholloway) +@Tgr for insights particularly around auth stuff. [00:36:40] (03PS1) 10Smalyshev: Add configuration for CirrusSearch to instantly index new Wikidata items [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413899 (https://phabricator.wikimedia.org/T183053) [00:46:01] (03PS9) 10Chad: Move all dblists on noc to dblists/ directory, rather than individually [mediawiki-config] - 10https://gerrit.wikimedia.org/r/394199 [00:47:24] (03CR) 10jerkins-bot: [V: 04-1] Move all dblists on noc to dblists/ directory, rather than individually [mediawiki-config] - 10https://gerrit.wikimedia.org/r/394199 (owner: 10Chad) [00:49:39] (03CR) 10Krinkle: [C: 04-1] "This would break the m.wikipedia.org and zero.wikipedia.org entry points because the (currently unreachable) Apache server config and the " [puppet] - 10https://gerrit.wikimedia.org/r/404158 (https://phabricator.wikimedia.org/T69015) (owner: 10Mholloway) [00:49:58] (03PS10) 10Chad: Move all dblists on noc to dblists/ directory, rather than individually [mediawiki-config] - 10https://gerrit.wikimedia.org/r/394199 [00:51:18] (03CR) 10jerkins-bot: [V: 04-1] Move all dblists on noc to dblists/ directory, rather than individually [mediawiki-config] - 10https://gerrit.wikimedia.org/r/394199 (owner: 10Chad) [00:52:58] (03CR) 10Krinkle: Move all dblists on noc to dblists/ directory, rather than individually (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/394199 (owner: 10Chad) [00:53:30] (03PS11) 10Chad: Move all dblists on noc to dblists/ directory, rather than individually [mediawiki-config] - 10https://gerrit.wikimedia.org/r/394199 [00:54:12] (03PS12) 10Chad: Move all dblists on noc to dblists/ directory, rather than individually [mediawiki-config] - 10https://gerrit.wikimedia.org/r/394199 [00:55:40] (03CR) 10jerkins-bot: [V: 04-1] Move all dblists on noc to dblists/ directory, rather than individually [mediawiki-config] - 10https://gerrit.wikimedia.org/r/394199 (owner: 10Chad) [00:58:41] (03PS13) 10Chad: Move all dblists on noc to dblists/ directory, rather than individually [mediawiki-config] - 10https://gerrit.wikimedia.org/r/394199 [01:00:05] (03CR) 10jerkins-bot: [V: 04-1] Move all dblists on noc to dblists/ directory, rather than individually [mediawiki-config] - 10https://gerrit.wikimedia.org/r/394199 (owner: 10Chad) [01:02:23] (03PS14) 10Chad: Move all dblists on noc to dblists/ directory, rather than individually [mediawiki-config] - 10https://gerrit.wikimedia.org/r/394199 [01:04:00] PROBLEM - puppet last run on mw1312 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [01:04:07] (03CR) 10Chad: [C: 032] Move all dblists on noc to dblists/ directory, rather than individually [mediawiki-config] - 10https://gerrit.wikimedia.org/r/394199 (owner: 10Chad) [01:05:30] (03Merged) 10jenkins-bot: Move all dblists on noc to dblists/ directory, rather than individually [mediawiki-config] - 10https://gerrit.wikimedia.org/r/394199 (owner: 10Chad) [01:07:14] !log demon@tin Synchronized tests/: no-op (duration: 00m 59s) [01:07:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:08:30] !log demon@tin Synchronized docroot/noc/: dblists cleanup (duration: 00m 57s) [01:08:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:08:48] (03CR) 10jenkins-bot: Move all dblists on noc to dblists/ directory, rather than individually [mediawiki-config] - 10https://gerrit.wikimedia.org/r/394199 (owner: 10Chad) [01:11:17] no_justification: ah, the raw view is broken as wlel [01:11:23] not just highlight [01:11:26] so yeah, need to preserve the prefix from index [01:12:01] no_justification: also, there's at least one other place that needs to be fixed to not do basename() [01:12:12] because the git and raw link from https://noc.wikimedia.org/conf/highlight.php?file=dblists/all-labs.dblist is also wrong [01:13:10] !log added eqsin ipv6 range to botpasswords ip range restriction T188111 [01:13:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:13:25] T188111: Cannot fetch Zero carriers/proxies JSON files from eqsin - https://phabricator.wikimedia.org/T188111 [01:14:13] Working on it [01:15:31] (03CR) 10Krinkle: Move all dblists on noc to dblists/ directory, rather than individually (032 comments) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/394199 (owner: 10Chad) [01:15:33] (03PS1) 10Chad: noc/index.php: Don't use basename on dblists files [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413911 [01:15:35] (03CR) 10Chad: [C: 032] noc/index.php: Don't use basename on dblists files [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413911 (owner: 10Chad) [01:15:36] no_justification: ^ just in case :) [01:15:38] That'll fix the index.php part [01:16:17] Yeah [01:16:51] (03Merged) 10jenkins-bot: noc/index.php: Don't use basename on dblists files [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413911 (owner: 10Chad) [01:18:36] (03CR) 10jenkins-bot: noc/index.php: Don't use basename on dblists files [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413911 (owner: 10Chad) [01:18:39] !log demon@tin Synchronized docroot/noc/conf/index.php: fix dblist links from listing (duration: 00m 56s) [01:18:43] (03PS1) 10Chad: noc/highlight.php: Fix up some path detection for dblists files [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413914 [01:18:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:19:00] 10Operations, 10Traffic: Enable Service in Asia Cache DC - https://phabricator.wikimedia.org/T156026#3998413 (10Reedy) [01:19:06] 10Operations, 10Traffic, 10Zero, 10ZeroPortal: Cannot fetch Zero carriers/proxies JSON files from eqsin - https://phabricator.wikimedia.org/T188111#3998410 (10Reedy) 05Open>03Resolved a:03Reedy @bblack and I had come to the same solution around the same sort of time The `zerofetcher` user had a bot_... [01:19:32] Or not. [01:19:36] I hate this code. [01:28:24] (03PS1) 10Chad: highlight.php: Swap Diffusion for Gitiles [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413923 [01:29:00] RECOVERY - puppet last run on mw1312 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [01:31:23] no_justification: You'll want it unescaped in that case, I think [01:31:34] Gitiles, like GitHub, uses file paths as normal url paths [01:31:42] btw, this is testable locally [01:31:45] I'm still trying to figure out why the links won't work [01:31:48] I made it all use relative things within noc/ [01:31:54] so mount it anywhere under any document root and it works [01:32:00] php -s, or some random apache you have [01:32:19] locally I have it at localhost:8080/dev/wikimedia/operations/mediawiki-config/docroots/noc/ [01:32:24] and works :) [01:34:54] `php -S 127.0.0.1 -t .` [01:34:55] :) [01:36:01] (03CR) 10Krinkle: Point Mediawiki Monolog at Kafka jumbo in production (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413796 (https://phabricator.wikimedia.org/T188136) (owner: 10Ottomata) [01:36:28] Ok, local version works now [01:36:32] Pushing final amending [01:38:07] (03CR) 10Chad: [C: 032] noc/highlight.php: Fix up some path detection for dblists files [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413914 (owner: 10Chad) [01:38:17] (03CR) 10Chad: [C: 032] highlight.php: Swap Diffusion for Gitiles [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413923 (owner: 10Chad) [01:38:51] (03PS1) 10Chad: Fix highlight.php links one last time [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413933 [01:39:11] (03CR) 10Chad: [C: 032] Fix highlight.php links one last time [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413933 (owner: 10Chad) [01:39:31] (03Merged) 10jenkins-bot: noc/highlight.php: Fix up some path detection for dblists files [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413914 (owner: 10Chad) [01:39:33] (03Merged) 10jenkins-bot: highlight.php: Swap Diffusion for Gitiles [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413923 (owner: 10Chad) [01:39:42] (03CR) 10jenkins-bot: noc/highlight.php: Fix up some path detection for dblists files [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413914 (owner: 10Chad) [01:40:36] (03Merged) 10jenkins-bot: Fix highlight.php links one last time [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413933 (owner: 10Chad) [01:42:08] !log demon@tin Synchronized docroot/noc/conf/highlight.php: one last time (duration: 00m 57s) [01:42:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [01:44:01] (03CR) 10Krinkle: [C: 04-1] "Given group0 is used in wmf-config at run-time (InitialiseSettings keys), this must be pre-computed to avoid dblist computation overhead a" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413784 (owner: 10Chad) [01:44:30] (03PS1) 10Chad: highlight.php: Don't use the escaped URL for the raw URL either [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413939 [01:45:17] (03CR) 10Krinkle: [C: 04-1] highlight.php: Don't use the escaped URL for the raw URL either (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413939 (owner: 10Chad) [01:45:19] (03Abandoned) 10Chad: Shuffle group0 wikis a bit [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413784 (owner: 10Chad) [01:46:20] Ah yes, urlencode not htmlspecialchars [01:46:40] (03PS2) 10Chad: highlight.php: Don't use the escaped URL for the raw URL either [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413939 [01:56:32] (03PS1) 10Dzahn: microsites: profile to setup design.wikimedia.org [puppet] - 10https://gerrit.wikimedia.org/r/413952 (https://phabricator.wikimedia.org/T185282) [01:58:39] (03CR) 10Krinkle: "Gerrit's syntax highlighter found an error (I think)" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/413952 (https://phabricator.wikimedia.org/T185282) (owner: 10Dzahn) [01:58:52] no_justification: figuring out a better test btw, for dblist/computeed. [02:01:41] PROBLEM - Check systemd state on stat1005 is CRITICAL: Return code of 255 is out of bounds [02:02:10] PROBLEM - Disk space on stat1005 is CRITICAL: Return code of 255 is out of bounds [02:02:21] PROBLEM - MD RAID on stat1005 is CRITICAL: Return code of 255 is out of bounds [02:02:30] PROBLEM - DPKG on stat1005 is CRITICAL: Return code of 255 is out of bounds [02:02:31] PROBLEM - configured eth on stat1005 is CRITICAL: Return code of 255 is out of bounds [02:02:40] PROBLEM - dhclient process on stat1005 is CRITICAL: Return code of 255 is out of bounds [02:03:14] Krinkle that’s not invalid [02:03:33] You fell down the same path I did with the highlighter [02:03:38] paladox: is fine [02:03:46] but is not, right? [02:03:49] Oh wait [02:03:52] it's not self-closing [02:03:58] its the first attribute for root [02:04:06] Ha [02:04:11] Thanks [02:04:19] (03CR) 10Krinkle: microsites: profile to setup design.wikimedia.org (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/413952 (https://phabricator.wikimedia.org/T185282) (owner: 10Dzahn) [02:04:19] Your welcome :) [02:04:41] PROBLEM - puppet last run on stat1005 is CRITICAL: Return code of 255 is out of bounds [02:04:55] Krinkle: it’s due to the fact, erb can host any time of syn tax [02:05:05] So it highlights only erb syntax [02:05:09] stat1005 is alive [02:05:16] Not any other syntax that’s in the file [02:05:22] but java busy busy [02:05:28] (This is codemirror) [02:07:10] PROBLEM - Check the NTP synchronisation status of timesyncd on stat1005 is CRITICAL: Return code of 255 is out of bounds [02:08:17] milimetric: ^ looks like you are working on that, right [02:08:53] Krinkle: pg view may help with that view. It uses a different library [02:08:59] eh, maybe more another user, but yea, it's doing things [02:10:50] Krinkle tbh. I hate computed DB lists anyway [02:11:03] Also we have E_TOOMANYDBLISTS [02:12:15] (03PS1) 10Krinkle: tests: Add test to enforce dblists using expressions are pre-computed [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413969 [02:13:36] gettingstarted-with-category-suggestions.dblist [02:13:43] (03CR) 10jerkins-bot: [V: 04-1] tests: Add test to enforce dblists using expressions are pre-computed [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413969 (owner: 10Krinkle) [02:14:48] (03PS2) 10Krinkle: tests: Add test to enforce dblists using expressions are pre-computed [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413969 [02:18:33] (03PS2) 10Dzahn: microsites: profile to setup design.wikimedia.org [puppet] - 10https://gerrit.wikimedia.org/r/413952 (https://phabricator.wikimedia.org/T185282) [02:21:14] (03CR) 10Dzahn: [C: 032] "http://puppet-compiler.wmflabs.org/10133/bromine.eqiad.wmnet/" [puppet] - 10https://gerrit.wikimedia.org/r/413952 (https://phabricator.wikimedia.org/T185282) (owner: 10Dzahn) [02:23:57] (03PS1) 10Krinkle: Remove unused pp_stage1_raw dblist [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413978 [02:25:20] RECOVERY - Disk space on stat1005 is OK: DISK OK [02:25:26] (03CR) 10jerkins-bot: [V: 04-1] Remove unused pp_stage1_raw dblist [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413978 (owner: 10Krinkle) [02:25:30] RECOVERY - MD RAID on stat1005 is OK: OK: Active: 8, Working: 8, Failed: 0, Spare: 0 [02:25:40] RECOVERY - DPKG on stat1005 is OK: All packages OK [02:25:41] RECOVERY - configured eth on stat1005 is OK: OK - interfaces up [02:25:41] RECOVERY - dhclient process on stat1005 is OK: PROCS OK: 0 processes with command name dhclient [02:25:47] (03PS2) 10Krinkle: Remove unused pp_stage1_raw dblist [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413978 [02:25:51] RECOVERY - Check systemd state on stat1005 is OK: OK - running: The system is fully operational [02:26:02] (03CR) 10Krinkle: [C: 031] highlight.php: Don't use the escaped URL for the raw URL either [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413939 (owner: 10Chad) [02:29:41] RECOVERY - puppet last run on stat1005 is OK: OK: Puppet is currently enabled, last run 4 minutes ago with 0 failures [02:30:41] (03PS1) 10Dzahn: varnish: add misc director for design.wm.org -> bromine [puppet] - 10https://gerrit.wikimedia.org/r/413986 (https://phabricator.wikimedia.org/T185282) [02:34:21] mutante: btw, is there a task for misc sites being multi-dc? [02:35:48] Krinkle: i think no, but there should be. the timing is amazing: [02:35:50] 21:21 i need a copy of bromine in codfw maybe [02:35:55] 14 min :) [02:36:10] Ha!, I didn't know that. [02:36:14] maybe because you could say the whole content is just a git clone away .. [02:36:32] but still it saves much time to not have to create a new VM [02:36:37] mutante: Hehe, yeah, seems like puppet should be able to do it all on its own [02:36:52] yes, it does. it's really just about even less downtime [02:37:01] mutante: I was thinking about multi-dc because of the varnish misc directly only setting eqiad, whereas 'noc' has a nearby entry with both eqiad/codfw set [02:37:10] RECOVERY - Check the NTP synchronisation status of timesyncd on stat1005 is OK: OK: synced at Sat 2018-02-24 02:37:04 UTC. [02:37:26] yes, i want to do it [02:37:33] And also to distribute traffic nicely between DCs, and reducing latency for cache miss, in principle. [02:37:36] :) [02:37:47] yes, active-active [02:37:50] they are static sites :) [02:37:54] Yep [02:38:00] I assume they're mostly Varnish hits [02:38:04] but then so is MediaWiki [02:38:29] hm, right [02:39:44] 10Operations: create codfw-equivalent of bromine and make webserver_misc_static active/active in misc varnish - https://phabricator.wikimedia.org/T188163#3998458 (10Dzahn) [02:39:51] there.. let's go from there [02:57:52] 10Operations, 10Traffic, 10Zero, 10ZeroPortal: Cannot fetch Zero carriers/proxies JSON files from eqsin - https://phabricator.wikimedia.org/T188111#3998468 (10Tgr) >>! In T188111#3996786, @Mholloway wrote: > zerofetch.py is using the deprecated `action=login` login flow, including relying on fields that we... [02:58:25] (03PS1) 10Dzahn: design.wm.org: prepare for second dir for style guide (WIP) [puppet] - 10https://gerrit.wikimedia.org/r/414008 (https://phabricator.wikimedia.org/T185282) [03:01:11] (03PS1) 10MaxSem: beta: remove $wgFragmentMode, matches prod now [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414009 [03:01:13] (03PS1) 10MaxSem: beta: remove $wgSecureLogin [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414010 [03:01:15] (03PS1) 10MaxSem: beta: remove $wgStructuredChangeFiltersShowPreference [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414011 [03:01:17] (03PS1) 10MaxSem: beta: remove $wmgUseTimeless [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414012 [03:01:19] (03PS1) 10MaxSem: beta: remove $wmgUse3d [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414013 [03:01:21] (03PS1) 10MaxSem: Remove $wgUsejQueryThree [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414014 [03:01:23] (03PS1) 10MaxSem: Clean up $wgEchoPerUserBlacklist [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414015 [03:01:25] (03PS1) 10MaxSem: beta: remove $wmgMinervaNeue [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414016 [03:01:27] (03PS1) 10MaxSem: beta: remove $wgReadingListsCentralWiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414017 [03:01:29] (03PS1) 10MaxSem: beta: remove $wmgUseReadingLists [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414018 [03:06:10] 10Operations, 10Traffic, 10Zero, 10ZeroPortal: Cannot fetch Zero carriers/proxies JSON files from eqsin - https://phabricator.wikimedia.org/T188111#3998473 (10Mholloway) >nor is the 'token' response value included along with a NeedToken response. Is the part that threw me. A 'token' response value is stil... [04:50:56] (03CR) 10Krinkle: [C: 031] Remove $wgUsejQueryThree [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414014 (owner: 10MaxSem) [04:51:22] (03CR) 10Krinkle: [C: 031] beta: remove $wgSecureLogin [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414010 (owner: 10MaxSem) [04:51:35] (03CR) 10Krinkle: [C: 031] beta: remove $wmgUseTimeless [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414012 (owner: 10MaxSem) [04:52:06] MaxSem: Nice work [04:58:24] (03CR) 10Krinkle: [C: 031] Stop forcing php5 in `mwscript` [puppet] - 10https://gerrit.wikimedia.org/r/358896 (https://phabricator.wikimedia.org/T146285) (owner: 10Chad) [05:04:00] RECOVERY - haproxy failover on dbproxy1005 is OK: OK check_failover servers up 2 down 0 [05:07:00] PROBLEM - haproxy failover on dbproxy1005 is CRITICAL: CRITICAL check_failover servers up 2 down 1 [06:11:48] !log Reload haproxy on dbproxy1005 [06:12:06] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [06:12:20] RECOVERY - haproxy failover on dbproxy1005 is OK: OK check_failover servers up 2 down 0 [07:39:30] PROBLEM - toolschecker: Make sure enwiki dumps are not empty on checker.tools.wmflabs.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 SERVICE UNAVAILABLE - string OK not found on http://checker.tools.wmflabs.org:80/dumps - 288 bytes in 0.010 second response time [08:02:58] (03CR) 10jenkins-bot: highlight.php: Swap Diffusion for Gitiles [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413923 (owner: 10Chad) [08:03:00] (03CR) 10jenkins-bot: Fix highlight.php links one last time [mediawiki-config] - 10https://gerrit.wikimedia.org/r/413933 (owner: 10Chad) [08:03:50] PROBLEM - Host chlorine is DOWN: PING CRITICAL - Packet loss = 100% [08:04:00] PROBLEM - Host bohrium is DOWN: PING CRITICAL - Packet loss = 100% [08:04:01] PROBLEM - Host install1002 is DOWN: PING CRITICAL - Packet loss = 100% [08:04:01] PROBLEM - Host logstash1007 is DOWN: PING CRITICAL - Packet loss = 100% [08:04:01] PROBLEM - Host dubnium is DOWN: PING CRITICAL - Packet loss = 100% [08:04:20] PROBLEM - Host mwdebug1002 is DOWN: PING CRITICAL - Packet loss = 100% [08:04:40] PROBLEM - Host rutherfordium is DOWN: PING CRITICAL - Packet loss = 100% [08:04:40] PROBLEM - Host netmon1003 is DOWN: PING CRITICAL - Packet loss = 100% [08:04:40] PROBLEM - Host planet1001 is DOWN: PING CRITICAL - Packet loss = 100% [08:04:40] PROBLEM - Host releases1001 is DOWN: PING CRITICAL - Packet loss = 100% [08:04:50] PROBLEM - Host webperf1001 is DOWN: PING CRITICAL - Packet loss = 100% [08:05:10] PROBLEM - Host hassium is DOWN: PING CRITICAL - Packet loss = 100% [08:05:30] PROBLEM - SSH on ganeti1006 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:10:00] PROBLEM - Misc HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [1000.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes.json?panelId=3&fullscreen&orgId=1&var-site=All&var-cache_type=misc&var-status_type=5 [08:10:30] RECOVERY - SSH on ganeti1006 is OK: SSH OK - OpenSSH_6.7p1 Debian-5+deb8u4 (protocol 2.0) [08:10:40] RECOVERY - Host chlorine is UP: PING OK - Packet loss = 0%, RTA = 0.45 ms [08:10:40] RECOVERY - Host rutherfordium is UP: PING OK - Packet loss = 0%, RTA = 0.47 ms [08:10:47] 10Operations: Remove imagescaler cluster (aka 'rendering') - https://phabricator.wikimedia.org/T188062#3995055 (10brion) There has been a suggestion to recycle some of these as video scalers (T188075) if they're not needed elsewhere. Will have a sustained need for cpu capacity for a while to transition from VP8... [08:10:50] RECOVERY - Host mwdebug1002 is UP: PING OK - Packet loss = 0%, RTA = 0.45 ms [08:11:00] RECOVERY - Host dubnium is UP: PING OK - Packet loss = 0%, RTA = 1.05 ms [08:11:00] RECOVERY - Host logstash1007 is UP: PING OK - Packet loss = 0%, RTA = 0.87 ms [08:11:10] RECOVERY - Host webperf1001 is UP: PING OK - Packet loss = 0%, RTA = 0.44 ms [08:11:10] RECOVERY - Host hassium is UP: PING OK - Packet loss = 0%, RTA = 0.93 ms [08:11:10] RECOVERY - Host releases1001 is UP: PING OK - Packet loss = 0%, RTA = 1.65 ms [08:11:20] RECOVERY - Host netmon1003 is UP: PING OK - Packet loss = 0%, RTA = 0.85 ms [08:11:20] RECOVERY - Host bohrium is UP: PING OK - Packet loss = 0%, RTA = 1.50 ms [08:11:30] RECOVERY - Host planet1001 is UP: PING OK - Packet loss = 0%, RTA = 1.00 ms [08:13:00] RECOVERY - Host install1002 is UP: PING OK - Packet loss = 0%, RTA = 0.48 ms [08:15:55] (03PS13) 10Elukey: [WIP] eventlogging: add systemd support [puppet] - 10https://gerrit.wikimedia.org/r/413362 [08:23:01] RECOVERY - Misc HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes.json?panelId=3&fullscreen&orgId=1&var-site=All&var-cache_type=misc&var-status_type=5 [09:34:02] (03PS25) 10Zoranzoki21: Add namespaces to urwiktionary [mediawiki-config] - 10https://gerrit.wikimedia.org/r/407901 (https://phabricator.wikimedia.org/T186393) [11:22:33] addshore, Antoine (hashar), Brad (anomie), Katie (aude), Max (MaxSem), Mukunda (twentyafterfour), Roan (RoanKattouw), Sébastien (Dereckson), Tyler (thcipriani), Niharika (Niharika), or Željko (zeljkof) : Someone Have production shell access? [14:34:46] Could I please have an op check logstash for T188171 please? [14:34:46] T188171: Please unblock stuck global rename of SimonFoundationContinence to Drytime%$1600 - https://phabricator.wikimedia.org/T188171 [15:42:51] RECOVERY - toolschecker: Make sure enwiki dumps are not empty on checker.tools.wmflabs.org is OK: HTTP OK: HTTP/1.1 200 OK - 166 bytes in 0.012 second response time [16:13:16] (03PS1) 10BryanDavis: wikireplica_dns: Adjust dblist retrieval path [puppet] - 10https://gerrit.wikimedia.org/r/414107 [16:15:06] 10Operations, 10hardware-requests: Site: (2) hardware access request for videoscalers - https://phabricator.wikimedia.org/T188075#3998922 (10brion) Rough plan is to get two new r430s with roughly the same config as the old image scalers, and also repurpose as many of the old ones as are available. If any ques... [17:27:51] (03PS1) 10Framawiki: Enable DynamicPageList extension on bdwikimedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414109 (https://phabricator.wikimedia.org/T188109) [17:40:03] (03PS1) 10Framawiki: Enable rollback for editors at zh_classicalwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414114 (https://phabricator.wikimedia.org/T188064) [17:41:42] (03CR) 10Framawiki: "I'm not sure if I need to add 'zh_classicalwiki' or '+zh_classicalwiki', any idea ?" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414114 (https://phabricator.wikimedia.org/T188064) (owner: 10Framawiki) [17:47:10] 10Operations, 10Traffic, 10Zero, 10ZeroPortal: Cannot fetch Zero carriers/proxies JSON files from eqsin - https://phabricator.wikimedia.org/T188111#3999055 (10Tgr) Yeah, that was meant in the sense //"nor is the 'token' response value included along with a NeedToken response removed"//. I guess there are t... [17:54:04] (03PS1) 10Framawiki: Enable responsive references by default on rowiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414115 (https://phabricator.wikimedia.org/T187997) [17:56:12] (03CR) 10Zoranzoki21: [C: 031] Enable responsive references by default on rowiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414115 (https://phabricator.wikimedia.org/T187997) (owner: 10Framawiki) [17:56:50] (03CR) 10Zoranzoki21: [C: 031] Enable DynamicPageList extension on bdwikimedia [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414109 (https://phabricator.wikimedia.org/T188109) (owner: 10Framawiki) [18:32:10] PROBLEM - HHVM jobrunner on mw1301 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Service Unavailable - 473 bytes in 0.001 second response time [18:33:10] RECOVERY - HHVM jobrunner on mw1301 is OK: HTTP OK: HTTP/1.1 200 OK - 206 bytes in 0.002 second response time [19:41:12] 10Operations, 10Pybal, 10Traffic: Some etcd connections not established at startup - https://phabricator.wikimedia.org/T188087#3999137 (10Vgutierrez) I've been doing some tests regarding the timeouts, and my current theory is that the `timeout = 0` is there to enable the HTTP long polling. The timeout that a... [20:22:01] (03CR) 10MarcoAurelio: "I'd say to add '+' so it also inherits the defaults. Maybe @Urbanecm knows better." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414114 (https://phabricator.wikimedia.org/T188064) (owner: 10Framawiki) [20:23:13] Urbanecm: ^^ [20:23:36] Hauskatze, what's happening? [20:24:01] Urbanecm: could you please see that patch? There's a little question you may be able to answer. [20:24:21] Hauskatze, sure, looking at it [20:24:40] thanks [20:28:04] Hauskatze, there's no need for +dbname, the applying code takes care about not overriding default settings [20:28:20] so it's a remnant from the past? [20:29:30] PROBLEM - HHVM jobrunner on mw1309 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:30:17] Hauskatze, yep. See lines 1267-1279 in CS.php [20:30:21] RECOVERY - HHVM jobrunner on mw1309 is OK: HTTP OK: HTTP/1.1 200 OK - 206 bytes in 0.011 second response time [20:32:28] Hauskatze: There's checking if wiki have something in $wgGroupPermissions, if so, it merges content in groupOverrides(2) with $wgGroupPermissions. [20:37:07] Hauskatze, the + is neeeded just for private, fishbowl or closed wikis, if we want to keep settings for private/fishbowl/closed wikis in force. [20:37:21] I'll summarize it to the gerrit patch [20:39:06] ty [20:39:29] (03CR) 10Urbanecm: [C: 031] "@Farmawiki: There's no need to add the + if the wiki is not a) private b) fishbowl c) closed AND you want to keep privileges settings set " [mediawiki-config] - 10https://gerrit.wikimedia.org/r/414114 (https://phabricator.wikimedia.org/T188064) (owner: 10Framawiki) [20:39:54] yw [20:52:50] PROBLEM - MegaRAID on db1068 is CRITICAL: CRITICAL: 1 failed LD(s) (Degraded) [20:52:51] ACKNOWLEDGEMENT - MegaRAID on db1068 is CRITICAL: CRITICAL: 1 failed LD(s) (Degraded) nagiosadmin RAID handler auto-ack: https://phabricator.wikimedia.org/T188187 [20:52:57] 10Operations, 10ops-eqiad: Degraded RAID on db1068 - https://phabricator.wikimedia.org/T188187#3999213 (10ops-monitoring-bot) [21:02:00] PROBLEM - Router interfaces on cr1-eqord is CRITICAL: CRITICAL: host 208.80.154.198, interfaces up: 37, down: 1, dormant: 0, excluded: 0, unused: 0 [21:02:00] PROBLEM - Router interfaces on cr2-eqiad is CRITICAL: CRITICAL: host 208.80.154.197, interfaces up: 224, down: 1, dormant: 0, excluded: 0, unused: 0