[00:41:12] (03PS1) 10AndyRussG: Make $wgMessageCacheType on beta cluster the same as on production [mediawiki-config] - 10https://gerrit.wikimedia.org/r/316204 (https://phabricator.wikimedia.org/T144952) [00:41:36] (03PS2) 10AndyRussG: Make $wgMessageCacheType on beta cluster the same as on production [mediawiki-config] - 10https://gerrit.wikimedia.org/r/316204 (https://phabricator.wikimedia.org/T144952) [02:21:04] !log mwdeploy@tin scap sync-l10n completed (1.28.0-wmf.22) (duration: 06m 50s) [02:21:10] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [02:26:45] !log l10nupdate@tin ResourceLoader cache refresh completed at Sun Oct 16 02:26:45 UTC 2016 (duration 5m 41s) [02:26:51] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [04:21:10] 06Operations, 10Fundraising-Backlog, 10MediaWiki-extensions-CentralNotice, 03Fundraising Sprint Stirring The Pot, and 4 others: Banner not showing up on site - https://phabricator.wikimedia.org/T144952#2719901 (10AndyRussG) @aaron thanks for merging the [[ https://gerrit.wikimedia.org/r/315712 | patch ]]!!... [05:02:28] 06Operations, 10Fundraising-Backlog, 10MediaWiki-extensions-CentralNotice, 03Fundraising Sprint Stirring The Pot, and 4 others: Banner not showing up on site - https://phabricator.wikimedia.org/T144952#2719931 (10AndyRussG) Maybe this is it? Or at least something? `$wgMaxMsgCacheEntrySize` is 1024. I'm su... [06:37:41] 06Operations, 10OTRS: clean up non-working otrs email addresses - https://phabricator.wikimedia.org/T84044#2719956 (10Peachey88) [08:25:09] (03PS1) 10Elukey: Allow the wmde LDAP group to access pivot.w.o [puppet] - 10https://gerrit.wikimedia.org/r/316217 [08:44:34] (03CR) 10Peachey88: "Is there a Phabricator Task to attached to this changedset?" [puppet] - 10https://gerrit.wikimedia.org/r/316217 (owner: 10Elukey) [08:48:44] (03CR) 10Elukey: "Not going to merge now, it was only a proof of concept to show :)" [puppet] - 10https://gerrit.wikimedia.org/r/316217 (owner: 10Elukey) [10:20:19] (03CR) 10Alex Monk: "how sensitive is the data in that system? does it require ndas?" [puppet] - 10https://gerrit.wikimedia.org/r/316217 (owner: 10Elukey) [10:36:38] <_joe_> !log restarting hhvm on mw120[0-8] [10:36:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [11:15:22] (03PS1) 10Giuseppe Lavagetto: pooler-loop: ignore unreachable/down pybals [puppet] - 10https://gerrit.wikimedia.org/r/316219 [11:19:10] (03CR) 10Giuseppe Lavagetto: [C: 032] pooler-loop: ignore unreachable/down pybals [puppet] - 10https://gerrit.wikimedia.org/r/316219 (owner: 10Giuseppe Lavagetto) [11:19:17] (03PS2) 10Giuseppe Lavagetto: pooler-loop: ignore unreachable/down pybals [puppet] - 10https://gerrit.wikimedia.org/r/316219 [11:19:22] (03CR) 10Giuseppe Lavagetto: [V: 032] pooler-loop: ignore unreachable/down pybals [puppet] - 10https://gerrit.wikimedia.org/r/316219 (owner: 10Giuseppe Lavagetto) [14:05:57] !log mwscript resetUserEmail.php --wiki=fawiki Ebrambot [14:06:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [14:09:36] Hi, at cswiki watchlist only changes from last 3 days are shown. Is it possible to change this number of days per wiki? Another similar idea, is it able to watchlist pages which the logged in user edited by default? I.e. check corresponding checkbox in preferences? [14:09:57] I know both is technically possible. But will be request like this accepted? [14:10:42] There is a discussion about this idea at cswiki Village pump so if this won't be accepted the discussion shouldn't continue. [14:12:21] If theres concensus yes [14:13:19] Urbanecm [14:14:13] Okay, so I'll work on consensus and then fill a request. Thanks a lot! [14:14:22] No problem [16:16:50] (03CR) 10Paladox: "recheck" [puppet] - 10https://gerrit.wikimedia.org/r/316228 (https://phabricator.wikimedia.org/T39602) (owner: 10Paladox) [17:39:24] (03CR) 10Legoktm: "Ping...?" [puppet] - 10https://gerrit.wikimedia.org/r/308904 (owner: 10Legoktm) [19:20:44] (03CR) 10Ejegg: [C: 031] Make $wgMessageCacheType on beta cluster the same as on production [mediawiki-config] - 10https://gerrit.wikimedia.org/r/316204 (https://phabricator.wikimedia.org/T144952) (owner: 10AndyRussG) [20:09:54] 06Operations, 10Wikimedia-Site-requests, 13Patch-For-Review: Private wiki for Project Grants Committee - https://phabricator.wikimedia.org/T143138#2720586 (10Ruslik0) As a committee member I am fine with projectcom.wikimedia.org. It seems the best choice among available names. [20:35:12] anyone around? [20:35:16] It's rather urgent [20:35:59] It's probably someone loading up lots of requests on ores [20:36:09] it starts to time out a lots [20:36:15] https://grafana.wikimedia.org/dashboard/db/ores [20:36:34] robh: you're the ops clinic duty [20:38:36] akosiaris: ^ [20:38:48] You need an opsen? [20:39:03] I need someone with +2 in puppet [20:39:33] link to patch? [20:39:44] I make it right now [20:39:46] clinic duty doesn't mean he's about ;) [20:40:23] I guess I don't know fully about it. [20:42:45] (03PS1) 10Ladsgroup: ores: Increase capacity [puppet] - 10https://gerrit.wikimedia.org/r/316271 [20:42:58] Reedy: ^ [20:43:18] Just seeing who I can find [20:43:42] icinga is sending "ores is down" alarms to us [20:43:48] practically it's down [20:45:05] Amir1: are you sure that changing those values will help? Presumably there are finite hardware resources at some level [20:45:14] someone is sending lots of requests or someone is making tons of edit in a wiki. If increasing capacity doesn't help. It's some sort of DDoS [20:45:38] andrewbogott: https://ganglia.wikimedia.org/latest/graph.php?r=day&z=xlarge&h=scb1001.eqiad.wmnet&m=cpu_report&s=by+name&mc=2&g=mem_report&c=Service+Cluster+B+eqiad [20:45:40] it's okay [20:46:10] CPU is always okay, the bottleneck is memory which is fine for now [20:48:15] graphs to see what's happening: https://grafana.wikimedia.org/dashboard/db/ores [20:48:41] "PROBLEM - ORES worker production on ores.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds" in #wikimedia-ai [20:49:07] http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-ai/20161016.txt [20:49:24] (03CR) 10Andrew Bogott: [C: 032] ores: Increase capacity [puppet] - 10https://gerrit.wikimedia.org/r/316271 (owner: 10Ladsgroup) [20:52:51] 06Operations, 10Fundraising-Backlog, 10MediaWiki-extensions-CentralNotice, 03Fundraising Sprint Stirring The Pot, and 4 others: Banner not showing up on site - https://phabricator.wikimedia.org/T144952#2720621 (10AndyRussG) Aaarg please scratch the [[ https://phabricator.wikimedia.org/T144952#2719931 | pre... [20:54:17] Amir1: any better? [20:54:31] andrewbogott: I'm restarting services [20:54:36] we'll know soon [20:58:56] !log ladsgroup@scb[12]00[12]: sudo service celery-ores-worker restart [20:59:02] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [21:16:35] Amir1: so, same problem still? [21:19:59] 06Operations, 10ops-eqiad, 06Labs: labvirt1005 RAID critical - https://phabricator.wikimedia.org/T148345#2720638 (10Andrew) [22:00:17] 06Operations, 06Performance-Team, 10Thumbor, 13Patch-For-Review: thumbor memory limits for main process and subprocesses - https://phabricator.wikimedia.org/T145623#2720668 (10Gilles) [23:02:06] 06Operations, 06Performance-Team, 10Thumbor, 13Patch-For-Review: thumbor memory limits for main process and subprocesses - https://phabricator.wikimedia.org/T145623#2720692 (10Gilles) OK, now that I've implemented this in a different way, I've figured out why cgexec wasn't working. Firejail blocks access t... [23:11:12] 06Operations, 10ops-eqiad, 06Labs: labvirt1005 RAID critical - https://phabricator.wikimedia.org/T148345#2720701 (10Peachey88) [23:11:14] 06Operations, 10ops-eqiad, 06Labs, 10Labs-Infrastructure: labvirt1005 - HP RAID controller issue (battery?) - https://phabricator.wikimedia.org/T148255#2720704 (10Peachey88)