[00:04:27] (03PS1) 10Dzahn: icinga: add notification type to SMS content [puppet] - 10https://gerrit.wikimedia.org/r/406535 (https://phabricator.wikimedia.org/T185862) [00:05:36] 10Operations, 10monitoring, 10Patch-For-Review: icinga ACK shows as CRIT when delivered via SMS - https://phabricator.wikimedia.org/T185862#3927084 (10Dzahn) p:05Triage>03Normal [00:10:35] (03PS2) 10Dzahn: icinga: add notification type to SMS content [puppet] - 10https://gerrit.wikimedia.org/r/406535 (https://phabricator.wikimedia.org/T185862) [00:11:35] 10Operations, 10ops-codfw, 10DBA, 10Patch-For-Review: db2036 storage issues? (mysql crashed, installer issues) - https://phabricator.wikimedia.org/T185294#3927095 (10Marostegui) 05stalled>03Resolved a:03jcrespo I am going to close this as resolved as the server got repooled back. If it happens again,... [02:03:59] PROBLEM - MariaDB Slave Lag: s6 on db1102 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 316.62 seconds [02:04:59] RECOVERY - MariaDB Slave Lag: s6 on db1102 is OK: OK slave_sql_lag Replication lag: 0.00 seconds [02:21:09] PROBLEM - Check health of redis instance on 6380 on rdb2004 is CRITICAL: CRITICAL ERROR - Redis Library - can not ping 127.0.0.1 on port 6380 [02:22:10] RECOVERY - Check health of redis instance on 6380 on rdb2004 is OK: OK: REDIS 2.8.17 on 127.0.0.1:6380 has 1 databases (db0) with 7561486 keys, up 3 minutes 20 seconds [02:23:49] !log l10nupdate@tin scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 47s) [02:24:04] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [02:37:17] 10Operations, 10Ops-Access-Requests, 10Research, 10Research-collaborations, 10Patch-For-Review: Analytics cluster access request for ISI Foundation team - https://phabricator.wikimedia.org/T141634#3927173 (10Nuria) @Simonjoylet: a formal collaboration request for a research project (and acceptance) is ne... [02:55:00] 10Operations: Some Core availability Catchpoint tests might be more expensive than they need to be - https://phabricator.wikimedia.org/T162857#3927174 (10faidon) p:05Normal>03High a:03Volans [03:21:00] PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 637.14 seconds [03:24:40] 10Operations, 10Gerrit, 10Release-Engineering-Team: Add prometheus exporter to Gerrit - https://phabricator.wikimedia.org/T184086#3927178 (10demon) We'll do it separately. [03:59:29] PROBLEM - Nginx local proxy to apache on mw2132 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:00:19] RECOVERY - Nginx local proxy to apache on mw2132 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 617 bytes in 0.198 second response time [04:01:19] RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 288.06 seconds [05:09:30] PROBLEM - Check Varnish expiry mailbox lag on cp4021 is CRITICAL: CRITICAL: expiry mailbox lag is 2135673 [05:34:26] test [05:34:54] Does anyone have insight in how to reset a wikipedia password? [05:37:25] bakonydraco: https://en.wikipedia.org/wiki/Special:PasswordReset [05:39:57] problem is i never tied it to an email :( [05:40:03] I assume I'm just SOL [05:40:31] but it'd be super cool if there were a way to get through [05:40:50] My wikipedia handle is the same as my IRC/Reddit and one character off my gmail [05:43:22] bakonydraco: Exceptions are made if it's an account with substantial contributions. [05:44:18] honestly not too substantial [05:44:29] i had one page i edited each year for 3 years when the event happened [05:45:18] looks like about 80 total edits: https://en.wikipedia.org/w/index.php?title=Special:Contributions&dir=prev&offset=20140108205308&contribs=user&target=Bakonydraco&namespace=&tagfilter=&start=&end= [05:46:25] Ivy: who do I appeal to to make a determination on substance? [05:47:44] It's a bit unusual for someone to forget his or her password, have made a bunch of substantial contributions, and never have set an e-mail address. [05:49:07] https://wikitech.wikimedia.org/wiki/Password_reset [05:51:08] also, silly question: is Wikipedia the same as Wikimedia? [05:53:15] no, Wikimedia is the foundation that keeps the servers running, etc [05:54:32] so would there be separate username/pws for wikimedia/wikipedia? [05:55:10] bakonydraco: What are you considering "wikimedia"? [05:55:23] meta.wikimedia.org will use the same password as commons.wikimedia.org or en.wikipedia.org. [05:55:32] ah cool! [05:55:45] User accounts and passwords have been fully unified across public Wikimedia wikis. [05:57:37] ah wait a minute, I'm now logged in as Bakonydraco on wikimedia.org, but i still can't log in on wikipedia.org with the exact same username/pw [05:58:31] What does "wikimedia.org" mean? [05:59:17] uh, wikitech.wikimedia.org [06:01:01] ah yeah there we go, I can log in to wikitech.wikimedia.org, but not commons.wikimedia.org, meta.wikimedia.org, or en.wikipedia.org [06:03:11] ahh so wikitech is a private wikimedia wiki, hence it's not uniform [06:04:25] Yeah, wikitech.wikimedia.org is a special case. [06:04:28] anyway, Ivy , who can I petition to do https://wikitech.wikimedia.org/wiki/Password_reset ? Seems like it needs to be someone with access to a deployment host? [06:04:37] But if you have a login there, you can use that as an identity verification mechanism. [06:05:20] wonderful! [06:05:54] You have to ask someone with shell access. Probably during SF business hours in here. [06:06:32] File a ticket in Phabricator at https://phabricator.wikimedia.org and link to some edit on wikitech.wikimedia.org I guess? [06:08:03] Ah thank you! [06:26:19] okay Ivy filed a ticket, thanks for your help! [06:42:49] PROBLEM - Check Varnish expiry mailbox lag on cp4022 is CRITICAL: CRITICAL: expiry mailbox lag is 2038754 [07:12:49] PROBLEM - Check Varnish expiry mailbox lag on cp4022 is CRITICAL: CRITICAL: expiry mailbox lag is 2021647 [07:24:00] (03CR) 10Chad: "Someone either needs to push this through to production, or abandon it. The constant rebasing is getting really freaking annoying in my in" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392184 (https://phabricator.wikimedia.org/T45956) (owner: 10TerraCodes) [07:29:29] PROBLEM - HHVM rendering on mw2134 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:30:19] RECOVERY - HHVM rendering on mw2134 is OK: HTTP OK: HTTP/1.1 200 OK - 78202 bytes in 0.277 second response time [07:32:30] (03CR) 10TerraCodes: [C: 031] "> Someone either needs to push this through to production, or abandon" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392184 (https://phabricator.wikimedia.org/T45956) (owner: 10TerraCodes) [07:55:20] PROBLEM - HHVM rendering on mw2137 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:56:10] RECOVERY - HHVM rendering on mw2137 is OK: HTTP OK: HTTP/1.1 200 OK - 78296 bytes in 0.424 second response time [08:37:12] 10Operations, 10Icinga, 10monitoring, 10Patch-For-Review: icinga ACK shows as CRIT when delivered via SMS - https://phabricator.wikimedia.org/T185862#3927398 (10Peachey88) [09:27:07] (03PS6) 10MarcoAurelio: Remove upload rights on wikis where local uploads are disabled [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405421 (https://phabricator.wikimedia.org/T143789) [09:53:05] (03CR) 10MarcoAurelio: "https://gerrit.wikimedia.org/r/#/c/406025/ is going out today as well, which would make this changeset redundant. What do you think?" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406012 (https://phabricator.wikimedia.org/T185597) (owner: 10Jayprakash12345) [10:26:29] PROBLEM - puppet last run on cp3036 is CRITICAL: CRITICAL: Catalog fetch fail. Either compilation failed or puppetmaster has issues [10:56:19] RECOVERY - puppet last run on cp3036 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [11:00:04] jan_drewniak: How many deployers does it take to do Wikimedia Portals Update deploy? (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180129T1100). [11:00:04] No GERRIT patches in the queue for this window AFAICS. [11:31:19] PROBLEM - Varnish HTTP upload-frontend - port 3127 on cp3035 is CRITICAL: connect to address 10.20.0.170 and port 3127: Connection refused [11:31:40] PROBLEM - Varnish HTTP upload-frontend - port 3121 on cp3035 is CRITICAL: connect to address 10.20.0.170 and port 3121: Connection refused [11:31:49] PROBLEM - Varnish HTTP upload-frontend - port 80 on cp3035 is CRITICAL: connect to address 10.20.0.170 and port 80: Connection refused [11:31:49] PROBLEM - Varnish HTTP upload-frontend - port 3120 on cp3035 is CRITICAL: connect to address 10.20.0.170 and port 3120: Connection refused [11:31:49] PROBLEM - Varnish HTTP upload-frontend - port 3122 on cp3035 is CRITICAL: connect to address 10.20.0.170 and port 3122: Connection refused [11:31:49] PROBLEM - Varnish HTTP upload-frontend - port 3125 on cp3035 is CRITICAL: connect to address 10.20.0.170 and port 3125: Connection refused [11:31:59] PROBLEM - Check systemd state on cp3035 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [11:32:09] PROBLEM - Varnish HTTP upload-frontend - port 3126 on cp3035 is CRITICAL: connect to address 10.20.0.170 and port 3126: Connection refused [11:32:09] PROBLEM - Varnish HTTP upload-frontend - port 3123 on cp3035 is CRITICAL: connect to address 10.20.0.170 and port 3123: Connection refused [11:32:09] PROBLEM - Varnish HTTP upload-frontend - port 3124 on cp3035 is CRITICAL: connect to address 10.20.0.170 and port 3124: Connection refused [11:37:49] PROBLEM - Varnish HTTP upload-frontend - port 80 on cp3039 is CRITICAL: connect to address 10.20.0.174 and port 80: Connection refused [11:37:49] PROBLEM - Varnish HTTP upload-frontend - port 3127 on cp3039 is CRITICAL: connect to address 10.20.0.174 and port 3127: Connection refused [11:37:59] PROBLEM - Check systemd state on cp3039 is CRITICAL: CRITICAL - degraded: The system is operational but one or more units failed. [11:37:59] PROBLEM - Varnish HTTP upload-frontend - port 3123 on cp3039 is CRITICAL: connect to address 10.20.0.174 and port 3123: Connection refused [11:38:00] PROBLEM - Varnish HTTP upload-frontend - port 3126 on cp3039 is CRITICAL: connect to address 10.20.0.174 and port 3126: Connection refused [11:38:00] PROBLEM - Varnish HTTP upload-frontend - port 3122 on cp3039 is CRITICAL: connect to address 10.20.0.174 and port 3122: Connection refused [11:38:09] PROBLEM - Varnish HTTP upload-frontend - port 3121 on cp3039 is CRITICAL: connect to address 10.20.0.174 and port 3121: Connection refused [11:38:09] PROBLEM - Varnish HTTP upload-frontend - port 3125 on cp3039 is CRITICAL: connect to address 10.20.0.174 and port 3125: Connection refused [11:38:09] PROBLEM - Varnish HTTP upload-frontend - port 3124 on cp3039 is CRITICAL: connect to address 10.20.0.174 and port 3124: Connection refused [11:38:19] PROBLEM - Varnish HTTP upload-frontend - port 3120 on cp3039 is CRITICAL: connect to address 10.20.0.174 and port 3120: Connection refused [11:39:19] RECOVERY - Varnish HTTP upload-frontend - port 3120 on cp3039 is OK: HTTP OK: HTTP/1.1 200 OK - 502 bytes in 0.168 second response time [11:39:49] RECOVERY - Varnish HTTP upload-frontend - port 80 on cp3039 is OK: HTTP OK: HTTP/1.1 200 OK - 503 bytes in 0.168 second response time [11:39:49] RECOVERY - Varnish HTTP upload-frontend - port 3127 on cp3039 is OK: HTTP OK: HTTP/1.1 200 OK - 503 bytes in 0.168 second response time [11:39:59] RECOVERY - Check systemd state on cp3039 is OK: OK - running: The system is fully operational [11:40:00] RECOVERY - Varnish HTTP upload-frontend - port 3123 on cp3039 is OK: HTTP OK: HTTP/1.1 200 OK - 504 bytes in 0.168 second response time [11:40:00] RECOVERY - Varnish HTTP upload-frontend - port 3126 on cp3039 is OK: HTTP OK: HTTP/1.1 200 OK - 503 bytes in 0.168 second response time [11:40:00] RECOVERY - Varnish HTTP upload-frontend - port 3122 on cp3039 is OK: HTTP OK: HTTP/1.1 200 OK - 503 bytes in 0.168 second response time [11:40:09] RECOVERY - Varnish HTTP upload-frontend - port 3121 on cp3039 is OK: HTTP OK: HTTP/1.1 200 OK - 503 bytes in 0.168 second response time [11:40:09] RECOVERY - Varnish HTTP upload-frontend - port 3125 on cp3039 is OK: HTTP OK: HTTP/1.1 200 OK - 503 bytes in 0.168 second response time [11:40:09] RECOVERY - Varnish HTTP upload-frontend - port 3124 on cp3039 is OK: HTTP OK: HTTP/1.1 200 OK - 503 bytes in 0.168 second response time [11:45:49] RECOVERY - Varnish HTTP upload-frontend - port 3121 on cp3035 is OK: HTTP OK: HTTP/1.1 200 OK - 503 bytes in 0.168 second response time [11:45:49] RECOVERY - Varnish HTTP upload-frontend - port 80 on cp3035 is OK: HTTP OK: HTTP/1.1 200 OK - 503 bytes in 0.168 second response time [11:45:49] RECOVERY - Varnish HTTP upload-frontend - port 3120 on cp3035 is OK: HTTP OK: HTTP/1.1 200 OK - 503 bytes in 0.168 second response time [11:45:49] RECOVERY - Varnish HTTP upload-frontend - port 3122 on cp3035 is OK: HTTP OK: HTTP/1.1 200 OK - 503 bytes in 0.168 second response time [11:45:49] RECOVERY - Varnish HTTP upload-frontend - port 3125 on cp3035 is OK: HTTP OK: HTTP/1.1 200 OK - 503 bytes in 0.168 second response time [11:46:00] RECOVERY - Check systemd state on cp3035 is OK: OK - running: The system is fully operational [11:46:09] RECOVERY - Varnish HTTP upload-frontend - port 3126 on cp3035 is OK: HTTP OK: HTTP/1.1 200 OK - 503 bytes in 0.168 second response time [11:46:09] RECOVERY - Varnish HTTP upload-frontend - port 3123 on cp3035 is OK: HTTP OK: HTTP/1.1 200 OK - 503 bytes in 0.168 second response time [11:46:10] RECOVERY - Varnish HTTP upload-frontend - port 3124 on cp3035 is OK: HTTP OK: HTTP/1.1 200 OK - 503 bytes in 0.168 second response time [11:46:19] RECOVERY - Varnish HTTP upload-frontend - port 3127 on cp3035 is OK: HTTP OK: HTTP/1.1 200 OK - 502 bytes in 0.168 second response time [12:02:49] RECOVERY - Check Varnish expiry mailbox lag on cp4022 is OK: OK: expiry mailbox lag is 4 [12:09:05] (03PS3) 10ζ˜Ÿθ€€ζ™¨ζ›¦: Set Portal and Portal talk namespace alias of zhwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406487 (https://phabricator.wikimedia.org/T184866) [13:08:14] (03PS1) 10Ladsgroup: Enable lua fine grained usage tracking in more wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406578 (https://phabricator.wikimedia.org/T185032) [13:11:57] 10Operations, 10Ops-Access-Requests, 10Research, 10Research-collaborations, 10Patch-For-Review: Analytics cluster access request for ISI Foundation team - https://phabricator.wikimedia.org/T141634#3927818 (10Simonjoylet) >>! In T141634#3927173, @Nuria wrote: > @Simonjoylet: a formal collaboration request... [13:35:19] hi all [13:50:35] Too, can you give me cloak? I request it already before few days [13:50:43] *requested [13:51:07] Zoranzoki21, you need to wait. Only group contacts (people designated to communicate with freenode staff) can add it to your account [13:51:51] Urbanecm: Ok. Thank you for reply [13:56:04] (03PS2) 10Addshore: Switch to extension.json for PropertySuggester [mediawiki-config] - 10https://gerrit.wikimedia.org/r/395486 [13:56:08] (03PS2) 10Addshore: Switch to extension.json for WikibaseQuality extensions [mediawiki-config] - 10https://gerrit.wikimedia.org/r/395487 [13:56:12] (03PS2) 10Addshore: Switch to extension.json for Wikidata.org [mediawiki-config] - 10https://gerrit.wikimedia.org/r/395488 [13:57:11] (03PS2) 10Addshore: Update location of wikibase-rebuildTermSqlIndex script [puppet] - 10https://gerrit.wikimedia.org/r/395700 [13:58:21] (03PS2) 10Ladsgroup: Enable lua fine grained usage tracking in more wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406578 (https://phabricator.wikimedia.org/T185032) [14:00:05] addshore, hashar, anomie, no_justification, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: I seem to be stuck in Groundhog week. Sigh. Time for (yet another) European Mid-day SWAT(Max 8 patches) deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180129T1400). [14:00:05] rxy, Jhs, Zoranzoki21, Urbanecm, and Amir1: A patch you scheduled for European Mid-day SWAT(Max 8 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [14:00:11] hi [14:00:31] hi [14:01:43] o// [14:02:00] my patch is not testable [14:06:27] I can SWAT [14:07:02] (03CR) 10Ladsgroup: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406256 (https://phabricator.wikimedia.org/T185720) (owner: 10Rxy) [14:07:30] I'm ready for test [14:12:55] (03Merged) 10jenkins-bot: Add 'rollbacker' group at arwikibooks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406256 (https://phabricator.wikimedia.org/T185720) (owner: 10Rxy) [14:13:04] rxy: have you seen https://gerrit.wikimedia.org/r/#/c/406256/1/wmf-config/InitialiseSettings.php ? [14:13:10] (03CR) 10jenkins-bot: Add 'rollbacker' group at arwikibooks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406256 (https://phabricator.wikimedia.org/T185720) (owner: 10Rxy) [14:13:50] Amir1: yes [14:14:04] rxy: it seems fine to me [14:15:27] mwdebug is so slow [14:16:43] rxy: your patch is live in mwdebug [14:16:50] mwdebug1002 [14:18:02] Can I next? [14:18:12] wait [14:18:19] ok [14:18:20] I waiting [14:18:58] Jhs: around? [14:19:35] ugh [14:19:45] commonsuploads doesn't apply my patch [14:19:54] should be have '+' prefix [14:19:54] Zoranzoki21: If Jhs doesn't answer by rxy is finished, I go with yours [14:20:14] ok [14:20:15] rxy: can you make a patch right now and I deploy it [14:20:31] yes, I can [14:20:41] Thanks [14:21:21] (03PS5) 10Zoranzoki21: Enable again wgNamespacesWithSubpages for huwikisource [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406474 (https://phabricator.wikimedia.org/T185813) [14:24:24] rxy: ? [14:25:52] Amir1: Start with next patches [14:26:06] Better is it [14:26:12] yes, please skip me [14:26:12] It's not possible without reverting his patch [14:26:21] oh [14:26:22] I guess I just revert yours [14:26:39] (03PS1) 10Ladsgroup: Revert "Add 'rollbacker' group at arwikibooks" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406588 [14:26:45] (03CR) 10Ladsgroup: [C: 032] Revert "Add 'rollbacker' group at arwikibooks" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406588 (owner: 10Ladsgroup) [14:27:51] Amir1, see you're SWATTing. I'm here :) [14:28:02] cool [14:28:26] (03Merged) 10jenkins-bot: Revert "Add 'rollbacker' group at arwikibooks" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406588 (owner: 10Ladsgroup) [14:28:49] (03CR) 10jenkins-bot: Revert "Add 'rollbacker' group at arwikibooks" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406588 (owner: 10Ladsgroup) [14:29:11] (03CR) 10Ladsgroup: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406474 (https://phabricator.wikimedia.org/T185813) (owner: 10Zoranzoki21) [14:29:20] (03PS6) 10Ladsgroup: Enable again wgNamespacesWithSubpages for huwikisource [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406474 (https://phabricator.wikimedia.org/T185813) (owner: 10Zoranzoki21) [14:29:34] (03CR) 10Ladsgroup: [C: 032] Enable again wgNamespacesWithSubpages for huwikisource [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406474 (https://phabricator.wikimedia.org/T185813) (owner: 10Zoranzoki21) [14:30:06] Can I move https://gerrit.wikimedia.org/r/#/c/406025/ for this SWAT? [14:30:51] Zoranzoki21: We have more than 8 patches already [14:31:01] Amir1: Ok [14:31:07] Amir1: For next will be [14:31:23] Sure! [14:32:25] (03Merged) 10jenkins-bot: Enable again wgNamespacesWithSubpages for huwikisource [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406474 (https://phabricator.wikimedia.org/T185813) (owner: 10Zoranzoki21) [14:32:35] (03CR) 10jenkins-bot: Enable again wgNamespacesWithSubpages for huwikisource [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406474 (https://phabricator.wikimedia.org/T185813) (owner: 10Zoranzoki21) [14:33:12] on which is beta server? [14:33:32] Zoranzoki21: it is now [14:33:39] please test and let me know [14:33:52] where to test? [14:34:06] on which mwdebug? [14:34:09] mwdebu1002 [14:35:26] Amir1: no problems [14:35:29] Amir1: ok is [14:35:37] *without problems [14:35:38] all is ok [14:35:46] Amir1: should I file a new one? " ! [remote rejected] HEAD -> refs/publish/master/T185720 (change https://gerrit.wikimedia.org/r/406256 closed) " [14:35:55] rxy: yup [14:36:04] k, thx [14:36:36] Zoranzoki21: cool, deploying [14:37:01] (03PS3) 10Ladsgroup: Enable lua fine grained usage tracking in more wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406578 (https://phabricator.wikimedia.org/T185032) [14:37:20] (03CR) 10Ladsgroup: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406578 (https://phabricator.wikimedia.org/T185032) (owner: 10Ladsgroup) [14:37:23] !log ladsgroup@tin Synchronized wmf-config/InitialiseSettings.php: [[gerrit:406474|Enable again wgNamespacesWithSubpages for huwikisource (T185813)]] (duration: 00m 57s) [14:37:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:37:38] T185813: Enable subpages in template namespace on huwikisource - https://phabricator.wikimedia.org/T185813 [14:37:48] Zoranzoki21: It's live everywhere, please test [14:38:21] Amir1: All ok [14:39:04] Amir1: Thank you very much [14:39:20] (03Merged) 10jenkins-bot: Enable lua fine grained usage tracking in more wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406578 (https://phabricator.wikimedia.org/T185032) (owner: 10Ladsgroup) [14:39:23] Zoranzoki21: Thank you for deploying with #releng :P [14:39:39] Amir1: Your welcome [14:40:01] (03CR) 10jenkins-bot: Enable lua fine grained usage tracking in more wikis [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406578 (https://phabricator.wikimedia.org/T185032) (owner: 10Ladsgroup) [14:40:03] * Zoranzoki21 is happy because is patch finally deployed and all work without problems [14:41:18] (03CR) 10Ladsgroup: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406263 (https://phabricator.wikimedia.org/T185347) (owner: 10Urbanecm) [14:41:23] !log ladsgroup@tin Synchronized wmf-config/InitialiseSettings.php: [[gerrit:406578|Enable lua fine grained usage tracking in more wikis (T185032)]] (duration: 00m 57s) [14:41:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:41:37] T185032: Enable lua fine grained usage tracking - Late January 2018 batch - https://phabricator.wikimedia.org/T185032 [14:42:24] Urbanecm: you are next [14:43:08] (03Merged) 10jenkins-bot: Change logos of hi.wikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406263 (https://phabricator.wikimedia.org/T185347) (owner: 10Urbanecm) [14:43:24] (03CR) 10jenkins-bot: Change logos of hi.wikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406263 (https://phabricator.wikimedia.org/T185347) (owner: 10Urbanecm) [14:44:05] Urbanecm: ^ your patch is in mwdebug1002 [14:44:17] please test and let me know [14:44:53] Amir1, will test [14:45:15] (03PS1) 10Rxy: Add 'rollbacker' group at arwikibooks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406591 (https://phabricator.wikimedia.org/T185720) [14:45:31] Amir1, works, please deploy [14:47:34] Amir1: should I reschedule in another day? [14:47:52] Urbanecm: deploying [14:48:11] rxy: yes, please. I'm not even sure if I can finish all of Urbanecm patches [14:48:28] (03PS2) 10Ladsgroup: Change wgMetaNamespace and wgSitename for hiwikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406264 (https://phabricator.wikimedia.org/T185347) (owner: 10Urbanecm) [14:48:40] (03CR) 10Ladsgroup: [C: 032] "SWAT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406264 (https://phabricator.wikimedia.org/T185347) (owner: 10Urbanecm) [14:48:43] !log ladsgroup@tin Synchronized static/images/project-logos: [[gerrit:406263|Change logos of hi.wikiversity (T185347)]] (duration: 00m 56s) [14:48:46] Amir1: ok, thanks and sorry for inconvenience [14:48:55] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:48:56] T185347: Change Namespace and logo of hiwikiversity - https://phabricator.wikimedia.org/T185347 [14:49:02] no worries, it happens [14:49:23] Urbanecm: live everywhere, please test, in the mean time the other patch is being merged [14:50:19] (03Merged) 10jenkins-bot: Change wgMetaNamespace and wgSitename for hiwikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406264 (https://phabricator.wikimedia.org/T185347) (owner: 10Urbanecm) [14:50:30] (03CR) 10jenkins-bot: Change wgMetaNamespace and wgSitename for hiwikiversity [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406264 (https://phabricator.wikimedia.org/T185347) (owner: 10Urbanecm) [14:50:40] Amir1, it seems it is cached, so I'll re-test in a couple of hours [14:50:58] ugh, some wikis should be have '+' prefix when contain 'commonsuploads' set [14:51:11] Urbanecm: let me know [14:51:29] Urbanecm: please test the patch in mwdebug [14:51:48] the Change wgMetaNamespace and wgSitename for hiwikiversity [14:52:54] Urbanecm: I have a meeting in 8 minutes, I can't do more than one more [14:53:29] sorry, please reschedule for the morning SWAT (SF time) [14:53:51] Amir1, morning swat is...late evening swat for me :D [14:54:05] But prepare for your meeting, I'll use some swat [14:54:27] Same for me :D I can do them in the later SWAT if it's too late for you [14:55:06] Amir1, I'll use another EU swat probably :) [14:55:27] Urbanecm: Ping me if no one is around for tomorrow [14:55:38] Urbanecm: is it okay in mwdebug? [14:56:18] Amir1, seems to be. But a script is required or things will be broken. Do you have time for it? [14:56:44] https://www.mediawiki.org/wiki/Manual:NamespaceDupes.php [14:57:01] nah it's okay [14:57:09] First I need to deploy it [14:58:59] !log ladsgroup@tin Synchronized wmf-config/InitialiseSettings.php: [[gerrit:406264|Change wgMetaNamespace and wgSitename for hiwikiversity (T185347)]] (duration: 00m 54s) [14:59:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:59:09] T185347: Change Namespace and logo of hiwikiversity - https://phabricator.wikimedia.org/T185347 [14:59:29] !log ladsgroup@terbium:~$ mwscript namespaceDupes.php --wiki=hiwikiversity (T185347) [14:59:32] Okay, I'm done for now [14:59:35] need to go [14:59:35] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:59:37] see you later [15:00:07] hmm .... 'bnwikisource', 'guwiki', 'ladwiki', 'newiki', 'pswiki', 'trwikiquote', 'wuuwiki'γ€€γ€€@'groupOverrides'γ€€γ€€those wikis should be have '+' prefix because included in 'commonsuploads' set [15:00:24] Amir1, anyway, I think that --fix is needed. [15:01:54] of course, wgEnableUploads variable are applied too. ('+' prefix @'groupOverrides') [15:06:20] I confused. 'bnwikisource', 'guwiki', 'ladwiki', 'newiki', 'pswiki', 'trwikiquote', 'wuuwiki'γ€€ those wikis can be use upload functio by autoconfirmed users... [15:07:53] by miss configuration [15:08:14] *wrong configuration [15:13:00] Urbanecm: I do it now [15:13:31] Urbanecm: done [15:30:57] I file a task : https://phabricator.wikimedia.org/T185898 (related with soft disable uploads) [15:34:43] (03PS3) 10Zoranzoki21: Set wgNamespaceRobotPolicies on ptwiki's NS_USER to noindex [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406476 (https://phabricator.wikimedia.org/T185660) [15:39:37] (03PS2) 10Rxy: Add 'rollbacker' group at arwikibooks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406591 (https://phabricator.wikimedia.org/T185720) [15:46:48] (03PS3) 10Lokal Profil: Drop the medlem user group and editallpages user right [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404942 (https://phabricator.wikimedia.org/T184981) [15:49:11] (03CR) 10Lokal Profil: "I have now removed all users from the "medlem" group" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404942 (https://phabricator.wikimedia.org/T184981) (owner: 10Lokal Profil) [16:22:28] (03CR) 10Rxy: ">" (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406256 (https://phabricator.wikimedia.org/T185720) (owner: 10Rxy) [16:31:57] 10Operations, 10ORES, 10Patch-For-Review, 10Scoring-platform-team (Current): Reimage ores* hosts with Debian Stretch - https://phabricator.wikimedia.org/T171851#3928149 (10Halfak) [16:32:23] 10Operations, 10ORES, 10Patch-For-Review, 10Scoring-platform-team (Current): Reimage ores* hosts with Debian Stretch - https://phabricator.wikimedia.org/T171851#3478146 (10Halfak) @akosiaris, is this done? [16:33:17] 10Operations, 10ORES, 10Patch-For-Review, 10Scoring-platform-team (Current): Reimage ores* hosts with Debian Stretch - https://phabricator.wikimedia.org/T171851#3928163 (10Halfak) [16:51:55] (03PS12) 10Andrew Bogott: openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) [16:52:25] (03CR) 10jerkins-bot: [V: 04-1] openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) (owner: 10Andrew Bogott) [16:54:45] (03PS13) 10Andrew Bogott: openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) [16:55:18] (03CR) 10jerkins-bot: [V: 04-1] openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) (owner: 10Andrew Bogott) [16:56:59] (03PS14) 10Andrew Bogott: openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) [16:57:37] (03CR) 10jerkins-bot: [V: 04-1] openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) (owner: 10Andrew Bogott) [16:58:30] (03CR) 10Andrew Bogott: [V: 032 C: 032] openstack horizon: rough in manifests for source deploy of Horizon 'ocata' [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) (owner: 10Andrew Bogott) [17:06:25] (03PS1) 10Zoranzoki21: Add throttle rule for 1Lib1Ref event [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406603 (https://phabricator.wikimedia.org/T185857) [17:07:10] PROBLEM - puppet last run on tin is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 6 minutes ago with 1 failures. Failed resources (up to 3 shown): Scap_source[horizon/deploy] [17:12:10] RECOVERY - puppet last run on tin is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [17:14:46] ./mediawiki/php-1.31.0-wmf.15/extensions/Wikidata/extensions/Wikibase/repo/maintenance/rebuildTermSqlIndex.php [17:14:58] ^ extension of extension? normal? [17:15:47] shouldnt it just have moved from extensions/Wikidata to extensions/Wikibase without that duplicate structure? [17:17:33] (03CR) 10Dzahn: [C: 032] "./mediawiki/php-1.31.0-wmf.17/extensions/Wikibase/repo/maintenance/rebuildTermSqlIndex.php" [puppet] - 10https://gerrit.wikimedia.org/r/395700 (owner: 10Addshore) [17:17:43] (03PS3) 10Dzahn: Update location of wikibase-rebuildTermSqlIndex script [puppet] - 10https://gerrit.wikimedia.org/r/395700 (owner: 10Addshore) [17:18:13] thanks mutante ! [17:18:37] mutante: extensions/Wikidata is from the wikidata build which isnt being used as of a few months [17:18:45] this was the 1 last reference to that patch in puppet [17:19:00] i see it's just like that in "wmf15" and the right path you are using exists, no worries [17:19:03] ok [17:19:03] even though it is for an absented cron, it should probably be updated incase someone turns it back on again :) [17:19:14] yes [17:19:24] submitted :) [17:21:33] thanks! [17:24:19] PROBLEM - Long running screen/tmux on puppetmaster1001 is CRITICAL: CRIT: Long running SCREEN process. (PID: 1154, 1735580s 1728000s). [17:26:52] godog: ^ that's yours, should we whitelist or was it idle [17:28:50] PROBLEM - puppet last run on naos is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Scap_source[horizon/deploy] [17:31:34] ^ naos is NOT the active deploy server and should not be used! [17:39:39] RECOVERY - Check Varnish expiry mailbox lag on cp4021 is OK: OK: expiry mailbox lag is 9 [17:41:30] fatal: destination path '/srv/deployment/horizon/deploy' already exists and is not an empty directory. [17:44:12] HI, can you tell me about patch https://gerrit.wikimedia.org/r/#/c/406603/ [17:44:16] Is all ok or no? [17:44:27] Because I want to add for next swat which coming fast [17:48:03] (03CR) 10Dzahn: "fatal: destination path '/srv/deployment/horizon/deploy' already exists and is not an empty directory." [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) (owner: 10Andrew Bogott) [17:50:04] (03CR) 10Dzahn: "separate issue: please don't override the jenkins vote for "includes apache". we can replace that with the httpd module" [puppet] - 10https://gerrit.wikimedia.org/r/405373 (https://phabricator.wikimedia.org/T168470) (owner: 10Andrew Bogott) [17:51:36] 10Operations, 10Cloud-VPS, 10cloud-services-team: wikidumpparse is using 1.2TB of 5T available NFS misc storage - https://phabricator.wikimedia.org/T183970#3928471 (10madhuvishy) @notconfusing Is this service still active? Are there ongoing clean up jobs in place to delete files that are generated? I see tha... [17:52:27] (03CR) 10Rxy: [C: 04-1] Add throttle rule for 1Lib1Ref event (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406603 (https://phabricator.wikimedia.org/T185857) (owner: 10Zoranzoki21) [17:52:46] (03PS1) 10Marostegui: db-eqiad,db-codfw.php: Remove db1030 [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406613 (https://phabricator.wikimedia.org/T184397) [17:56:52] (03CR) 10Zoranzoki21: Add throttle rule for 1Lib1Ref event (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406603 (https://phabricator.wikimedia.org/T185857) (owner: 10Zoranzoki21) [17:58:50] RECOVERY - puppet last run on naos is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [18:00:04] gehel: Time to snap out of that daydream and deploy Wikidata Query Service weekly deploy. Get on with it. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180129T1800). [18:00:05] No GERRIT patches in the queue for this window AFAICS. [18:01:54] !log smalyshev@tin Started deploy [wdqs/wdqs@0a7126d]: updater and GUI deploy [18:02:00] (03PS1) 10Andrew Bogott: Revert "openstack horizon: rough in manifests for source deploy of Horizon 'ocata'" [puppet] - 10https://gerrit.wikimedia.org/r/406614 [18:02:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:02:26] (03CR) 10jerkins-bot: [V: 04-1] Revert "openstack horizon: rough in manifests for source deploy of Horizon 'ocata'" [puppet] - 10https://gerrit.wikimedia.org/r/406614 (owner: 10Andrew Bogott) [18:03:16] (03CR) 10Andrew Bogott: [V: 032 C: 032] Revert "openstack horizon: rough in manifests for source deploy of Horizon 'ocata'" [puppet] - 10https://gerrit.wikimedia.org/r/406614 (owner: 10Andrew Bogott) [18:04:32] !log smalyshev@tin Finished deploy [wdqs/wdqs@0a7126d]: updater and GUI deploy (duration: 02m 38s) [18:04:43] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [18:09:19] (03PS2) 10Rxy: Add throttle rule for 1Lib1Ref event [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406603 (https://phabricator.wikimedia.org/T185857) (owner: 10Zoranzoki21) [18:09:44] (03CR) 10Zoranzoki21: "recheck" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406603 (https://phabricator.wikimedia.org/T185857) (owner: 10Zoranzoki21) [18:10:39] (03CR) 10Zoranzoki21: "> Patch Set 2: Published edit on patch set 1." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406603 (https://phabricator.wikimedia.org/T185857) (owner: 10Zoranzoki21) [18:14:52] (03CR) 10Zoranzoki21: [C: 031] Add 'rollbacker' group at arwikibooks [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406591 (https://phabricator.wikimedia.org/T185720) (owner: 10Rxy) [18:15:22] (03PS4) 10RobH: adding new shell user Ramsey Isler [puppet] - 10https://gerrit.wikimedia.org/r/405981 (https://phabricator.wikimedia.org/T185356) [18:15:42] (03CR) 10RobH: [C: 032] adding new shell user Ramsey Isler [puppet] - 10https://gerrit.wikimedia.org/r/405981 (https://phabricator.wikimedia.org/T185356) (owner: 10RobH) [18:16:25] 10Operations, 10Ops-Access-Requests, 10Patch-For-Review: Requesting access to bast1001, stat1005, stat1006 for risler - https://phabricator.wikimedia.org/T185356#3928543 (10RobH) a:03RobH [18:23:34] (03PS14) 10Zoranzoki21: Enable Extension:Newsletter on hewiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/381537 (https://phabricator.wikimedia.org/T177151) [18:28:12] (03PS2) 10Andrew Bogott: Revert "openstack horizon: rough in manifests for source deploy of Horizon 'ocata'" [puppet] - 10https://gerrit.wikimedia.org/r/406614 [18:28:38] (03CR) 10jerkins-bot: [V: 04-1] Revert "openstack horizon: rough in manifests for source deploy of Horizon 'ocata'" [puppet] - 10https://gerrit.wikimedia.org/r/406614 (owner: 10Andrew Bogott) [18:29:30] (03CR) 10Andrew Bogott: [V: 032 C: 032] Revert "openstack horizon: rough in manifests for source deploy of Horizon 'ocata'" [puppet] - 10https://gerrit.wikimedia.org/r/406614 (owner: 10Andrew Bogott) [18:31:10] (03PS4) 10ζ˜Ÿθ€€ζ™¨ζ›¦: Set Portal and Portal talk namespace alias of zhwiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406487 (https://phabricator.wikimedia.org/T184866) [18:53:53] (03CR) 10Volans: "I'm ok with the change in general, but I have two concerns:" [puppet] - 10https://gerrit.wikimedia.org/r/406535 (https://phabricator.wikimedia.org/T185862) (owner: 10Dzahn) [18:56:34] (03PS2) 10Madhuvishy: labs: Only include nfsclient if *any* nfs mounts are enabled [puppet] - 10https://gerrit.wikimedia.org/r/333227 (owner: 10Yuvipanda) [18:58:02] (03PS1) 10Odder: Add favicon for right-to-left Wikibooks projects [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406624 (https://phabricator.wikimedia.org/T185919) [19:00:04] addshore, hashar, anomie, no_justification, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: #bothumor Q:Why did functions stop calling each other? A:They had arguments. Rise for Morning SWAT (Max 8 patches) . (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180129T1900). [19:00:05] tgr, Hauskatze, stephanebisson, Zoranzoki21, and rxy: A patch you scheduled for Morning SWAT (Max 8 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker. [19:00:12] o/ [19:00:13] hello [19:00:13] hi [19:00:42] o/ [19:00:59] hello [19:04:47] who's taking care of Morning SWAT? :-) [19:07:05] !log chasetest delete project [post horizon cleanup] [19:10:16] (03CR) 10Volans: "I understand the scaffolding and the code looks ok, but given that it depends on how the different installations will be handled, it would" [puppet] - 10https://gerrit.wikimedia.org/r/405808 (https://phabricator.wikimedia.org/T185501) (owner: 10Herron) [19:10:20] tgr: no SWATers around apparently [19:10:25] Hauskatze: Not RelEng, we're all off-site. [19:10:40] (03PS3) 10Volans: wmf-auto-reimage: refactor for Cumin 2.0.0 API [puppet] - 10https://gerrit.wikimedia.org/r/406409 [19:10:43] no_justification: oh, no idea [19:10:43] oh, right, forgot about that [19:10:56] most other people are still traveling I suppose? [19:11:04] I'll do it. [19:11:07] (03CR) 10Volans: [C: 032] wmf-auto-reimage: refactor for Cumin 2.0.0 API [puppet] - 10https://gerrit.wikimedia.org/r/406409 (owner: 10Volans) [19:11:08] :D [19:11:21] (03PS6) 10Niharika29: Adding config for WikimediaEvents module for logging behaviour data [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404910 (https://phabricator.wikimedia.org/T183869) (owner: 10Groovier1) [19:11:30] (03CR) 10Niharika29: [C: 032] "SWAT." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404910 (https://phabricator.wikimedia.org/T183869) (owner: 10Groovier1) [19:13:20] (03Merged) 10jenkins-bot: Adding config for WikimediaEvents module for logging behaviour data [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404910 (https://phabricator.wikimedia.org/T183869) (owner: 10Groovier1) [19:13:25] (03CR) 10jenkins-bot: Adding config for WikimediaEvents module for logging behaviour data [mediawiki-config] - 10https://gerrit.wikimedia.org/r/404910 (https://phabricator.wikimedia.org/T183869) (owner: 10Groovier1) [19:13:27] (03CR) 10Fdans: [C: 04-1] "Not to merge until Leila approves" [puppet] - 10https://gerrit.wikimedia.org/r/405727 (https://phabricator.wikimedia.org/T174386) (owner: 10Fdans) [19:13:46] tgr: Your patch is on mwdebug1002, please check. [19:14:03] (03PS2) 10Niharika29: Bureaucrats on WMF wikis to add and remove 'accountcreator' by default [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406025 (https://phabricator.wikimedia.org/T185417) (owner: 10MarcoAurelio) [19:14:18] meow [19:14:21] Niharika: it's a beta-only change [19:15:04] tgr: Okay, then I'll just sync it. It does update the non-beta file too. [19:15:10] (03CR) 10Niharika29: [C: 032] "SWAT." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406025 (https://phabricator.wikimedia.org/T185417) (owner: 10MarcoAurelio) [19:16:40] Niharika: yeah, should be a no-op though; also the patch that the configuration is for did not reach the train yet [19:17:00] (03CR) 10jenkins-bot: Bureaucrats on WMF wikis to add and remove 'accountcreator' by default [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406025 (https://phabricator.wikimedia.org/T185417) (owner: 10MarcoAurelio) [19:17:05] tgr: So safe to deploy? [19:17:10] yeah [19:17:36] Hauskatze: https://gerrit.wikimedia.org/r/#/c/406025/ is on mwdebug1002, if you can test. [19:17:43] Niharika: checking [19:18:20] Niharika: random check on a wiki shows the change's ok [19:18:33] I can see the permissions changes correctly applied for the bureaucrat user group [19:18:50] Hauskatze: Okay, deploying in a moment. [19:19:30] !log niharika29@tin Synchronized wmf-config/InitialiseSettings-labs.php: Adding config for WikimediaEvents module for logging behaviour data T183869 (duration: 00m 57s) [19:19:44] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:19:44] T183869: WikimediaEvents extension for data collection - https://phabricator.wikimedia.org/T183869 [19:21:05] thx Niharika! [19:21:14] You're welcome. :) [19:21:22] (03PS7) 10Niharika29: Remove upload rights on wikis where local uploads are disabled [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405421 (https://phabricator.wikimedia.org/T143789) (owner: 10MarcoAurelio) [19:22:02] !log niharika29@tin Synchronized wmf-config/InitialiseSettings.php: Adding config for WikimediaEvents module for logging behavior data T183869 and beaureaucrats to add and remove accountcreator by default T185417 (duration: 00m 56s) [19:22:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:22:16] T185417: Bureaucrats on WMF wikis to add/remove 'accountcreator' by default - https://phabricator.wikimedia.org/T185417 [19:22:50] I am here [19:23:36] (03CR) 10Niharika29: [C: 032] "SWAT." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405421 (https://phabricator.wikimedia.org/T143789) (owner: 10MarcoAurelio) [19:27:42] stephanebisson: Can both of your changes be applied simultaneously? Zuul is being slow. [19:27:58] Niharika: sure, no problem [19:29:17] (03Merged) 10jenkins-bot: Remove upload rights on wikis where local uploads are disabled [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405421 (https://phabricator.wikimedia.org/T143789) (owner: 10MarcoAurelio) [19:29:30] (03CR) 10jenkins-bot: Remove upload rights on wikis where local uploads are disabled [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405421 (https://phabricator.wikimedia.org/T143789) (owner: 10MarcoAurelio) [19:30:11] Hauskatze: https://gerrit.wikimedia.org/r/#/c/405421/7 is on mwdebug1002. Testable? [19:30:49] Niharika: guess so, let me check [19:32:16] Niharika: random check on wiki with uploads fully disabled returns upload link to commons as expected [19:32:30] and I don't see any breaks so I guess it's good :-) [19:33:24] Hauskatze: Alright then. [19:33:48] Niharika: hmm wait [19:34:02] Hauskatze: Hmm? [19:34:15] Anything broke? [19:34:19] not sure why the permissions things ain't applied correctly [19:34:33] it's on debug1002 right? [19:34:40] or am I checking the wrong spot? [19:34:42] Hauskatze: Yes. [19:34:47] It's there. [19:34:55] Hauskatze: https://phabricator.wikimedia.org/T185898 [19:35:41] Niharika: feel free to sync. I think this is because of commonsettings [19:35:54] once the dblist is fully picked it should work [19:36:06] Hauskatze: Okay. Fingers crossed. [19:36:23] Niharika: I'll monitor until the end of the swat window [19:37:22] !log niharika29@tin scap failed: average error rate on 9/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details) [19:37:33] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:37:48] (03PS1) 10Ayounsi: Add scs-c1-eqiad.mgmt.eqiad.wmnet to rancid [puppet] - 10https://gerrit.wikimedia.org/r/406630 [19:37:50] Uh oh. [19:38:00] https://www.irccloud.com/pastebin/wf3AlDkx/ [19:38:08] "9/11 canaries" doesn't sound good [19:38:30] I think I know what's up. [19:39:09] PROBLEM - Nginx local proxy to apache on mw1261 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.022 second response time [19:39:10] PROBLEM - Nginx local proxy to apache on mw1263 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.020 second response time [19:39:10] PROBLEM - Apache HTTP on mw1277 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.018 second response time [19:39:10] PROBLEM - Nginx local proxy to apache on mw1262 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.024 second response time [19:39:18] Niharika: I'm reverting the patch. It does not work as expected for some reason :| [19:39:19] PROBLEM - HHVM rendering on mw1277 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.017 second response time [19:39:19] PROBLEM - HHVM rendering on mw1278 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.014 second response time [19:39:19] PROBLEM - Nginx local proxy to apache on mw1264 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.022 second response time [19:39:19] PROBLEM - HHVM rendering on mw1265 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.017 second response time [19:39:19] PROBLEM - Nginx local proxy to apache on mw1276 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.028 second response time [19:39:20] !log niharika29@tin Synchronized wmf-config/InitialiseSettings.php: Remove upload rights on wikis where local uploads are disabled T143789 (duration: 00m 56s) [19:39:20] PROBLEM - HHVM rendering on mw1263 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.017 second response time [19:39:20] PROBLEM - HHVM rendering on mw1276 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.018 second response time [19:39:21] PROBLEM - HHVM rendering on mw1279 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.021 second response time [19:39:21] PROBLEM - HHVM rendering on mw1262 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.049 second response time [19:39:22] PROBLEM - HHVM rendering on mw1261 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.062 second response time [19:39:22] PROBLEM - HHVM rendering on mwdebug1001 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1973 bytes in 0.218 second response time [19:39:22] I hope that's not me.^ [19:39:23] PROBLEM - HHVM rendering on mw1264 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.015 second response time [19:39:23] PROBLEM - Apache HTTP on mw1278 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.018 second response time [19:39:24] PROBLEM - Nginx local proxy to apache on mwdebug1001 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1972 bytes in 0.018 second response time [19:39:29] oh great [19:39:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:39:33] T143789: On wikis where uploads are fully disabled remove upload rights on any group - https://phabricator.wikimedia.org/T143789 [19:39:39] PROBLEM - Apache HTTP on mw1265 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.015 second response time [19:39:39] PROBLEM - Apache HTTP on mw1262 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.016 second response time [19:39:39] PROBLEM - Nginx local proxy to apache on mw1265 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.021 second response time [19:39:50] PROBLEM - Nginx local proxy to apache on mw1277 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.023 second response time [19:39:59] PROBLEM - Apache HTTP on mwdebug1001 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1972 bytes in 0.012 second response time [19:39:59] PROBLEM - Apache HTTP on mw1263 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.018 second response time [19:40:00] PROBLEM - Apache HTTP on mw1261 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.040 second response time [19:40:00] PROBLEM - Apache HTTP on mw1264 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.015 second response time [19:40:00] PROBLEM - Nginx local proxy to apache on mw1279 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.027 second response time [19:40:24] !log niharika29@tin Synchronized wmf-config/CommonSettings.php: Remove upload rights on wikis where local uploads are disabled T143789 (duration: 00m 56s) [19:40:26] (03PS1) 10MarcoAurelio: Revert "Remove upload rights on wikis where local uploads are disabled" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406632 [19:41:05] wiki's down [19:41:08] via cp1055 cp1055, Varnish XID 438797998 [19:41:08] Error: 503, Backend fetch failed at Mon, 29 Jan 2018 19:40:49 GMT [19:41:10] PROBLEM - Nginx local proxy to apache on mw1238 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.026 second response time [19:41:16] welp same [19:41:19] PROBLEM - HHVM rendering on mw1238 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.023 second response time [19:41:19] PROBLEM - Apache HTTP on mw1238 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 1967 bytes in 0.022 second response time [19:41:21] confirmed. commons, dewp etc. [19:41:23] mutante: ping ^^ [19:41:28] svwiki confirmed [19:41:40] RECOVERY - Nginx local proxy to apache on mw1265 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 617 bytes in 0.040 second response time [19:41:40] oops [19:41:40] Oh no. [19:41:46] <_joe_> who's reverting? [19:41:48] * volans here [19:41:48] * foks pets Niharika [19:41:50] RECOVERY - Nginx local proxy to apache on mw1277 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 617 bytes in 0.047 second response time [19:41:50] Commons no work [19:41:50] Shit, I actually broke wikis. [19:41:53] <_joe_> Hauskatze: we know [19:41:55] Hi. [19:41:57] https://la.wikisource.org/w/index.php?title=De_Re_Rustica/Liber_III&action=submit [19:41:59] <_joe_> Niharika: revert :) [19:41:59] RECOVERY - Apache HTTP on mwdebug1001 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 621 bytes in 0.029 second response time [19:41:59] Failed [19:42:00] RECOVERY - Nginx local proxy to apache on mw1279 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 617 bytes in 0.083 second response time [19:42:02] <_joe_> ShakespeareFan02: known [19:42:04] ShakespeareFan02, aware [19:42:05] !log niharika29@tin Synchronized dblists/uploadsdisabled.dblist: https://gerrit.wikimedia.org/r/#/c/405421/ (duration: 00m 56s) [19:42:10] RECOVERY - Nginx local proxy to apache on mw1238 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 619 bytes in 1.105 second response time [19:42:10] RECOVERY - Apache HTTP on mw1277 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 616 bytes in 0.065 second response time [19:42:15] Alright, what to revert first? [19:42:19] It said to tell you - "Request from 88.97.96.89 via cp1055 cp1055, Varnish XID 557711415 [19:42:19] Error: 503, Backend fetch failed at Mon, 29 Jan 2018 19:41:33 GMT" [19:42:19] RECOVERY - HHVM rendering on mw1277 is OK: HTTP OK: HTTP/1.1 200 OK - 78382 bytes in 0.231 second response time [19:42:19] RECOVERY - Nginx local proxy to apache on mw1276 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 617 bytes in 0.050 second response time [19:42:19] RECOVERY - HHVM rendering on mw1279 is OK: HTTP OK: HTTP/1.1 200 OK - 78382 bytes in 0.163 second response time [19:42:19] RECOVERY - HHVM rendering on mw1276 is OK: HTTP OK: HTTP/1.1 200 OK - 78382 bytes in 0.162 second response time [19:42:19] RECOVERY - HHVM rendering on mw1278 is OK: HTTP OK: HTTP/1.1 200 OK - 78382 bytes in 0.195 second response time [19:42:20] RECOVERY - Apache HTTP on mw1238 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 616 bytes in 0.058 second response time [19:42:20] RECOVERY - HHVM rendering on mwdebug1001 is OK: HTTP OK: HTTP/1.1 200 OK - 78394 bytes in 1.713 second response time [19:42:21] RECOVERY - Apache HTTP on mw1278 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 616 bytes in 0.053 second response time [19:42:21] RECOVERY - HHVM rendering on mw1238 is OK: HTTP OK: HTTP/1.1 200 OK - 78384 bytes in 3.464 second response time [19:42:22] RECOVERY - Nginx local proxy to apache on mwdebug1001 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 622 bytes in 0.035 second response time [19:42:27] Revert is created already https://gerrit.wikimedia.org/r/#/c/406632/ [19:42:28] I don't think my patch could have caused all this [19:42:29] RECOVERY - HHVM rendering on mw1263 is OK: HTTP OK: HTTP/1.1 200 OK - 78380 bytes in 6.918 second response time [19:42:29] RECOVERY - HHVM rendering on mw1262 is OK: HTTP OK: HTTP/1.1 200 OK - 78380 bytes in 7.398 second response time [19:42:29] RECOVERY - Apache HTTP on mw1279 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 616 bytes in 0.034 second response time [19:42:30] RECOVERY - Apache HTTP on mw1276 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 616 bytes in 0.040 second response time [19:42:30] PROBLEM - Restbase edge codfw on text-lb.codfw.wikimedia.org is CRITICAL: /api/rest_v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) is CRITICAL: Test Retrieve aggregated feed content for April 29, 2016 responds with malformed body (AttributeError: NoneType object has no attribute get) [19:42:30] <_joe_> Niharika: seems it is recovering? [19:42:30] RECOVERY - Nginx local proxy to apache on mw1278 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 617 bytes in 0.059 second response time [19:42:31] back working [19:42:39] RECOVERY - Apache HTTP on mw1262 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 616 bytes in 0.021 second response time [19:42:42] RECOVERY - Apache HTTP on mw1265 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 616 bytes in 3.011 second response time [19:42:42] Yeah, back. [19:42:43] works for me now [19:42:47] At least you get that t-shirt now. :D [19:42:48] Work [19:42:48] (03CR) 10Ayounsi: [C: 032] Add scs-c1-eqiad.mgmt.eqiad.wmnet to rancid [puppet] - 10https://gerrit.wikimedia.org/r/406630 (owner: 10Ayounsi) [19:42:51] Thank you god [19:42:51] there was some puppet changes a couple of minutes ago [19:42:54] Don't panic [19:42:56] :) [19:42:57] lol [19:42:59] PROBLEM - Restbase edge ulsfo on text-lb.ulsfo.wikimedia.org is CRITICAL: /api/rest_v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) is CRITICAL: Test Retrieve aggregated feed content for April 29, 2016 responds with malformed body (AttributeError: NoneType object has no attribute get) [19:42:59] PROBLEM - Restbase edge eqiad on text-lb.eqiad.wikimedia.org is CRITICAL: /api/rest_v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) is CRITICAL: Test Retrieve aggregated feed content for April 29, 2016 responds with malformed body (AttributeError: NoneType object has no attribute get) [19:43:00] RECOVERY - Apache HTTP on mw1263 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 616 bytes in 0.024 second response time [19:43:00] RECOVERY - Apache HTTP on mw1261 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 616 bytes in 0.024 second response time [19:43:00] RECOVERY - Apache HTTP on mw1264 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 616 bytes in 0.025 second response time [19:43:02] foks: I am freaking out so bad right now. :P [19:43:05] This happens ocassionaly [19:43:09] Niharika, aww, it's okay! [19:43:10] RECOVERY - Nginx local proxy to apache on mw1261 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 617 bytes in 0.031 second response time [19:43:10] RECOVERY - Nginx local proxy to apache on mw1263 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 617 bytes in 0.034 second response time [19:43:13] <_joe_> Niharika: we're ok now :) [19:43:17] Dont panic. All coming ok [19:43:19] RECOVERY - Nginx local proxy to apache on mw1262 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 617 bytes in 0.040 second response time [19:43:19] RECOVERY - HHVM rendering on mw1265 is OK: HTTP OK: HTTP/1.1 200 OK - 78380 bytes in 0.091 second response time [19:43:19] RECOVERY - Nginx local proxy to apache on mw1264 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 617 bytes in 0.035 second response time [19:43:19] RECOVERY - HHVM rendering on mw1261 is OK: HTTP OK: HTTP/1.1 200 OK - 78381 bytes in 0.102 second response time [19:43:20] RECOVERY - HHVM rendering on mw1264 is OK: HTTP OK: HTTP/1.1 200 OK - 78380 bytes in 0.078 second response time [19:43:26] srwiki work now without problems [19:43:37] Niharka: You don't live in an Island State of the US? [19:43:37] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:43:39] XD [19:43:50] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:44:19] PROBLEM - Codfw HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [1000.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes.json?panelId=3&fullscreen&orgId=1&var-site=codfw&var-cache_type=All&var-status_type=5 [19:44:19] PROBLEM - Ulsfo HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 33.33% of data above the critical threshold [1000.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes.json?panelId=3&fullscreen&orgId=1&var-site=ulsfo&var-cache_type=All&var-status_type=5 [19:44:44] <_joe_> ShakespeareFan02: still having issues [19:45:00] PROBLEM - Esams HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 20.00% of data above the critical threshold [1000.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes.json?panelId=3&fullscreen&orgId=1&var-site=esams&var-cache_type=All&var-status_type=5 [19:45:09] <_joe_> ? [19:45:13] Okay I'll leave it a few miniutes [19:45:18] Niharika: can we https://gerrit.wikimedia.org/r/#/c/406632/ ? [19:45:24] (reverts the dblist patch) [19:45:29] PROBLEM - Text HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [1000.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes.json?panelId=3&fullscreen&orgId=1&var-site=All&var-cache_type=text&var-status_type=5 [19:45:32] <_joe_> ShakespeareFan02: that was a question, sorry [19:45:36] !log niharika29@tin Synchronized docroot/noc/conf/uploadsdisabled.dblist: https://gerrit.wikimedia.org/r/#/c/405421 (duration: 00m 55s) [19:45:40] PROBLEM - Restbase edge esams on text-lb.esams.wikimedia.org is CRITICAL: /api/rest_v1/feed/featured/{yyyy}/{mm}/{dd} (Retrieve aggregated feed content for April 29, 2016) is CRITICAL: Test Retrieve aggregated feed content for April 29, 2016 responds with malformed body (AttributeError: NoneType object has no attribute get) [19:45:46] Hauskatze: I don't think we need to do that. [19:45:48] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:45:49] PROBLEM - Eqiad HTTP 5xx reqs/min on graphite1001 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [1000.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes.json?panelId=3&fullscreen&orgId=1&var-site=eqiad&var-cache_type=All&var-status_type=5 [19:46:01] _joe_: I was making a joke re the Island State of US thing [19:46:01] Niharika: okay... [19:46:05] I think the problem was that I synced the CommonsSettings and InitializeSettings before the delist. [19:46:07] <_joe_> :) [19:46:08] NiharikaL Sorry... [19:46:16] dblist. * [19:46:22] Niharika: Sorry, I can't even type today [19:46:24] :( [19:46:28] I didn't realize it'd crash the whole thing. [19:46:45] Thank you god to is all ok [19:47:00] <_joe_> Niharika: our deployment model is so error prone you shouldn't really blame yourself :) [19:47:05] Niharika: All code breaks stuff [19:47:13] Heh. :) [19:47:16] (03CR) 10Zoranzoki21: [C: 04-1] "Now no needed. Work without problems.." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406632 (owner: 10MarcoAurelio) [19:47:23] https://gu.wikipedia.org/wiki/Special:Upload?uselang=en ehh... now enabled file upload ... [19:47:39] RECOVERY - Restbase edge codfw on text-lb.codfw.wikimedia.org is OK: All endpoints are healthy [19:47:40] _joe_: "Inadequate safeguards to ensure continuity of platform delivery" XD [19:47:49] RECOVERY - Restbase edge esams on text-lb.esams.wikimedia.org is OK: All endpoints are healthy [19:47:54] Zoranzoki21: please stop messing with my patch! [19:47:59] RECOVERY - Restbase edge ulsfo on text-lb.ulsfo.wikimedia.org is OK: All endpoints are healthy [19:48:00] RECOVERY - Restbase edge eqiad on text-lb.eqiad.wikimedia.org is OK: All endpoints are healthy [19:48:01] rxy: yes, we need to revert [19:48:08] Niharika: ^ [19:48:11] <_joe_> ShakespeareFan02: you can speak enterprise! [19:48:15] Hauskatze: OK. Sorry [19:48:19] it's all very weird, it worked on mwdebug [19:48:20] Hauskatze: We need to revert still? [19:48:24] _joe_: Not fleuently [19:48:27] <_joe_> Hauskatze: why do we need to revert? [19:48:32] Niharika: yes [19:48:34] stephanebisson: Your patches are on mwdebug1002. [19:48:46] Niharika: testing now [19:48:58] _joe_: because it's not working now and strangely uploads are enabled again on wikis suposed not to be able to do local upload [19:49:06] 10Operations, 10ORES, 10Patch-For-Review, 10Scoring-platform-team (Current): Reimage ores* hosts with Debian Stretch - https://phabricator.wikimedia.org/T171851#3928820 (10akosiaris) It's half done (codfw but not eqiad). I 've been stalling it on T182799 so that we don't get hosts in a non-working state al... [19:49:08] it all very weird [19:49:11] <_joe_> what's not working? [19:49:23] can we please revert and discuss later? [19:49:28] Hauskatze: Okay then. Gimme a moment. [19:49:28] I'm getting kinda tense [19:49:36] At least it's not 'aliens', North Korea or Russia XD [19:49:40] Relax. You can't be more tense than me! [19:49:42] Niharika: working as expected [19:49:51] stephanebisson: Okay, syncing it out now. [19:49:53] <_joe_> but yeah let's revert and discuss after [19:49:55] (Sorry... I try to use humor when things brak) [19:50:01] *break [19:50:16] Niharika: do we have time for one more, at the end maybe? https://gerrit.wikimedia.org/r/#/c/406631/ [19:50:18] (03Abandoned) 10Jayprakash12345: Allow bureaucrats to add/remove 'accountcreator' permission on Wikidata [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406012 (https://phabricator.wikimedia.org/T185597) (owner: 10Jayprakash12345) [19:50:21] Humor is for #wikimedia-tech [19:50:57] stephanebisson: Not really. :( Have to revert one right now and a couple others to go. [19:50:59] _joe_: for some reason the patch is not working as expected. Wikis in 'uploadsdisabled.dblist' should also have removed 'sysops' the upload-related rights from special:listgrouprights; not only it is not happening but local uploads are also enabled again :S [19:51:06] (03PS4) 10Zoranzoki21: Set wgNamespaceRobotPolicies on ptwiki's NS_USER to noindex [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406476 (https://phabricator.wikimedia.org/T185660) [19:51:37] <_joe_> Hauskatze: ok so it's a "feature not working" issue, I'll leave it to you people [19:51:45] !log niharika29@tin Synchronized php-1.31.0-wmf.17/extensions/Flow/: https://gerrit.wikimedia.org/r/#/c/406574/ and https://gerrit.wikimedia.org/r/#/c/406622/ (duration: 01m 13s) [19:51:57] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:52:12] _joe_: yes, that's it: I'll explain on the Task later, maybe I can find someone who can help me understand why. [19:52:27] (03CR) 10Niharika29: [C: 032] "SWAT." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406632 (owner: 10MarcoAurelio) [19:52:31] I'm guessing, maybe, it has something to do with duplicate entries in commonsuploads.dblist [19:52:41] guwiki, iawiki, mswiktionary -> those is in commonsuploads.dblist (soft disable upload) and uploadsdisabled.dblist (hard disable upload) [19:52:48] but I'll check later when we get back to normal [19:52:55] rxy: thanks for confirming [19:53:00] it's strange though [19:53:20] RECOVERY - Ulsfo HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes.json?panelId=3&fullscreen&orgId=1&var-site=ulsfo&var-cache_type=All&var-status_type=5 [19:53:24] Dear Jenkins be fast now please :) [19:53:28] 10Operations, 10ORES, 10Patch-For-Review, 10Scoring-platform-team (Current): Reimage ores* hosts with Debian Stretch - https://phabricator.wikimedia.org/T171851#3928840 (10Halfak) Sorry for the confusion. T182799 is done. Will resolve. [19:54:00] (03Merged) 10jenkins-bot: Revert "Remove upload rights on wikis where local uploads are disabled" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406632 (owner: 10MarcoAurelio) [19:54:15] I assume the reason it was okay on mwdebug is that we staged both changes on tin, used pull to stage on mwdebug, but then synced separately. This is quite common. We tend to try to document the order of sync, but don't apply it to testing. So we should either enforce separate syncs as separate commits, or implement a way to "sync to mwdebug" [19:54:17] ^ for retrospect [19:54:20] RECOVERY - Codfw HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes.json?panelId=3&fullscreen&orgId=1&var-site=codfw&var-cache_type=All&var-status_type=5 [19:54:29] RECOVERY - Text HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes.json?panelId=3&fullscreen&orgId=1&var-site=All&var-cache_type=text&var-status_type=5 [19:54:38] and those wikis should be prepend '+' for dbname in 'groupOverrides' at InitialiseSettings.php ( https://phabricator.wikimedia.org/T185898 ) [19:55:32] Is there db maintaince going on? [19:55:50] RECOVERY - Eqiad HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes.json?panelId=3&fullscreen&orgId=1&var-site=eqiad&var-cache_type=All&var-status_type=5 [19:56:09] Krinkle: that's good to know [19:56:10] (03PS5) 10Zoranzoki21: Set wgNamespaceRobotPolicies on ptwiki's NS_USER to noindex [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406476 (https://phabricator.wikimedia.org/T185660) [19:56:48] What happening with beta-mediawiki-config-update-eqiad? [19:57:00] RECOVERY - Esams HTTP 5xx reqs/min on graphite1001 is OK: OK: Less than 1.00% above the threshold [250.0] https://grafana.wikimedia.org/dashboard/file/varnish-aggregate-client-status-codes.json?panelId=3&fullscreen&orgId=1&var-site=esams&var-cache_type=All&var-status_type=5 [19:57:09] (03CR) 10jenkins-bot: Revert "Remove upload rights on wikis where local uploads are disabled" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406632 (owner: 10MarcoAurelio) [19:57:18] !log niharika29@tin Synchronized wmf-config/InitialiseSettings.php: Revert upload rights patch https://gerrit.wikimedia.org/r/#/c/406632/ (duration: 00m 56s) [19:57:31] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:57:45] Ignore my last comment [19:57:46] Krinkle: Yeah. [19:58:00] PROBLEM - MariaDB Slave Lag: s1 on dbstore2002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 422.57 seconds [19:58:31] !log niharika29@tin Synchronized wmf-config/CommonSettings.php: Revert upload rights patch https://gerrit.wikimedia.org/r/#/c/406632/ (duration: 00m 55s) [19:58:37] 10Operations, 10netops: review and fix scs config - https://phabricator.wikimedia.org/T185926#3928854 (10ayounsi) [19:58:41] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [19:58:52] Krinkle: Do I sync it out the normal way when a file is deleted? [19:58:57] Sync that file? [19:58:58] 10Operations, 10netops: review and fix scs config - https://phabricator.wikimedia.org/T185926#3928867 (10ayounsi) p:05Triage>03Normal [20:00:12] I'm gonna have to go a few minutes over. [20:00:15] jouncebot: next [20:00:15] In 0 hour(s) and 59 minute(s): Services – Parsoid / Citoid / Mobileapps / ORES / … (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180129T2100) [20:00:39] jouncebot: next [20:00:39] In 0 hour(s) and 59 minute(s): Services – Parsoid / Citoid / Mobileapps / ORES / … (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180129T2100) [20:00:43] next window is one hour later so I think it should be okay [20:00:54] Will be [20:01:09] Niharika: so the patch is now reverted right? [20:01:15] PROBLEM - MariaDB Slave Lag: s1 on db1073 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 406.12 seconds [20:01:16] PROBLEM - MariaDB Slave Lag: s1 on db2085 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 375.82 seconds [20:01:19] PROBLEM - MariaDB Slave Lag: s1 on db2069 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 413.92 seconds [20:01:19] PROBLEM - MariaDB Slave Lag: s1 on db2055 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 414.44 seconds [20:01:24] Hauskatze: Yes. [20:01:27] Ugh, what now. [20:01:32] Niharika: I breath alleviated [20:01:33] Not me, definitely. [20:01:35] PROBLEM - MariaDB Slave Lag: s1 on db1066 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 425.22 seconds [20:01:39] PROBLEM - MariaDB Slave Lag: s1 on db2070 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 308.42 seconds [20:01:50] PROBLEM - MariaDB Slave Lag: s1 on db2042 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 302.11 seconds [20:02:08] urgh [20:02:19] RECOVERY - MariaDB Slave Lag: s1 on db2085 is OK: OK slave_sql_lag Replication lag: 0.45 seconds [20:02:21] robh: You know what's up with this? [20:02:29] i do not, im pinging other opsen [20:02:29] RECOVERY - MariaDB Slave Lag: s1 on db2069 is OK: OK slave_sql_lag Replication lag: 42.58 seconds [20:02:29] RECOVERY - MariaDB Slave Lag: s1 on db2055 is OK: OK slave_sql_lag Replication lag: 36.39 seconds [20:02:38] 6 [10000ms] at runtime/ext_mysql: slow query: SELECT MASTER_GTID_WAIT('0-171970637-5470008541,180359172-180359172- [20:02:38] 49702203,171970637-171970637-1089605305', 10) [20:02:38] 4 [10000ms] at runtime/ext_mysql: slow query: SELECT MASTER_GTID_WAIT('0-171970637-5470008525,180359172-180359172- [20:02:38] 49702203,171970637-171970637-1089605060', 10) [20:02:39] RECOVERY - MariaDB Slave Lag: s1 on db2070 is OK: OK slave_sql_lag Replication lag: 0.00 seconds [20:02:49] is OK. [20:02:50] RECOVERY - MariaDB Slave Lag: s1 on db2042 is OK: OK slave_sql_lag Replication lag: 0.23 seconds [20:03:00] robh: Seems like its fine now? [20:03:10] PROBLEM - MediaWiki exceptions and fatals per minute on graphite1001 is CRITICAL: CRITICAL: 80.00% of data above the critical threshold [50.0] https://grafana.wikimedia.org/dashboard/db/mediawiki-graphite-alerts?orgId=1&panelId=2&fullscreen [20:03:14] longrunning query i suppose [20:03:28] Man. [20:03:36] what's going on? [20:03:40] Niharika: okay so I'm leaving momentarily. Given that the patch is reverted there should not be any further issues on my side. [20:04:13] Hauskatze: Yeah, okay. [20:04:27] paravoid: not certain, had db lag on slaves and then some have cleared but not all [20:04:45] RECOVERY - MariaDB Slave Lag: s1 on db1066 is OK: OK slave_sql_lag Replication lag: 0.41 seconds [20:04:56] * volans looking [20:05:25] RECOVERY - MariaDB Slave Lag: s1 on db1073 is OK: OK slave_sql_lag Replication lag: 1.99 seconds [20:05:32] (03PS3) 10RobH: adding Ramsey Isler to statistics-privatedata-users [puppet] - 10https://gerrit.wikimedia.org/r/405982 (https://phabricator.wikimedia.org/T185356) [20:06:33] (03CR) 10RobH: [C: 032] adding Ramsey Isler to statistics-privatedata-users [puppet] - 10https://gerrit.wikimedia.org/r/405982 (https://phabricator.wikimedia.org/T185356) (owner: 10RobH) [20:07:12] 10Operations, 10Ops-Access-Requests: Requesting access to bast1001, stat1005, stat1006 for risler - https://phabricator.wikimedia.org/T185356#3928906 (10RobH) a:05RobH>03None [20:07:30] Zoranzoki21 and rxy I think we'll have to defer your changes until Evening SWAT. Sorry. :( [20:07:43] Niharika: I can not be for evening swat [20:07:48] Niharika: Can you now deploy? [20:08:00] Zoranzoki21: Tomorrow then. We're out of time. [20:08:08] Niharika: ok, thanks [20:08:11] Niharika: :( [20:08:18] 10Operations, 10Ops-Access-Requests: Requesting access to bast1001, stat1005, stat1006 for risler - https://phabricator.wikimedia.org/T185356#3913933 (10RobH) 05Open>03Resolved a:03RobH @Ramsey-WMF: Your access is now live. It may take up to 30 minutes for affected hosts to receive the update. If you h... [20:08:31] Niharika: Nothing, I will move me [20:08:33] I'd like to freeze the SWAT entirely for this week honestly [20:09:04] a lot of people are travelling, going to be travelling or are in their team offsites [20:09:50] +1/-1? [20:09:56] I need this patch https://gerrit.wikimedia.org/r/#/c/406603/ merged tommorrow [20:10:35] yeah ok, that seems easy enough [20:10:56] I moved for tomorrow mid-day swat [20:10:57] paravoid: +1 to freeze the SWAT [20:11:03] -1 [20:11:03] -1 [20:11:07] uhm, what's with the 10.x space there? [20:11:20] RECOVERY - MediaWiki exceptions and fatals per minute on graphite1001 is OK: OK: Less than 70.00% above the threshold [25.0] https://grafana.wikimedia.org/dashboard/db/mediawiki-graphite-alerts?orgId=1&panelId=2&fullscreen [20:11:20] BIG -1 [20:12:13] Zppix, Zoranzoki21 there was a huge spike of writes on the s1 master, which caused lag on all the slaves. There is lots of people traveling at the moment, including myself [20:12:39] marostegui: im ok doing it for today but not while they are away [20:12:44] marostegui: I only need to https://gerrit.wikimedia.org/r/#/c/406603/ be merged tomorrow because is it throttle rule for 31th [20:13:16] I can not be for evening SWAT here [20:13:20] I am sleepy [20:13:33] Zoranzoki21: if they do evening swat i can deploy it [20:13:37] Zoranzoki21: Well, I cannot be here either, I will be on a plane :) [20:13:47] But this lag was user impacting [20:13:51] Zoranzoki21 : I do it. [20:13:58] !log niharika29@tin Synchronized dblists/: Sync removed dblist in https://gerrit.wikimedia.org/r/#/c/406632/ (duration: 00m 56s) [20:13:58] (03CR) 10Faidon Liambotis: [C: 04-1] Add throttle rule for 1Lib1Ref event (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406603 (https://phabricator.wikimedia.org/T185857) (owner: 10Zoranzoki21) [20:14:09] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:15:18] !log niharika29@tin Synchronized docroot/noc/conf/: Sync removed dblist in https://gerrit.wikimedia.org/r/#/c/406632/ (duration: 00m 56s) [20:15:27] (03PS3) 10Zoranzoki21: Add throttle rule for 1Lib1Ref event [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406603 (https://phabricator.wikimedia.org/T185857) [20:15:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:15:35] So https://gerrit.wikimedia.org/r/#/c/405421/ has now been completely reverted. [20:15:45] Zoranzoki21: all of it is [20:15:57] well most of it, I mean [20:16:22] <_joe_> Niharika: ack, thanks :) [20:16:39] (03CR) 10Zoranzoki21: Add throttle rule for 1Lib1Ref event (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406603 (https://phabricator.wikimedia.org/T185857) (owner: 10Zoranzoki21) [20:16:55] _joe_: I'm sorry for all the trouble! [20:17:02] (03PS4) 10Rxy: Add throttle rule for 1Lib1Ref event [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406603 (https://phabricator.wikimedia.org/T185857) (owner: 10Zoranzoki21) [20:17:06] Zoranzoki21: https://en.wikipedia.org/wiki/Private_network [20:17:12] (03CR) 10Rxy: ">" (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406603 (https://phabricator.wikimedia.org/T185857) (owner: 10Zoranzoki21) [20:17:46] Niharika: thanks [20:18:07] thanks Niharika [20:18:17] (03CR) 10Faidon Liambotis: [C: 04-1] Add throttle rule for 1Lib1Ref event (031 comment) [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406603 (https://phabricator.wikimedia.org/T185857) (owner: 10Zoranzoki21) [20:18:20] How's the lag looking now? Where's the dashboard for it? [20:18:49] Niharika: lag is recovered for now, let me paste the link [20:19:03] <_joe_> !log depooling mw1275 to take an APC dump [20:19:09] https://grafana.wikimedia.org/dashboard/db/mysql-replication-lag?orgId=1&from=now-1h&to=now [20:19:15] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:19:15] volans: Okay, phew. :) Thanks! [20:20:19] (03PS5) 10Rxy: Add throttle rule for 1Lib1Ref event [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406603 (https://phabricator.wikimedia.org/T185857) (owner: 10Zoranzoki21) [20:20:24] (03CR) 10Zoranzoki21: "I will fix all. Wait small" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406603 (https://phabricator.wikimedia.org/T185857) (owner: 10Zoranzoki21) [20:21:02] (03CR) 10Zoranzoki21: "WAIT" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406603 (https://phabricator.wikimedia.org/T185857) (owner: 10Zoranzoki21) [20:21:10] Niharika: looks like writes are back to normal values: https://grafana.wikimedia.org/dashboard/db/mysql?panelId=2&fullscreen&orgId=1&var-dc=eqiad%20prometheus%2Fops&var-server=db1052&var-port=9104&from=now-3h&to=now [20:21:35] Niharika: we still have some lags in the tens of seconds that are not normal [20:21:38] though [20:21:41] Niharika: delete can be done later as long as it isn't referenced, but I think for delete you need to sync the parent dir [20:21:47] but mostly recovered I would say, right marostegui ? [20:21:56] Krinkle: Yep, done now. Thanks! [20:22:01] yeah, I am still checking a few things, but looks good [20:22:37] labs still delayed, but that is kinda expected [20:23:55] (03PS6) 10Zoranzoki21: Add throttle rule for 1Lib1Ref event [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406603 (https://phabricator.wikimedia.org/T185857) [20:24:28] (03PS7) 10Zoranzoki21: Add throttle rule for 1Lib1Ref event [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406603 (https://phabricator.wikimedia.org/T185857) [20:24:53] Zoranzoki21: you reintroduced all the 10.x space... :) [20:25:21] (03PS8) 10Zoranzoki21: Add throttle rule for 1Lib1Ref event [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406603 (https://phabricator.wikimedia.org/T185857) [20:25:21] <_joe_> !log repooling mw1275, taking samples of apc metadata at intervals [20:25:29] fixeed [20:25:32] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [20:28:49] (03CR) 10Rxy: [C: 031] "LGTM" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/406603 (https://phabricator.wikimedia.org/T185857) (owner: 10Zoranzoki21) [20:29:38] jouncebot:next [20:29:41] jouncebot: next [20:29:42] In 0 hour(s) and 30 minute(s): Services – Parsoid / Citoid / Mobileapps / ORES / … (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180129T2100) [20:30:27] Can you deploy this? https://gerrit.wikimedia.org/r/#/c/406524/ [20:30:36] its for integration [20:30:43] (03CR) 10Dzahn: ""FLAPPINGSTART", "FLAPPINGSTOP", "FLAPPINGDISABLED" could technically show up since all our contacts have "host_notification_options " [puppet] - 10https://gerrit.wikimedia.org/r/406535 (https://phabricator.wikimedia.org/T185862) (owner: 10Dzahn) [20:30:55] (03PS15) 10Zoranzoki21: Enable Extension:Newsletter on hewiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/381537 (https://phabricator.wikimedia.org/T177151) [20:31:21] (03PS3) 10Zoranzoki21: icinga: add notification type to SMS content [puppet] - 10https://gerrit.wikimedia.org/r/406535 (https://phabricator.wikimedia.org/T185862) (owner: 10Dzahn) [20:37:02] 10Operations, 10Gerrit, 10Patch-For-Review, 10Release-Engineering-Team (Kanban): Investigate seemingly random Gerrit slow-downs - https://phabricator.wikimedia.org/T148478#3929042 (10demon) 05Open>03Resolved [20:40:17] (03CR) 10Chad: "Tbh I never thought this was worth the effort anyway. Let's just abandon it and decline the task. Who cares?" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392184 (https://phabricator.wikimedia.org/T45956) (owner: 10TerraCodes) [20:43:20] sorry for the delay, I got language nerd-swiped by my boss [20:43:26] are things ok? [20:43:30] (now in office) [20:43:47] (03CR) 10Zoranzoki21: [C: 031] "I added for swat before. Zeljko no wanted to merge because you no reviewed this." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392184 (https://phabricator.wikimedia.org/T45956) (owner: 10TerraCodes) [20:44:26] apergos: last status was "yes, apart from labs which is recovering slowly" [20:44:35] thanks mutante [20:44:46] the page caught me as I was crossing the street to come into the building [20:45:16] we still have some more writes than before, but not in an alarming write [20:45:19] *rate [20:45:25] ok [20:58:37] jouncebot: next [20:58:37] In 0 hour(s) and 1 minute(s): Services – Parsoid / Citoid / Mobileapps / ORES / … (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180129T2100) [20:59:10] jouncebot: next [20:59:11] In 0 hour(s) and 0 minute(s): Services – Parsoid / Citoid / Mobileapps / ORES / … (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180129T2100) [20:59:29] are we holding all deploys for this week? I see that we're doing no swats [20:59:47] Zoranzoki21 there's no swat this week [20:59:57] it's be cancelled the whole week. [20:59:57] paladox: QQQ [21:00:05] cscott, arlolra, subbu, bearND, halfak, and Amir1: #bothumor I οΏ½ Unicode. All rise for Services – Parsoid / Citoid / Mobileapps / ORES / … deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180129T2100). [21:00:05] No GERRIT patches in the queue for this window AFAICS. [21:00:24] not sure what qqq is? [21:00:28] What we will with this? https://gerrit.wikimedia.org/r/#/c/406603/ [21:00:49] QQQ is on my language your OMG [21:01:03] Up to releng + ops. [21:01:17] Also qqq means nothing to me. [21:01:34] Patch https://gerrit.wikimedia.org/r/#/c/406603/ needs to be deployed [21:01:41] Have to be [21:02:32] Zoranzoki21 you will have to contact someone who decide weather's swat or which patches can go through. [21:02:55] There is no swat this week [21:03:12] bawolff: I know but I need https://gerrit.wikimedia.org/r/#/c/406603/ deployed [21:03:49] until 31th [21:03:51] January [21:03:53] Please [21:03:55] Zoranzoki21 yes but you need to contact greg g. [21:04:49] Zoranzoki21: Calm down everything will be okay [21:05:42] Zppix: I will contact greg [21:05:58] Zppix: And after merging, than will be everything okay [21:06:03] well, that's one heck of a range [21:06:14] bawolff: i agree [21:07:52] well, we've just made a swat window, it's been suddenly cancelled for the remainer of the week? [21:08:04] Yes [21:08:14] Hauskatze: swat is cancelled until next week [21:08:18] bawolff: now that you're here, got a couple of questions (easy ones) [21:08:30] sure [21:08:41] Zppix: thank God I swatted my patches an hour ago :) [21:08:46] Lol [21:08:58] 10Operations, 10TemplateStyles, 10Traffic, 10Wikimedia-Extension-setup, and 4 others: Deploy TemplateStyles to WMF production - https://phabricator.wikimedia.org/T133410#3929223 (10Tgr) [21:09:08] (03PS1) 10Dzahn: icinga: fix host names for ORES web node monitoring [puppet] - 10https://gerrit.wikimedia.org/r/406653 [21:09:11] bawolff: so there's a security patch for review for an extension and I was wondering if you'd like to take a look? [21:09:21] it's just a couple of lines [21:09:22] Link? [21:09:54] bawolff: https://phabricator.wikimedia.org/T185652 [21:11:07] (03PS2) 10Dzahn: icinga: fix host names for ORES web node monitoring [puppet] - 10https://gerrit.wikimedia.org/r/406653 (https://phabricator.wikimedia.org/T185929) [21:11:15] the other question is wrt T155725 - since the extension has changed since (some files deleted, etc.) I was wondering if we need to request another security review [21:11:16] T155725: Security review for StopForumSpam - https://phabricator.wikimedia.org/T155725 [21:11:28] 10Operations, 10Patch-For-Review, 10cloud-services-team (Kanban): Onboard bstorm to WMF - https://phabricator.wikimedia.org/T185493#3929231 (10chasemp) @Andrew when you have a chance can you do whatever `cloud admin` portion exists on wikitech please? [21:12:30] PROBLEM - HHVM rendering on mw1298 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:13:20] (03CR) 10Dzahn: [C: 032] icinga: fix host names for ORES web node monitoring [puppet] - 10https://gerrit.wikimedia.org/r/406653 (https://phabricator.wikimedia.org/T185929) (owner: 10Dzahn) [21:13:20] RECOVERY - HHVM rendering on mw1298 is OK: HTTP OK: HTTP/1.1 200 OK - 78384 bytes in 0.166 second response time [21:14:23] Hauskatze: commented on bug [21:14:33] bawolff: checking [21:14:53] !log arlolra@tin Started deploy [parsoid/deploy@cc1574b]: Updating Parsoid to 91854ff [21:15:03] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:15:22] jouncebot: next [21:15:22] In 0 hour(s) and 44 minute(s): Weekly Security deployment window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180129T2200) [21:15:30] bawolff: thanks :D Mainframe will surely fix that little issue too and we'd be good to go [21:15:42] Glad to help :) [21:15:49] oh, if he's quick about it maybe we can add that to that window? [21:16:05] not sure how to deploy patches not in gerrit [21:16:10] AutoProxyBlock is not Wikimedia deployed afaik [21:16:20] Its not [21:17:54] So nothing to deploy for us. So all that needs to be done is upload patch to gerrit, and then send an announcement email [21:18:36] * bawolff goes does his laundry [21:18:38] announcement email to whom? [21:19:24] to mediawiki-l list, probably [21:19:47] or mediawiki-announce, rather [21:23:15] !log arlolra@tin Finished deploy [parsoid/deploy@cc1574b]: Updating Parsoid to 91854ff (duration: 08m 23s) [21:23:26] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:27:41] (03CR) 10Chad: "I'm not going to review it--I think it's dumb." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/392184 (https://phabricator.wikimedia.org/T45956) (owner: 10TerraCodes) [21:31:37] 10Operations, 10Patch-For-Review, 10cloud-services-team (Kanban): Onboard bstorm to WMF - https://phabricator.wikimedia.org/T185493#3929298 (10Dzahn) @Bstorm feel free to ping me about the Icinga contact part, happy to do it together or show you where to do it self-service. you can pick your own phone number... [21:33:57] !log Updated Parsoid to 91854ff (T185643, T185346, T185385, T185267) [21:34:12] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [21:34:12] T185267: Maximum call stack side exceeded in linter - https://phabricator.wikimedia.org/T185267 [21:34:12] T185385: Cannot read property 'extsrc' of null - https://phabricator.wikimedia.org/T185385 [21:34:12] T185346: Cannot read property '2' of undefined - https://phabricator.wikimedia.org/T185346 [21:34:13] T185643: Expecting : in parser function definiton - https://phabricator.wikimedia.org/T185643 [21:42:23] 10Operations, 10ops-codfw, 10DBA, 10Patch-For-Review: db2036 storage issues? (mysql crashed, installer issues) - https://phabricator.wikimedia.org/T185294#3929377 (10jcrespo) Also, I did a thorough compare.py on the core content of all open wikis and it didn't crash, unlike the last time. [21:44:51] in about 5 minutes I'm going to be heading out of the office, moving to my work location (relatives) for the rest of my stay. Probably back on in 60 to 90 minutes [21:47:21] (03CR) 10Dzahn: "Icinga is OK for the ORES check https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=ores-web" [puppet] - 10https://gerrit.wikimedia.org/r/406653 (https://phabricator.wikimedia.org/T185929) (owner: 10Dzahn) [21:52:11] RECOVERY - MariaDB Slave Lag: s1 on dbstore2002 is OK: OK slave_sql_lag Replication lag: 21.57 seconds [21:57:21] (03PS1) 10Dzahn: icinga: add new and improved SMS notification commands [puppet] - 10https://gerrit.wikimedia.org/r/406768 (https://phabricator.wikimedia.org/T185862) [21:59:49] (03PS1) 10Dzahn: icinga: retab notification_commands template [puppet] - 10https://gerrit.wikimedia.org/r/406769 [22:00:04] bawolff and Reedy: Dear deployers, time to do the Weekly Security deployment window deploy. Dont look at me like that. You signed up for it. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180129T2200). [22:00:04] No GERRIT patches in the queue for this window AFAICS. [22:02:00] (03PS2) 10Dzahn: icinga: add new and improved SMS notification commands [puppet] - 10https://gerrit.wikimedia.org/r/406768 (https://phabricator.wikimedia.org/T185862) [22:02:57] (03CR) 10Dzahn: [C: 032] "just adding a new command, not changing existing one in this step" [puppet] - 10https://gerrit.wikimedia.org/r/406768 (https://phabricator.wikimedia.org/T185862) (owner: 10Dzahn) [22:03:41] (03PS2) 10Dzahn: icinga: retab notification_commands template [puppet] - 10https://gerrit.wikimedia.org/r/406769 [22:04:13] 10Operations, 10DC-Ops: document all scs connections - https://phabricator.wikimedia.org/T175876#3929471 (10ayounsi) 05Open>03Resolved scs-c1-eqiad.mgmt.eqiad.wmnet added to rancid. [22:04:15] 10Operations, 10ops-eqiad, 10DC-Ops: scs-c1-eqiad unresponsive - https://phabricator.wikimedia.org/T175625#3929473 (10ayounsi) [22:04:48] 10Operations, 10ops-eqiad, 10DC-Ops: scs-c1-eqiad unresponsive - https://phabricator.wikimedia.org/T175625#3598199 (10ayounsi) [22:04:50] 10Operations, 10DC-Ops: document all scs connections - https://phabricator.wikimedia.org/T175876#3929478 (10ayounsi) 05Resolved>03Open a:05ayounsi>03None [22:08:06] (03CR) 10Dzahn: [C: 032] "retab only" [puppet] - 10https://gerrit.wikimedia.org/r/406769 (owner: 10Dzahn) [22:21:28] (03Abandoned) 10MarcoAurelio: Allow eswiki bureaucrats to add/remove 'accountcreator' [mediawiki-config] - 10https://gerrit.wikimedia.org/r/395775 (https://phabricator.wikimedia.org/T182201) (owner: 10MarcoAurelio) [22:24:53] (03CR) 10Zoranzoki21: [C: 04-1] "1aa09e2b39dd made this now the default behaviour on all WMF sites. There's no longer a need to change the local configuration for euwiki." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/405771 (https://phabricator.wikimedia.org/T185531) (owner: 10Framawiki) [22:31:40] (03PS1) 10Dzahn: icinga: add temp contactgroup/host/service for testing SMS content [puppet] - 10https://gerrit.wikimedia.org/r/406772 (https://phabricator.wikimedia.org/T185862) [22:32:00] (03CR) 10jerkins-bot: [V: 04-1] icinga: add temp contactgroup/host/service for testing SMS content [puppet] - 10https://gerrit.wikimedia.org/r/406772 (https://phabricator.wikimedia.org/T185862) (owner: 10Dzahn) [22:37:59] (03PS2) 10Dzahn: icinga: add temp contactgroup/host/service for testing SMS content [puppet] - 10https://gerrit.wikimedia.org/r/406772 (https://phabricator.wikimedia.org/T185862) [22:38:29] (03CR) 10jerkins-bot: [V: 04-1] icinga: add temp contactgroup/host/service for testing SMS content [puppet] - 10https://gerrit.wikimedia.org/r/406772 (https://phabricator.wikimedia.org/T185862) (owner: 10Dzahn) [22:42:47] (03PS3) 10Dzahn: icinga: add temp contactgroup/host/service for testing SMS content [puppet] - 10https://gerrit.wikimedia.org/r/406772 (https://phabricator.wikimedia.org/T185862) [22:46:08] (03PS4) 10Dzahn: icinga: add temp contactgroup/host/service for testing SMS content [puppet] - 10https://gerrit.wikimedia.org/r/406772 (https://phabricator.wikimedia.org/T185862) [22:51:13] (03CR) 10Dzahn: [C: 032] icinga: add temp contactgroup/host/service for testing SMS content [puppet] - 10https://gerrit.wikimedia.org/r/406772 (https://phabricator.wikimedia.org/T185862) (owner: 10Dzahn) [22:59:14] 10Operations, 10Patch-For-Review, 10cloud-services-team (Kanban): Onboard bstorm to WMF - https://phabricator.wikimedia.org/T185493#3929668 (10Andrew) >>! In T185493#3929231, @chasemp wrote: > @Andrew when you have a chance can you do whatever `cloud admin` portion exists on wikitech please? Done! [23:24:57] (03PS2) 10Giuseppe Lavagetto: Refactor conftool.action, add the edit action [software/conftool] - 10https://gerrit.wikimedia.org/r/405303 [23:26:04] (03CR) 10jerkins-bot: [V: 04-1] Refactor conftool.action, add the edit action [software/conftool] - 10https://gerrit.wikimedia.org/r/405303 (owner: 10Giuseppe Lavagetto) [23:35:26] PROBLEM - Host foobar.wmflabs.org is DOWN: check_ping: Invalid hostname/address - foobar.wmflabs.org [23:35:37] lol [23:35:41] mutante ^^^ [23:35:53] hang on, that's planets [23:36:25] yes, i did that on purpose [23:36:54] though it didnt text me personally as it was supposed to yet [23:37:16] i want to test a change in the format of those SMS messages without touching existing things [23:37:26] RECOVERY - Host foobar.wmflabs.org is UP: PING OK - Packet loss = 0%, RTA = 0.98 ms [23:37:49] why it just recovered is another story.. [23:39:28] it's just planet because it was an existing web service just used for testing and clicking "add proxy" on horizon is so quick :) [23:39:44] heh [23:44:55] <_joe_> mutante: I was looking at your kill_ganglia thing [23:45:26] <_joe_> and I think there is one place where we still use ganglia_clusters, where we can't really stop doing it IMHO [23:45:48] <_joe_> and that is prometheus, where we need to know our logical division between clusters [23:45:56] PROBLEM - Host foobar.wmflabs.org is DOWN: check_ping: Invalid hostname/address - foobar.wmflabs.org [23:47:36] RECOVERY - Host foobar.wmflabs.org is UP: PING OK - Packet loss = 0%, RTA = 1.78 ms [23:48:40] _joe_: is it this one? https://gerrit.wikimedia.org/r/#/c/382930/ [23:48:47] oh, prometheus. wait [23:49:22] is it really still used there, afaict godog removed it already [23:49:47] grep'ed for ganglia_cluster through the repo [23:51:42] i see.. cluster_config uses it and prometheus uses cluster_config . .hrm [23:52:56] _joe_: we can just rename it but keep the cluster list in Hiera, just not call it "ganglia_clusters [23:53:37] 10Operations, 10Traffic: varnish 5.1.3 frontend child restarted - https://phabricator.wikimedia.org/T185968#3929788 (10ema) [23:53:52] (03PS1) 10Volans: Cumin: add custom backend in WMCS [puppet] - 10https://gerrit.wikimedia.org/r/406778 (https://phabricator.wikimedia.org/T185967) [23:53:54] (03PS1) 10Volans: NFS: add custom script to generate target hosts [puppet] - 10https://gerrit.wikimedia.org/r/406779 (https://phabricator.wikimedia.org/T185967) [23:54:22] (03CR) 10jerkins-bot: [V: 04-1] Cumin: add custom backend in WMCS [puppet] - 10https://gerrit.wikimedia.org/r/406778 (https://phabricator.wikimedia.org/T185967) (owner: 10Volans) [23:54:36] PROBLEM - Host foobar.wmflabs.org is DOWN: CRITICAL - Host not found (foobar.wmflabs.org) [23:54:46] RECOVERY - Host foobar.wmflabs.org is UP: PING OK - Packet loss = 0%, RTA = 0.54 ms [23:54:53] well, it doesnt page me and i dunno why its flapping. meh [23:55:11] i dont want to spam the channel either, just needed notifications on .. will turn them off again for a while [23:55:55] (03PS2) 10Volans: Cumin: add custom backend in WMCS [puppet] - 10https://gerrit.wikimedia.org/r/406778 (https://phabricator.wikimedia.org/T185967) [23:55:57] (03PS2) 10Volans: NFS: add custom script to generate target hosts [puppet] - 10https://gerrit.wikimedia.org/r/406779 (https://phabricator.wikimedia.org/T185967) [23:55:58] <_joe_> mutante: my idea exactly [23:56:38] 10Operations, 10Traffic: varnish 5.1.3 frontend child restarted - https://phabricator.wikimedia.org/T185968#3929809 (10ema) p:05Triage>03Normal [23:57:39] _joe_: turned off notifications for host and service. something fails to work as planned. the idea was that it should have only one notification method, send SMS to me, and not also IRC or something else [23:58:12] <_joe_> mutante: did puppet run on both the host and einsteinium? [23:58:26] <_joe_> also, did whatever you did reload icinga? [23:58:30] who broke beta :( [23:58:49] hmm, only some pages [23:58:57] https://en.wikipedia.beta.wmflabs.org/w/index.php?title=Bird&action=history displays an error for me [23:59:24] but other pages appear to work. so not critical, please ignore me. i'll file a bug [23:59:31] _joe_: i did check the generated configs on einsteinium. they were as i wanted them. but maybe the restart/reload, yea [23:59:44] will check in a min