[05:12:49] !log packagist-mirror fixed up apache config, and enabled systemd timer to mirror every 5 minutes [05:12:51] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Packagist-mirror/SAL [05:13:22] hey [05:13:27] I created the instance exactly 7 months ago [14:01:03] Technical Advice IRC meeting starting in 60 minutes in channel #wikimedia-tech, hosts: @chiborg & @milimetric - all questions welcome, more infos: https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting [14:29:18] !log hound upgrade python3.4 on hound-app-01 and hound-puppet-02 T226480 [14:29:20] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Hound/SAL [14:29:20] T226480: toolforge: puppet issue probably related to puppet-enc - https://phabricator.wikimedia.org/T226480 [14:30:53] jeh: more that were missed by the cumin run? *sigh* [14:50:47] Technical Advice IRC meeting starting in 10 minutes in channel #wikimedia-tech, hosts: @chiborg & @milimetric - all questions welcome, more infos: https://www.mediawiki.org/wiki/Technical_Advice_IRC_Meeting [16:51:30] I looks like there's some data missing from replica. [16:52:06] https://en.wikipedia.org/w/index.php?title=Milicent_Patrick&diff=prev&oldid=903318225 [16:52:17] It's not showing up: https://quarry.wmflabs.org/query/9297 [16:52:29] along with other recent edits by the same user. [16:54:45] Oh, I see... https://tools.wmflabs.org/replag/ [17:22:18] That's a lot of lag :-/ [17:28:13] I'm going to randomly guess that there are some big, slow queries running on the analytics host right now. [17:34:02] !log tools.meta Init. SAL [17:34:03] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.meta/SAL [17:39:01] bd808: lag on s4 and s8 is ~40 hours. Those are wild queries, if that's the case [17:39:20] https://tools.wmflabs.org/replag/ [17:40:15] my understanding is that the threads which do the replication are set at a lower level of priority than inbound queries, so one of the sources of lag is just a high rate of requests. [17:41:06] like, a bot, for example? [17:41:24] right. or 50 bots [17:41:54] * hauskatze always use the 'maxlag' param to help prevent replag [17:43:44] right now the 3 servers are setup with 2 serving *.web and 1 doing *.analytics. A month ago it was 1 web & 2 analytics. It might be that we were wrong in thinking that the actor and comment table changes would stress the web pool more. [17:51:09] labsdb1010 has a load avg of 30 [17:52:02] That's one of the two web replicas [17:52:15] So actually, I think the problem is the web replicas are getting hammered by something [17:52:52] labsdb1011, the analytics replica, has a load of 1 [17:53:00] It's barely working at all [17:53:21] The other web replica has a load avg of 5 [17:54:46] at this second, s4 and s8 look to have the same lag on both clusters, so that may indicate load on the sanitarium server that feeds them [17:54:49] It's all mysqld. That server is getting slammed [17:55:11] That as well, but labsdb1010 is overloaded in general [18:02:13] Apparently one is depooled (which has a load of 1) and work is going on. That's causing additional load on one server and generally likely affecting lag. [18:02:22] Just to update those following this conversation [19:01:17] bd808 I'm finding the fingerprint of cyberbot-exec-01 to be different from when I last logged in. Is this expected? [19:04:20] andrewbogott ^ [19:04:41] When did you last log in? [19:04:44] (Not expected) [19:05:07] I don't remember. Maybe not in the last2 months [19:05:39] andrewbogott pm? [19:17:21] andrewbogott, Cyberpower678: we have seen host fingerprints change sometimes when an instance has been migrated from the old region to eqiad1-r. [19:18:09] bd808 thank you. I've been talking with andrewbogott to confirm the fingerprint I'm getting is authentic [19:18:12] And it is [19:19:04] in ~20 years of using ssh I've honestly never personally encountered a mitm attack. Fingerprint changes are almost always the result of reimaging and missing a documentation update. Not to say I could never happen, but its not my first thought when I see the client notice [19:20:01] bd808 I choose to be on the safe side. [19:25:29] andrewbogott thanks for your help [19:25:35] np [20:14:04] !log tools.lexeme-forms deployed e74ff290cc (duplicates API bug fix) [actually deployed 2 hours ago, forgot to log] [20:14:07] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL