[10:52:50] 10serviceops, 10Operations, 10ops-codfw, 10User-jijiki: Degraded RAID on thumbor2002 - https://phabricator.wikimedia.org/T214813 (10jijiki) Server will be re-imaged to stretch as part of upgrading Thumbor servers to stretch - T214597 [10:53:32] 10serviceops, 10Operations, 10Thumbor, 10Patch-For-Review, 10User-jijiki: Thumbor upgrade to stretch plan - https://phabricator.wikimedia.org/T214597 (10jijiki) [10:53:35] 10serviceops, 10Operations, 10ops-codfw, 10User-jijiki: Degraded RAID on thumbor2002 - https://phabricator.wikimedia.org/T214813 (10jijiki) [10:53:38] 10serviceops, 10Operations, 10Thumbor, 10Patch-For-Review, and 2 others: Upgrade Thumbor servers to Stretch - https://phabricator.wikimedia.org/T170817 (10jijiki) [10:53:44] 10serviceops, 10Operations, 10ops-codfw, 10User-jijiki: Degraded RAID on thumbor2002 - https://phabricator.wikimedia.org/T214813 (10jijiki) 05Open→03Resolved [10:53:47] 10serviceops, 10Operations, 10Thumbor, 10Patch-For-Review, and 2 others: Upgrade Thumbor servers to Stretch - https://phabricator.wikimedia.org/T170817 (10jijiki) [11:30:53] 10serviceops, 10Operations, 10Thumbor, 10Patch-For-Review, 10User-jijiki: Thumbor upgrade to stretch plan - https://phabricator.wikimedia.org/T214597 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by jiji on cumin2001.codfw.wmnet for hosts: ` thumbor2002.codfw.wmnet ` The log can be found in... [11:38:56] 10serviceops, 10Operations, 10Thumbor, 10Patch-For-Review, 10User-jijiki: Thumbor upgrade to stretch plan - https://phabricator.wikimedia.org/T214597 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['thumbor2002.codfw.wmnet'] ` Of which those **FAILED**: ` ['thumbor2002.codfw.wmnet'] ` [11:39:20] 10serviceops, 10Operations, 10Thumbor, 10Patch-For-Review, 10User-jijiki: Thumbor upgrade to stretch plan - https://phabricator.wikimedia.org/T214597 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by jiji on cumin2001.codfw.wmnet for hosts: ` thumbor2002.codfw.wmnet ` The log can be found in... [11:39:23] 10serviceops, 10Operations, 10Thumbor, 10Patch-For-Review, 10User-jijiki: Thumbor upgrade to stretch plan - https://phabricator.wikimedia.org/T214597 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['thumbor2002.codfw.wmnet'] ` Of which those **FAILED**: ` ['thumbor2002.codfw.wmnet'] ` [11:41:20] 10serviceops, 10Operations, 10Thumbor, 10Patch-For-Review, 10User-jijiki: Thumbor upgrade to stretch plan - https://phabricator.wikimedia.org/T214597 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by jiji on cumin2001.codfw.wmnet for hosts: ` thumbor2002.codfw.wmnet ` The log can be found in... [11:41:24] 10serviceops, 10Operations, 10Thumbor, 10Patch-For-Review, 10User-jijiki: Thumbor upgrade to stretch plan - https://phabricator.wikimedia.org/T214597 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['thumbor2002.codfw.wmnet'] ` Of which those **FAILED**: ` ['thumbor2002.codfw.wmnet'] ` [11:44:21] 10serviceops, 10Operations, 10Thumbor, 10Patch-For-Review, 10User-jijiki: Thumbor upgrade to stretch plan - https://phabricator.wikimedia.org/T214597 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by jiji on cumin2001.codfw.wmnet for hosts: ` thumbor2002.codfw.wmnet ` The log can be found in... [12:15:03] 10serviceops, 10Operations, 10Thumbor, 10Patch-For-Review, 10User-jijiki: Thumbor upgrade to stretch plan - https://phabricator.wikimedia.org/T214597 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['thumbor2002.codfw.wmnet'] ` Of which those **FAILED**: ` ['thumbor2002.codfw.wmnet'] ` [12:15:27] 10serviceops, 10Operations, 10Thumbor, 10Patch-For-Review, 10User-jijiki: Thumbor upgrade to stretch plan - https://phabricator.wikimedia.org/T214597 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by jiji on cumin2001.codfw.wmnet for hosts: ` thumbor2002.codfw.wmnet ` The log can be found in... [12:41:45] 10serviceops, 10Operations, 10Thumbor, 10Patch-For-Review, 10User-jijiki: Thumbor upgrade to stretch plan - https://phabricator.wikimedia.org/T214597 (10Gilles) Let me know when the host is ready for testing [12:51:31] 10serviceops, 10Operations, 10Thumbor, 10Patch-For-Review, 10User-jijiki: Thumbor upgrade to stretch plan - https://phabricator.wikimedia.org/T214597 (10jijiki) [12:51:35] 10serviceops, 10Operations, 10ops-codfw, 10User-jijiki: Degraded RAID on thumbor2002 - https://phabricator.wikimedia.org/T214813 (10jijiki) 05Resolved→03Open @papaul I am unable to reimage the server because PXE boot is failing. Server says: ` Broadcom UNDI PXE-2.1 v16.4.3 Copyright (C) 2000-2014 Bro... [12:51:38] 10serviceops, 10Operations, 10Thumbor, 10Patch-For-Review, and 2 others: Upgrade Thumbor servers to Stretch - https://phabricator.wikimedia.org/T170817 (10jijiki) [12:53:25] 10serviceops, 10Operations, 10Thumbor, 10Patch-For-Review, 10User-jijiki: Thumbor upgrade to stretch plan - https://phabricator.wikimedia.org/T214597 (10jijiki) @Gilles It looks like we have some issues with thumbor2002, we are investigating if we can continue the upgrade with other host. [12:54:09] !log Depooling thumbor1004 to check if the rest of our hosts can handle the load without it - T214597 [13:15:44] 10serviceops, 10Operations, 10Thumbor, 10Patch-For-Review, 10User-jijiki: Thumbor upgrade to stretch plan - https://phabricator.wikimedia.org/T214597 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['thumbor2002.codfw.wmnet'] ` Of which those **FAILED**: ` ['thumbor2002.codfw.wmnet'] ` [13:20:12] <_joe_> jijiki: I would wait for papaul tomorrow, tbh [13:20:51] we had a look with gilles [13:21:05] after depooling thumbor1004 [13:21:13] not much changed on the rest of the clusted [13:21:29] we feel that we can survive with 6 hosts [13:22:20] server load and cpu is low [13:23:20] https://grafana.wikimedia.org/d/000000607/cluster-overview?orgId=1&var-datasource=eqiad%20prometheus%2Fops&var-cluster=thumbor&var-instance=All&from=now-3h&to=now&refresh=5s [13:23:42] <_joe_> ok, your call :) [13:24:16] <_joe_> it was more "we're not in an emergency to upgrade" than worrying about load [13:25:23] jijiki: i think the !log doesnt work in this channel, that message does not appear on the SAL [13:25:26] >( [13:25:29] :) [13:25:37] fsero: yes wrong channel [13:25:38] sigh [13:25:58] tx for noticing [13:27:50] I'll postpone it for later [13:28:11] I will step out for a bit later [14:21:04] <_joe_> so, for today's meeting, can someone create an etherpad with a non-changing url? [14:41:59] check out the email _joe_ [14:42:32] <_joe_> thanks (although the url has the date :P) [16:07:54] 10serviceops, 10Analytics, 10Operations, 10Research, and 4 others: Transferring data from Hadoop to production MySQL database - https://phabricator.wikimedia.org/T213566 (10akosiaris) For the record, just saying pointing out that the question of a new VM versus mwmaint1002 is probably irrelevant here. We c... [16:30:35] <_joe_> fsero: akosiaris MEETING [16:31:00] * akosiaris joining [17:20:36] 10serviceops, 10Analytics, 10Operations, 10Research, and 4 others: Transferring data from Hadoop to production MySQL database - https://phabricator.wikimedia.org/T213566 (10Nuria) >Should we btw stall this on T213976? yes, we need to resolve first where/how are binarie/data files s going to be moved to the... [17:51:04] 10serviceops, 10Operations, 10Thumbor, 10Patch-For-Review, 10User-jijiki: Thumbor upgrade to stretch plan - https://phabricator.wikimedia.org/T214597 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by jiji on cumin1001.eqiad.wmnet for hosts: ` ['thumbor1004.eqiad.wmnet'] ` The log can be foun... [18:30:04] jijiki: fsero: 2019/freenode.#wikimedia-serviceops.0205.log:2019-02-05 17:47:04 bd808 Is there any interest in having stashbot in this channel? The main use here would probably be Phabricator mentions, but I could also setup some !log routing too if that would be useful [18:30:08] :) [18:31:38] bd808: folks do read the -operations channel to take a peak what has been done [18:31:53] splitting that to two channels I feel it will be a bit confusing [18:32:13] although I have by mistake ran !log here :p [18:32:19] yeah. opinions on this vary by team :) [18:41:12] and what you said, it is something that should be discussed at the sre meeting [18:41:39] :( [18:42:28] * bd808 awards volans geek points for the SLATFATF quit message [18:42:57] lol [18:44:08] bd808: btw spicerack has been deployed, so next runs should be logsmbot-compatible ;) thanks for the fix [18:44:33] funny story, it was not noticed up until now, and I think we did already 2 datacenter switchover with that format :-P [18:44:41] heh [18:44:54] Most of us write to SAL more than we read it [18:51:05] bd808: I had a look at the code btw [18:51:30] I was thinking if it makes sense to add some colour on the word "logged" [18:52:02] in a similar manner as wikibugs [18:52:51] could do. I have my irc client setup to strip out a lot of color and formatting so I don't often think about such things [19:01:25] LOL [20:04:22] 10serviceops, 10Operations, 10Security: User ziraksima@gmail is receiving too many emails - https://phabricator.wikimedia.org/T216445 (10jijiki) [20:14:46] I think I prefer !log ging to be in the 'main' channel only [20:15:40] wikibugs covers our phab mention needs, I feel [20:15:50] thanks though bd808 [20:19:01] I wonder if there are any other bots that might be useful though [20:38:13] this is your own discussion, I only want to offer a perspective if that helps- we started reporting #dba on -datbases because most people don't tak them with #operations [20:39:00] so they are for the most part non-duplicate, and the rate was very low for us [20:39:17] s/tak/tag/ [21:07:22] we have the phab tickets that are tagged with serviceops or whatever it is, reported in here by wikibugs; is this the sort of thing you mean jynus or something additional? [21:07:38] and your perspective is always welcome [21:24:56] 10serviceops, 10Operations, 10Thumbor, 10Patch-For-Review, 10User-jijiki: Thumbor upgrade to stretch plan - https://phabricator.wikimedia.org/T214597 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['thumbor1004.eqiad.wmnet'] ` Of which those **FAILED**: ` ['thumbor1004.eqiad.wmnet'] `