[00:16:13] !log tools.admin Running k8s-2020-migrate.sh with prefix=k [00:16:24] !log tools.kasper-data-translator Migrated to 2020 Kubernetes cluster [00:16:47] !log tools.kian Migrated to 2020 Kubernetes cluster [00:16:58] !log tools.knowledgegrapher Migrated to 2020 Kubernetes cluster [00:17:09] !log tools.kokolores Migrated to 2020 Kubernetes cluster [00:17:20] !log tools.krdbot Migrated to 2020 Kubernetes cluster [00:17:30] !log tools.krinkle-redirect Migrated to 2020 Kubernetes cluster [00:17:41] !log tools.ksamsok-rest Migrated to 2020 Kubernetes cluster [00:21:25] !log tools.admin Running k8s-2020-migrate.sh with prefix=l [00:21:39] !log tools.langviews Migrated to 2020 Kubernetes cluster [00:21:53] !log tools.langviews-test Migrated to 2020 Kubernetes cluster [00:22:03] !log tools.lestaty Migrated to 2020 Kubernetes cluster [00:22:12] !log tools.lexeme-forms Migrated to 2020 Kubernetes cluster [00:22:22] !log tools.lexeme-senses Migrated to 2020 Kubernetes cluster [00:22:36] !log tools.lingua-libre Migrated to 2020 Kubernetes cluster [00:22:46] !log tools.linkscount Migrated to 2020 Kubernetes cluster [00:23:08] !log tools.linksearch Migrated to 2020 Kubernetes cluster [00:23:17] !log tools.linkstranslator Migrated to 2020 Kubernetes cluster [00:23:30] !log tools.list Migrated to 2020 Kubernetes cluster [00:23:40] !log tools.lists Migrated to 2020 Kubernetes cluster [00:23:49] !log tools.locator Migrated to 2020 Kubernetes cluster [00:24:08] !log tools.locator-tool Migrated to 2020 Kubernetes cluster [00:24:16] !log tools.lolrrit-wm Migrated to 2020 Kubernetes cluster [00:24:37] !log tools.lp-tools Migrated to 2020 Kubernetes cluster [00:24:46] !log tools.lziad Migrated to 2020 Kubernetes cluster [00:26:27] !log tools.admin Running k8s-2020-migrate.sh with prefix=m [00:26:50] !log tools.machtsinn-dev Migrated to 2020 Kubernetes cluster [00:26:50] the "m" batch will be my stopping point for a while [00:27:12] !log tools.machtsinn Migrated to 2020 Kubernetes cluster [00:27:23] !log tools.maintgraph Migrated to 2020 Kubernetes cluster [00:27:32] !log tools.mapillary-commons Migrated to 2020 Kubernetes cluster [00:27:43] !log tools.maplayers-demo Migrated to 2020 Kubernetes cluster [00:27:54] !log tools.map-of-monuments Migrated to 2020 Kubernetes cluster [00:28:02] !log tools.massmailer Migrated to 2020 Kubernetes cluster [00:28:10] !log tools.massviews Migrated to 2020 Kubernetes cluster [00:28:19] !log tools.massviews-test Migrated to 2020 Kubernetes cluster [00:28:28] !log tools.mathbot Migrated to 2020 Kubernetes cluster [00:28:38] !log tools.mathqa Migrated to 2020 Kubernetes cluster [00:28:57] !log tools.matsubot Migrated to 2020 Kubernetes cluster [00:29:06] !log tools.matthewrbowker-dev Migrated to 2020 Kubernetes cluster [00:29:19] !log tools.matthewrbowker Migrated to 2020 Kubernetes cluster [00:29:41] !log tools.matthobot Migrated to 2020 Kubernetes cluster [00:29:58] !log tools.matvaretabellen Migrated to 2020 Kubernetes cluster [00:30:04] !log tools.mdann52bot Migrated to 2020 Kubernetes cluster [00:30:12] !log tools.media-reports Migrated to 2020 Kubernetes cluster [00:30:22] !log tools.mediawiki-mirror Migrated to 2020 Kubernetes cluster [00:30:33] !log tools.merge2pdf Migrated to 2020 Kubernetes cluster [00:30:45] !log tools.meta Migrated to 2020 Kubernetes cluster [00:31:05] !log tools.metricslibrary Migrated to 2020 Kubernetes cluster [00:31:16] !log tools.mirador Migrated to 2020 Kubernetes cluster [00:31:41] !log tools.mitmachen Migrated to 2020 Kubernetes cluster [00:31:50] !log tools.mmt Migrated to 2020 Kubernetes cluster [00:32:12] !log tools.montage Migrated to 2020 Kubernetes cluster [00:32:34] !log tools.monumental-glam Migrated to 2020 Kubernetes cluster [00:32:56] !log tools.monumental Migrated to 2020 Kubernetes cluster [00:33:05] !log tools.mormegil Migrated to 2020 Kubernetes cluster [00:33:27] !log tools.mortar Migrated to 2020 Kubernetes cluster [00:33:40] !log tools.most-readable-pages Migrated to 2020 Kubernetes cluster [00:34:02] !log tools.multicompare Migrated to 2020 Kubernetes cluster [00:34:11] !log tools.mu Migrated to 2020 Kubernetes cluster [00:34:29] !log tools.mw2sparql Migrated to 2020 Kubernetes cluster [00:34:39] !log tools.mwstew Migrated to 2020 Kubernetes cluster [00:34:49] !log tools.mwversion Migrated to 2020 Kubernetes cluster [00:35:02] !log tools.my-first-django-oauth-app Migrated to 2020 Kubernetes cluster [00:35:12] !log tools.my-threads Migrated to 2020 Kubernetes cluster [00:35:22] !log tools.mzmcbride Migrated to 2020 Kubernetes cluster [00:41:07] !log tools.addshore Migrated to 2020 Kubernetes cluster [00:44:10] !log tools.article-ideas-generator Migrated to 2020 Kubernetes cluster [00:45:58] !log tools.citer Migrated to 2020 Kubernetes cluster [00:47:03] !log tools.cite-web-helper Migrated to 2020 Kubernetes cluster [00:51:55] !log tools.cvrminer Migrated to 2020 Kubernetes cluster [00:53:03] !log tools.enet Migrated to 2020 Kubernetes cluster [00:54:31] !log tools.gerrit-newcomer-bot Migrated to 2020 Kubernetes cluster [00:55:40] !log tools.holidays-viewer Migrated to 2020 Kubernetes cluster [00:56:34] !log tools.indic-techcom Migrated to 2020 Kubernetes cluster [00:57:50] !log tools.inkpen Migrated to 2020 Kubernetes cluster [01:06:48] stashbot: can you talk now? [01:06:49] See https://wikitech.wikimedia.org/wiki/Tool:Stashbot for help. [02:28:12] did something break on the bot? [02:41:58] probably just being too noisy [14:19:51] !log wikispeech Deploy latest from Git master: 13787aa (T192683), fc44dd4, 15214e8 (T192683) [14:20:00] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Wikispeech/SAL [14:20:01] T192683: add user setting for disabling Wikispeech - https://phabricator.wikimedia.org/T192683 [14:26:33] !log wikispeech Deploy latest from Git master: 20d9123, d8447da (T206485), f1f1143, 07e603c, cbfb487 (T179229), 0a7a488, bcd8507, 3225fe2, ebbad49 (T234597), b831a16, 879bc31, 52931bc, 24aa5aa, 93c9f0b, 90df124 (T243384), 1dbe811, 8a3cc99 (T244345), 624274d (T167300, T243376) [14:26:52] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Wikispeech/SAL [14:26:52] T243376: Determine if an upgrade to manifest version 2 in extension.json is useful - https://phabricator.wikimedia.org/T243376 [14:26:53] T206485: Set root: true in eslintrc.json - https://phabricator.wikimedia.org/T206485 [14:26:54] T244345: Remove the Util class for PHP tests - https://phabricator.wikimedia.org/T244345 [14:26:54] T243384: Remove `require` from PHP tests - https://phabricator.wikimedia.org/T243384 [14:26:55] T167300: Sensible default values in extension.json (WikispeechServerUrl) - https://phabricator.wikimedia.org/T167300 [14:26:56] T179229: Decide whether we want the package-lock.json to commit or ignore - https://phabricator.wikimedia.org/T179229 [14:26:56] T234597: Phase out PHPUnit expected exception annotations from tests - https://phabricator.wikimedia.org/T234597 [15:44:31] * bd808 quiets stashbot before starting more 2020 Kubernetes migrations [15:46:08] !log tools.admin Running k8s-2020-migrate.sh with prefix=n [15:46:10] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.admin/SAL [15:46:18] !log tools.neechal Migrated to 2020 Kubernetes cluster [15:46:20] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.neechal/SAL [15:46:41] !log tools.niosh Migrated to 2020 Kubernetes cluster [15:46:51] !log tools.nli-wiki Migrated to 2020 Kubernetes cluster [15:47:04] !log tools.noclaims Migrated to 2020 Kubernetes cluster [15:47:11] note to self: needs both -v and +q to quiet the bot [15:47:20] !log tools.nominatim Migrated to 2020 Kubernetes cluster [15:47:42] !log tools.nordic-museum-depicts Migrated to 2020 Kubernetes cluster [15:47:58] !log tools.nppdash Migrated to 2020 Kubernetes cluster [15:51:40] !log tools.admin Running k8s-2020-migrate.sh with prefix=o [15:51:58] !log tools.oabot Migrated to 2020 Kubernetes cluster [15:52:08] !log tools.oabot-wd-game Migrated to 2020 Kubernetes cluster [15:52:18] !log tools.oauth-hello-world Migrated to 2020 Kubernetes cluster [15:52:40] !log tools.oauthtest Migrated to 2020 Kubernetes cluster [15:52:51] !log tools.ocrtoy Migrated to 2020 Kubernetes cluster [15:53:01] !log tools.octodata Migrated to 2020 Kubernetes cluster [15:53:09] !log tools.olympics Migrated to 2020 Kubernetes cluster [15:53:19] !log tools.onetools Migrated to 2020 Kubernetes cluster [15:53:28] !log tools.oojs-ui Migrated to 2020 Kubernetes cluster [15:53:50] !log tools.opendatasets Migrated to 2020 Kubernetes cluster [15:54:04] !log tools.openrefine-wikidata Migrated to 2020 Kubernetes cluster [15:54:16] !log tools.order-user-by-reg Migrated to 2020 Kubernetes cluster [15:54:26] !log tools.ores Migrated to 2020 Kubernetes cluster [15:54:35] !log tools.ores-support-checklist Migrated to 2020 Kubernetes cluster [15:54:45] !log tools.orphantalk Migrated to 2020 Kubernetes cluster [15:54:55] !log tools.osm-add-tags Migrated to 2020 Kubernetes cluster [15:55:58] !log tools.osm Migrated to 2020 Kubernetes cluster [15:56:08] !log tools.otrsreports Migrated to 2020 Kubernetes cluster [15:56:30] !log tools.outreachy-user-contribution-tool Migrated to 2020 Kubernetes cluster [15:56:44] !log tools.outreachy-user-ranking-tool Migrated to 2020 Kubernetes cluster [15:56:54] !log tools.outreachy-wikicv Migrated to 2020 Kubernetes cluster [16:05:10] !log tools.admin Running k8s-2020-migrate.sh with prefix=p [16:05:19] !log tools.pagecount Migrated to 2020 Kubernetes cluster [16:05:31] !log tools.pagecounts Migrated to 2020 Kubernetes cluster [16:05:44] !log tools.pagepile-visual-filter Migrated to 2020 Kubernetes cluster [16:06:01] * Lucas_WMDE wags finger at admin tool for breaking alphabetical order ;) [16:06:06] !log tools.pageviews Migrated to 2020 Kubernetes cluster [16:06:17] !log tools.para Migrated to 2020 Kubernetes cluster [16:06:30] !log tools.paste Migrated to 2020 Kubernetes cluster [16:06:30] * Lucas_WMDE is an idiot and did not notice the different message [16:06:39] !log tools.pathway-viewer Migrated to 2020 Kubernetes cluster [16:07:01] !log tools.paws Migrated to 2020 Kubernetes cluster [16:07:11] !log tools.paws-support Migrated to 2020 Kubernetes cluster [16:07:21] !log tools.peachy-docs Migrated to 2020 Kubernetes cluster [16:07:30] !log tools.periodibot Migrated to 2020 Kubernetes cluster [16:07:43] Lucas_WMDE: it all depends on the collator one uses :) In this case it is potentially filesystem inode order [16:07:52] !log tools.permission-denied-test Migrated to 2020 Kubernetes cluster [16:08:05] !log tools.persondata Migrated to 2020 Kubernetes cluster [16:08:15] !log tools.phabricator-bug-status Migrated to 2020 Kubernetes cluster [16:08:27] !log tools.piagetbot Migrated to 2020 Kubernetes cluster [16:08:49] !log tools.pirsquared Migrated to 2020 Kubernetes cluster [16:09:03] !log tools.plagiabot Migrated to 2020 Kubernetes cluster [16:09:25] !log tools.plantel2wiki Migrated to 2020 Kubernetes cluster [16:09:46] !log tools.portal-stats Migrated to 2020 Kubernetes cluster [16:10:08] !log tools.position-holder-history Migrated to 2020 Kubernetes cluster [16:10:35] !log tools.project-fa Migrated to 2020 Kubernetes cluster [16:10:57] !log tools.prompter Migrated to 2020 Kubernetes cluster [16:11:19] !log tools.proneval-gsoc17 Migrated to 2020 Kubernetes cluster [16:11:41] !log tools.proxies Migrated to 2020 Kubernetes cluster [16:12:03] !log tools.ptable Migrated to 2020 Kubernetes cluster [16:12:25] !log tools.ptools Migrated to 2020 Kubernetes cluster [16:12:47] !log tools.pub Migrated to 2020 Kubernetes cluster [16:13:09] !log tools.pyshexy Migrated to 2020 Kubernetes cluster [16:13:31] !log tools.pywikibot-testwiki Migrated to 2020 Kubernetes cluster [16:13:53] !log tools.pywikipedia Migrated to 2020 Kubernetes cluster [16:16:21] !log tools.admin Running k8s-2020-migrate.sh with prefix=q [16:16:42] !log tools.qrcode-generator Migrated to 2020 Kubernetes cluster [16:17:04] !log tools.quarrybot-enwiki Migrated to 2020 Kubernetes cluster [16:17:26] !log tools.query Migrated to 2020 Kubernetes cluster [16:17:49] !log tools.quickpreset-migrate Migrated to 2020 Kubernetes cluster [16:20:08] !log tools.admin Running k8s-2020-migrate.sh with prefix=r [16:20:31] !log tools.railways Migrated to 2020 Kubernetes cluster [16:20:55] !log tools.random-featured Migrated to 2020 Kubernetes cluster [16:21:18] !log tools.rangeblockfinder Migrated to 2020 Kubernetes cluster [16:21:45] !log tools.rang Migrated to 2020 Kubernetes cluster [16:22:09] !log tools.rank Migrated to 2020 Kubernetes cluster [16:22:32] !log tools.raun Migrated to 2020 Kubernetes cluster [16:22:55] !log tools.readmore Migrated to 2020 Kubernetes cluster [16:23:18] !log tools.recitation-bot Migrated to 2020 Kubernetes cluster [16:23:41] !log tools.redirectviews Migrated to 2020 Kubernetes cluster [16:24:04] !log tools.redirtest Migrated to 2020 Kubernetes cluster [16:24:29] !log tools.refill-api Migrated to 2020 Kubernetes cluster [16:24:51] !log tools.refill Migrated to 2020 Kubernetes cluster [16:25:15] !log tools.refswikipedia Migrated to 2020 Kubernetes cluster [16:25:38] !log tools.remarkup2wikitext Migrated to 2020 Kubernetes cluster [16:26:01] !log tools.replacer Migrated to 2020 Kubernetes cluster [16:26:28] !log tools.reviewtools Migrated to 2020 Kubernetes cluster [16:26:52] !log tools.rfastats Migrated to 2020 Kubernetes cluster [16:27:15] !log tools.ricordisamoa Migrated to 2020 Kubernetes cluster [16:27:38] !log tools.ri-diff-fixture-updater Migrated to 2020 Kubernetes cluster [16:28:00] !log tools.rightstool Migrated to 2020 Kubernetes cluster [16:28:23] !log tools.rm-stats Migrated to 2020 Kubernetes cluster [16:28:47] !log tools.rmstats Migrated to 2020 Kubernetes cluster [16:29:10] !log tools.robin Migrated to 2020 Kubernetes cluster [16:29:33] !log tools.roundtripping Migrated to 2020 Kubernetes cluster [16:29:56] !log tools.ruarbcom-js Migrated to 2020 Kubernetes cluster [16:30:19] !log tools.ruarbcom Migrated to 2020 Kubernetes cluster [16:30:42] !log tools.rxy Migrated to 2020 Kubernetes cluster [16:53:31] !log tools.admin Running k8s-2020-migrate.sh with prefix=s [16:53:55] !log tools.sammour Migrated to 2020 Kubernetes cluster [16:54:17] !log tools.sau226test Migrated to 2020 Kubernetes cluster [16:54:39] !log tools.scholia Migrated to 2020 Kubernetes cluster [16:55:00] !log tools.scribe Migrated to 2020 Kubernetes cluster [16:55:23] !log tools.sdbot Migrated to 2020 Kubernetes cluster [16:55:45] !log tools.section-links Migrated to 2020 Kubernetes cluster [16:56:06] !log tools.secwatch Migrated to 2020 Kubernetes cluster [16:56:30] !log tools.serviceawards Migrated to 2020 Kubernetes cluster [16:56:52] !log tools.sge-jobs Migrated to 2020 Kubernetes cluster [16:56:54] on the s's atm, guess I'll wait a bit to log on so it doesn't mess up something [16:57:14] !log tools.shexia Migrated to 2020 Kubernetes cluster [16:57:36] !log tools.shex-simple Migrated to 2020 Kubernetes cluster [16:57:58] !log tools.shextranslator Migrated to 2020 Kubernetes cluster [16:58:21] !log tools.shields Migrated to 2020 Kubernetes cluster [16:58:38] DSquirrelGM: Is your tool still using the legacy Kubernetes cluster? If not there should not be any kind of conflict [16:58:46] !log tools.shortnames Migrated to 2020 Kubernetes cluster [16:59:08] !log tools.shorturls Migrated to 2020 Kubernetes cluster [16:59:30] !log tools.sibu Migrated to 2020 Kubernetes cluster [16:59:31] how would I check? I don't have the web service running atm [16:59:52] !log tools.sibutest Migrated to 2020 Kubernetes cluster [17:00:14] !log tools.similarity Migrated to 2020 Kubernetes cluster [17:00:36] !log tools.simplewd Migrated to 2020 Kubernetes cluster [17:00:59] !log tools.sistercities Migrated to 2020 Kubernetes cluster [17:01:21] !log tools.siteviews Migrated to 2020 Kubernetes cluster [17:01:22] DSquirrelGM: if you don't have the webservice running then it is on no cluster at all :) [17:01:34] The migrations I am doing now are only for running webservices [17:01:44] !log tools.slow-parse Migrated to 2020 Kubernetes cluster [17:02:06] !log tools.smv-description-translations Migrated to 2020 Kubernetes cluster [17:02:28] !log tools.snapshots Migrated to 2020 Kubernetes cluster [17:02:50] !log tools.sonarqubebot Migrated to 2020 Kubernetes cluster [17:03:13] !log tools.sowhy Migrated to 2020 Kubernetes cluster [17:03:24] didn't know whether it involved the ssh console on the login host [17:03:35] !log tools.spacemedia Migrated to 2020 Kubernetes cluster [17:04:01] !log tools.spdx Migrated to 2020 Kubernetes cluster [17:04:22] DSquirrelGM: it does not have anything to do with ssh access. Only the `webservice` command or directly use of kubectl [17:04:23] !log tools.speed-patrolling Migrated to 2020 Kubernetes cluster [17:04:46] !log tools.speedpatrolling Migrated to 2020 Kubernetes cluster [17:05:08] !log tools.sphinxcapt-leaderboard Migrated to 2020 Kubernetes cluster [17:05:30] !log tools.spiarticleanalyzer Migrated to 2020 Kubernetes cluster [17:05:53] !log tools.sqid Migrated to 2020 Kubernetes cluster [17:06:15] !log tools.sql-optimizer Migrated to 2020 Kubernetes cluster [17:06:37] !log tools.srwiki Migrated to 2020 Kubernetes cluster [17:06:37] so this migration process appends the kubectl alias to every tool’s ~/.profile? :/ [17:06:49] I was hoping /usr/local/bin/kubectl would just be removed when the migration was done [17:07:00] !log tools.statistics-api Migrated to 2020 Kubernetes cluster [17:07:27] !log tools.status Migrated to 2020 Kubernetes cluster [17:07:49] !log tools.stemmeberettigelse Migrated to 2020 Kubernetes cluster [17:08:12] !log tools.stewardbots Migrated to 2020 Kubernetes cluster [17:08:34] !log tools.strephit Migrated to 2020 Kubernetes cluster [17:08:46] Lucas_WMDE: that will happen, but we can't do it until the legacy cluster is shutdown [17:08:56] !log tools.stylize Migrated to 2020 Kubernetes cluster [17:09:23] !log tools.supercount Migrated to 2020 Kubernetes cluster [17:09:42] the problem is that /usr/bin/kubectl is too new to work with the API of the legacy cluster and /usr/local/bin/kubectl is too old to work with the API of the 2020 cluster [17:09:45] !log tools.svgcheck Migrated to 2020 Kubernetes cluster [17:09:50] so we need both as long as we have both [17:10:07] !log tools.svgedit Migrated to 2020 Kubernetes cluster [17:10:18] I am hoping that next week we will be able to do the steps to remove the legacy cluster entirely [17:10:29] !log tools.swviewer Migrated to 2020 Kubernetes cluster [17:11:24] !log tools.admin Running k8s-2020-migrate.sh with prefix=t [17:11:35] bd808: oh, I didn’t know that part [17:11:42] that the newer kubectl doesn’t support the older version [17:11:47] !log tools.tabulist Migrated to 2020 Kubernetes cluster [17:11:48] that sounds… annoying ^^ [17:12:09] !log tools.teg Migrated to 2020 Kubernetes cluster [17:12:13] the legacy version is too old. They usually do +/- 2 version compatability [17:12:31] !log tools.templatecheck Migrated to 2020 Kubernetes cluster [17:12:53] !log tools.templatetiger Migrated to 2020 Kubernetes cluster [17:13:15] !log tools.templatetransclusioncheck Migrated to 2020 Kubernetes cluster [17:13:37] !log tools.tesseract-ocr-service Migrated to 2020 Kubernetes cluster [17:13:59] !log tools.test-webservice-generic Migrated to 2020 Kubernetes cluster [17:14:25] !log tools.textcatdemo Migrated to 2020 Kubernetes cluster [17:14:48] !log tools.tfaprotbot Migrated to 2020 Kubernetes cluster [17:15:10] !log tools.thankyou Migrated to 2020 Kubernetes cluster [17:15:32] !log tools.thibaut120094 Migrated to 2020 Kubernetes cluster [17:15:54] !log tools.thibtools Migrated to 2020 Kubernetes cluster [17:16:16] !log tools.threed2commons Migrated to 2020 Kubernetes cluster [17:16:38] !log tools.tilde Migrated to 2020 Kubernetes cluster [17:17:01] !log tools.timescale Migrated to 2020 Kubernetes cluster [17:17:24] !log tools.toolforge-gallery Migrated to 2020 Kubernetes cluster [17:17:46] !log tools.toolschecker-k8s-ws Migrated to 2020 Kubernetes cluster [17:18:08] !log tools.toolserver-home-archive Migrated to 2020 Kubernetes cluster [17:18:30] !log tools.toolserver Migrated to 2020 Kubernetes cluster [17:18:52] !log tools.tools-info Migrated to 2020 Kubernetes cluster [17:19:20] !log tools.topviews-test Migrated to 2020 Kubernetes cluster [17:19:42] !log tools.totoazero Migrated to 2020 Kubernetes cluster [17:20:04] !log tools.tour Migrated to 2020 Kubernetes cluster [17:20:26] !log tools.tptools Migrated to 2020 Kubernetes cluster [17:20:48] !log tools.traffic-grapher Migrated to 2020 Kubernetes cluster [17:21:21] !log tools.translatemplate Migrated to 2020 Kubernetes cluster [17:21:43] !log tools.translate Migrated to 2020 Kubernetes cluster [17:22:05] !log tools.translation-server Migrated to 2020 Kubernetes cluster [17:22:27] !log tools.trusty-deprecation Migrated to 2020 Kubernetes cluster [17:22:49] !log tools.tsbot Migrated to 2020 Kubernetes cluster [17:23:11] !log tools.tts-comparison Migrated to 2020 Kubernetes cluster [17:23:33] !log tools.twl17 Migrated to 2020 Kubernetes cluster [17:23:55] !log tools.twltools Migrated to 2020 Kubernetes cluster [17:24:17] !log tools.typoscan Migrated to 2020 Kubernetes cluster [17:25:43] !log tools.admin Running k8s-2020-migrate.sh with prefix=u [17:26:06] !log tools.universalviewer Migrated to 2020 Kubernetes cluster [17:26:28] !log tools.unpkg Migrated to 2020 Kubernetes cluster [17:26:49] !log tools.uploadhelper-ir Migrated to 2020 Kubernetes cluster [17:27:11] !log tools.upload-stats-bot Migrated to 2020 Kubernetes cluster [17:27:34] !log tools.usage Migrated to 2020 Kubernetes cluster [17:27:56] !log tools.user-contributions-feed Migrated to 2020 Kubernetes cluster [17:28:18] !log tools.usernamesearch Migrated to 2020 Kubernetes cluster [17:28:40] !log tools.userrank Migrated to 2020 Kubernetes cluster [17:29:02] !log tools.usrd-tools Migrated to 2020 Kubernetes cluster [17:30:55] !log tools.admin Running k8s-2020-migrate.sh with prefix=v [17:31:18] !log tools.vendor Migrated to 2020 Kubernetes cluster [17:31:41] !log tools.video2commons-test Migrated to 2020 Kubernetes cluster [17:32:04] !log tools.video-cat-bot Migrated to 2020 Kubernetes cluster [17:32:26] !log tools.visualcategories Migrated to 2020 Kubernetes cluster [17:32:39] phuzion: the “meta” tool was migrated hours ago according to the log https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.meta/SAL [17:32:58] I would assume that should no longer be affected by transient issues, might be a persistent error [17:33:41] yeah, seems to be an intermittent issue. I've gotten it to load a handful of times. [17:34:27] Problems with stalktoy or another thing in that grab bag tool? [17:34:41] eligibility, apparently (it was mentioned in -operations) [17:34:41] * bd808 hates tools that do N>1 things [17:34:46] * Lucas_WMDE sign [17:35:00] bd808: accounteligibility, the steward election eligibility checker tool [17:35:05] https://tools.wmflabs.org/meta/accounteligibility/52 [17:35:28] but yeah, stalkertoy is also showing 502s for me [17:36:14] I loaded the UI for stalktoy but did not try to search for anything. accounteligibility is spinning now which is not a good sign [17:37:38] I figured I'd let y'all know about it since, you know, steward elections are happening now. :) [17:39:10] phuzion: maintainers are listed here -- https://tools.wmflabs.org/admin/tool/meta [17:40:04] bd808: is it likely that there's a config that needs fixed on their part, or is it possible it's on the infra somehow? [17:40:59] phuzion: Until proven otherwise I generally blame the software [17:41:09] the tool's software [17:41:21] bd808: extraordinarily fair [17:41:49] Is the Planned NFS maintenance https://lists.wikimedia.org/pipermail/cloud-announce/2020-February/000260.html going on now? [17:42:11] fnielsen: I don't think bstorm_ has started yet [17:42:33] Not yet. In about 15 min [17:42:46] Ok. I am getting "SIGINT/SIGQUIT received...killing workers..." "Fatal Python error: PyInterpreterState_Delete: remaining subinterpreters" and "worker 1 buried after 1 seconds" on one of my tools [17:43:52] I suppose that could be my setup somehow [17:44:35] It seem to start at around 16:55 today [17:44:46] fnielsen: that could be the tool hitting the memory limits for its container. [17:45:18] fnielsen: some info at https://wikitech.wikimedia.org/wiki/News/2020_Kubernetes_cluster_migration#Lower_default_resource_limits_for_webservice [17:47:57] $ kubectl describe pod `kubectl get pods | tail -n 1 | awk '{print $1}'` | tail -n 1 [17:47:57] Warning FailedScheduling 119s (x103 over 53m) default-scheduler 0/38 nodes are available: 3 node(s) had taints that the pod didn't tolerate, 35 Insufficient cpu. [17:48:55] The 3 nodes with taints are the control plane [17:49:08] What did you set the limits to? [17:49:12] Defaults [17:49:21] Oh! That's interesting [17:49:22] (I think) [17:49:50] https://tools.wmflabs.org/k8s-status/ thinks there is lots of space still, but ? [17:50:02] fnielsen: what's the tool name? [17:50:08] Scholia [17:51:01] I changed it last week to Python3.7. Maybe I did not setup the virtualenv correct. [17:51:31] The scheduler is all about the "requests" line. The request for most tools defaults to really low, so insufficient CPU would be very surprising to me [17:51:58] requests is only the default 250m/256Mi [17:52:40] So then the scheduler really thinks we need more...or possibly the remaining nodes have too many tools and possibly also the ingress controllers on them [17:52:44] They reserve a lot of CPU [17:52:44] Now the k8s webservice give me an error https://tools.wmflabs.org/k8s-status/namespaces/tool-scholia/ [17:52:47] the scheduler failure is timestamped 2020-02-26T16:54:23Z [17:53:34] fnielsen: yeah. the pod is stuck in scheduling. I'm going to try a soft restart and hope this was some temporary issue. [17:53:37] It is only for that tool. My other tool shows ok: https://tools.wmflabs.org/k8s-status/namespaces/tool-ordia/ [17:53:44] It recovered [17:53:56] bd808: I think the cluster is thrashing a bit [17:54:07] Too many pods for the scheduler. [17:54:12] redirectviews, siteviews and topviews-test (latter isn't important) didn't migrate properly. All appear to be stuck in the "Pending" state. I tried manually restarting, etc. [17:55:11] musikanimal: ack. It seems you are not alone [17:56:06] bd808: we'd need to check stats for requests on nodes and do some math about what the cluster is doing with it...buuuut I need to do the NFS change soon. We should probably move some tools back [17:56:22] Wiki-replicas should be back to normal actually [17:56:44] I see 137 pending pods [17:56:46] Sorry about my problem just got on top of the maintenance. [17:56:56] We only succeeded moving 506 [17:56:58] 605 [17:57:02] Sorry, I reversed it [17:57:02] bstorm_: I'm going to poke around a bit more. The data I'm getting from the API says things are fine, but obviously they are not [17:57:13] https://grafana-labs.wikimedia.org/d/toolforge-kubernetes/toolforge-kubernetes?orgId=1&refresh=1m [17:57:25] Basically, we hit limits [17:57:33] Spinning up more nodes shoudl fix it [17:57:36] *should [17:58:09] In fact, that's the best fix until we dig in deeper. We can reduce the default CPU request in webservice thereafter [17:58:18] Sound good? [17:58:43] bd808: ^^ [17:58:56] I can work on adding nodes, but I'm confused about `kubectl top node` looking fine [17:59:04] packing problem I guess? [17:59:20] I'm not. That makes a lot of sense that it would be fine. This all the scheduler and what it thinks it should do [17:59:44] * bstorm_ doesn't disagree with the scheduler entirely [18:00:01] however, we should probably reduce the default CPU request in webservice [18:00:19] kubectl top node is the actual current usage [18:00:30] This is all reservations and packing [18:00:33] like you said [18:01:16] Our nodes are horribly sized if they stop accepting workloads while showing this much unused capacity, but I'm probably not thinking about burst room correctly when I do the math [18:02:20] most nodes are only showing 33% mem in use [18:02:33] I think it is all cpu [18:02:47] and it's the request being too high [18:02:49] avg cpu load is ~10% [18:03:06] Yeah, scheduler won't care about load until the kubelet taints the node [18:03:32] !log tools downtimed toolschecker for nfs maintenance [18:04:21] bstorm_: should I try building new nodes while you are workign on NFS, or should I just fret about things for a bit? [18:04:45] lol, well...you can honestly build new nodes just fine until puppet starts failing [18:05:03] #wikimedia-cloud Wikimedia Cloud Services (wikitech.wikimedia.org) | Status: Kubernetes issues; NFS maintenance | Ask questions here, but please provide links and context. Use "!help" if nobody responds | More details and channel logs at https://wikitech.wikimedia.org/wiki/Help:IRC | Code of Conduct applies: https://www.mediawiki.org/wiki/CoC [18:05:09] Oops [18:05:27] Works better with the command in place [18:06:24] But if you want to be on the safe side, maybe wait until client side dust settles. Any tool that really needs to be up can move back to the old cluster until this is done? [18:07:24] yeah, I will try to get a report of messed up tools [18:10:09] 135 of them. brilliant :/ [18:10:26] https://grafana-labs.wikimedia.org/d/toolforge-kubernetes/toolforge-kubernetes?orgId=1&refresh=1m [18:10:40] I see 137 pods pending...hope that doesn't include some control plane pods [18:11:12] no. they are all in "tool-*" namespaces [18:11:35] names starting q-v [18:11:58] Ah, I see [18:12:01] That makes sense [18:12:28] * bd808 wishes he had coded an "unmigrate" command [18:12:52] I might just hot patch webservice for that actually [18:13:09] That might not be the worst thing. [18:18:36] Everything is likely to start breaking now for at least a bit [18:19:23] `-bash: cd: /home/bd808: Stale file handle` -- confirmed [18:19:34] Getting it back will take a little bit [18:22:50] Running mass puppet runs [18:23:01] Oops, need to run puppet on the maps/scratch cluster [18:27:09] Ugh, looks like I missed something in the scratch mount. Fixing [18:28:25] Maybe not... [18:29:05] Yes I did [18:29:51] When reading the topic, "ls: cannot open directory '.': Stale file handle" on tools-sgebastion-07 seems to be known already? [18:30:08] Yes. NFS maintenance [18:30:43] thx [18:30:55] Hope to have it over soon [18:31:49] cumin is 65% done running puppet across the fleet [18:33:25] 90% [18:35:42] "ImportError: No module named debug " https://tools.wmflabs.org/video2commons/ [18:35:57] * Nemo_bis retries in a few min given what above [18:36:08] Nemo_bis: nfs changes happening will mess with pretty much everything in toolforge [18:36:20] * Nemo_bis nods [18:37:16] Yeah, sorry. 97%... [18:37:24] After this come the reboots where needed [18:38:42] That fixed scratch mounts [18:38:44] good. [18:41:05] Reboot time [18:43:16] !log tools rebooting all kubernetes workers [18:45:06] !log tools rebooting tools-sgebastion-07.tools.eqiad.wmflabs [18:47:32] !log tools rebooting tools-sgegrid-master [18:48:16] !log tools rebooting tools-sgegrid-shadow [18:48:28] LOL, the bot is dead. Oops [18:49:55] Uh oh. [18:50:12] I don't have perms on NFS on tools-sgebastion-07. Why is that.... [18:50:35] Yes. Wondering too [18:50:35] * jeh was just looking at that too `Operation not permitted` [18:52:18] idmapd not running or something. Checking [18:52:23] We had it set not to run previously [18:53:44] Failed to restart nfs-utils.service: Unit rpcbind.socket is masked. [18:53:47] WHY??? [18:53:48] Ok [18:54:05] I'll need to unmask that [18:54:17] That should not be needed, but there's likely some extra config to fix that [18:55:55] bstorm_: rpcbind is disabled in ::profile::wmcs::instance [18:56:04] it has been for a long time [18:56:06] Intentionally [18:56:30] well, since 2020-01-08, but yes [18:56:33] That's not something I'm worried about. I additionally masked the socket because it should be masked. However, going to fix the services and worry about that later [18:57:10] ah, got it [18:59:13] Strange. ls -l works. [18:59:23] So directories can be traversed [18:59:30] but file content not [19:00:14] Yes. It's because it is read only [19:00:31] It isn't recognizing users. This is a problem with nfs-utils on stretch or similar [19:00:49] I might have to rollback soon... [19:03:16] Well. My home directory /home/wurgl has permission 0700 and I can look at it (but not at e.g. /home/zerabat). So it recognizes me somehow [19:03:24] heh [19:04:00] Rebooting it to remount everything again [19:04:07] In case it just needed that service reset [19:04:27] !log tools rebooting tools-sgebastion-07 [19:05:44] !log tools rebooting tools-sgebastion-08 [19:06:12] * bd808 will probably never import these lost !log's but it is technically possible [19:06:19] That worked. I have to unmask the socket [19:06:25] Setting puppet to stop breaking this [19:08:10] I need puppet to stop masking it, but I can unmask via cumin [19:08:35] https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/575067 [19:08:48] Stupid incorrect dependency in the OS [19:17:01] Alright, that should start fixing things, but possibly only after reboots [19:17:07] Checking the work [19:17:32] at least one exec node looks totally fine [19:17:59] openstack browser is back up [19:18:49] Checking the paws master [19:19:45] PAWS is fine [19:20:35] The grid is in terrible shape [19:20:47] that's what my inbox is telling me :/ [19:20:54] jeh: in case you were doing that, I'll reboot the grid exec nodes now, ok [19:21:09] Just don't want to both be doing that... [19:21:30] I'm going through hosts with Stale file handles, if you want to get them I'll grab the others [19:21:46] !log tools rebooting tools-sgebastion-08 again [19:22:07] Rebooting tools-k8s-master [19:22:17] jeh: yeah, I can just reboot all the exec nodes with cumin [19:22:37] There's enough reporting shenanigans that I may as well. [19:23:14] tools-sgebastion-08 looks good so far after the second reboot [19:24:52] * bd808 tries to get stashbot up [19:25:09] !log catgraph restart fridolin, fishbone, sylvester recovering from NFS maintenance [19:25:31] !log rebooting every single exec node [19:25:35] :-p [19:26:23] and stashbot can't restart because I filled up the 2020 k8s cluster :/ [19:27:29] Lemme check that cluster [19:28:52] !log maps restarting maps-tiles1 recovering from NFS maintenance [19:28:55] I suspect the old cluster needs some reboots [19:29:45] I'm going to reboot the control plane nodes of the 2020 cluster one at a time [19:29:48] They need it [19:29:49] !log maps restarting maps-warper3 [19:30:33] Wikitech is 502 [19:31:19] Back [19:31:44] tools-k8s-control-1 still isn't up. Reboot is taking a bit [19:32:29] !log video restarting encoding04 [19:32:49] !log tools rebooted tools-package-builder-02 [19:34:10] bstorm_: do you think it is safe for me to try to move some tools back to the legacy k8s cluster? [19:34:16] bd808: if you want to build out nodes, I think that'll go fine now. [19:34:46] Quite the opposite, really. The legacy cluster seems ok except a couple nodes, but the new one is healthier [19:35:31] building new nodes takes quite a while. I was hoping to flip the tools stuck in scheduling back first [19:35:35] Ah ok [19:35:39] !log snuggle restarting snuggle-enwiki-02 [19:35:42] Well...let me check something [19:36:52] Load avg looks ok [19:36:56] Go for it [19:37:45] Cool. I guess I should get stashbot up first to catch the actions [19:38:09] !log video restarting gfg [19:38:29] webgrid nodes are still sucking [19:38:35] Sounds good :) [19:40:27] I'm glad I didn't have to revert...and still have to do all this cleanup [19:43:05] stuck in ContainerCreating for 2m on the legacy grid is not a good sign [19:44:57] There's some workers that need puppet runs and then reboots again [19:45:13] I can puppet run again via cumin [19:45:32] thanks bstorm_ [19:47:03] It was affecting the 2020 cluster as well [19:47:47] After that puppet run, it might recover. May need another reboot run [19:48:06] Without the puppet run, it'll miss some changes, though, so best to wait for it [19:49:14] 50%... [19:49:30] Hello stashbot!! [19:49:33] welcome back stashbot [19:49:47] its on the legacy cluster again for now [19:50:01] That'll do until we unscrew the 2020 cluster [19:50:13] yup. starting on that now [19:50:52] bd808: the cpu default in the new cluster is created via podpreset [19:51:00] In maintain-kubeusers [19:51:26] That means, we can fix it in place [19:51:34] sweet [19:51:34] !log tools.qrcode-generator Reverted to legacy Kubernetes cluster [19:51:45] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.qrcode-generator/SAL [19:51:49] Well...not really. I just realized that the preset likely won't retroactively affect pods. :( [19:51:54] !log tools.quarrybot-enwiki Reverted to legacy Kubernetes cluster [19:51:57] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.quarrybot-enwiki/SAL [19:52:06] But we can adjust those. It'll need a script because I think there's one per namespace [19:52:14] !log tools.query Reverted to legacy Kubernetes cluster [19:52:17] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.query/SAL [19:52:23] !log tools.quickpreset-migrate Reverted to legacy Kubernetes cluster [19:52:26] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.quickpreset-migrate/SAL [19:52:29] 70% puppet run finished [19:52:32] !log tools.railways Reverted to legacy Kubernetes cluster [19:52:34] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.railways/SAL [19:52:41] !log tools.random-featured Reverted to legacy Kubernetes cluster [19:52:43] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.random-featured/SAL [19:52:52] !log tools.rang Reverted to legacy Kubernetes cluster [19:52:54] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.rang/SAL [19:52:59] !log tools.rangeblockfinder Reverted to legacy Kubernetes cluster [19:53:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.rangeblockfinder/SAL [19:53:09] !log tools.rank Reverted to legacy Kubernetes cluster [19:53:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.rank/SAL [19:53:18] !log tools.raun Reverted to legacy Kubernetes cluster [19:53:27] !log tools.readmore Reverted to legacy Kubernetes cluster [19:53:47] !log tools.recitation-bot Reverted to legacy Kubernetes cluster [19:53:51] !log tools.redirectviews Reverted to legacy Kubernetes cluster [19:53:57] !log tools.redirtest Reverted to legacy Kubernetes cluster [19:54:18] !log tools.refill-api Reverted to legacy Kubernetes cluster [19:54:20] snuggle-enwiki-02.snuggle.eqiad.wmflabs is not in good shape, puppet was uninstalled on Jan 8 2020. [19:54:27] !log tools.refill Reverted to legacy Kubernetes cluster [19:54:47] !log tools.refswikipedia Reverted to legacy Kubernetes cluster [19:55:01] !log tools.remarkup2wikitext Reverted to legacy Kubernetes cluster [19:55:11] !log tools.reviewtools Reverted to legacy Kubernetes cluster [19:55:31] !log tools.rfastats Reverted to legacy Kubernetes cluster [19:55:49] !log tools.ri-diff-fixture-updater Reverted to legacy Kubernetes cluster [19:56:09] !log tools.ricordisamoa Reverted to legacy Kubernetes cluster [19:56:32] !log tools.rightstool Reverted to legacy Kubernetes cluster [19:56:41] !log tools.rm-stats Reverted to legacy Kubernetes cluster [19:56:52] jeh: "fun" we have had folks do that before. I think we generally tell them that is not allowed and give them a tiny grace period to try and get data off before we nuke the instance [19:57:01] !log tools.rmstats Reverted to legacy Kubernetes cluster [19:57:12] !log tools.robin Reverted to legacy Kubernetes cluster [19:57:32] !log tools.roundtripping Reverted to legacy Kubernetes cluster [19:57:34] Image pulls are going very slow on the legacy cluster, but they seem to work eventually [19:57:45] !log tools.ruarbcom-js Reverted to legacy Kubernetes cluster [19:57:54] the first tool I moved back took ~5m to get to Running state [19:57:56] !log tools.ruarbcom Reverted to legacy Kubernetes cluster [19:58:28] !log tools.sammour Reverted to legacy Kubernetes cluster [19:58:42] !log tools.sau226test Reverted to legacy Kubernetes cluster [19:58:49] !log tools.scholia Reverted to legacy Kubernetes cluster [19:59:03] !log tools.scribe Reverted to legacy Kubernetes cluster [19:59:12] !log tools.sdbot Reverted to legacy Kubernetes cluster [19:59:23] !log tools.section-links Reverted to legacy Kubernetes cluster [19:59:35] !log tools.secwatch Reverted to legacy Kubernetes cluster [19:59:41] !log dumps restart dumps-0 [19:59:47] !log tools.serviceawards Reverted to legacy Kubernetes cluster [20:00:03] !log tools.sge-jobs Reverted to legacy Kubernetes cluster [20:00:09] !log tools.shex-simple Reverted to legacy Kubernetes cluster [20:00:18] !log tools.shexia Reverted to legacy Kubernetes cluster [20:00:25] !log tools.shextranslator Reverted to legacy Kubernetes cluster [20:00:33] !log tools.shields Reverted to legacy Kubernetes cluster [20:00:40] !log tools.shortnames Reverted to legacy Kubernetes cluster [20:00:51] !log tools.shorturls Reverted to legacy Kubernetes cluster [20:01:11] !log tools.sibu Reverted to legacy Kubernetes cluster [20:01:26] !log tools.sibutest Reverted to legacy Kubernetes cluster [20:01:36] !log tools.similarity Reverted to legacy Kubernetes cluster [20:01:57] !log tools.simplewd Reverted to legacy Kubernetes cluster [20:02:05] !log tools.sistercities Reverted to legacy Kubernetes cluster [20:02:09] !log tools.siteviews Reverted to legacy Kubernetes cluster [20:02:29] !log tools.slow-parse Reverted to legacy Kubernetes cluster [20:02:39] !log tools.smv-description-translations Reverted to legacy Kubernetes cluster [20:02:47] !log tools.snapshots Reverted to legacy Kubernetes cluster [20:03:00] !log tools.sonarqubebot Reverted to legacy Kubernetes cluster [20:03:07] !log tools.sowhy Reverted to legacy Kubernetes cluster [20:03:27] !log tools.spacemedia Reverted to legacy Kubernetes cluster [20:03:38] !log tools.spdx Reverted to legacy Kubernetes cluster [20:03:50] !log tools.speed-patrolling Reverted to legacy Kubernetes cluster [20:03:57] !log tools.speedpatrolling Reverted to legacy Kubernetes cluster [20:04:18] !log tools.sphinxcapt-leaderboard Reverted to legacy Kubernetes cluster [20:04:23] I see my downtimes are expiring [20:04:38] !log tools.spiarticleanalyzer Reverted to legacy Kubernetes cluster [20:04:48] !log tools.sqid Reverted to legacy Kubernetes cluster [20:05:08] !log tools.sql-optimizer Reverted to legacy Kubernetes cluster [20:05:13] !log fastcci restart fastcci-new-master [20:05:28] !log tools.srwiki Reverted to legacy Kubernetes cluster [20:05:39] !log tools.statistics-api Reverted to legacy Kubernetes cluster [20:05:50] !log tools.status Reverted to legacy Kubernetes cluster [20:05:58] !log tools.stemmeberettigelse Reverted to legacy Kubernetes cluster [20:06:11] !log tools.stewardbots Reverted to legacy Kubernetes cluster [20:06:38] !log huggle restart huggle-wl [20:06:42] !log tools.strephit Reverted to legacy Kubernetes cluster [20:07:02] !log tools.stylize Reverted to legacy Kubernetes cluster [20:07:09] !log tools.supercount Reverted to legacy Kubernetes cluster [20:07:29] !log tools.svgcheck Reverted to legacy Kubernetes cluster [20:07:42] !log tools.svgedit Reverted to legacy Kubernetes cluster [20:07:52] !log tools.swviewer Reverted to legacy Kubernetes cluster [20:08:01] !log tools.tabulist Reverted to legacy Kubernetes cluster [20:08:10] !log tools.teg Reverted to legacy Kubernetes cluster [20:08:19] !log tools.templatecheck Reverted to legacy Kubernetes cluster [20:08:27] !log tools.templatetiger Reverted to legacy Kubernetes cluster [20:08:40] !log tools.templatetransclusioncheck Reverted to legacy Kubernetes cluster [20:08:55] !log tools.tesseract-ocr-service Reverted to legacy Kubernetes cluster [20:09:04] !log tools.test-webservice-generic Reverted to legacy Kubernetes cluster [20:09:08] The cumin puppet runs on k8s is still only at 70% [20:09:11] which sucks [20:09:13] !log tools.textcatdemo Reverted to legacy Kubernetes cluster [20:09:20] It won't give me a report of what's done or not until the end [20:09:22] !log tools.tfaprotbot Reverted to legacy Kubernetes cluster [20:09:42] !log tools.thankyou Reverted to legacy Kubernetes cluster [20:09:50] !log tools.thibaut120094 Reverted to legacy Kubernetes cluster [20:09:59] !log tools.thibtools Reverted to legacy Kubernetes cluster [20:10:19] !log tools.threed2commons Reverted to legacy Kubernetes cluster [20:10:29] !log tools.tilde Reverted to legacy Kubernetes cluster [20:10:37] !log tools.timescale Reverted to legacy Kubernetes cluster [20:10:57] !log tools.toolforge-gallery Reverted to legacy Kubernetes cluster [20:11:08] !log tools.tools-info Reverted to legacy Kubernetes cluster [20:11:20] !log tools.toolschecker-k8s-ws Reverted to legacy Kubernetes cluster [20:11:27] !log tools.toolserver-home-archive Reverted to legacy Kubernetes cluster [20:11:37] !log tools.toolserver Reverted to legacy Kubernetes cluster [20:11:50] !log tools.topviews-test Reverted to legacy Kubernetes cluster [20:11:59] !log tools.totoazero Reverted to legacy Kubernetes cluster [20:12:08] !log tools.tour Reverted to legacy Kubernetes cluster [20:12:21] !log tools.tptools Reverted to legacy Kubernetes cluster [20:12:41] !log tools.traffic-grapher Reverted to legacy Kubernetes cluster [20:12:43] !log quarry restart quarry-web-01 and quarry-worker-01 [20:12:51] !log tools.translate Reverted to legacy Kubernetes cluster [20:13:00] !log tools.translatemplate Reverted to legacy Kubernetes cluster [20:13:20] !log tools.translation-server Reverted to legacy Kubernetes cluster [20:13:41] !log tools.trusty-deprecation Reverted to legacy Kubernetes cluster [20:13:47] !log tools.tsbot Reverted to legacy Kubernetes cluster [20:14:03] !log tools.tts-comparison Reverted to legacy Kubernetes cluster [20:14:10] !log tools.twl17 Reverted to legacy Kubernetes cluster [20:14:21] !log tools.twltools Reverted to legacy Kubernetes cluster [20:14:30] !log tools.typoscan Reverted to legacy Kubernetes cluster [20:14:38] !log tools.universalviewer Reverted to legacy Kubernetes cluster [20:14:58] !log tools.unpkg Reverted to legacy Kubernetes cluster [20:15:05] !log tools.upload-stats-bot Reverted to legacy Kubernetes cluster [20:15:15] !log tools.uploadhelper-ir Reverted to legacy Kubernetes cluster [20:15:25] !log tools.usage Reverted to legacy Kubernetes cluster [20:15:38] !log tools.user-contributions-feed Reverted to legacy Kubernetes cluster [20:15:47] !log tools rebooting tools-sgegrid-master because it actually had the permissions thing going on still [20:15:48] !log tools.usernamesearch Reverted to legacy Kubernetes cluster [20:15:56] !log tools.userrank Reverted to legacy Kubernetes cluster [20:16:05] !log tools.usrd-tools Reverted to legacy Kubernetes cluster [20:16:10] Stashbot seems overloaded? [20:16:25] How can you tell what cluster you’re on [20:16:26] !log tools.vendor Reverted to legacy Kubernetes cluster [20:16:46] !log tools.video-cat-bot Reverted to legacy Kubernetes cluster [20:17:09] !log tools.video2commons-test Reverted to legacy Kubernetes cluster [20:17:18] !log tools.visualcategories Reverted to legacy Kubernetes cluster [20:17:24] bstorm_: I muted it's ack of the log [20:17:27] RhinosF1 there's a few ways. `kubectl config current-context` is ideal [20:17:58] If it says "default" old cluster [20:17:59] I’ll look later then. I’m mobile. [20:18:08] "toolforge" = 2020 cluster [20:19:22] Does cumin ever just hang? [20:20:11] I suppose it could, at least for a while. Not sure if there is a max time it waits for a command response [20:20:13] !log toolsbeta restart toolsbeta-sgegrid-shadow [20:20:49] !log tools.portal-stats Reverted to legacy Kubernetes cluster [20:20:51] * bd808 found some more tools stuck in Pending [20:21:03] !log tools.position-holder-history Reverted to legacy Kubernetes cluster [20:21:23] !log tools.prompter Reverted to legacy Kubernetes cluster [20:21:42] !log tools.proneval-gsoc17 Reverted to legacy Kubernetes cluster [20:21:53] !log tools.proxies Reverted to legacy Kubernetes cluster [20:22:16] tools grid is finally recoverd [20:22:18] !log tools.ptable Reverted to legacy Kubernetes cluster [20:22:27] !log tools.pub Reverted to legacy Kubernetes cluster [20:22:39] !log tools.pyshexy Reverted to legacy Kubernetes cluster [20:22:39] I'd love to be able to say the same for sure of kubernetes [20:22:49] !log tools.pywikibot-testwiki Reverted to legacy Kubernetes cluster [20:22:57] the grid being quite so dependent on NFS is easier to tell [20:22:59] !log tools.pywikipedia Reverted to legacy Kubernetes cluster [20:23:36] On the k8s cluster, I'm using a combination of puppet failures and load to tell [20:23:40] Can I blame this on you? [20:23:45] https://www.irccloud.com/pastebin/ffdZeMjW [20:24:05] Yup! I rebooted it. Try again RhinosF1 [20:24:22] It’s cron [20:24:27] Ah ok [20:24:28] It’ll run again soon [20:24:30] Then definitely [20:24:38] The grid should be healthy no [20:24:40] *now [20:26:00] I'm going to try killing my cumin run. I just wish I could tell what it is hanging on [20:26:52] these are the last VMs that need to be rebooted: tools-worker-[1008,1015,1021].tools.eqiad.wmflab [20:27:02] they have the operation not permitted error [20:27:38] !log tools rebooting tools-worker-[1008,1015,1021] [20:27:48] RhinosF1: just so you don't feel singled out, there were easily 500 cron job failures during this NFS switch [20:28:16] stashbot: can you talk again? [20:28:17] See https://wikitech.wikimedia.org/wiki/Tool:Stashbot for help. [20:28:20] sweet [20:28:36] Thanks jeh! [20:28:44] I really appreciate the help with the cleanup [20:28:45] 4 of them were on one tool I maintain but np [20:30:31] RIP my inbox [20:31:24] I was really glad that I add a tag to all that tools.admin spam. Made it easy to select and archive the whole lot [20:32:08] Ha [20:32:23] I need to improve my tagging + mail workflow [20:32:59] I have thousands of emails to root...they all skip the inbox [20:34:05] Hmm. I wonder what state the toolschecker host is in [20:34:12] I think I have one auto-skip rule [20:35:37] !log toolsbeta hard rebooting the grid master for toolsbeta [20:35:45] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [20:37:42] I think that's it... [20:37:49] Things seem ok now [20:38:10] checking some stuff [20:39:27] awesome, great job putting the new config together bstorm_ [20:39:55] Thanks :) [20:40:56] toolschecker is green except the mtime on a cron job [20:57:38] It had a stuck job [20:57:46] deleted it and all is well [20:58:00] This makes me wonder how many more stuck jobs exist in the grid... [20:59:29] tons of them [21:00:01] https://www.irccloud.com/pastebin/RtmrwDco/ [21:06:25] !log tools deleting loads of stuck grid jobs [21:06:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [21:08:09] Ok, some are coming back. Some grid node is still having issues [21:14:19] Cleaned up. It was old stuff [22:03:08] snuggle [22:03:49] !log admin powering down cloudvirt1014 for hardware maintenance [22:03:52] * zhuyifei1999_ was reading backlog and tried to ctrl-f for this ^, oops [22:04:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [22:29:48] !log tools deleted pod maintain-kubeusers-6d9c45f4bc-5bqq5 to deploy new image [22:29:57] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [23:12:34] !log tools replacing all tool limit-ranges in the 2020 cluster with a lower cpu request version [23:12:44] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [23:41:28] !log tools Cordoned tools-worker-10[16-40] in preparation for shrinking legacy Kubernetes cluster [23:41:37] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [23:42:30] !log tools Drained tools-worker-1040 [23:42:33] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [23:42:56] how can I get access to: ssh deployment-eventlog05.eqiad.wmflabs? [23:43:20] EdTadros__: you need to be a member of the deployment-prep Cloud VPS project. [23:43:42] https://tools.wmflabs.org/openstack-browser/project/deployment-prep [23:44:37] bd808: is there some sort of maint going on right? [23:44:41] right now* [23:44:44] bd808 where do I go to request access/membership? [23:45:11] Zppix: yes? Do you have a particular issue to report? [23:45:25] EdTadros__: That is actually a really good question. [23:45:27] bd808: no, just trying to figure out why my ircbot keeps dying [23:46:44] Zppix: earlier that would have been from the NFS maintenance. Just now it was because your bot was running on tools-worker-1039 which I just drained of jobs [23:46:57] it should start back up on a different node [23:47:19] bd808: it did i was curious, thanks [23:47:51] EdTadros__: first question for you, Do you have a Wikimedia developer account yet? -- https://www.mediawiki.org/wiki/Developer_account [23:48:44] I believe so but will confirm and hop back in here in a bit. [23:49:36] EdTadros__: cool. Once that's sorted then you can start adding the things you will need to join any Cloud VPS project. Then we can get you added as a member of deployment-prep [23:50:40] EdTadros__: https://wikitech.wikimedia.org/wiki/Help:Cloud_services_user_roles_and_rights may help you understand things a bit. [23:51:20] * bd808 wonders why stashbot isn't started up again yet [23:52:36] bd808: stashbot has other plans in mind /s [23:53:13] we may have some busted nodes in the legacy Kubernetes cluster it seems :/ [23:59:20] EdTadros__, I may be able to grant access to deployment-prep stuff depending on your position