[00:00:43] paravoid: i wouldn't have guessed that you listened to neko case, but i'm not sure what about it is surprising [00:01:14] (03CR) 10Nemo bis: "Thanks. How about "Thanks" too? And BetaFeatures, CentralNotice, Echo?" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105622 (owner: 10Reedy) [00:01:22] heh [00:01:27] Nemo_bis: I was doing the obvious ones [00:01:46] Wikimedians are never satisfied. [00:01:52] Reedy: and I was just dropping some only-slightly-less-obvious in the comments :) [00:01:59] Gloria: I prepended a "Thanks" [00:02:07] You're saintly. [00:02:19] consider that the standard nowadays is just clicking a button and letting a robot/program do the job for you [00:02:39] $wgCategoryTreeDynamicTag = true; [00:02:39] require( $IP . '/extensions/CategoryTree/CategoryTree.php' ); [00:02:45] How did anyone think that was going to work? [00:02:47] Actually writing or pronouncing the word is probably an habit going to extintion [00:02:50] +c [00:05:47] wtf [00:09:17] !log reedy updated /a/common to {{Gerrit|I624133239}}: Disable more stuff on loginwiki and votewiki [00:09:35] Logged the message, Master [00:10:23] (03PS1) 10Reedy: Fixup CategoryTree [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105623 [00:11:45] greg-g: did you enjoy the video? [00:12:37] (03PS2) 10Reedy: Fixup CategoryTree config, disable where necessary [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105623 [00:13:39] (03CR) 10Reedy: [C: 032] Fixup CategoryTree config, disable where necessary [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105623 (owner: 10Reedy) [00:13:48] (03Merged) 10jenkins-bot: Fixup CategoryTree config, disable where necessary [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105623 (owner: 10Reedy) [00:15:44] !log reedy synchronized wmf-config/ [00:16:02] Logged the message, Master [00:18:38] !log reedy updated /a/common to {{Gerrit|Ibdd8671a1}}: Fixup CategoryTree config, disable where necessary [00:18:44] (03PS1) 10Reedy: Disable BetaFeatures on votewiki and loginwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105624 [00:18:55] Logged the message, Master [00:19:46] (03CR) 10Reedy: [C: 032] Disable BetaFeatures on votewiki and loginwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105624 (owner: 10Reedy) [00:19:56] (03Merged) 10jenkins-bot: Disable BetaFeatures on votewiki and loginwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105624 (owner: 10Reedy) [00:20:08] Reedy: An if branch + require seems weird. [00:20:52] why [00:20:53] ? [00:20:54] !log reedy synchronized wmf-config/InitialiseSettings.php [00:21:10] Dunno. [00:21:12] Logged the message, Master [00:21:23] Because the others near it use include. [00:21:34] we're pretty inconsistent throughout the file [00:28:41] (03Abandoned) 10Andrew Bogott: Add support for miscellaneous proxy-specific settings. [operations/puppet] - 10https://gerrit.wikimedia.org/r/105513 (owner: 10Andrew Bogott) [00:35:24] !log reedy updated /a/common to {{Gerrit|Ie45a06c38}}: Disable BetaFeatures on votewiki and loginwiki [00:35:31] (03PS1) 10Reedy: Disable a handful more extensions form loginwiki and votewiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105627 [00:35:43] Logged the message, Master [00:36:30] (03CR) 10Reedy: [C: 032] Disable a handful more extensions form loginwiki and votewiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105627 (owner: 10Reedy) [00:36:52] (03Merged) 10jenkins-bot: Disable a handful more extensions form loginwiki and votewiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105627 (owner: 10Reedy) [00:37:17] !log reedy synchronized wmf-config/InitialiseSettings.php [00:37:33] Logged the message, Master [00:44:52] !log reedy updated /a/common to {{Gerrit|I854ea451b}}: Disable a handful more extensions form loginwiki and votewiki [00:44:56] (03PS1) 10Reedy: Disable a few more extensions from VoteWiki and LoginWiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105628 [00:45:10] Logged the message, Master [00:45:37] (03CR) 10Reedy: [C: 032] Disable a few more extensions from VoteWiki and LoginWiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105628 (owner: 10Reedy) [00:45:45] (03Merged) 10jenkins-bot: Disable a few more extensions from VoteWiki and LoginWiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105628 (owner: 10Reedy) [00:47:33] !log reedy synchronized wmf-config/ [00:47:51] Logged the message, Master [00:48:40] !log reedy updated /a/common to {{Gerrit|I23469a49a}}: Disable a few more extensions from VoteWiki and LoginWiki [00:48:58] Logged the message, Master [00:49:20] !log reedy synchronized wmf-config/CommonSettings.php 'Fix notice' [00:49:33] (03PS1) 10Reedy: Fix NOTICE/typo [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105630 [00:49:35] Logged the message, Master [00:52:51] !log reedy synchronized wmf-config/ '$wgUseAjax = true;' [00:52:57] 'wgUseAjax' => array( [00:52:57] 'default' => false, [00:52:58] ), [00:53:08] Logged the message, Master [00:53:08] Reedy: You found that in InitialiseSettings.php? [00:53:11] Krenair: ^ [00:53:19] Gah [00:53:22] lol [00:53:28] I don't even. [00:54:09] (03PS1) 10Reedy: [x] Yes to AJAX [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105632 [00:54:46] Reedy: Give me a second to review that. [00:55:07] It's fine, Gloria... [00:55:18] (03CR) 10MZMcBride: "Looks good to me." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105632 (owner: 10Reedy) [00:55:34] Krenair: I was double-checking that InitialiseSettings.php got updated. [00:55:55] https://noc.wikimedia.org/conf/CommonSettings.php.txt seems outdated. [00:55:57] (03CR) 10Reedy: [C: 032] [x] Yes to AJAX [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105632 (owner: 10Reedy) [00:55:58] But might be local cache. [00:56:19] I see "$wgUseAjax = true;" in there. [00:57:06] https://git.wikimedia.org/raw/operations%2Fmediawiki-config/7047848725cc099d76094c16ebf1e4b8ce5299d5/wmf-config%2FCommonSettings.php [00:57:11] Doesn't have the same line. [00:57:13] (03CR) 10Reedy: [C: 032] Fix NOTICE/typo [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105630 (owner: 10Reedy) [00:57:23] (03Merged) 10jenkins-bot: Fix NOTICE/typo [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105630 (owner: 10Reedy) [00:57:25] (03Merged) 10jenkins-bot: [x] Yes to AJAX [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105632 (owner: 10Reedy) [00:59:14] !log reedy synchronized wmf-config/ '[x] Yes to AJAX' [00:59:30] Logged the message, Master [00:59:42] reedy@tin:/a/common$ mwscript eval.php enwiki [00:59:43] > echo $wgUseAjax; [00:59:43] 1 [00:59:43] WFM [01:00:28] Remind me what /a is? [01:00:46] 1st extra storage partition [01:01:07] Or not even [01:01:14] It's a folder [01:36:56] Reedy: Yeah, I think the conf/ files at noc can be slightly oudated. [01:37:02] Outdated, even. [01:37:05] Possibly [01:37:11] https://noc.wikimedia.org/conf/CommonSettings.php.txt looks fine now. [01:37:13] I think there's still on fenari... [01:37:25] So syncing should hit there too [01:37:40] Probably a caching layer or two. [01:40:58] curl says no [01:40:59] Server: Apache/2.2.22 (Ubuntu) [01:41:13] Probably should be behind the misc proxy... [01:58:06] (03CR) 10MZMcBride: "Related: bug 59704" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105476 (owner: 10Reedy) [02:00:36] (03CR) 10MZMcBride: "Related: bug 59702" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105622 (owner: 10Reedy) [02:00:45] (03CR) 10MZMcBride: "Related: bug 59702" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105623 (owner: 10Reedy) [02:00:50] (03CR) 10MZMcBride: "Related: bug 59702" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105624 (owner: 10Reedy) [02:01:08] (03CR) 10MZMcBride: "Related: bug 59702" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105627 (owner: 10Reedy) [02:01:22] (03CR) 10MZMcBride: "Related: bug 59702" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105628 (owner: 10Reedy) [02:01:36] (03CR) 10MZMcBride: "Related: bug 59702" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105630 (owner: 10Reedy) [02:01:45] (03CR) 10MZMcBride: "Related: bug 59702" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105632 (owner: 10Reedy) [02:01:51] (Finished.) [02:18:30] !log LocalisationUpdate completed (1.23wmf8) at Mon Jan 6 02:18:30 UTC 2014 [02:18:54] Logged the message, Master [02:35:14] !log LocalisationUpdate completed (1.23wmf9) at Mon Jan 6 02:35:14 UTC 2014 [02:35:31] Logged the message, Master [02:44:44] PROBLEM - SSH on lvs1001 is CRITICAL: Server answer: [02:45:44] RECOVERY - SSH on lvs1001 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [03:00:27] !log LocalisationUpdate ResourceLoader cache refresh completed at Mon Jan 6 03:00:27 UTC 2014 [03:00:44] Logged the message, Master [05:18:49] Aaron|home: huh, apparently salt uses msgpack for serialization [05:19:03] maybe that php code will have other uses [05:25:13] * Aaron|home detects evil grinning [05:33:34] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [05:35:34] RECOVERY - RAID on searchidx1001 is OK: OK: optimal, 1 logical, 4 physical [05:36:23] !log aaron started scap: timing test (beta) [05:36:40] Logged the message, Master [05:45:40] ugh searchidx1001 again [06:10:09] Aaron|home: that system needs ops attention [06:10:25] it's not something we should hack around in scap or by renicing [06:10:47] dmesg is filled w/: [06:10:50] [11604026.902923] INFO: task xfsbufd/sda6:731 blocked for more than 120 seconds. [06:10:50] [11604026.910081] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [06:10:59] except it's any number of processes [06:12:01] * Aaron|home killed the resync proc after a while [06:12:24] my main terminal seemed to have disconnected by then [06:12:55] but yeah that box is fcked [06:14:23] just running 'ls' hangs [06:16:28] could be an XFS deadlock or a bad disk [06:17:52] apergo.s should be up in a bit & could possibly investigate. probably best to file an RT ticket tho [06:25:34] RECOVERY - udp2log log age for oxygen on oxygen is OK: OK: all log files active [06:26:35] first relevant dmesg entry is "[9997623.601184] INFO: task kswapd0:115 blocked for more than 120 seconds." tc [06:29:54] PROBLEM - MySQL Slave Running on db68 is CRITICAL: CRIT replication Slave_IO_Running: Yes Slave_SQL_Running: No Last_Error: Error You have an error in your SQL syntax: check the manual that co [06:30:32] I would just reboot the box and see how it is after that [06:30:38] ori: Aaron|home ^^ [06:31:54] RECOVERY - MySQL Slave Running on db68 is OK: OK replication Slave_IO_Running: Yes Slave_SQL_Running: Yes Last_Error: [06:38:58] !log rebooting searchidx1001, hanging on various things, first sign of trouble was 'task kswapd0:115 blocked for more than 120 seconds.' [06:39:16] Logged the message, Master [06:40:04] PROBLEM - Disk space on searchidx1001 is CRITICAL: Connection refused by host [06:40:15] PROBLEM - DPKG on searchidx1001 is CRITICAL: Connection refused by host [06:40:15] PROBLEM - SSH on searchidx1001 is CRITICAL: Connection refused [06:40:24] PROBLEM - puppet disabled on searchidx1001 is CRITICAL: Connection refused by host [06:40:24] PROBLEM - RAID on searchidx1001 is CRITICAL: Connection refused by host [06:42:54] PROBLEM - Host searchidx1001 is DOWN: PING CRITICAL - Packet loss = 100% [06:44:14] RECOVERY - SSH on searchidx1001 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [06:44:24] RECOVERY - puppet disabled on searchidx1001 is OK: OK [06:44:24] RECOVERY - Host searchidx1001 is UP: PING OK - Packet loss = 0%, RTA = 0.30 ms [06:44:24] RECOVERY - RAID on searchidx1001 is OK: OK: optimal, 1 logical, 4 physical [06:45:04] RECOVERY - Disk space on searchidx1001 is OK: DISK OK [06:45:14] RECOVERY - DPKG on searchidx1001 is OK: All packages OK [06:58:44] PROBLEM - NTP on searchidx1001 is CRITICAL: NTP CRITICAL: Offset unknown [07:04:44] RECOVERY - NTP on searchidx1001 is OK: NTP OK: Offset -0.0007642507553 secs [07:27:34] PROBLEM - udp2log log age for oxygen on oxygen is CRITICAL: CRITICAL: log files /var/log/udp2log/packet-loss.log, have not been written in a critical amount of time. For most logs, this is 4 hours. For slow logs, this is 4 days. [07:56:52] apergos: looks better. thanks! [07:57:04] sure [09:04:56] (03CR) 10DrTrigon: [C: 031] "Merge please! :)" [operations/puppet] - 10https://gerrit.wikimedia.org/r/105192 (owner: 10Tim Landscheidt) [10:06:46] (03CR) 10Dzahn: [C: 032] Add Frank Schulenburg's blog to the English Planet [operations/puppet] - 10https://gerrit.wikimedia.org/r/105460 (owner: 10Odder) [10:13:25] WEE [10:15:00] !log jenkins: refreshing jslint files for mediawiki extensions, making sure they are in sync [10:15:17] Logged the message, Master [10:16:02] (03CR) 10Dzahn: [C: 032] install graphviz on bugzilla role [operations/puppet] - 10https://gerrit.wikimedia.org/r/103525 (owner: 10Dzahn) [10:18:50] (03CR) 10Dzahn: "Checking for GraphViz (any) ok" [operations/puppet] - 10https://gerrit.wikimedia.org/r/103525 (owner: 10Dzahn) [10:35:12] (03PS1) 10TTO: Add Wikimania wikis to global abuse filters [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105655 [10:36:50] (03CR) 10TTO: "Yay for cleanup!! Thanks Reedy." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105632 (owner: 10Reedy) [10:40:42] (03CR) 10Dzahn: [C: 031] Add Wikimania wikis to global abuse filters [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105655 (owner: 10TTO) [10:41:18] (03CR) 10Odder: "https://gerrit.wikimedia.org/r/#/c/104726/ closes the wikimania2013wiki, so it probably isn't worth the hassle to have to remove the setti" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105655 (owner: 10TTO) [10:50:53] !log jenkins: made mediawiki-core-phpunit-parser job executable concurrently, might cause race conditions. {{gerrit|102152}} [10:51:09] Logged the message, Master [10:59:21] (03CR) 10TTO: "Good point, didn't know it was happening so soon." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105655 (owner: 10TTO) [10:59:30] (03PS2) 10TTO: Add Wikimania 2014 wiki to global abuse filters [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105655 [11:16:49] (03CR) 10Odder: [C: 031] Add Wikimania 2014 wiki to global abuse filters [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105655 (owner: 10TTO) [11:17:18] !log jenkins: uninstalling phpcs on gallium (was installed with pear, no deployed using our deploy system) {{bug|57064}} [11:17:35] Logged the message, Master [11:50:30] PROBLEM - Disk space on wtp1007 is CRITICAL: DISK CRITICAL - free space: / 200 MB (2% inode=72%): [11:54:34] PROBLEM - Disk space on wtp1024 is CRITICAL: DISK CRITICAL - free space: / 184 MB (2% inode=72%): [11:54:44] PROBLEM - Parsoid on wtp1007 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:54:45] PROBLEM - Disk space on wtp1010 is CRITICAL: DISK CRITICAL - free space: / 144 MB (1% inode=72%): [11:56:44] PROBLEM - Disk space on wtp1014 is CRITICAL: DISK CRITICAL - free space: / 257 MB (2% inode=72%): [11:58:44] PROBLEM - Parsoid on wtp1010 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:58:45] RECOVERY - Disk space on wtp1014 is OK: DISK OK [11:59:04] PROBLEM - Parsoid on wtp1024 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:00:02] (03PS1) 10Dzahn: hackish fix for wikisource portal special case [operations/debs/wikistats] - 10https://gerrit.wikimedia.org/r/105668 [12:01:04] PROBLEM - Parsoid on wtp1014 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:05:34] PROBLEM - Disk space on wtp1019 is CRITICAL: DISK CRITICAL - free space: / 320 MB (3% inode=72%): [12:06:30] re: socket timeout on wtp1014 - processes are running, can login, disk space available .. [12:07:34] does this cause an actual problem on the site or not [12:09:42] suppose not because no paging or other reports [12:09:43] (03CR) 10Addshore: Start wikidata puppet module for builder (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/96552 (owner: 10Addshore) [12:10:14] PROBLEM - Parsoid on wtp1019 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:11:34] RECOVERY - Disk space on wtp1024 is OK: DISK OK [12:11:47] (03PS10) 10Addshore: Start wikidata puppet module for builder [operations/puppet] - 10https://gerrit.wikimedia.org/r/96552 [12:13:24] RECOVERY - Disk space on wtp1007 is OK: DISK OK [12:13:35] (03PS11) 10Addshore: Start wikidata puppet module for builder [operations/puppet] - 10https://gerrit.wikimedia.org/r/96552 [12:15:30] (03PS2) 10Parent5446: Undeploy AssertEdit (merged into core) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/96931 (owner: 10Legoktm) [12:24:44] RECOVERY - Disk space on wtp1010 is OK: DISK OK [12:32:34] RECOVERY - Disk space on wtp1019 is OK: DISK OK [12:34:44] PROBLEM - Disk space on wtp1014 is CRITICAL: DISK CRITICAL - free space: / 263 MB (2% inode=72%): [12:49:34] PROBLEM - Disk space on wtp1024 is CRITICAL: DISK CRITICAL - free space: / 186 MB (2% inode=72%): [12:50:24] PROBLEM - Disk space on wtp1007 is CRITICAL: DISK CRITICAL - free space: / 211 MB (2% inode=72%): [13:01:44] PROBLEM - Disk space on wtp1010 is CRITICAL: DISK CRITICAL - free space: / 148 MB (1% inode=72%): [13:06:37] (03PS1) 10Dzahn: let metapedias table have language columns [operations/debs/wikistats] - 10https://gerrit.wikimedia.org/r/105675 [13:07:38] (03CR) 10Dzahn: [C: 032] hackish fix for wikisource portal special case [operations/debs/wikistats] - 10https://gerrit.wikimedia.org/r/105668 (owner: 10Dzahn) [13:08:23] (03CR) 10Dzahn: [C: 032] let metapedias table have language columns [operations/debs/wikistats] - 10https://gerrit.wikimedia.org/r/105675 (owner: 10Dzahn) [13:10:34] PROBLEM - Disk space on wtp1019 is CRITICAL: DISK CRITICAL - free space: / 264 MB (2% inode=72%): [13:11:23] (03CR) 10Dzahn: "reported by Robert Hanke, thx, fixed" [operations/debs/wikistats] - 10https://gerrit.wikimedia.org/r/105675 (owner: 10Dzahn) [13:45:06] (03PS1) 10Odder: Enable UploadWizard on Romanian Wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105679 [13:48:42] mutante: interested in importing some 2k independent mediawikis into wikistats? [13:49:10] they're at https://wikiteam.googlecode.com/svn/trunk/listsofwikis/mediawikis_2013_checked.txt , already added to https://wikiapiary.com/wiki/Category:January_2014_Import [13:50:01] Nemo_bis: yes, i am, thanks. can make bug ?:) [13:50:10] it has a component [13:51:16] sure [13:56:13] mutante: btw it's always such a pain to navigate the *three* _pruducts_ for labs/tools [13:56:40] !log jenkins: migrated mediawiki extension loader from /tools/extensions-loader.php to mediawiki/conf.d/50_mw_ext_loader.php {{gerrit|105680}} [13:56:46] Nemo_bis: :P i know heh:) thank you [13:56:56] Logged the message, Master [13:57:00] Nemo_bis: Bugzilla main navigation has a new link since today :) [13:57:23] mutante: what link ? [13:57:50] My Requests ? [13:59:17] hashar: https://gerrit.wikimedia.org/r/#/c/104342/ [13:59:23] Browse projects [13:59:28] ahhh [14:10:09] I know how to find projects, it's just frustrating to always be presented with all that confusion of similar products [14:12:37] not supposed to change any products or components without tickets to andre because of confusion in the past and multi-admins :p [14:13:53] (03PS1) 10Aude: Enable Wikibase Client on testwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105682 [14:15:14] sure [14:25:33] (03CR) 10Reedy: [C: 032] Enable Wikibase Client on testwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105682 (owner: 10Aude) [14:25:47] (03Merged) 10jenkins-bot: Enable Wikibase Client on testwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105682 (owner: 10Aude) [14:26:12] Bugzilla products too confusing? Add your opinion and grand plan to https://bugzilla.wikimedia.org/show_bug.cgi?id=38990 ! :P [14:26:37] !log reedy synchronized wmf-config/InitialiseSettings.php [14:26:55] Logged the message, Master [14:27:05] aude: Any other prep we need? [14:29:09] don't think so [14:29:21] I can't find your config change commit again... [14:29:32] * aude looks [14:29:39] https://github.com/wikimedia/operations-mediawiki-config/commits/master?page=2 [14:30:12] https://github.com/wikimedia/operations-mediawiki-config/commit/e6fc8d9db38947776523bfdcc330bdecbaadd034 [14:30:15] Need to be looking for the right thing ;) [14:30:20] I7cf7af5525f3223dbd044e7676b5c0255b45928c [14:30:25] shhh, don't let them catch you using github links [14:30:47] too late [14:30:51] Reedy, do you have a script or something that reports fatal errors in the production logs? [14:31:01] Sort of [14:31:21] There's one that do some grouping and combining of the last 1000 lines of the apache syslogs [14:31:27] Useful as an overview of what's going on [14:31:38] Then can grep through the full exception/fatal logs as necessary [14:31:42] Reedy: so the goal is to have beta run the Wikidata git repo (everything is inside there) [14:31:45] on beta [14:31:50] and keep production as-is [14:31:54] Reedy: is it checked in somewhere? [14:31:55] YuviPanda: can i ask re: commons app ?: [14:32:05] mutante: sure! switch to -mobile? [14:32:12] YuviPanda: kthx [14:32:24] for localisation cache, i tried to tell beta to use one set of stuff for wikibase and production the other [14:32:31] Reedy: I'm looking for something that will monitor fatal.log on beta, and wondering if I should just write something myself. [14:32:50] Uhh [14:33:08] chrismcmahon: we don't have such thing? [14:33:20] watch "tail -n 1000 /home/wikipedia/syslog/apache.log | grep -v 'Search backend error' | grep -v -i 'swift' | grep 'PHP\|Segmentation fault' | grep -v 'filemtime\|failed to mkdir\|GC cache entry\|cache slam averted\|SHA-1 metadata' | sed -r 's/\[notice\] child pid [0-9]+ exit signal //g' | sed 's/, referer.*$//g' | cut -d ' ' -f 7- | sort | uniq -c | sort -rn" [14:33:23] aude: not on beta, not right now. It's time. [14:33:27] is the summary script [14:33:28] oh [14:34:52] Reedy: hmm, I was thinking of something simpler. [14:35:09] (03PS1) 10Reedy: Enable Wikidata build on beta labs [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105684 [14:35:22] chrismcmahon: The fatal logs are even more complex than the syslogs... [14:35:40] (03PS2) 10Reedy: Enable Wikidata build on beta labs [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105684 [14:36:11] Reedy: on beta, I think we just want to know if fatal.log just changes at all. And check at intervals. [14:36:39] We have graphs for those on ganglia [14:38:31] Reedy: we have http://ganglia.wmflabs.org/latest/?c=deployment-prep [14:38:59] Nemo_bis: Where are those graphs? :P [14:40:08] so Reedy, that script above just runs watch() in a screen then? does it do notifications? [14:40:57] !log reedy updated /a/common to {{Gerrit|I349589a49}}: Enable Wikibase Client on testwiki [14:40:58] chrismcmahon: Nope [14:40:58] Logged the message, Master [14:40:58] Not even run in a screen [14:41:00] logmsgbot: LIES [14:41:19] Reedy: so it's basically sync-common mw1017 [14:41:32] That's exactly what I've just done ;) [14:41:35] k [14:41:37] * aude reading https://wikitech.wikimedia.org/wiki/Test.wikipedia.org [14:41:44] No l10n stuff done yet [14:41:49] Which is where things should break again [14:42:06] ottomata: I just realized that oxygen's udp2log is not producing data. [14:42:24] ottomata: Its produced tsv since the last rotations are empty :-( [14:42:51] ottomata: e.g.: /a/log/webrequest/mobile-sampled-100.tsv.log on oxygen. [14:43:39] so testwiki has no sites table [14:43:48] shouldn't break anything though [14:43:55] might aswell fix it [14:43:57] easily enough done [14:44:02] ok [14:44:17] Reedy: while I have you on the line then, if had a script checking beta for fatals, and I wanted it to send email, what MTA would it send through? [14:44:32] OOoooo [14:44:33] wa [14:44:33] ottomata: Seems it's broken since around 2014-01-05 03:39:25 UTC [14:44:36] Not sure if you can from labs [14:44:50] Reedy: yes, this would be a deployment-foo host. [14:45:00] ottomata: (At least those are the last time stamps in mobile and zero tsvs. Typically rotation is around 06:30) [14:45:03] qchris, yeah i see some alerts in my email [14:45:03] yeah [14:45:06] checking into it now [14:45:10] ottomata: Thanks. [14:46:40] Query OK, 798 row(s) affected [14:46:44] Query OK, 1588 row(s) affected [14:47:00] sounds right [14:47:12] what's the command to flush the caches for those? [14:47:26] it should be automatic [14:47:38] Reedy: these graphs linked in the /topic you mean? http://ur1.ca/edq1f [14:47:48] chrismcmahon: ^ [14:47:50] aude: ok... [14:47:53] Reedy: looks good [14:47:58] https://test.wikipedia.org/wiki/New_York_City [14:48:24] PROBLEM - Disk space on wtp1004 is CRITICAL: DISK CRITICAL - free space: / 1333 MB (3% inode=92%): [14:48:34] RECOVERY - udp2log log age for oxygen on oxygen is OK: OK: all log files active [14:48:43] (03CR) 10Faidon Liambotis: [C: 032] Fix dependency libtiff5-alt-dev -> libtiff4-dev [operations/debs/vips] - 10https://gerrit.wikimedia.org/r/102617 (owner: 10coren) [14:49:08] I think we've gotta update the l10ncache on tin now [14:49:10] Got a few hours [14:49:18] I'll make a backup of it before changing it though ;) [14:49:23] ok [14:50:45] still no updates @ https://wikitech.wikimedia.org/wiki/RT_Triage_Duty#Duty_desk_rotation_-_who_is_next.3F [14:51:11] jeremyb: still waiting on the apache stuff? [14:51:25] aude: yah [14:51:27] * aude can't do anything about it [14:53:14] i'm still getting used to the new redirects.dat workflow [14:53:18] Reedy: speaking of which... wikisource needs sites table [14:53:22] before deployment next week [14:53:39] (03PS1) 10Yuvipanda: dynamicproxy: Enable XFF based on a parameter [operations/puppet] - 10https://gerrit.wikimedia.org/r/105687 [14:53:43] oh gadolinum is down!? [14:53:45] eef [14:54:01] Oooh :-) [14:54:15] apergos: Gadolinium died Jan 5, 3AM and is now no longer on ganglia. will it resurrect from the deaths? [14:54:59] dunno, can't seem to log into mgmt either [14:55:06] hmmmmMMM [14:55:16] damn you file permissions [14:55:18] cmjohnson1: got helping ability? [14:57:16] hedonil: 20:24 apergos: gadolinium down/unreachable, can't get to mgmt console, no ping even [14:57:29] jeremyb: 6572 [14:57:44] mutante: haha [14:58:02] mutante: danke :-) [14:58:33] mutante: few minutes ago gadolinium was at least /dead/ on ganglia, but now disappeared completely [14:58:51] yeah guess it expired there, its dead in icinga [14:58:59] doo dee doo, well not much I can do atm :/ [14:59:07] ottomata: what's up? [14:59:09] doesnt sound good [14:59:14] gadolinum is dead! [14:59:15] when apergos said not even mgmt [14:59:18] its the udp2log multicast relay [14:59:22] then i dont think it will come back [14:59:26] any time soon [14:59:27] well, i said that i think [14:59:59] hedonil: i think that part was just the ganglia update [15:00:04] PROBLEM - Parsoid on wtp1004 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:00:10] and it was dead since that SAL entry [15:00:12] i can't seem to get anything out of mgmt [15:00:19] could be doing something wrong, but ja [15:00:23] yeah...i can't either [15:00:56] confirmed, no reply [15:02:06] yea, at this points it needs smart hands [15:02:27] and connecting monitor cable [15:03:02] ottomata mutante: I will be there within the hour...need to wrap up what I am doing [15:03:05] who wants to make ticket? seems we dont have one yet [15:03:32] cmjohnson1: cool, thanks, i think otto knows better than I do how urgent it really is [15:04:49] i can do it [15:05:23] ottomata: 6573 [15:05:27] but please add some details? [15:05:32] if it makes sense [15:05:35] ok thanks [15:06:07] ottomata: i am heading there now [15:08:54] assigned to chris , we'll see later [15:11:24] RECOVERY - Disk space on wtp1004 is OK: DISK OK [15:15:28] (03CR) 10Manybubbles: [C: 031] "Will deploy with the config change today." (031 comment) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105479 (owner: 10Chad) [15:17:58] moving to the library, back in a few [15:21:35] (03PS3) 10Aude: Enable Wikidata build on beta labs [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105684 (owner: 10Reedy) [15:41:06] (03PS1) 10Jgreen: increase max message size for spam scanning on OTRS server [operations/puppet] - 10https://gerrit.wikimedia.org/r/105692 [15:44:29] (03CR) 10Jgreen: [C: 032 V: 031] increase max message size for spam scanning on OTRS server [operations/puppet] - 10https://gerrit.wikimedia.org/r/105692 (owner: 10Jgreen) [15:44:56] (03CR) 10Reedy: [C: 032] "Tested and confirmed working on testwiki" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105684 (owner: 10Reedy) [15:45:05] (03Merged) 10jenkins-bot: Enable Wikidata build on beta labs [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105684 (owner: 10Reedy) [15:46:49] !log rebooting gadolinium [15:47:07] Logged the message, Master [15:47:12] (03PS1) 10Hashar: multiversion: replace die() with print; exit(1); [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105693 [15:50:08] !log reedy started scap: Rebuild lolcalisation cache after https://gerrit.wikimedia.org/r/#/c/105684 [15:50:26] Logged the message, Master [15:53:22] ottomata: definitely a h/w issue...getting an idrac not responding error. this requires a lengthy call to dell...will keep you posted [15:53:32] oof ok [15:53:33] hm [15:53:38] i wonder if we should move the relay [15:53:41] hmmmmmMMMMM [15:55:05] cmjohnson1: so this will probably take more than a few hours to get fixed? [15:55:12] probably [15:55:26] could be just a firmware update [15:55:31] Eeeeeee [15:55:34] hmph [15:56:18] hmmmm [15:56:25] i think i could move the relay to an26 temporarily [15:57:29] 2014-01-06 15:56:57 mw1080 enwiki: [eb8c5447] /?search=Hhvvvvv Exception from line 468 of /usr/local/apache/common-local/php-1.23wmf8/includes/cache/LocalisationCache.php: No localisation cache found for English. Please run maintenance/rebuildLocalisationCache.php. [15:57:32] eh? [15:57:38] English? [15:57:50] it isn't common, but it is weird [15:58:20] OOF, this also means eventlogging is down i think [15:58:35] huh [15:58:37] ah and webstatscollector [15:59:26] manybubbles: I'm scapping currently [15:59:31] OOooooof [15:59:39] Might be hitting at te magic time [15:59:51] Reedy: grr - that is a new thing, right? [16:00:05] What? [16:00:40] The way localisation stuff is deployed has been [16:00:52] cmjohnson1: do you perchance have another machien lying around we can use for the relay? [16:00:58] until gadolinium is back? [16:01:12] i think I can use analytics1026 if you don't [16:01:34] PROBLEM - udp2log log age for oxygen on oxygen is CRITICAL: CRITICAL: log files /var/log/udp2log/packet-loss.log, have not been written in a critical amount of time. For most logs, this is 4 hours. For slow logs, this is 4 days. [16:02:19] relay for what? [16:02:56] udp2log and eventlogging [16:03:02] gadolinium is down [16:03:32] relay from what to what? [16:06:04] the multicast relay for udp2log [16:06:11] and the unicast relay for eventlogging [16:06:11] so [16:06:25] udp2log: varnishes -> gadolinium -> multiple udp2log hosts [16:06:34] evetnlogging: varnishes -> gadolinium -> vanadium [16:11:34] paravoid^ [16:11:46] ottomata: looks to be fixe [16:11:50] ! [16:12:54] RECOVERY - Host gadolinium is UP: PING OK - Packet loss = 0%, RTA = 0.40 ms [16:13:07] multicast relay? [16:13:11] for eqiad->pmtpa? [16:13:12] yes paravoid! hah [16:13:13] no [16:13:14] RECOVERY - udp2log log age for erbium on erbium is OK: OK: all log files active [16:13:33] there is a socat daemon on gadolinium [16:13:42] is there anyone in-house who can administer 2-factor auth, or do I still need to track down mr. lane? [16:13:47] that sends udp2log data to a mulitcast addy [16:13:59] and a couple of udp2log instances get their stream from that multicast group [16:14:00] from? [16:14:23] this was setup so that varnishes wouldn't have to run so many instances of varnishncsa [16:14:34] by asher before my time [16:14:57] draining the flea power corrected the idrac not responding [16:15:05] andrewbogott: Coren ^^ can you help Jeff ? [16:15:13] ottomata: confirmed fixed [16:15:16] * Coren reads scrollback. [16:15:21] I can [16:15:38] awesome, thanks cmjohnson1 [16:16:00] andrewbogott just volunteered. :-) [16:16:04] whooo! [16:16:09] cmjohnson1: All of the virt1001-1009 boxes are now working! Thanks for all the juggling. [16:16:10] glad I don't have to move things around [16:16:16] But if he's too busy I can help too. [16:16:32] Jeff_Green: what's up? [16:16:40] Or am I just not scrolling far enough? [16:17:14] my phone croaked yet again, and I suspect I've just burned through all of my emergency keys [16:17:30] i have no idea how to generate a new set and I don't see any documentation [16:17:32] On wikitech this is? [16:17:42] Are you currently locked out? [16:18:21] yep [16:19:52] ok… I think the for-the-future lesson is to reset your 2fa /before/ you use your last paper key :) [16:20:06] which, I don't know what the proper method is, but you can always just turn it off and on again, which generates a new everything. [16:20:11] ? [16:20:14] how can I turn it off? [16:20:19] there's 0 documentation [16:21:02] also I'm not positive I lost my last key. i've been deleting them as I go, and I thought I had 3 left [16:21:40] It's on your account setting page, which I can't find right now :( [16:22:00] :-( [16:22:03] Oh, 'preferences' of course. [16:22:06] There's a link to disable. [16:22:19] But you can't disable without logging in and having a key :p So, I will disable it for you. [16:22:24] :-) [16:22:34] Which I think I know how to do because I recently wiped my phone and had to do it for myself. [16:22:37] how are the emergency keys generated? [16:23:11] You mean, mathwise? Or user interface wise [16:23:11] ? [16:23:20] interface-wise [16:26:21] Jeff_Green, can you call me via google hangout? [16:26:35] ya. lemee get to a conference room. sec [16:26:45] I only need 10 seconds [16:26:50] just ID verify [16:28:38] !log reedy finished scap: Rebuild lolcalisation cache after https://gerrit.wikimedia.org/r/#/c/105684 (duration: 39m 45s) [16:28:52] Logged the message, Master [16:30:11] scap completed in 39m 45s. [16:30:13] Jeff_Green: thanks, sorry to make you run around. [16:30:15] * Reedy nearly died of boredom [16:30:34] andrewbogott: no prob at all [16:30:43] (03PS2) 10Reedy: Bump various epochs to start of 2013 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105476 [16:30:46] but: stupid linux audio grumble grumble [16:30:50] finally [16:30:53] (03CR) 10Reedy: [C: 032] Bump various epochs to start of 2013 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105476 (owner: 10Reedy) [16:30:58] 39m [16:31:02] (03Merged) 10jenkins-bot: Bump various epochs to start of 2013 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105476 (owner: 10Reedy) [16:31:25] I'm so face blind that just now when my face came up on the hangout screen (without my glasses) I didn't know who it was. I spent a while thinking, 'That doesn't look like Jeff!' [16:31:32] ha [16:32:13] It's going to take me 5 or 10 to remember how to do this, I'll ping you when it's reset. [16:32:36] thanks and sorry for the time [16:32:45] (03PS3) 10Reedy: Add Wikimania 2014 wiki to global abuse filters [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105655 (owner: 10TTO) [16:32:51] (03CR) 10Reedy: [C: 032] Add Wikimania 2014 wiki to global abuse filters [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105655 (owner: 10TTO) [16:33:00] (03Merged) 10jenkins-bot: Add Wikimania 2014 wiki to global abuse filters [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105655 (owner: 10TTO) [16:33:22] andrewbogott: I just did that a few days ago and it's fresh in my memory. Need help? [16:33:33] (03PS3) 10Reedy: Swap method for closure [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105471 [16:33:42] Coren: I barely ever use the mysql commandline, it'll be good for me to figure it out. [16:33:46] (03CR) 10Reedy: [C: 032] Swap method for closure [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105471 (owner: 10Reedy) [16:33:55] andrewbogott: kk [16:33:56] (03Merged) 10jenkins-bot: Swap method for closure [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105471 (owner: 10Reedy) [16:34:12] (03PS2) 10Reedy: Swap global functions for closures [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105473 [16:34:23] (03CR) 10Reedy: [C: 032] Swap global functions for closures [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105473 (owner: 10Reedy) [16:34:35] (03Merged) 10jenkins-bot: Swap global functions for closures [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105473 (owner: 10Reedy) [16:35:00] (03PS2) 10Reedy: Enable UploadWizard on Romanian Wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105679 (owner: 10Odder) [16:35:06] (03CR) 10Reedy: [C: 032] Enable UploadWizard on Romanian Wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105679 (owner: 10Odder) [16:35:17] (03Merged) 10jenkins-bot: Enable UploadWizard on Romanian Wikipedia [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105679 (owner: 10Odder) [16:35:40] (03PS2) 10Reedy: Add more settings related to page imports on hewikivoyage [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105221 (owner: 10Odder) [16:35:44] (03CR) 10Reedy: [C: 032] Add more settings related to page imports on hewikivoyage [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105221 (owner: 10Odder) [16:35:59] (03Merged) 10jenkins-bot: Add more settings related to page imports on hewikivoyage [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105221 (owner: 10Odder) [16:36:47] (03PS7) 10Reedy: Add templateeditor right, group, and restriction to testwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/104912 (owner: 10Rschen7754) [16:36:55] (03CR) 10Reedy: [C: 032] Add templateeditor right, group, and restriction to testwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/104912 (owner: 10Rschen7754) [16:37:07] (03Merged) 10jenkins-bot: Add templateeditor right, group, and restriction to testwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/104912 (owner: 10Rschen7754) [16:38:14] (03PS5) 10Reedy: Simplify Drafts related TitleQuickPermissions hook subscriber [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/102366 [16:38:18] (03CR) 10Reedy: [C: 032] Simplify Drafts related TitleQuickPermissions hook subscriber [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/102366 (owner: 10Reedy) [16:38:20] Jeff_Green: what's your username on wikitech? [16:38:28] (03Merged) 10jenkins-bot: Simplify Drafts related TitleQuickPermissions hook subscriber [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/102366 (owner: 10Reedy) [16:39:46] (03CR) 10Reedy: [C: 04-1] "Not yet" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/104726 (owner: 10Odder) [16:40:16] (03PS2) 10Reedy: Disable oai auditing. No one uses it for anything [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105475 [16:40:17] (03CR) 10Reedy: [C: 032] Disable oai auditing. No one uses it for anything [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105475 (owner: 10Reedy) [16:40:40] (03Merged) 10jenkins-bot: Disable oai auditing. No one uses it for anything [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105475 (owner: 10Reedy) [16:40:51] oh, reedy spam [16:40:54] (03CR) 10Odder: "Not yet what." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/104726 (owner: 10Odder) [16:41:05] I thought I was in #mediawiki for a second :) [16:41:24] !log reedy synchronized wmf-config/ [16:41:42] Logged the message, Master [16:42:24] Jeff_Green: OK, you can now log into wikitech without a token. [16:42:29] (03CR) 10Reedy: "https://bugzilla.wikimedia.org/show_bug.cgi?id=59157#c3" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/104726 (owner: 10Odder) [16:42:53] You'll need to turn 2fa back on and start with a whole new QR and paper keys &c. [16:43:45] Reedy: not bothered too much, but the bug reporter sounds Hong Kong-anese to me [16:43:54] so if they say close, then close :-) [16:44:16] andrewbogott: looking [16:46:12] (03PS1) 10Hashar: sanity test for refreshWikiversionsCDB [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105698 [16:46:29] (03CR) 10jenkins-bot: [V: 04-1] sanity test for refreshWikiversionsCDB [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105698 (owner: 10Hashar) [16:50:31] andrewbogott: worked, and I see the new set of emergency keys. thanks! [16:50:46] Jeff_Green: So there's a 'reset 2fa' link on the preferences page. [16:50:59] I'm pretty sure that's the only existing way to get a new set of paper keys. [16:51:21] right [16:51:39] So… I guess we should all save our last (last two?) wishes to use for more wishes. [16:52:23] yeah [16:52:42] the workflow almost makes sense now... [16:53:29] i'm a little bothered by the fact that it gives you a list of emergency paper keys before you complete the qr-app-code confirmation, seems counterintuitive [16:53:52] It does. [16:54:03] seem counterintuitive, I mean [16:54:24] hopefully those actually work next time, but I'm pretty much convinced they wont :-) [17:01:22] greg-g: am I good to go? [17:01:50] yessir [17:02:07] going@ [17:02:27] <^d> Started the merge process on the submodule updates. [17:03:52] !log reedy synchronized php-1.23wmf9/resources/mediawiki/images [17:04:09] Logged the message, Master [17:04:12] thanks! [17:04:28] Reedy: are you doing things I should wait for? [17:04:51] Nope that's me all done. Thanks [17:05:42] starting then [17:07:06] <^d> branch changes merged [17:08:59] !log manybubbles synchronized php-1.23wmf9/extensions/CirrusSearch/ [17:09:17] Logged the message, Master [17:10:00] test2wiki looks good. doing wmf8 [17:12:15] !log manybubbles synchronized php-1.23wmf8/extensions/CirrusSearch/ [17:12:19] that isn't good [17:12:33] Logged the message, Master [17:12:33] CirrusSearch\BetaFeatures not found [17:12:40] fixing [17:13:52] ^d: merge https://gerrit.wikimedia.org/r/#/c/105707/ [17:15:09] <^d> Merged. [17:17:10] <^d> manybubbles: Merged wmf8 change through, rather than wait on jenkins. [17:17:58] !log manybubbles synchronized php-1.23wmf8/extensions/CirrusSearch/ [17:18:15] (03PS2) 10Hashar: sanity test for refreshWikiversionsCDB [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105698 [17:18:15] Logged the message, Master [17:18:26] (03CR) 10jenkins-bot: [V: 04-1] sanity test for refreshWikiversionsCDB [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105698 (owner: 10Hashar) [17:18:33] !log manybubbles synchronized php-1.23wmf8/extensions/CirrusSearch/ [17:18:39] :( [17:18:49] Logged the message, Master [17:19:02] fixed [17:19:27] not entirely [17:19:42] chad, it looks like it still wants CirrusSearchHooks.php rather than Hooks.php [17:20:35] ^d: ^ [17:20:41] <^d> I'm looking [17:20:45] hanks [17:23:02] <^d> It's all mw1198, which is an api server. [17:23:26] yeah! [17:23:29] funky [17:23:33] maybe it isn't getting synced [17:23:51] I can sync again just to see [17:23:59] (03PS3) 10Hashar: sanity test for refreshWikiversionsCDB [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105698 [17:24:10] (03CR) 10jenkins-bot: [V: 04-1] sanity test for refreshWikiversionsCDB [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105698 (owner: 10Hashar) [17:24:28] <^d> Files look ok on the server. [17:24:35] * ^d wonders if it's apc. [17:25:03] <^d> Can I get an opsen to kick apache on mw1198? [17:25:35] (03PS4) 10Hashar: sanity test for refreshWikiversionsCDB [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105698 [17:25:43] andrewbogott: ^ [17:26:03] alex is on RT duty but I think that's from last week [17:26:14] it is [17:26:19] alex is also on vac today, it's a holiday here [17:26:23] ^d: Just a service restart? [17:26:33] oh right [17:26:35] (03CR) 10Hashar: "That is a bit hacky but seems to get stuff done. Should generate the php-* directories in a temp directory or maybe delete all directorie" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105698 (owner: 10Hashar) [17:26:36] ^d: going to sync out wmf9 again so we're the same [17:26:39] <^d> andrewbogott: Yes. [17:27:08] hm, there's a typo in the init script. Is that you? [17:27:48] not us [17:27:49] !log manybubbles synchronized php-1.23wmf9/extensions/CirrusSearch/ [17:27:49] <^d> Nope. [17:28:05] Logged the message, Master [17:28:06] is all better now [17:28:33] ^d: I was _just_ typing up your congratualtions message for moving us to a namespace without breaking anything [17:28:51] but ofcourse, neither of us tested it as a betafeature [17:28:54] , [17:30:04] ^d: ok. my heart rate has gone down so I'm not ready for the config changes [17:30:27] s/not/now/? [17:30:28] What's the story with mw1198? Is it hand-configured for some reason? [17:30:49] <^d> Not sure. It's definitely an API apache, all the fatals I was seeing were coming from api.php [17:30:53] what is mw1198? [17:31:45] <^d> andrewbogott: Is apache down on the box right now? Just so we're on the same page status-wise :) [17:31:48] ok, nothing to do with test.wikipedia [17:31:51] from site.pp: # mw1189-1208 are api apaches (precise) [17:32:19] ^d: It looks to be running, but 'service' throws an error which I'm trying to track down [17:32:21] (03CR) 10Manybubbles: [C: 032] Finalize commons config for Cirrus [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105479 (owner: 10Chad) [17:32:32] (03Merged) 10jenkins-bot: Finalize commons config for Cirrus [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105479 (owner: 10Chad) [17:32:39] ^d: did you get a chance to read https://gerrit.wikimedia.org/r/#/c/105228/ ? [17:33:12] <^d> Yes, and I'm fine merging it when you're ready to sync. [17:33:39] ready! [17:35:14] /etc/init.d/apache2: 55: [: nice: unexpected operator [17:35:19] Does ^ look familiar to anyone? [17:35:28] (03CR) 10Aaron Schulz: [C: 031] multiversion: replace die() with print; exit(1); [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105693 (owner: 10Hashar) [17:35:31] Best I can tell that init script comes straight from the .deb [17:35:43] mutante loves these things [17:36:02] (03CR) 10Chad: [C: 032] Cirrus config updates [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105228 (owner: 10Manybubbles) [17:36:11] (03Merged) 10jenkins-bot: Cirrus config updates [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105228 (owner: 10Manybubbles) [17:37:43] !log manybubbles synchronized wmf-config/ [17:37:56] !log manybubbles synchronized cirrus.dblist [17:38:00] Logged the message, Master [17:38:15] ^d and greg-g: I'm done syncing things for the day [17:38:18] Logged the message, Master [17:39:01] building indexes now. [17:39:10] you may see elasticsearch complain via nagios. please ignore [17:39:21] it complains because the shards are allocating which they kinda have to do [17:39:28] manybubbles: when should I not ignore it? [17:39:32] (serious) [17:39:38] I tell you [17:39:39] #notsnark [17:39:48] k [17:39:49] :) [17:41:09] andrewbogott: the "nice" thing with apache is oooold existing issue [17:41:21] mutante: OK, I will ignore it then [17:41:26] andrewbogott: it's a bug but not new in any way [17:41:28] yep [17:41:57] <^d> greg-g: If some mild amount of complaining about shards happens *when someone is (re)building an index*, those can be more-or-less ignored. [17:41:59] it might even be related to bug number [17:42:00] 9! :) [17:42:08] RT #9 that is [17:42:08] <^d> Now, if nobody's indexing anything that'd be a little more worrying! :) [17:43:23] greg-g: looks like one more pair of syncs [17:43:24] ^d: k :) [17:43:26] this one is a job error [17:45:34] RECOVERY - udp2log log age for oxygen on oxygen is OK: OK: all log files active [17:51:37] (03Abandoned) 10Reedy: Tabs to spaces [operations/dns] - 10https://gerrit.wikimedia.org/r/96188 (owner: 10Reedy) [17:54:15] (03Abandoned) 10Reedy: Cache loaded dblists when tagged. Reuse for SiteMatrix, CentralAuth and Incubator [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/57173 (owner: 10Reedy) [17:55:02] greg-g: heading back to tin for a minute [17:55:09] k [17:56:15] !log manybubbles synchronized php-1.23wmf9/extensions/CirrusSearch/ [17:56:33] Logged the message, Master [17:58:38] (03CR) 10Anomie: "I see die() calls in multiversion/activeMWVersions.php and wmf-config/CommonSettings.php, and also a wrong mention of die() in a comment i" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105693 (owner: 10Hashar) [17:58:57] (03Abandoned) 10Reedy: Install Thanks and Echo extensions on enwikivoyage [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/70861 (owner: 10Reedy) [17:59:55] greg-g and ^d and anyone else: I've finished making the indexes for the new wikis. any nagios alerts for elasticsearch are now real [18:00:57] * greg-g nods [18:01:05] !log manybubbles synchronized php-1.23wmf8/extensions/CirrusSearch/ [18:01:23] Logged the message, Master [18:01:23] there. now otherindexjobs shouldn't fail [18:03:12] greg-g: When can I get a release timeslot for Scholarships? I've got a bug fix that reduces log output and some i18n updates. [18:03:30] <^d> manybubbles: I'm testing the addWiki thing, will be creating a "testfoowiki", then deleting it. [18:04:38] <^d> Maybe I won't. Stupid maintenance class, who wrote that crap? [18:04:58] bd808: neat, uh, let's see [18:05:29] (03CR) 10Anomie: [C: 04-1] "You'd need either a config variable that would be progressively toggled as the appropriate 1.23wmfX version is rolled out, or else a direc" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/96931 (owner: 10Legoktm) [18:05:35] bd808: (my laxness will change on this, remember) but what are you doing tomorrow afternoon? [18:06:09] greg-g: My schedule is wide open tomorrow afternoon [18:07:08] cool, 1pm Pacific? [18:07:34] That sounds good to me. [18:07:51] cool [18:10:12] greg-g and ^d: something is funky. wikis seem to be getting cirrus randomly [18:10:23] <^d> ...what? [18:10:29] stealth deploy [18:10:34] but, weird [18:11:13] https://www.wikidata.org/w/index.php?search=nikfoo&title=Special%3ASearch [18:11:18] just reload it over and over [18:11:18] bd808: added to calendar(s) [18:11:43] about every other load doesn't have our message [18:11:46] greg-g: Thanks. I was trying to figure out how to do that :) [18:11:55] manybubbles: the "did you mean" part? [18:12:12] This wiki is using a new search engine. (Learn more) [18:12:22] https://www.wikidata.org/wiki/Special:Version contains us half the time [18:13:06] I'm not getting that, but I am getting the "did you mean" about half the time [18:13:13] greg-g and ^d: I think mw.org gets it every time [18:13:23] I call deployment craziness [18:13:29] oh, yeah, same thing (I was missing the text) [18:13:45] and, the maintenance scripts all work [18:13:46] lack of "did you mean" is the new search [18:14:45] yeah [18:14:48] in that case [18:15:02] I think it is "getting better" [18:15:05] <^d> Dafuq? [18:15:13] like the apache processess aren't reloading the code [18:15:27] and are plowing ahead ok until they die a natural death and the new one is right [18:15:36] that's not cool [18:15:40] you can see an increase in search traffic http://ganglia.wikimedia.org/latest/?r=hour&cs=&ce=&m=es_queries&s=by+name&c=Elasticsearch+cluster+eqiad&h=&host_regex=&max_graphs=0&tab=m&vn=&hide-hf=false&sh=1&z=small&hc=4 [18:16:14] or maybe that bump was temporary [18:16:23] <^d> I'm getting old search on wikidata still. [18:18:08] (03CR) 10Anomie: sanity test for refreshWikiversionsCDB (031 comment) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105698 (owner: 10Hashar) [18:18:43] almost like not all servers got cirrus.dblist? [18:18:49] or they are caching that somehow? [18:19:48] I haven't started building the indexes yet until this gets resolved [18:19:51] because weird [18:20:16] touch initialisesettings and sync? [18:20:23] I can [18:20:33] is there something special to do if I change a dblist? [18:20:38] touch initialisesettings and sync? [18:20:46] heh [18:21:03] automate it [18:21:09] syncing [18:21:15] !log manybubbles synchronized wmf-config/InitialiseSettings.php [18:21:17] greg-g: I think I already opened a bug for that ;) [18:21:31] Yup [18:21:32] https://bugzilla.wikimedia.org/show_bug.cgi?id=58618 [18:21:33] Logged the message, Master [18:21:49] <^d> #allfixed [18:21:50] Reedy: :) [18:21:58] seems to have fixed it [18:22:17] Reedy: would it ever be bad *not* to touch it? [18:22:38] If you're touching other stuff in wmf-config, or a dblist etc [18:23:07] ah, so the dumb route isn't good enough :/ [18:23:09] It'd probably solve some problems [18:23:45] you can see the extra traffic: http://ganglia.wikimedia.org/latest/?r=hour&cs=&ce=&m=cpu_report&s=by+name&c=Elasticsearch+cluster+eqiad&h=&host_regex=&max_graphs=0&tab=m&vn=&hide-hf=false&sh=1&z=small&hc=4 [18:24:03] <^d> I was just looking at that. [18:24:29] probably mostly from wikidata? or? [18:24:46] checking [18:27:29] gwicke: wtp* boxes have their / 100% full [18:27:40] 1004, 1007, 1010, 1014, 1019, 1024 [18:27:43] 34746 itwiki: [18:27:45] 16905 frwiktionary: [18:27:47] 14381 wikidatawiki: [18:28:03] so mostly new ones [18:28:05] gwicke: want to have a look before I kill these? [18:28:11] manybubbles: huh, neat [18:29:24] PROBLEM - Disk space on wtp1004 is CRITICAL: DISK CRITICAL - free space: / 1418 MB (3% inode=92%): [18:29:27] I'll fun some capacity planning number soon. we way way way more than doubled our search traffic with this release [18:29:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:19:47 PM UTC [18:29:54] RECOVERY - Puppet freshness on cp1065 is OK: puppet ran at Mon Jan 6 18:29:53 UTC 2014 [18:30:09] better numbers: [18:30:10] 17670 itwiki: [18:30:12] 4250 wikidatawiki: [18:30:14] 1542 cawiki: [18:30:26] we've been doing cawiki for a while [18:30:28] but itwiki is new [18:30:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:29:53 PM UTC [18:31:03] ottomata: hey [18:31:13] heya [18:31:14] <^d> manybubbles: Watching the searches go by in real time is kind of neat :) [18:31:24] ottomata: icinga has a bunch of varnishkafka warnings/tracebacks [18:31:30] thanks [18:31:34] and a procs critical [18:31:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:29:53 PM UTC [18:31:41] PROCS CRITICAL: 2 processes with command name 'varnishkafka' [18:32:17] (03PS1) 10Reedy: Use getRealmSpecificFilename on extension-list-wikidata [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105718 [18:32:35] 2 process, hm [18:32:38] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:29:53 PM UTC [18:32:43] if you want to accept 2 of them it's easy to change the limits, but .. i guess this case you dont [18:33:08] sometimes it's a bug where it counts the check itself [18:33:13] depending on the args you use [18:33:50] kinda like [18:33:54] grep | grep -v grep [18:33:55] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:29:53 PM UTC [18:33:55] RECOVERY - Varnishkafka log producer on cp1047 is OK: PROCS OK: 1 process with command name varnishkafka [18:33:56] oh hm, thanks icinga! [18:34:04] i had an extra process there when I was checking something for snaps, don't need it anymore [18:34:09] (and thanks paravoid!) [18:35:08] (03PS1) 10Reedy: Remove DEBUG_LOG usage [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105720 [18:35:08] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:29:53 PM UTC [18:35:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:29:53 PM UTC [18:36:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:29:53 PM UTC [18:37:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:29:53 PM UTC [18:38:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:29:53 PM UTC [18:39:07] (03PS1) 10Reedy: Disable Timeline and wikiheiro on votewiki and loginwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105721 [18:39:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:29:53 PM UTC [18:40:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:29:53 PM UTC [18:41:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:29:53 PM UTC [18:42:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:29:53 PM UTC [18:43:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:29:53 PM UTC [18:44:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:29:53 PM UTC [18:44:51] (03PS3) 10Reedy: Completely undeploy AssertEdit (merged into core) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/96931 (owner: 10Legoktm) [18:44:52] (03PS1) 10Reedy: Enable AssertEdit extension only on 1.23wmf9 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105729 [18:45:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:29:53 PM UTC [18:46:35] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:29:53 PM UTC [18:47:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:29:53 PM UTC [18:48:38] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:29:53 PM UTC [18:49:14] RECOVERY - Puppet freshness on cp1065 is OK: puppet ran at Mon Jan 6 18:49:13 UTC 2014 [18:49:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:49:13 PM UTC [18:50:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:49:13 PM UTC [18:51:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:49:13 PM UTC [18:52:31] Am I the only one for whom grrrit-wm doesn't properly colourize his message since 17:18Z? I see ^C and ^Bs. [18:52:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:49:13 PM UTC [18:53:06] heya paravoid, any idea how iptables rules are applied on hooft? [18:53:24] ferm? [18:53:28] ferm? [18:53:29] ha [18:53:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:49:13 PM UTC [18:53:37] i don't see the ones i'm looking for specifically [18:53:40] puppet [18:53:42] what are you looking for? [18:53:46] there are a bunch there about ganglia ports [18:53:53] and hosts that are allowed to connect to them [18:54:08] /etc/ferm/conf.d is full of ports [18:54:11] too many actually [18:54:23] loking [18:54:32] ok ok ok cool, this must be in ganglia in puppet [18:54:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:49:13 PM UTC [18:55:01] found it, thanks [18:55:10] neon needs to be able to query ganglia ports too [18:55:12] for ganglios [18:55:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:49:13 PM UTC [18:56:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:49:13 PM UTC [18:57:20] hmm neon should be able to [18:57:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:49:13 PM UTC [18:58:38] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:49:13 PM UTC [18:58:54] RECOVERY - Puppet freshness on cp1065 is OK: puppet ran at Mon Jan 6 18:58:52 UTC 2014 [18:59:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:58:52 PM UTC [19:00:35] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 06:58:52 PM UTC [19:00:44] RECOVERY - Puppet freshness on cp1065 is OK: puppet ran at Mon Jan 6 19:00:38 UTC 2014 [19:01:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:00:38 PM UTC [19:02:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:00:38 PM UTC [19:03:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:00:38 PM UTC [19:04:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:00:38 PM UTC [19:06:18] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:00:38 PM UTC [19:06:56] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:00:38 PM UTC [19:07:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:00:38 PM UTC [19:08:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:00:38 PM UTC [19:09:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:00:38 PM UTC [19:10:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:00:38 PM UTC [19:11:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:00:38 PM UTC [19:12:35] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:00:38 PM UTC [19:13:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:00:38 PM UTC [19:14:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:00:38 PM UTC [19:15:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:00:38 PM UTC [19:16:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:00:38 PM UTC [19:17:40] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:00:38 PM UTC [19:18:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:00:38 PM UTC [19:19:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:00:38 PM UTC [19:19:44] RECOVERY - Puppet freshness on cp1065 is OK: puppet ran at Mon Jan 6 19:19:37 UTC 2014 [19:20:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:21:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:22:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:23:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:24:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:24:50] wth puppet [19:25:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:26:37] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:27:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:28:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:29:35] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:30:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:31:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:32:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:33:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:34:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:35:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:36:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:37:40] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:38:41] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:39:53] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:40:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:41:34] RECOVERY - Puppet freshness on cp1065 is OK: puppet ran at Mon Jan 6 19:41:25 UTC 2014 [19:41:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:19:37 PM UTC [19:42:18] puppet runs fine on cp1065, I don't know why icinga is freaking out [19:42:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:41:25 PM UTC [19:43:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:41:25 PM UTC [19:44:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:41:25 PM UTC [19:45:35] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:41:25 PM UTC [19:46:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:41:25 PM UTC [19:47:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:41:25 PM UTC [19:48:00] PROBLEM - icinga-wm above annoyance treshold [19:48:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:41:25 PM UTC [19:49:24] RECOVERY - Puppet freshness on cp1065 is OK: puppet ran at Mon Jan 6 19:49:16 UTC 2014 [19:49:34] PROBLEM - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:49:16 PM UTC [19:50:03] ACKNOWLEDGEMENT - Puppet freshness on cp1065 is CRITICAL: Last successful Puppet run was Mon 06 Jan 2014 07:49:16 PM UTC daniel_zahn interesting, how shut it [19:52:13] !log disabled notificatios for puppet fresheness on cp1065, annoying icinga-wm told us about it evevery second [19:52:29] Logged the message, Master [19:52:39] can't type on this, obviously [19:52:51] why? [19:53:01] why is it doing this? [19:53:09] why i disabled it or why i cant type? [19:53:20] that answer i don't know yet either [19:53:25] but i know it's just 1 host [19:53:52] re-install? or something similar? [19:54:02] (03PS1) 10RobH: RT: 6579 Replacing wildcard with one-off certificates replacing the use of *.wikimedia.org certificate use with certificates specific to service fqdn. [operations/puppet] - 10https://gerrit.wikimedia.org/r/105735 [19:54:48] (03CR) 10Faidon Liambotis: [C: 04-1] "http://www.mediawiki.org/wiki/Gerrit/Commit_message_guidelines" [operations/puppet] - 10https://gerrit.wikimedia.org/r/105735 (owner: 10RobH) [19:55:12] ? [19:55:19] paravoid: whats wrong with my commit msg? [19:55:25] it has what it does and links to rt [19:55:37] did you read the URL? [19:55:48] newlines [19:55:49] RobH: a newline after the first line [19:55:57] no subject/body, first line over 50 characters [19:56:03] or it does not separate it [19:56:14] doesn't look well in searches and mail [19:56:20] paravoid: there is no requirement that says your first line's gotta be less than 50 chars [19:56:38] it's preferred [19:56:48] so that git log --oneline output is sane on 80x24 [19:56:51] what's the limit on the other lines? [19:56:55] 80? [19:56:57] it even says so on the URL I gave [19:56:58] 100 chars [19:57:00] 78, typically. [19:57:01] ok [19:57:01] 80 for first line [19:57:07] ok [19:57:18] that's what that page says, at least. [19:57:20] 100 chars is evil if you ask me, but the page says "between 70 and 100" [19:57:26] for 78, i knew the 50 for the first and try to stick to it [19:57:30] though sometimes it's hard [19:57:51] (03PS2) 10RobH: Replacing wildcard with one-off certificates [operations/puppet] - 10https://gerrit.wikimedia.org/r/105735 [19:58:13] I just use vim's defaults tbh [19:58:32] better? [19:58:34] syntax=gitcommit textwidth=72 filetype=gitcommit [19:59:08] set tabstop=4 set shiftwidth=4 [19:59:13] set expandtab [19:59:19] highlight ExtraWhitespace ctermbg=red guibg=red [19:59:30] match ExtraWhitespace /\s\+$/ [19:59:43] paravoid: what is our standard for tabs/spaces in DNS zones files [19:59:54] i just saw Reed abandon a change there, f.w. [19:59:56] iw [19:59:58] RobH: much. I'd still use "Replace" instead of "replacing", and FQDN. it also says "replace" but only adds certificates, doesn't remove, so it might be worth explaining this further [20:00:47] (03PS3) 10RobH: Replacing wildcard with one-off certificates [operations/puppet] - 10https://gerrit.wikimedia.org/r/105735 [20:01:03] bah, i forgot the damned subject. [20:01:14] i now have an abiding hatred for commit messages. [20:01:22] RobH: if you just want to change the commit message, you can just Edit in gerrit web ui [20:01:25] (03PS1) 10Springle: Segregate contributions and lopager traffic on s[234567] [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105738 [20:01:36] ... [20:01:38] god dman it [20:01:54] mutante: yes... that would be smarter huh? [20:01:58] RobH: vs. using local editor, that just works if it's only the message but not any files [20:02:18] (03PS4) 10RobH: Replacing wildcard with one-off certificates [operations/puppet] - 10https://gerrit.wikimedia.org/r/105735 [20:02:24] but then gerrit just creates a new PS for you [20:02:31] huh, interesting, hadnt done that [20:02:32] now i have [20:02:34] thx for info =] [20:02:38] np [20:02:45] usually when paravoid says my commit messages suck [20:02:50] its after i self merged and i dont fix ;] [20:02:55] :) [20:03:07] (03CR) 10Springle: [C: 032] Segregate contributions and lopager traffic on s[234567] [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105738 (owner: 10Springle) [20:03:32] Ok, time to update other repo with the keys [20:04:10] !log springle synchronized wmf-config/db-eqiad.php 'LB changes' [20:04:29] Logged the message, Master [20:05:53] (03PS1) 10Ottomata: Adding neon in the list of gmetad hosts [operations/puppet] - 10https://gerrit.wikimedia.org/r/105774 [20:06:28] (03CR) 10Ottomata: [C: 032 V: 032] Adding neon in the list of gmetad hosts [operations/puppet] - 10https://gerrit.wikimedia.org/r/105774 (owner: 10Ottomata) [20:08:58] (03PS1) 10Ottomata: Removing typo from configuration.pp [operations/puppet] - 10https://gerrit.wikimedia.org/r/105809 [20:09:16] (03CR) 10Ottomata: [C: 032 V: 032] Removing typo from configuration.pp [operations/puppet] - 10https://gerrit.wikimedia.org/r/105809 (owner: 10Ottomata) [20:10:28] hi springle [20:10:37] hi aude [20:10:49] (03CR) 10RobH: [C: 032] Replace wildcard with one-off certificates [operations/puppet] - 10https://gerrit.wikimedia.org/r/105735 (owner: 10RobH) [20:11:12] i want to check on https://gerrit.wikimedia.org/r/#/c/99660/ (wb_terms for wikidata) [20:11:31] looks like though you have an outstanding comment for daniel [20:12:40] yep? [20:12:51] should we try another patch with "KEY term_search (term_language, term_type, term_entity_type, term_search_key(12), term_entity_id)" [20:14:15] actually, leave it a couple days. i'm also trialing a partitioned version of wb_terms (hash on term_language). i'll add another comment about it [20:15:18] ok, great [20:18:44] paravoid: around? [20:24:20] (03PS1) 10RobH: ganglia to use fqdn cert - replace wildcard [operations/puppet] - 10https://gerrit.wikimedia.org/r/105820 [20:25:21] man apache templates are awesome. [20:25:28] be back in a bit [20:29:34] !log jenkins: blacklisted l10n-bot in Zuul so it should no more trigger anything {{gerrit|102636}} [20:29:52] Logged the message, Master [20:31:15] (03CR) 10RobH: [C: 032] ganglia to use fqdn cert - replace wildcard [operations/puppet] - 10https://gerrit.wikimedia.org/r/105820 (owner: 10RobH) [20:31:28] !log going to merge cert update on ganglia, so if it dies, i messed up [20:31:46] Logged the message, RobH [20:33:15] (03PS1) 10Springle: reduce db1006 LB during reindexing [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105821 [20:33:50] (03CR) 10Springle: [C: 032] reduce db1006 LB during reindexing [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105821 (owner: 10Springle) [20:34:45] !log springle synchronized wmf-config/db-eqiad.php 'reduce db1006 LB during reindexing' [20:35:03] Logged the message, Master [20:36:10] bleh, missed the install entry, doh [20:36:12] (03PS1) 10RobH: ganglia to use fqdn cert - replace wildcard [operations/puppet] - 10https://gerrit.wikimedia.org/r/105822 [20:37:53] (03CR) 10RobH: [C: 032] ganglia to use fqdn cert - replace wildcard [operations/puppet] - 10https://gerrit.wikimedia.org/r/105822 (owner: 10RobH) [20:40:07] !log ganglia cert update complete, its all workin [20:40:26] Logged the message, RobH [21:05:56] (03PS1) 10RobH: Replace wildcard with icinga.wikimedia.org certificate [operations/puppet] - 10https://gerrit.wikimedia.org/r/105824 [21:08:43] (03CR) 10Lcarr: [C: 032] Replace wildcard with icinga.wikimedia.org certificate [operations/puppet] - 10https://gerrit.wikimedia.org/r/105824 (owner: 10RobH) [21:15:06] reedy@bast1001:~$ ssh -A flock [21:15:06] ssh: Could not resolve hostname flock: Name or service not known [21:15:08] I love that [21:15:12] When it autocompleted to flock [21:47:04] (03PS2) 10Hashar: zuul: let us change branch of zuul-config repo [operations/puppet] - 10https://gerrit.wikimedia.org/r/98155 [21:54:32] (03PS1) 10Hashar: zuul: support for zuul.zuul_url [operations/puppet] - 10https://gerrit.wikimedia.org/r/105837 [21:54:53] any nice ops could please merge in the two changes above ? :-D [21:55:03] Zaaarooo impact on production [22:01:55] (03PS8) 10Mwalker: Collection Renderer (Now a module!) [operations/puppet] - 10https://gerrit.wikimedia.org/r/102352 [22:02:04] (03CR) 10jenkins-bot: [V: 04-1] Collection Renderer (Now a module!) [operations/puppet] - 10https://gerrit.wikimedia.org/r/102352 (owner: 10Mwalker) [22:08:33] !log reedy updated /a/common to {{Gerrit|Ie0884149b}}: reduce db1006 LB during reindexing [22:08:37] (03PS1) 10Reedy: Change $wgTranslateCC for $wgHooks['TranslatePostInitGroups'] [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105839 [22:08:52] Logged the message, Master [22:11:59] (03PS2) 10Reedy: Change $wgTranslateCC for $wgHooks['TranslatePostInitGroups'] [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105839 [22:13:47] (03PS9) 10Mwalker: Collection Renderer (Now a module!) [operations/puppet] - 10https://gerrit.wikimedia.org/r/102352 [22:15:11] (03PS1) 10Lcarr: replacing star cert with noc.wikimedia.org on fenari [operations/puppet] - 10https://gerrit.wikimedia.org/r/105842 [22:15:18] (03CR) 10Reedy: [C: 032] Change $wgTranslateCC for $wgHooks['TranslatePostInitGroups'] [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105839 (owner: 10Reedy) [22:15:53] hashar: ok, looking [22:16:27] (03CR) 10Lcarr: [C: 032] zuul: let us change branch of zuul-config repo [operations/puppet] - 10https://gerrit.wikimedia.org/r/98155 (owner: 10Hashar) [22:16:28] (03Merged) 10jenkins-bot: Change $wgTranslateCC for $wgHooks['TranslatePostInitGroups'] [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105839 (owner: 10Reedy) [22:17:00] (03CR) 10Lcarr: [C: 032] zuul: support for zuul.zuul_url [operations/puppet] - 10https://gerrit.wikimedia.org/r/105837 (owner: 10Hashar) [22:17:11] (03PS2) 10Lcarr: replacing star cert with noc.wikimedia.org on fenari [operations/puppet] - 10https://gerrit.wikimedia.org/r/105842 [22:17:21] !log reedy synchronized wmf-config/CommonSettings.php [22:17:39] Logged the message, Master [22:18:09] hashar: i'll merge it once jenkins bot verifies https://gerrit.wikimedia.org/r/#/c/105842/ [22:18:14] (just merge all 3 at once) [22:18:26] (03CR) 10Lcarr: [C: 032] replacing star cert with noc.wikimedia.org on fenari [operations/puppet] - 10https://gerrit.wikimedia.org/r/105842 (owner: 10Lcarr) [22:18:56] hashar: all merged up :) [22:21:26] LeslieCarr: and applied. Perfect, thank you very much. [22:25:34] PROBLEM - HTTP on fenari is CRITICAL: Connection refused [22:27:37] !log killed nrpe on gallium [22:27:52] Logged the message, Master [22:31:34] RECOVERY - HTTP on fenari is OK: HTTP OK: HTTP/1.1 200 OK - 4775 bytes in 0.071 second response time [22:36:38] hiiii jgage! [22:43:24] (03CR) 10Ottomata: [C: 031] "This looks good, Christian, let's see about merging and deploying this tomorrow." [operations/puppet] - 10https://gerrit.wikimedia.org/r/105449 (owner: 10QChris) [22:43:29] (03CR) 10Ottomata: "This looks good, Christian, let's see about merging and deploying this tomorrow." [operations/puppet] - 10https://gerrit.wikimedia.org/r/105450 (owner: 10QChris) [22:43:59] (03CR) 10Ottomata: [C: 031] "I haven't checked if that is valid log format variable, but I trust you! Let's see about deploying this tomorrow." [operations/puppet] - 10https://gerrit.wikimedia.org/r/105451 (owner: 10QChris) [22:49:57] (03CR) 10QChris: "> [...], but I trust you!" [operations/puppet] - 10https://gerrit.wikimedia.org/r/105451 (owner: 10QChris) [22:53:04] greg-g and ^d: I've prepared the cirrus pushes. the net caught more than the one tiny change but all changes are simple, helpful, and pass regression tests [22:54:02] hiii, anybody know where our ganglios package came from? [22:54:04] did we build it? [22:54:05] someone else" [22:54:10] did we build it before gerrit was set up? [22:54:21] I need to fix somethign in it (hardcoded gmond port) [22:54:32] Coren, this still needs testing and cleanup, but I welcome your comments on design. Patchset starts here: https://gerrit.wikimedia.org/r/#/c/105845/ [22:54:52] should I convert it to git and add it to gerrit (it is mercurial on bitbucket) [23:01:07] manybubbles: k, it's you and maybe marktraceur, and I have a 15 minute appt starting at 4, so, coordinate amongst yourselves :) [23:01:17] I backed out [23:01:20] nvm [23:01:23] Hah [23:01:33] so, just me. [23:01:37] yep [23:01:40] and the window is now? [23:01:53] in one hour, officially [23:02:03] but, you're east coast, and nothing else is going on... [23:02:15] so yeah, go for it [23:02:27] sweet. I do now [23:03:57] wmf9 is away [23:04:04] !log manybubbles synchronized php-1.23wmf9/extensions/CirrusSearch 'update cirrus to master to hopefully reduce load on elasticsearch' [23:04:21] Logged the message, Master [23:05:02] wmf9 looks good, doing wmf8 [23:06:14] !log manybubbles synchronized php-1.23wmf8/extensions/CirrusSearch 'update cirrus to master to hopefully reduce load on elasticsearch' [23:06:31] Logged the message, Master [23:06:54] greg-g: all done [23:06:56] ^d: done [23:07:31] too soon to tell re perf? [23:07:58] eek, that heat map looks hot [23:10:45] much better [23:10:56] on server went nuts but came back [23:11:13] yeah, 0008 looked scary [23:11:17] 1008 [23:11:46] yeah. he was just doing the same stuff that they were doing before. maybe more of it [23:12:13] oh man, much better! [23:13:08] and searches for long strings are fast again too! [23:13:09] yay [23:13:29] 2! < 5! [23:14:25] (03PS1) 10MaxSem: Enable beta mobile diff on betalabs [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105856 [23:14:31] ok, now I don't mind walking away from the computer [23:14:37] I'mma gonna go now [23:14:56] <^d> How about searches for upper unicode? Like U+1F4AF? [23:14:57] <^d> :) [23:14:58] greg-g or ^d: something blows up my number is on the contact list [23:15:05] ^d: tomorrow [23:21:37] (03CR) 10MaxSem: [C: 032] Enable beta mobile diff on betalabs [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105856 (owner: 10MaxSem) [23:21:45] (03Merged) 10jenkins-bot: Enable beta mobile diff on betalabs [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105856 (owner: 10MaxSem) [23:24:53] (03PS1) 10MaxSem: Redo beta diff on labs in a nicer way [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105860 [23:25:28] (03CR) 10MaxSem: [C: 032] Redo beta diff on labs in a nicer way [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105860 (owner: 10MaxSem) [23:25:38] (03Merged) 10jenkins-bot: Redo beta diff on labs in a nicer way [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/105860 (owner: 10MaxSem) [23:49:17] going to deploy a couple of javascript changes to wmf9, then 8; they re-launch the module storage experiment. [23:51:52] !log ori synchronized php-1.23wmf9/extensions/WikimediaEvents 'Ieef052279: Update WikimediaEvents for module storage exp.' [23:52:10] Logged the message, Master [23:52:10] !log ori synchronized php-1.23wmf9/resources/mediawiki/mediawiki.js 'Ifa97d36d3a: Restore module storage experiment' [23:52:27] Logged the message, Master [23:54:17] !log ori synchronized php-1.23wmf8/extensions/WikimediaEvents 'Ieef052279: Update WikimediaEvents for module storage exp.' [23:54:35] Logged the message, Master [23:54:36] !log ori synchronized php-1.23wmf8/resources/mediawiki/mediawiki.js 'Ifa97d36d3a: Restore module storage experiment' [23:54:53] Logged the message, Master