[04:13:37] hi wm-bot! [05:20:05] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [06:49:34] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [06:49:34] PROBLEM - Puppet freshness on analytics1001 is CRITICAL: Puppet has not run in the last 10 hours [06:49:34] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [07:05:28] PROBLEM - Puppet freshness on magnesium is CRITICAL: Puppet has not run in the last 10 hours [07:05:28] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [08:06:39] PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours [08:11:51] New review: Hashar; "recheck" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/34850 [08:25:26] New patchset: Hashar; "Jenkins test please ignore." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35135 [09:22:32] New patchset: ArielGlenn; "ms1001 -> external-facing ip long since, updating stanza" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35136 [09:23:34] New patchset: Hashar; "zuul: support specifying a baseurl" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35137 [09:23:56] Change merged: ArielGlenn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35136 [09:33:32] apergos: good morning Ariel! Whenever you have time, could you possibly merge in https://gerrit.wikimedia.org/r/35137 [09:33:47] apergos: it tweaks some Zuul configuration on gallium (continuous integration server) [09:34:00] oh whitespace [09:34:01] grr [09:34:49] New patchset: Hashar; "zuul: support specifying a baseurl" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35137 [09:49:05] in a little while, I'm in a discussion about disabled special pages atm [10:01:44] Can 3rd parties use bits.wikimedia.org/geoiplookup? I tested that it works, but I don't know whether it is okay to do so. [10:04:59] third parties as in TWN? [10:07:18] that page says I'm in Engadine, I don't even know where that is ;) [10:08:10] https://en.wikipedia.org/wiki/Engadine,_New_South_Wales ;) [10:11:58] TimStarling: ULS extension users, including twn [10:13:14] normally people complain about privacy if we make MW dial home without explicit consent [10:13:36] and if there's no obvious other way to set it up, then that is a privacy problem in itself [10:14:35] do geoip doesn't know me : Geo = {} [10:17:02] bah, it knows a lot about me) [10:18:04] simply does not work over IPv6 :( [10:18:34] TimStarling: well, surely wmf is better default than freegeoip.net (which doesn't support https) and it can of course be pointed to another service or be turned off [10:18:50] yes, I guess so [10:19:56] in terms of traffic, it's hard to imagine the combined external users of ULS making any significant impact on the traffic to bits [10:20:23] but maybe append ULS to the User-Agent string just in case [10:20:30] hmm, right, can't do that [10:20:43] * TimStarling tries thinking before typing [10:22:13] yeah can only alter the url [10:23:07] and the conf has: [10:23:07] if (req.url == "/geoiplookup") { [10:23:19] but I guess Referer will be set [10:23:33] usually yes [10:45:10] Change merged: ArielGlenn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35137 [10:48:48] apergos: thanks ) [10:49:20] err: /Stage[main]/Zuul/Git::Clone[integration/zuul]/Exec[git_pull_integration/zuul]/returns: change from notrun to 0 failed: git pull --quiet returned 1 instead of one of [0] at /var/lib/git/operations/puppet/manifests/generic-definitions.pp:679 [10:49:29] apergos: this is never ending [10:49:35] warning: /Stage[main]/Zuul/Exec[install_zuul]: Skipping because of failed dependencies [10:49:49] your conf file is updated however. [10:50:05] yeah that is at least a good point [10:51:48] apergos: ohhh I found out there is a conflict in the git repository :/ cd /var/lib/git/integration/zuul then git reset --hard origin/master [10:51:58] apergos: I force pushed a change which does not play nice with git::clone [10:52:32] tsk tsk [10:52:36] it apparently tries to merge in [10:53:00] I guess that is to prevent git::clone from overriding possible live hacks [10:55:18] apergos: can you possibly reset the local git repo at /var/lib/git/integration/zuul on gallium ? It is owned by root:root :-] [10:55:35] that is the last time I force push to a git repo :-] [10:56:39] same as your reset above I assume? [10:56:48] yeah git reset --hard origin/master [10:56:49] should fix it [10:57:01] master is currently ahead3, behind 1 :( [10:57:07] nice [10:57:14] another puppetd -tv will reinstall the software [10:57:19] I reset it [10:57:26] \O/ [10:57:28] running [10:57:47] luckily Zuul is not yet fully in production :-] [10:58:26] :-D [10:58:28] I did not face the issue on labs earlier because puppet did not run before the forced push [10:58:33] well this is why you are getting the bugs out now [10:58:50] I should use a branch for labs I gues [10:58:50] s [10:59:11] or a production tag [10:59:12] hm [10:59:41] this run had a bunch of iptables crap in it but that's about it [10:59:53] so it might be fine [11:01:56] New review: Hashar; "recheck" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/35135 [11:03:58] apergos: one last thing, the software did not get refreshed. could you manually set it up with : cd /var/lib/git/integration/zuul ; python setup.py install [11:04:03] that will fix it up :-) [11:04:27] puppet trigger an installation of the software if and only if it update the git clone [11:04:49] ok, you'll want to check it to see that it's ok now [11:04:51] but since you did update the local working directory manually, we still need to manually run the install :( [11:05:03] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [11:07:04] checking [11:07:37] New review: Hashar; "recheck" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/35135 [11:08:36] apergos: got another bug but it is unrelated. Thanks a ton Ariel! [11:09:04] σθρε [11:09:06] er [11:09:07] sure [11:09:11] stupid keyboard layout [11:27:06] PROBLEM - Puppet freshness on ssl3001 is CRITICAL: Puppet has not run in the last 10 hours [11:36:03] PROBLEM - Frontend Squid HTTP on amssq62 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:37:33] RECOVERY - Frontend Squid HTTP on amssq62 is OK: HTTP OK HTTP/1.0 200 OK - 656 bytes in 0.251 seconds [11:50:28] out to coworking space [12:56:18] PROBLEM - Host cp1041 is DOWN: PING CRITICAL - Packet loss = 100% [12:56:45] RECOVERY - Host cp1041 is UP: PING OK - Packet loss = 0%, RTA = 26.52 ms [13:10:51] PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours [13:11:00] PROBLEM - Host cp1042 is DOWN: PING CRITICAL - Packet loss = 100% [13:11:45] RECOVERY - Host cp1042 is UP: PING OK - Packet loss = 0%, RTA = 26.61 ms [13:20:48] New patchset: Dereckson; "(bug 42280) Temporary disable WebFonts on fa.wikipedia" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35145 [13:27:39] PROBLEM - Host cp1021 is DOWN: PING CRITICAL - Packet loss = 100% [13:28:06] RECOVERY - Host cp1021 is UP: PING OK - Packet loss = 0%, RTA = 26.92 ms [13:31:55] New patchset: Mark Bergsma; "Allow upgrades of Varnish on upload cluster" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35148 [13:32:07] Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35148 [13:32:36] PROBLEM - Varnish HTTP upload-frontend on cp1021 is CRITICAL: Connection refused [13:32:36] PROBLEM - Varnish traffic logger on cp1021 is CRITICAL: PROCS CRITICAL: 0 processes with command name varnishncsa [13:34:15] RECOVERY - Varnish traffic logger on cp1021 is OK: PROCS OK: 3 processes with command name varnishncsa [13:35:45] RECOVERY - Varnish HTTP upload-frontend on cp1021 is OK: HTTP OK HTTP/1.1 200 OK - 643 bytes in 0.053 seconds [13:55:24] PROBLEM - Host cp1023 is DOWN: PING CRITICAL - Packet loss = 100% [13:56:13] afk for awhile (feeling like crap since yesterday and getting steadily worse, will try rest for a bit) [13:56:45] RECOVERY - Host cp1023 is UP: PING OK - Packet loss = 0%, RTA = 26.78 ms [14:05:59] arhghghg [14:06:03] PROBLEM - Host cp1024 is DOWN: PING CRITICAL - Packet loss = 100% [14:06:05] the stupid SQLite is super slow [14:06:06] :( [14:06:30] New review: Nemo bis; "Bug was closed LATER" [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/35145 [14:06:30] RECOVERY - Host cp1024 is UP: PING OK - Packet loss = 0%, RTA = 26.62 ms [14:14:35] New review: Dereckson; "@Nemo Huji dropped in the current discussion and closed the bug "deploy on fa." as later, probably w..." [operations/mediawiki-config] (master) C: 0; - https://gerrit.wikimedia.org/r/35145 [14:17:09] New patchset: Hashar; "Contint requires a 512M tmpfs file fs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35159 [14:19:48] New review: Nemo bis; "Yes, bug was reopened but I confirm my -1 because they're not asking undeploy but only a configurati..." [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/35145 [14:20:00] PROBLEM - Host cp1025 is DOWN: PING CRITICAL - Packet loss = 100% [14:21:57] RECOVERY - Host cp1025 is UP: PING OK - Packet loss = 0%, RTA = 26.92 ms [14:34:52] PROBLEM - Host cp1026 is DOWN: PING CRITICAL - Packet loss = 100% [14:35:36] RECOVERY - Host cp1026 is UP: PING OK - Packet loss = 0%, RTA = 27.75 ms [14:39:42] New review: Dereckson; "Actually, the fa. community requested to disable extension." [operations/mediawiki-config] (master) C: 0; - https://gerrit.wikimedia.org/r/35145 [14:42:35] PROBLEM - Host cp1027 is DOWN: PING CRITICAL - Packet loss = 100% [14:44:50] RECOVERY - Host cp1027 is UP: PING OK - Packet loss = 0%, RTA = 26.77 ms [14:46:25] New review: Nemo bis; "Removing myself from the discussion: Amir said they'd disable it if there's no quick fix, not in any..." [operations/mediawiki-config] (master) C: 0; - https://gerrit.wikimedia.org/r/35145 [15:02:05] PROBLEM - Host cp1028 is DOWN: PING CRITICAL - Packet loss = 100% [15:02:59] RECOVERY - Host cp1028 is UP: PING OK - Packet loss = 0%, RTA = 27.08 ms [15:06:26] PROBLEM - Varnish traffic logger on cp1028 is CRITICAL: PROCS CRITICAL: 0 processes with command name varnishncsa [15:07:02] PROBLEM - Varnish HTTP upload-frontend on cp1028 is CRITICAL: Connection refused [15:08:41] RECOVERY - Varnish HTTP upload-frontend on cp1028 is OK: HTTP OK HTTP/1.1 200 OK - 643 bytes in 0.086 seconds [15:09:35] RECOVERY - Varnish traffic logger on cp1028 is OK: PROCS OK: 3 processes with command name varnishncsa [15:16:39] apergos: around still ? :-) [15:19:22] what do you need? [15:19:26] oh [15:19:26] :) [15:19:36] mark: some file permission fix on gallium.wikimedia.org [15:19:55] mark: I had some perf issues so Tim created a tmpfs but files belong to root:root [15:20:03] the path is /mnt/jenkins-tmp [15:20:11] if you could chown -R jenkins:jenkins that would be nice :) [15:20:20] done [15:20:26] our hero [15:20:46] I have documented the hack in puppet at https://gerrit.wikimedia.org/r/#/c/35159/ :) [15:20:57] I guess Tim will have a look at it first [15:20:59] thanks again mark! [15:21:17] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [15:21:56] New review: Mark Bergsma; "This should preferably not be in /mnt, but in /var/lib/jenkins (or something) instead, or perhaps un..." [operations/puppet] (production); V: 0 C: -1; - https://gerrit.wikimedia.org/r/35159 [15:22:02] PROBLEM - Host cp1029 is DOWN: PING CRITICAL - Packet loss = 100% [15:23:23] RECOVERY - Host cp1029 is UP: PING OK - Packet loss = 0%, RTA = 26.50 ms [15:26:11] !log dist-upgrades & reboots for payments100[1-4], pay-lvs100[12] [15:26:20] Logged the message, Master [15:31:47] PROBLEM - Host cp1030 is DOWN: PING CRITICAL - Packet loss = 100% [15:32:03] New patchset: Hashar; "Contint requires a 512M tmpfs file fs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35159 [15:33:03] New review: Hashar; "PS2 uses /var/lib/jenkins as a base directory and keeps a symbolic behind to preserves back compatib..." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/35159 [15:33:08] RECOVERY - Host cp1030 is UP: PING OK - Packet loss = 0%, RTA = 26.61 ms [15:42:44] PROBLEM - Host cp1032 is DOWN: PING CRITICAL - Packet loss = 100% [15:43:20] RECOVERY - Host cp1032 is UP: PING OK - Packet loss = 0%, RTA = 26.54 ms [15:52:51] PROBLEM - Host cp1033 is DOWN: PING CRITICAL - Packet loss = 100% [15:53:18] RECOVERY - Host cp1033 is UP: PING OK - Packet loss = 0%, RTA = 26.49 ms [16:02:36] PROBLEM - Host cp1034 is DOWN: PING CRITICAL - Packet loss = 100% [16:05:09] RECOVERY - Host cp1034 is UP: PING OK - Packet loss = 0%, RTA = 26.67 ms [16:14:00] PROBLEM - Host cp1035 is DOWN: PING CRITICAL - Packet loss = 100% [16:14:36] RECOVERY - Host cp1035 is UP: PING OK - Packet loss = 0%, RTA = 26.47 ms [16:17:28] New review: RobLa; "Looks fine to me. I could bikeshed a little (put the true condition first) but this version works. ..." [operations/mediawiki-config] (master); V: 0 C: 1; - https://gerrit.wikimedia.org/r/33542 [16:25:24] PROBLEM - Host cp1036 is DOWN: PING CRITICAL - Packet loss = 100% [16:26:10] RECOVERY - Host cp1036 is UP: PING OK - Packet loss = 0%, RTA = 26.52 ms [16:29:10] !log Dist-upgraded & rebooted eqiad upload Varnish servers [16:29:16] Logged the message, Master [16:30:55] https://bugzilla.wikimedia.org/show_bug.cgi?id=40910#c5 Who needs to be asked to deploy these sorts of changes? (wikimedia/communications/WP-Victor for blog.wm.o) [16:32:51] robla: also not necessarily worth the bikeshed but you could do `$wgEnableTranscode = (bool)$wgEnableUploads` [16:33:34] yeah, j^ and I spoke about that too :) [16:34:23] * robla goes offline to get into the office now [16:50:36] PROBLEM - Puppet freshness on analytics1001 is CRITICAL: Puppet has not run in the last 10 hours [16:50:37] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [16:50:37] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [16:54:32] Ugh [16:54:40] Java is using 300% CPU on manganese [16:54:47] Explains why git clone is so sloooow [16:55:08] 18296 gerrit2 20 0 6639m 2.3g 5716 S 310 29.0 1482:01 java [16:57:08] New patchset: Silke Meyer; "Added a generic definition to install mediawiki extensions" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35173 [16:57:24] jeremyb: did you review https://gerrit.wikimedia.org/r/#/c/34481/ ? [16:57:46] 18296 gerrit2 20 0 6639m 2.3g 5716 S 413 29.0 1490:31 java [16:57:47] preilly: erm? [16:57:49] 400%! [16:57:56] preilly: i read part of it [16:57:59] preilly: why? [16:58:00] ^demon|sick: ^^ I'm guessing that's not right? [16:58:25] Reedy: he's probably sleeping [16:58:28] <^demon|sick> *sigh* [16:58:29] <^demon|sick> Probably not [16:58:44] well you see how accurate i am [16:58:49] He made a commit in the last half an hour ;) [16:59:04] If Ryan had been around, I would've poked him instead [16:59:46] Connection to gerrit.wikimedia.org closed by remote host. [16:59:46] fatal: The remote end hung up unexpectedly [16:59:47] heh [16:59:50] <^demon|sick> If I plan on sleeping, I should actually step away from the computer. [16:59:59] Write failed: Broken pipe1260/1615), 6.19 MiB | 1.27 MiB/s [16:59:59] fatal: The remote end hung up unexpectedly [17:00:01] fatal: early EOF [17:00:01] fatal: index-pack failed [17:00:18] * Reedy wonders what the hell preilly is pushing [17:00:45] Reedy: that was on a clone of operations/debs/lucene-search-2 [17:00:48] <^demon|sick> I kicked gerrit. [17:01:09] ^demon|sick: much better [17:01:57] <^demon|sick> Ok, now I'm actually going to walk away from the computer. [17:02:01] <^demon|sick> Pings will no longer raise me [17:02:03] <^demon|sick> :) [17:02:23] thanks [17:06:32] PROBLEM - Puppet freshness on magnesium is CRITICAL: Puppet has not run in the last 10 hours [17:06:32] PROBLEM - Puppet freshness on zinc is CRITICAL: Puppet has not run in the last 10 hours [17:10:04] 13477 gerrit2 20 0 6608m 1.8g 12m S 299 23.2 25:36.11 java [17:10:07] Didn't take very long... [17:14:36] New review: Andrew Bogott; "I like this! The git::extension is entirely mediawiki-specific, so could it live in a mediawiki fil..." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/35173 [17:14:55] 500% CPU! [17:15:35] seems we can probably do 800% [17:17:02] New review: Silke Meyer; "Yeah, true. Would that be in the mediawiki.pp file? I'm not sure if the keep_up_to_date parts could ..." [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/35173 [17:23:47] New patchset: Demon; "Fix gerritslave's environment so he can connect" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35176 [17:26:17] ^demon|sick: go away! [17:26:41] New review: Andrew Bogott; "Yep, I think mediawiki.pp is the right place for it." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/35173 [17:55:29] !log authdns-update - adding wikipedia.bg [17:55:36] Logged the message, Master [17:56:10] bulgaria! [17:58:04] !log aaron synchronized php-1.21wmf4/extensions/FlaggedRevs [17:58:11] Logged the message, Master [17:59:40] New patchset: Reedy; "Add wmf5 symlinks" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35186 [18:00:03] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35186 [18:01:25] !log reedy synchronized live-1.5/static-1.21wmf5/ [18:01:31] Logged the message, Master [18:02:49] !log reedy synchronized wmf-config/ExtensionMessages-1.21wmf5.php [18:02:55] Logged the message, Master [18:03:06] !log aaron synchronized php-1.21wmf5/extensions/FlaggedRevs [18:03:12] Logged the message, Master [18:07:36] PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours [18:14:48] !log reedy synchronized php-1.21wmf5 'Initial sync of php-1.21wmf5' [18:14:54] Logged the message, Master [18:15:36] !log aaron synchronized php-1.21wmf4/thumb.php 'deployed 81c1d0133f74eec600277ab744a9730c4e974e47' [18:15:42] Logged the message, Master [18:16:27] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: test2wiki to 1.21wmf5 for l10n stuffs [18:16:34] Logged the message, Master [18:20:15] New patchset: Kaldari; "Switching to secure Flickr API URL" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35190 [18:20:49] Change merged: Kaldari; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35190 [18:31:51] !log reedy Started syncing Wikimedia installation... : Build localisation cache for 1.21wmf5 [18:31:57] Logged the message, Master [18:32:31] New patchset: preilly; "fix whitespace issue" [operations/debs/lucene-search-2] (master) - https://gerrit.wikimedia.org/r/35192 [18:32:59] New patchset: Reedy; "ULS is there twice..." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35193 [18:33:40] New patchset: preilly; "fix whitespace issue" [operations/debs/lucene-search-2] (master) - https://gerrit.wikimedia.org/r/35192 [18:33:44] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35193 [18:34:22] notpeter: can you approve https://gerrit.wikimedia.org/r/#/c/35192/ [18:34:35] sure [18:34:41] (probably ;) ) [18:35:09] Change merged: Pyoungmeister; [operations/debs/lucene-search-2] (master) - https://gerrit.wikimedia.org/r/35192 [18:35:34] New review: preilly; "It would be nice to have some sort of review before merge on these changes." [operations/debs/lucene-search-2] (master) - https://gerrit.wikimedia.org/r/34481 [18:37:17] !log aaron synchronized php-1.21wmf5/includes/filerepo/file/LocalFile.php 'deployed 1fa41352d4e0b3cd2e53e7d8d485ac9098b8bfcc' [18:37:24] Logged the message, Master [18:41:20] New patchset: preilly; "fix whitespace issue" [operations/debs/lucene-search-2] (master) - https://gerrit.wikimedia.org/r/35195 [18:41:37] Change merged: preilly; [operations/debs/lucene-search-2] (master) - https://gerrit.wikimedia.org/r/35195 [18:43:48] New patchset: Dzahn; "RT-804, umask for wikidev users, overwritten by /etc/profile on <= lucid" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/34223 [18:45:06] New patchset: preilly; "fix whitespace issue" [operations/debs/lucene-search-2] (master) - https://gerrit.wikimedia.org/r/35197 [18:45:09] !log reedy synchronized php-1.21wmf5/extensions/Diff [18:45:16] Logged the message, Master [18:45:25] Change merged: preilly; [operations/debs/lucene-search-2] (master) - https://gerrit.wikimedia.org/r/35197 [18:47:35] !log reedy synchronized php-1.21wmf5/extensions/Wikibase [18:47:41] Logged the message, Master [18:58:50] !log reedy synchronized php-1.21wmf5/cache/l10n/ [18:58:54] yay [18:58:57] Logged the message, Master [19:10:43] !log reedy Started syncing Wikimedia installation... : Rebuild localisation cache for 1.21wmf5 [19:10:49] Logged the message, Master [19:12:41] Can someone please run chown mwdeploy /home/wikipedia/common/wmf-config/ExtensionMessages-1.21wmf5.php on fenari? Thanks [19:13:02] apergos: paravoid either of you there? [19:17:35] Reedy: done [19:17:40] thanks [19:24:03] !log reedy synchronized php-1.21wmf5/cache/l10n/ [19:24:10] Logged the message, Master [19:27:11] !log reedy synchronized php-1.21wmf5/extensions/UploadWizard [19:27:18] Logged the message, Master [19:29:38] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: testwiki and mediawikiwiki to 1.21wmf5 [19:29:45] Logged the message, Master [19:30:25] New patchset: Reedy; "testwiki, test2wiki and mediawikiwiki to 1.21wmf5" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35199 [19:30:58] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35199 [19:33:27] New review: Umherirrender; "FYI: Some category names have changed" [operations/mediawiki-config] (master) C: 0; - https://gerrit.wikimedia.org/r/34964 [20:06:46] New patchset: Reedy; "Fix testwiki version to 1.21wmf5 not 6!!" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35204 [20:07:03] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35204 [20:08:42] New patchset: Reedy; "Add TemplateSandbox" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35205 [20:10:56] New patchset: Reedy; "Add TemplateSandbox" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35205 [20:11:16] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35205 [20:13:03] New patchset: Asher; "enable binlog on pc dbs with sync_binlog=0" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35206 [20:13:51] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35206 [20:20:23] PROBLEM - Host pc3 is DOWN: PING CRITICAL - Packet loss = 100% [20:21:02] that was me ^^ [20:22:59] New patchset: Reedy; "Enable TemplateSandbox on cluster, on for test2wiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35207 [20:23:15] RECOVERY - Host pc3 is UP: PING OK - Packet loss = 0%, RTA = 0.38 ms [20:36:10] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35207 [20:39:53] PROBLEM - Host pc2 is DOWN: PING CRITICAL - Packet loss = 100% [20:42:35] RECOVERY - Host pc2 is UP: PING OK - Packet loss = 0%, RTA = 1.27 ms [20:42:35] PROBLEM - Host pc3 is DOWN: PING CRITICAL - Packet loss = 100% [20:45:44] RECOVERY - Host pc3 is UP: PING OK - Packet loss = 0%, RTA = 0.71 ms [20:48:58] New patchset: MarkTraceur; "Put Parsoid bug messages into #mediawiki-parsoid" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35285 [20:54:32] -rw-r--r-- 1 udp2log udp2log 0 Nov 21 06:50 fatal.log [20:54:51] Anyone any idea why the fatal log on fluorine is empty? [20:55:24] Can someone also please delete .php(3819):.log and .php(253):.log and .log from /a/mw-log? [20:55:31] New patchset: Ottomata; "Installing geoip packages on contint server for udp-filter" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35287 [20:55:59] Change merged: Ottomata; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35287 [20:56:26] Reedy: deleted [20:56:34] thanks [20:57:29] New patchset: Reedy; "Enable TemplateSandbox on mediawikiwiki" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35288 [20:58:07] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35288 [20:58:13] fatal.log might be a precise upgrade issue [20:59:24] -rw-r--r-- 1 udp2log udp2log 181157 Nov 20 06:20 fatal.log-20121120.gz [20:59:24] -rw-r--r-- 1 udp2log udp2log 52694 Nov 21 02:56 fatal.log-20121121.gz [20:59:31] Last couple of days at least had content [20:59:34] duh [20:59:44] with the last couple of days being a week ago [20:59:49] Yup, looks very likely [20:59:52] notpeter: ^^ ;) [21:00:09] in fatal.log-20121121.gz there are only mw* servers that weren't upgraded yet [21:00:32] I realised straight after I said it, the files decrease massively, then disappear [21:02:21] files/php/wmerrors.ini logged to udp://10.64.0.21:8420 [21:02:31] modules/applicationserver/files/php/wmerrors.ini to udp://10.64.0.21:8421 [21:03:37] notpeter: do you remember why? [21:04:09] New patchset: Dzahn; "RT-804, umask for wikidev users, overwritten by /etc/profile on <= lucid" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/34223 [21:04:23] I would guess because someone changed it in only 1 of 2 places :/ [21:05:14] ok, i'm fixing it, just wondering if there was a reason for the port change [21:05:53] I don't remember specifically [21:05:57] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [21:06:09] but, sorry that that slipped through the cracks :/ [21:06:20] no worries [21:06:21] New patchset: Asher; "fix the udp errorlog port" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35289 [21:06:38] Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35289 [21:07:01] i guess we'll need to restart all apaches later [21:07:51] want me to do it in 30 minutes? [21:07:55] New patchset: MarkTraceur; "Put Parsoid bug messages into #mediawiki-parsoid" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35285 [21:07:58] New patchset: Dzahn; "RT-804, umask for wikidev users, overwritten by /etc/profile on <= lucid" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/34223 [21:08:32] notpeter: how about in a few hours in case the puppets are slow [21:10:47] New patchset: Dzahn; "RT-804, umask for wikidev users, overwritten by /etc/profile on <= lucid" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/34223 [21:10:56] binasher: sure, I'll do that [21:14:27] New patchset: Andrew Bogott; "Replace keep_up_to_date with ensure present/latest." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35293 [21:22:25] New patchset: MaxSem; "Update device detection to match MobileFrontend" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35298 [21:22:27] gerrit-wm: move it [21:24:51] Reedy: Do you know what (kind of) password I need to use for sudo on production machines (gallium to be exact) [21:24:59] https://gerrit.wikimedia.org/r/34226 was merged [21:25:02] You don't [21:25:04] but I have no idea what password to use. [21:25:05] err [21:25:09] nothing seems to work. [21:25:15] oh [21:25:24] just do sudo /etc/init.d/jenkins status [21:25:40] You've only got limited access to certain commands [21:25:45] oh, I see. [21:25:54] So what do I use to e.g. fix chmod on a directory owned by jenkins? [21:26:00] (or is that not possible with the given change) [21:26:07] sudo -u jenkins chmod 123 foo [21:26:08] ? [21:26:26] ah, right "-u jenkins" [21:26:38] thanks, that worked [21:26:45] (sudo -u jenkins git clone .. ; in this case) [21:26:53] same for postgres/testswarm users [21:27:03] err, not for postgres [21:27:38] yeah, already got it. Thanks, I thought I had passwordless sudo in general (which I would't need anyway) [21:27:48] or passworded for that matter (with my ldap password) [21:28:00] PROBLEM - Puppet freshness on ssl3001 is CRITICAL: Puppet has not run in the last 10 hours [21:28:04] since "sudo " asks for a password [21:30:37] New patchset: Dereckson; "(bug 41167) Namespace configuration for ba.wikipedia" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35300 [21:35:53] New review: Dereckson; "Followup: Gerrit change I9a65708e" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/28505 [21:36:42] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/34223 [21:44:24] anomie: Reedy: the umask thing is fixed for real now [21:44:33] i got 0002 [21:44:41] (on fenari) [21:45:01] mutante- Works for me too. Thanks! [21:45:21] (that umask line in /etc/profile is just there on lucid, it is not anymore once this is precise) [21:45:24] kk,cool [21:49:12] mutante: hey Daniel :-] I have been asked about redirects.conf which hardcode redirects to HTTP :-] So just poking you to remember you about https://bugzilla.wikimedia.org/show_bug.cgi?id=31369 :-] [21:49:33] mutante: probably not sooo urgent though :-] [21:53:16] New review: Hashar; "See also https://gerrit.wikimedia.org/r/#/c/26325/ :-)" [operations/puppet] (production); V: 0 C: -1; - https://gerrit.wikimedia.org/r/35285 [21:55:55] hashar: it could start with a comment and new patch set .. that has a path conflict since September [21:56:25] _and_ i already pointed out the issue there..it broke Apache too [22:01:08] New patchset: Nemo bis; "(bug 42105) Restore normal bureaucrat permissions where changed without consensus" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33390 [22:04:17] New review: Alex Monk; "It looks like PS2 didn't change anything, so issues from PS1 still apply." [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/33390 [22:04:20] New patchset: Pyoungmeister; "setting all lucene nodes to latest version of lucene" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35308 [22:06:15] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35308 [22:07:00] hmm [22:07:45] New patchset: Nemo bis; "(bug 42105) Restore normal bureaucrat permissions where changed without consensus" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/33390 [22:19:55] Could someone please review, merge and push https://gerrit.wikimedia.org/r/#/c/35176/ please? [22:24:00] Reedy: mergin' [22:25:45] Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35176 [22:26:05] Reedy: ok, merged on sockpuppet [22:27:55] Thanks [22:28:23] New patchset: Ottomata; "Granting Dario access to Hadoop" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35312 [22:28:49] Change merged: Ottomata; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35312 [22:30:50] New review: Tim Starling; "I wanted to deploy it before the cluster went down again and paged all the Americans on their long w..." [operations/debs/lucene-search-2] (master) - https://gerrit.wikimedia.org/r/34481 [22:34:13] New patchset: Andrew Bogott; "Add the install_path param for single-mode mediawiki." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35313 [22:34:14] New patchset: Andrew Bogott; "Replace keep_up_to_date with ensure present/latest." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/35293 [22:49:10] New review: MarkTraceur; "So should I wait for the wikibugs git change to be merged, then re-submit?" [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/35285 [22:49:54] Tim-away: https://gerrit.wikimedia.org/r/#/c/35289/ [22:50:03] I think it was merged, not sure the apaches were gracefulled yet [22:50:15] I know notpeter was going to do it a bit later when puppet catches up... [22:52:52] right [22:58:31] New patchset: Hashar; "beta: use IP for memcached server" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35318 [23:04:10] mutante: I'm looking for lsearch.conf and the OAI.username and OAI.password [23:04:20] mutante: thanks, for your help [23:04:27] preilly: it is in the private puppet repo [23:06:51] New review: Cmcmahon; "this should fix beta labs (again), so +1 and please +2 as soon as possible" [operations/mediawiki-config] (master) C: 1; - https://gerrit.wikimedia.org/r/35318 [23:06:57] preilly: i put it in your home [23:06:57] woosters: ping [23:07:01] mutante: thanks [23:07:21] wassup preilly? [23:07:27] the rest is in ./templates/lucene/ in public puppet [23:08:31] woosters: Can I send you a pm [23:08:38] mutante: okay cool [23:12:02] PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours [23:16:12] Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/34353 [23:19:47] paravoid: ping [23:20:03] * preilly — yes, I know it's 1:19am Tuesday (EET)  [23:21:36] TimStarling: ping [23:21:44] hello [23:21:50] TimStarling: can you comment on https://rt.wikimedia.org/Ticket/Display.html?id=3963&results=732e9d335aef007ccb33ad6b30f772cf [23:21:57] TimStarling: also did you notice my email to Rob [23:22:13] TimStarling: I wanted to ping him again after Sandy and remind him [23:24:35] TimStarling: I also noticed the tests in https://gerrit.wikimedia.org/r/#/c/34852/ and it all looked good [23:25:20] preilly: just came home from the airport [23:25:32] do you still want a VM on labs with something similar to my test setup? [23:25:52] TimStarling: no need [23:27:14] paravoid: oh nice fun trip? [23:27:34] paravoid: if you have the cycles can you comment on https://rt.wikimedia.org/Ticket/Display.html?id=3963&results=732e9d335aef007ccb33ad6b30f772cf [23:28:06] TimStarling: I've got everything on my local VM for lsearch so I really don't think that labs is needed right now [23:28:46] ok [23:29:46] it's pretty easy to set up, at least at the level of complexity I used for my tests [23:30:04] TimStarling: yeah totally [23:30:40] TimStarling: can I send you a pm? [23:30:45] yes [23:32:15] preilly: I think this you "moving more to an architect role" should be communicated better [23:32:47] paravoid: in what sense? [23:32:50] New patchset: Dereckson; "(bug 41992) Set timezones for Wikivoyage wikis" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35330 [23:33:01] is it announced anywhere and I missed it? [23:33:10] I'm not disputing that it has happened [23:33:18] just saying that people should be aware of it :) [23:34:18] paravoid: yeah, I agree [23:34:35] paravoid: there hasn't really been any official announcement [23:37:43] access requests are being discussed at the ops meetings [23:37:46] next one is this coming Monday [23:38:09] not that discussing it in RT is bad [23:38:13] paravoid: I asked for process guidance from woosters and he told me to create an RT ticket [23:38:23] yes [23:38:26] paravoid: so that is what I did [23:38:40] just saying that it's unlikely you'll get an answer before this coming Monday [23:38:49] not that you did wrong by filing the RT [23:39:23] trying to avoid the frustration over the delay :) [23:40:34] paravoid: ah, I see [23:40:37] paravoid: thanks for that [23:40:55] paravoid: I can just be like the US Congress and lame duck until next Monday [23:43:12] as for the request, I'm generally in favor of this, although I'd love to know more about this role and your new responsibilities [23:43:16] New patchset: Dereckson; "(bug 41322) Namespace configuration on be.wikisource" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35331 [23:45:17] paravoid: okay that makes sense — it's basically me being a performance engineer [23:45:43] paravoid: working on analysis and deployment of tweaks, etc. [23:45:54] paravoid: as well as hopefully bigger things like HHVM [23:48:10] mutante: what is ticket 2901 [23:48:25] mutante: I see the following: No permission to view ticket [23:49:42] your request for sudo access on mobile varnish boxes [23:49:51] paravoid: ah, I see [23:50:01] paravoid: I wonder why I don't have access to view that ticket [23:50:03] well, tfinc's request actually [23:50:29] hmm ? [23:50:45] tfinc: we are talking about RT ticket 2901 [23:50:47] sorry for the notify, nothing important :) [23:50:55] tfinc: sudo access on mobile varnish boxes [23:51:12] ahh ok [23:51:14] tfinc: as a parallel to ticket https://rt.wikimedia.org/Ticket/Display.html?id=3963&results=230f619d93d6ce01ccb9a905bf32ca6a [23:51:17] which is kind of obsoleted if this new request goes through [23:51:24] paravoid: indeed [23:51:28] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/35331 [23:51:38] woosters: any idea why I can't view ticket number 2901? [23:52:49] is it in the procurement queue? [23:54:35] preilly: an older access request for sudo on mobile varnish boxes [23:54:58] mutante: okay got it [23:55:07] mutante: for some reason I can't view that ticket