[00:00:57] hoo, I thought those three dots is it's attempt to get to the next hop and is failing. I am fairly certain that it's not masking 64 hops. [00:01:54] let me search a nice expl. online [00:02:00] PROBLEM - gitblit.wikimedia.org on antimony is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Server Error - 1703 bytes in 6.872 second response time [00:02:30] Cyberpower678: https://serverfault.com/a/334039 [00:02:56] the Wikimedia nodes coming after that simply aren't responding [00:04:49] hoo, so good thing or bad thing? [00:05:09] Cyberpower678: As told, that's totally normal behavior [00:05:15] oh. [00:05:34] you're traceroute looks totally fine [00:05:57] Ok. So trace route is ok. So why is it taking milkipedia sites to load significantly longer. :/ [00:06:13] milkipedia? [00:06:21] what's that? Wikia? [00:06:23] :D [00:06:29] Wikimedia [00:06:37] LMFAO [00:06:46] By far the best autocorrect ever. [00:07:02] hilarious :P [00:07:30] Anyway... given the description of your problems before, that seemed more like you have problems with bits than with the actual html pages [00:08:22] It took 5 minutes to load the Application Management page in preferences. [00:08:38] Firefox said it was transferring data from bits.wikimedia.org [00:08:46] for 5 minutes [00:08:52] icky [00:09:19] (03CR) 10Reedy: "166 or so apaches apparently online in tampa (based on API, non api and bits). Ganglia suggests that's out of 243 machines online." [operations/puppet] - 10https://gerrit.wikimedia.org/r/108070 (owner: 10Chad) [00:10:10] Block logs take 3-6 seconds. [00:10:24] Usually it's a half of a second to load. [00:10:32] Cyberpower678: ping bits-lb.eqiad.wikimedia.org and see whether you have packet losses [00:10:40] hoo, how? [00:11:00] RECOVERY - gitblit.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 229854 bytes in 7.035 second response time [00:11:04] nvm [00:11:57] Alright, lighttning deploy time [00:12:50] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 08:45:44 AM UTC [00:13:18] hoo, How many pings will it do? [00:13:42] Cyberpower678: Unless you stop it using ctrl + c (usually) [00:14:40] hoo, http://pastebin.com/mzXCY5mw [00:15:37] "0.5% packet loss " that's not as good as it can get, but I doubt that's what you were hitting [00:16:54] hoo, any other ideas? [00:17:35] Try turning the internet off and on again [00:18:04] -.- [00:18:13] I'm not that clueless. :p [00:18:20] I've already tried that. [00:18:29] As well as reboot the computer. [00:18:49] Connection speeds as pointed out, are at peak levels. [00:21:02] !log catrope synchronized php-1.23wmf16/extensions/VisualEditor/modules/ve-mw/ui/dialogs/ve.ui.MWMediaInsertDialog.js 'https://gerrit.wikimedia.org/r/#/c/116211/' [00:21:10] Logged the message, Master [00:21:38] (03CR) 10Catrope: [C: 032] Fix VisualEditor/Parsoid on private wikis [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/116662 (owner: 10Catrope) [00:21:47] (03Merged) 10jenkins-bot: Fix VisualEditor/Parsoid on private wikis [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/116662 (owner: 10Catrope) [00:25:03] (03CR) 10GWicke: Fix VisualEditor/Parsoid on private wikis (031 comment) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/116662 (owner: 10Catrope) [00:25:48] !log catrope synchronized wmf-config/CommonSettings.php 'Fix VE on officewiki' [00:25:51] hoo, I just received an incomplete page through a timeout after 5 minutes of attempting to load my watchlist [00:25:57] Logged the message, Master [00:27:57] https://en.wikipedia.org/w/index.php?title=Saint_Joseph%27s_University&action=history 11 seconds [00:28:09] what exactly takes 11s [00:28:14] I mean which http request [00:28:18] all of them [00:28:24] simultaneously [00:28:41] https://en.wikipedia.org/w/index.php?title=Encyclopedia_Dramatica&action=history 23 seconds [00:29:09] Reedy: Even Wikipedia would ahve been a better answer :D [00:29:33] computers [00:29:37] hoo, From the time I click that link to the time something shows up on my screen. [00:29:52] till something shows? :| [00:30:11] Reedy, I can't be more specific atm [00:30:56] it's quite difficult to help you without better information [00:31:19] Reedy, how do I get better information. It helps to know where to look. [00:31:26] What browser are you using? [00:31:31] Safari on Mac [00:32:26] https://developer.apple.com/library/safari/documentation/AppleApplications/Conceptual/Safari_Developer_Guide/Instruments/Instruments.html#//apple_ref/doc/uid/TP40007874-CH4-SW1 [00:33:41] i'd just try another browser before i tried troubleshooting deeply within a single browser personally... [00:34:03] (but usually its crazy addons that cause issues, safari isnt known for that) [00:34:38] RobH, Other browsers are having the same issues. Firefox on windows is slow to load wikipedia too [00:34:45] ahh, then no clue [00:34:49] do what reedy says =] [00:35:10] Reedy, is the web inspector an add-on? [00:35:30] I have nfi [00:35:47] No [00:35:49] "Web Inspector is an open source web development tool built into Safari" [00:36:59] Reedy, except for I can't find the damn thing. [00:37:08] I don't use a mac [00:37:11] Or safari [00:39:56] Reedy, foun it [00:40:38] You've got to enable it in advanced preferences because apparently no one would ever want development tools in their browser [00:42:20] index.php en.wikipedia.org resource-type-document 200 false 99945 15803 3.183137893676758 6.03571891784668 1393893692.505842 [00:42:34] 6.03571891784668s [00:44:03] a bit of context would help [00:44:04] (03CR) 10Dzahn: [C: 032] remove 'zhen' from site.pp,dsh,dhcpd [operations/puppet] - 10https://gerrit.wikimedia.org/r/116659 (owner: 10Dzahn) [00:44:12] I can guess most, but well :P [00:44:33] load.php bits.wikimedia.org resource-type-stylesheet 200 false 140985 34677 0.6567199230194092 25.39751410484314 1393893839.557539 [00:45:43] Name Domain Type Status Cached Size Transferred Latency Duration Timeline [00:45:50] load.php bits.wikimedia.org resource-type-stylesheet 200 false 140985 34677 0.6567199230194092 25.39751410484314 1393893839.557539 [00:46:16] the duration is... wow [00:47:50] hoo, does that help? [00:48:10] Do you have a load of useless crap enabled? [00:48:16] gadgets, user js... [00:48:30] does it change being logged in vs logged out? [00:48:39] Reedy, just what I've had enabled for the past year. [00:48:40] mutante: that's the question ;) [00:48:49] Hasn't affected performance. [00:49:21] mutante, I can't access preferences logged out. [00:49:34] That load.php entry comes from preferences. [00:49:53] !log zhen - disable puppet,revoke puppet cert,delete salt key,delete stored configs, disable monitoring... [00:50:02] Logged the message, Master [00:50:09] Cyberpower678: ah, understand [00:50:49] Special:Watchlist en.wikipedia.org resource-type-document 200 false 230319 34417 9.361343145370483 10.17725396156311 1393894224.499453 [00:51:28] Cyberpower678: next step would be comparing to the same computer and user, but connected via a different provider (if that is possible) [00:51:37] Cyberbot_II en.wikipedia.org resource-type-document 200 false 99589 15749 1.6356449127197266 6.37595796585083 1393894275.096077 [00:52:16] load.php bits.wikimedia.org resource-type-stylesheet 200 false 150438 37766 1.9066860675811768 3.514871120452881 1393894318.734478 [00:52:34] Another computer on the same network might work also, if you can't just hop onto another connection [00:52:44] Not much of a significant difference. Time until something appears is still the same. Logged out [00:53:40] I can't login anymore. [00:54:02] nvm [00:54:04] Cyberbot_II en.wikipedia.org resource-type-document 200 false 102051 15950 27.839799165725708 1.7688369750976562 1393894424.443726 [00:54:20] Latency of 27.839799165725708 seconds when logging in. [00:54:56] hoo, mutante ^ [00:55:05] Cyberpower678: If you have another computer or a tablet or whatever around, you might want to try that [00:55:41] Takes forever on my iPhone [00:56:00] Just to tell me that I'm not even logged in. [00:56:27] Login page loaded in 19 seconds [00:56:51] Logs in slightly faster though. [00:58:09] Cyberpower678: Is anyone besides you having an issue? [00:58:10] Cyberpower678: Is that using your local internet connection or going via some mobile network? [00:58:20] I'm not sure shitty local wifi is an operations issue. [00:58:20] Both [00:58:46] No one else is on Wikipedia. [00:59:50] hoo, On my LTE connection it took 10 seconds. That was slightly faster, but my LTE is a faster connection in general to the WiFi I have. [01:02:31] Well I have to go. [01:10:08] !log shutting down 'zhen' permanently [01:10:19] Logged the message, Master [01:20:38] PROBLEM - Puppet freshness on labstore4 is CRITICAL: Last successful Puppet run was Tue 25 Feb 2014 06:33:37 PM UTC [01:32:45] !log catrope synchronized php-1.23wmf16/resources/oojs-ui/oojs-ui.js 'oojs-ui fixes' [01:32:56] Logged the message, Master [02:25:05] !log schema change bug 31397 afl_namespace, slave by slave [02:25:15] Logged the message, Master [02:28:16] !log LocalisationUpdate completed (1.23wmf15) at 2014-03-04 02:28:16+00:00 [02:28:24] Logged the message, Master [02:41:35] (03CR) 10Hoo man: [C: 032] "Trivial one" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/115992 (owner: 10BryanDavis) [02:41:52] (03Merged) 10jenkins-bot: Fix documentation of `--home` option for activeMWVersions.php [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/115992 (owner: 10BryanDavis) [02:43:06] hoo: make sure you at least pull it onto tin [02:43:16] Reedy: ;) [02:43:17] !log LocalisationUpdate completed (1.23wmf16) at 2014-03-04 02:43:17+00:00 [02:43:24] Logged the message, Master [02:43:42] !log hoo synchronized multiversion/activeMWVersions.php 'Comment only change {{Gerrit' [02:43:49] Logged the message, Master [02:43:55] well :P [02:45:05] You don't want RoanKattouw_away chasing you about it [02:45:51] ok, there we go... :P [02:46:01] !log hoo synchronized multiversion/activeMWVersions.php 'Comment only change {{Gerrit|I5e68518}}' [02:46:08] Logged the message, Master [02:46:12] if anyone tries to chase hoo, he can simply lock the chaser xD [02:46:16] I knew it was smart to start of with a file outside the normal tree :P [02:46:45] Time to open a betting pool [02:47:09] for what? [02:47:36] Till you cause your first outage ;) [02:47:47] /downtime [02:48:06] :P [02:48:14] Shall I delete the first log entry? [02:48:17] The broken one? [02:48:29] I wouldn't worry about it [02:48:39] wikitech won't run out of space :p [02:48:47] might not :P [02:49:26] Reedy: What's the procedure to know that nobody else is deploying (except of scanning tins processlist like crazy)? [02:50:04] Scheduled stuff is https://wikitech.wikimedia.org/wiki/Deployments [02:50:12] Yeah, I'm not that bad :P [02:50:32] But you don't schedule every "Allow foowiki sysops to destroy the world" [02:50:33] Beyond that, there isn't really [02:51:27] Common sense mostly [02:51:50] Being in here [02:52:20] I mostly looked that nobody merged any deployment stuff recently and that nobody was doing suspicious stuff on tin [02:53:27] I wonder if gerrit should report action on gerrit on wmf/* branches in here [02:53:46] Probably not needed in most cases. solved by being in -dev [02:54:12] that sounds like a somewhat smart idea... sometimes people flood -dev so hard ... :P [02:54:32] Like that guy adding COPYING to everything :D [02:56:08] If you go back to living on CET rather than PST, you shouldn't conflict ;) [02:58:51] Reedy: I'm on CET (physically), just not living it... I have to find a rhythm again, I guess... [02:59:42] ;) [02:59:54] To which extent, I really shouldn't be (still) here [03:01:56] I guess this is a good moment to say good night [03:13:38] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 08:45:44 AM UTC [03:29:53] !log LocalisationUpdate ResourceLoader cache refresh completed at 2014-03-04 03:29:53+00:00 [03:30:01] Logged the message, Master [04:18:50] (03PS1) 10Greg Grossmeier: Log length of l10nupdate [operations/puppet] - 10https://gerrit.wikimedia.org/r/116718 [04:21:25] (03CR) 10Greg Grossmeier: "Haven't tested on production, but I did test my syntax locally with a stub shell script: http://paste.debian.net/85186/" [operations/puppet] - 10https://gerrit.wikimedia.org/r/116718 (owner: 10Greg Grossmeier) [04:21:38] PROBLEM - Puppet freshness on labstore4 is CRITICAL: Last successful Puppet run was Tue 25 Feb 2014 06:33:37 PM UTC [04:22:36] bd808: ^^ step 1 of my yet-to-be-report bug report [04:22:42] +ed [04:23:15] and.. sleep time [04:34:35] (03CR) 10BryanDavis: "You should also add:" [operations/puppet] - 10https://gerrit.wikimedia.org/r/116718 (owner: 10Greg Grossmeier) [05:24:22] (03PS1) 10Ori.livneh: Enable GeoIP cookie on Labs [operations/puppet] - 10https://gerrit.wikimedia.org/r/116719 [05:26:06] (03PS2) 10Ori.livneh: Enable GeoIP cookie on Labs [operations/puppet] - 10https://gerrit.wikimedia.org/r/116719 [05:27:20] (03CR) 10Ori.livneh: "hashar: please +1 if this is OK." [operations/puppet] - 10https://gerrit.wikimedia.org/r/116719 (owner: 10Ori.livneh) [06:14:38] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 08:45:44 AM UTC [06:27:18] PROBLEM - udp2log log age for emery on emery is CRITICAL: CRITICAL: log files /a/log/webrequest/packet-loss.log, have not been written in a critical amount of time. For most logs, this is 4 hours. For slow logs, this is 4 days. [06:29:18] RECOVERY - udp2log log age for emery on emery is OK: OK: all log files active [07:22:38] PROBLEM - Puppet freshness on labstore4 is CRITICAL: Last successful Puppet run was Tue 25 Feb 2014 06:33:37 PM UTC [07:23:17] (03CR) 10Matanya: [C: 031] remove "zhen" public IP, decom [operations/dns] - 10https://gerrit.wikimedia.org/r/116658 (owner: 10Dzahn) [08:04:59] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [08:48:51] (03PS3) 10Ori.livneh: geoip.inc.vcl: don't increment loop counter twice [operations/puppet] - 10https://gerrit.wikimedia.org/r/116469 [08:57:52] (03CR) 10Faidon Liambotis: [C: 032] geoip.inc.vcl: don't increment loop counter twice [operations/puppet] - 10https://gerrit.wikimedia.org/r/116469 (owner: 10Ori.livneh) [08:58:16] (03CR) 10Faidon Liambotis: [C: 032] Enable GeoIP cookie on Labs [operations/puppet] - 10https://gerrit.wikimedia.org/r/116719 (owner: 10Ori.livneh) [09:01:31] paravoid: thanks! [09:02:58] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [09:15:38] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 08:45:44 AM UTC [09:22:30] (03CR) 10Alexandros Kosiaris: [C: 032] contint: slaves now have openjdk-{6,7}-jdk [operations/puppet] - 10https://gerrit.wikimedia.org/r/114619 (owner: 10Hashar) [09:37:15] (03CR) 10Faidon Liambotis: "What's the rationale for this?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/116019 (owner: 10Hoo man) [09:51:30] paravoid: around? https://gerrit.wikimedia.org/r/116019 = [09:51:31] *? [09:51:53] yes? [09:52:21] I think the rational is pretty clear: Let's only give everyone the access they need [09:52:37] eg. these people don't need access to the database and don't need to be able to run maint. scripts [09:52:46] so I don't see why they should be able to do that [09:54:00] we shouldn't had given these people access to the bastion in the first place [09:55:07] well, that's obviously to late [09:56:44] Of course the bastions are already rather high risks as they've access to the internal network and stuff... but that's by far lower than just giving *everybody* full access to the Database and logs and whatever [09:57:10] it's not about these two users, but a more general thing [09:57:39] I only took them because I was sure they really don't need any other machines than bastion and their release server [09:59:11] our access control sucks [09:59:28] so no disagreement there [09:59:38] however, a group for bastion sounds wrong to me [09:59:47] bastion is the means to accessing some resource, not the end [10:00:22] paravoid: It's not the end, the group is meant for people who need only (few) specific machines [10:00:44] so that we let them access the bastions and the boxes they need (via site.pp) [10:03:01] no, that's not the right way to handle access control [10:03:12] create a class admins::releases if you want [10:03:15] and add them there [10:03:41] that way, we know why they have access and why a machine has admins::releases in it [10:03:53] and if you're feeling like it, merge that into manifests/role/releases.pp as well [10:04:11] which has the same information of who is a member of the release group there [10:04:31] the way we should do this is having a different group per *purpose*, not per e.g. server [10:04:49] yeah, that idea is sane [10:04:50] btw, https://gerrit.wikimedia.org/r/#/c/107848/ :) [10:05:04] but I'm not a fan of putting users into roles [10:05:53] why not? [10:06:35] cause I like to have the *real* list of who has access to what at a single (or multiple) well known places [10:06:56] why? [10:07:13] you will have that information, but it will grouped into roles [10:07:26] with your changeset, how can I know why markus has access to the bastion? [10:07:41] Ok, my changeset is crap, agreed [10:07:45] what if we change the way we upload releases, for instance (e.g. jenkins does it, as it has been floated before) [10:07:52] no, it's a step in the right direction [10:07:58] I just don't agree fully with it :) [10:08:26] so, if we remove the access to caesium, will we remove mglaser from admins::bastion? [10:08:45] the RT could help there, but what if in the meantime mglaser gets access to another box [10:08:51] I just would like to have all these lists in one place... so a admin::releases in admins.pp is fine with me, also using that group in the releases role sounds sane [10:09:11] while having mglaser in the releases group, and then saying that the releases group has access to a) caesium, b) bastion [10:09:13] but specifying a list of users with access *in* the releases role sounds scary [10:09:14] is the right way to do this [10:09:23] oh, yes, I agree with that [10:09:32] I was suggesting the opposite actually [10:09:40] the releases role reusing admins::releases, not the other way around [10:09:50] Oh, I got that wrong then [10:10:25] yeah I wasn't clear enough, sorry about that [10:10:51] the duplication right now is a bit scary [10:11:16] btw, note from my admins.pp overhaul this bit too: [10:11:17] - Tie Unix groups with our arbitrary class groupings; grouping users together in a class now means grouping them in a Unix group too. [10:11:59] as to have multiple levels of protection [10:12:15] right now we only have 'wikidev' as unix group right? [10:12:17] so even if a mistake happens and a user gets access to a box they shouldn't, unix permissions shouldn't let them run code as wikidev [10:12:23] Despite of mwdeploy and stuff like that [10:12:25] or read the database password [10:12:44] wikidev is the primary gid for all users, yes :(( [10:13:08] what is wikidev? [10:13:10] yeah, I thought about restricting some of the password scripts, but as long as everyone from analytics intern to root is a wikidev, that's hardly possible [10:13:11] you seem interested in this; I'd love if you could review my WIP changeset and see if it makes sense to you [10:13:36] I like security stuffs, yeah :P [10:13:50] But I guess I should really go and do some of my real work now... might be back later [10:14:46] no worries, the patchset is there for almost a month now [10:16:38] :) [10:23:38] PROBLEM - Puppet freshness on labstore4 is CRITICAL: Last successful Puppet run was Tue 25 Feb 2014 06:33:37 PM UTC [10:29:30] and it is time to start thinking about merging it paravoid :) [10:29:39] it's not ready yet [10:30:01] agreed, but you don't seem to have time to work on it [11:03:11] (03PS2) 10Nemo bis: Enable autopatrolled group on itwikiquote [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/114945 (owner: 10Ricordisamoa) [11:04:32] (03CR) 10Nemo bis: [C: 031] "Ouch, I missed the question there: you should have filed a bug (if you file one now, better late than never). Anyway, consensus is clear a" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/114945 (owner: 10Ricordisamoa) [11:12:06] !log Jenkins web service unavailable, investigating. Builds should not be affected though since they dont use the web service (but gearman) [11:12:14] Logged the message, Master [11:12:48] (03CR) 10Nemo bis: "I'm told (on IRC) that I should comment on gerrit..." (032 comments) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/114656 (owner: 10Hoo man) [11:19:03] !log Restarting Jenkins, it is stalled .... [11:19:10] Logged the message, Master [11:22:08] PROBLEM - jenkins_service_running on gallium is CRITICAL: PROCS CRITICAL: 2 processes with regex args ^/usr/bin/java .*-jar /usr/share/jenkins/jenkins.war [11:24:40] (03CR) 10Ricordisamoa: ""Why file a bug when you can file a patch?"" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/114945 (owner: 10Ricordisamoa) [11:30:29] (03CR) 10Nemo bis: "Says who?" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/114945 (owner: 10Ricordisamoa) [11:32:21] Do we really need bugs if there's consensus etc? [11:32:48] Reedy: You once said we do :P [11:54:20] (03Restored) 10Hoo man: Simplify the AbuseFilter configuration a little [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/114656 (owner: 10Hoo man) [12:04:21] (03PS2) 10Hoo man: Simplify the AbuseFilter configuration a little [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/114656 [12:05:07] (03CR) 10Hoo man: "Addressed Nemo bis's comments" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/114656 (owner: 10Hoo man) [12:13:49] (03CR) 10Ricordisamoa: "Me." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/114945 (owner: 10Ricordisamoa) [12:16:11] (03CR) 10Hoo man: [C: 04-1] "Please open a bug, there really should be one for each change." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/114945 (owner: 10Ricordisamoa) [12:16:38] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 08:45:44 AM UTC [12:54:32] apergos: https://commons.wikimedia.org/wiki/File:Happiness_Wrapped_in_a_Blanket_13.jpg ? [12:54:43] there are quite a bit of those [12:54:52] around 6 recently [12:55:23] matanya: https://bugzilla.wikimedia.org/show_bug.cgi?id=32551 [12:56:25] thanks hoo [13:24:38] PROBLEM - Puppet freshness on labstore4 is CRITICAL: Last successful Puppet run was Tue 25 Feb 2014 06:33:37 PM UTC [13:27:17] * hoo looks manifests/role/releases.pp [13:27:21] :/ [13:31:21] (03PS3) 10Ricordisamoa: Enable autopatrolled group on itwikiquote [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/114945 [13:34:13] yay for dynamic scoping all over the place... but ok :P [13:49:56] (03PS4) 10Hoo man: Create an autopatrolled group on itwikiquote [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/114945 (owner: 10Ricordisamoa) [13:50:21] (03CR) 10Hoo man: [C: 032] Create an autopatrolled group on itwikiquote [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/114945 (owner: 10Ricordisamoa) [13:50:29] (03Merged) 10jenkins-bot: Create an autopatrolled group on itwikiquote [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/114945 (owner: 10Ricordisamoa) [13:51:52] !log hoo synchronized wmf-config/InitialiseSettings.php '{{Gerrit|I62b3288}}' [13:52:00] Logged the message, Master [14:16:54] (03PS1) 10Manybubbles: Require config file for staring Elasticsearch [operations/puppet] - 10https://gerrit.wikimedia.org/r/116743 [14:23:09] (03CR) 10Matanya: [C: 031] Require config file for staring Elasticsearch [operations/puppet] - 10https://gerrit.wikimedia.org/r/116743 (owner: 10Manybubbles) [14:42:32] (03CR) 10Ottomata: [C: 032 V: 032] Require config file for staring Elasticsearch [operations/puppet] - 10https://gerrit.wikimedia.org/r/116743 (owner: 10Manybubbles) [14:52:53] !log stopping puppet on gallium to play with apache configuration [14:53:02] Logged the message, Master [14:53:43] !log switched wikitech to allow eqiad access, turned off pmtpa instance creation [14:53:51] Logged the message, Master [14:55:36] andrewbogott: eqiad labs looks really good [14:55:49] i even logged in to there yesterday [14:55:49] good! Hope it holds up today [14:56:11] what is happening today? [14:56:13] matanya: starting today you should only access it via wikitech. Things will get a bit out of sync otherwise. [14:56:23] matanya, switched on in wikitech so everyone can create instances there. [14:56:59] so anything i currently do via tampa will break? or anything it do that needs to change? [14:59:15] um... [14:59:34] matanya: wikitech is in tampa. [14:59:42] (03PS2) 10Greg Grossmeier: Log length of l10nupdate to SAL and Graphite [operations/puppet] - 10https://gerrit.wikimedia.org/r/116718 [15:00:00] But, for a while some of us were using virt1000 (in eqiad) as the eqiad labs gui. That's all I was referring to. [15:00:28] ok, but in regards for user stuff, any change projected? [15:01:57] matanya: https://wikitech.wikimedia.org/wiki/Labs_Eqiad_Migration_Howto [15:02:30] thanks [15:03:20] matanya, speaking of which, are you attached to the site-testing instance in the 'puppet' project? [15:03:31] If you're using it currently then I'll move it to eqiad; if not, I'll just delete it. [15:03:55] since akosiaris merged the site.pp change it can now go :) [15:04:05] great [15:04:55] andrewbogott: same for puppet-svn [15:04:59] was merged too [15:05:14] ok, I think you deleted that one already anyway [15:05:21] i have? [15:05:33] !log reenabling puppet on gallium [15:05:35] hm… 'puppet' project looks empty to me. Are you seeing an instance there still? [15:05:41] Logged the message, Master [15:05:56] yes, i see an instance, but it is empty [15:06:40] i see two in fact, but both are marked deleted, i wonder why they show up [15:06:53] what url are you looking at? [15:07:23] https://wikitech.wikimedia.org/wiki/Nova_Resource:I-000009e0.pmtpa.wmflabs [15:07:31] and https://wikitech.wikimedia.org/wiki/Nova_Resource:I-000009b8.pmtpa.wmflabs [15:08:28] Oh, ok. Just so long as you don't see them on the 'manage instances' page I'm not too worried. [15:09:38] but i do need the etherpad instances moved [15:15:38] matanya: OK, best to create bugzilla tickets for that. [15:15:44] i will [15:15:53] thx [15:16:00] (03PS1) 10Hashar: contint: do not cache api/json calls [operations/puppet] - 10https://gerrit.wikimedia.org/r/116748 [15:16:46] andrewbogott: under toollabs? [15:17:08] there's a link on the above page that will file a bug properly. We have a tracking but for migration tasks. [15:17:38] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 08:45:44 AM UTC [15:18:27] (03CR) 10Hashar: [C: 031 V: 032] "This solve jenkins job builder throwing error:" [operations/puppet] - 10https://gerrit.wikimedia.org/r/116748 (owner: 10Hashar) [15:19:30] https://bugzilla.wikimedia.org/show_bug.cgi?id=62207 [15:28:27] as I understand it, the daily backups of Wikidata have failed for several days [15:28:58] can someone tell me what the issue is ? [15:29:14] matanya: do you happen to know what if anything is in the /data/projects dir in etherpad? Do you want to do a search and rescue or should I copy over everything? [15:29:46] let me take a look [15:32:23] apergos: ^^ see at XX:28 [15:33:05] andrewbogott: it wasn't created by me [15:33:18] what daily backups? [15:33:22] matanya: ok [15:33:46] i only need the puppet/webserver of that [15:33:47] apergos: /me shrugs [15:34:17] matanya: do you know if anyone is using/cares about that project? [15:34:35] GerardM-: what backups, specifically? [15:34:51] greg-g: Dumps [15:35:07] andrewbogott: i know ^d did in the past [15:35:13] need to check with him [15:35:32] what dumps? sorry but we don't have daily xml dumps of anything and that's what I would be aware of [15:35:49] matanya: ok. Mind making a note to that effect on the progress page? https://wikitech.wikimedia.org/wiki/Labs_Eqiad_Migration/Progress [15:36:05] i geuss a short mail to members will reveal the real status [15:36:18] * greg-g steps out of this conversation and let's GerardM- and apergos figure it out :) [15:37:26] done [15:37:42] GerardM-: is there a bug report? [15:38:13] matanya: ty [15:38:55] Last dump was 1 hour and 20 minutes ago for Wikidata. [15:39:08] greg-g: the daily backups have failed several times (this was confirmed by Lydia) [15:39:43] GerardM-: please direct your questions/statements to apergos :) [15:39:58] what daily backups? I'm trying to understand which these are [15:40:53] since when do we have backups? [15:41:29] mark; database dumps [15:41:47] !log upgraded php5 packages, php5-wmerrors package and libmemcached11 on mw1017. This will make puppet and the corresponding icinga check unhappy. [15:41:52] mysql snapshots? [15:41:55] Logged the message, Master [15:41:57] mark: https://dumps.wikimedia.org/ fwiw :p [15:42:09] those are not daily [15:42:30] blurg not https for that [15:43:08] we don't do any daily dumps; as a side projct we do experimental adds/changes of the wikis but those are not guaranteed to run or even have reasonable content [15:43:32] (huh re no tls for dumps.wikimedia) [15:44:05] but the regular dump runs are not daily and never have been [15:44:07] akosiaris: Can we check in about the state of your labs project? Either I failed to migrate some instances to eqiad or you deleted some there...? [15:44:19] i deleted [15:44:45] andrewbogott: in general the migration was rather uneventful [15:44:52] ok. So are you happy with the state of eqiad there? Shall I mark that project as finished? [15:44:58] they also have run successfully for as far back as we keep them... [15:45:02] yes :-) [15:45:08] (That means that the pmtpa instances may get deleted sometime) [15:45:20] ok, great! two down [15:45:21] the sooner the better [15:46:28] Well, I'll delete them right now if you don't mind, still having some space issues in pmtpa [15:46:29] andrewbogott: i think you can mark the puppet one too [15:46:36] matanya: yep, done. [15:46:42] thanks [15:46:53] matanya: there's also the 'puppet-cleanup' project which I'm unsure about [15:47:04] it is mostly abaonaded [15:47:34]