[00:00:39] mysql's 'utf8' wasn't real utf8 until a fairly recent version (5.1 iirc) [00:00:49] (I ask because I am currently trying to resolve https://bugzilla.wikimedia.org/show_bug.cgi?id=53751) [00:00:49] (and was surprised at our default) [00:00:50] it was restricted to the basic multilingual plane [00:01:24] which is a problem if you want to support certain languages [00:01:31] ah; yes; that makes sense [00:01:31] and we do [00:02:01] do you know off the top of your head how we do case insensitive db lookups then? [00:03:37] ^d i can start up the lvs thing in about 15 minutes, will you be here then ? [00:04:23] oh... answer -- we don't do case insensitive lookups... [00:10:18] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [00:15:38] PROBLEM - MySQL Replication Heartbeat on db1046 is CRITICAL: CRIT replication delay 301 seconds [00:17:38] PROBLEM - MySQL Replication Heartbeat on db1046 is CRITICAL: CRIT replication delay 303 seconds [00:33:58] RECOVERY - Puppet freshness on labstore4 is OK: puppet ran at Wed Oct 2 00:33:52 UTC 2013 [00:34:18] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [00:35:58] PROBLEM - Host mw31 is DOWN: PING CRITICAL - Packet loss = 100% [00:37:18] RECOVERY - Host mw31 is UP: PING OK - Packet loss = 0%, RTA = 26.58 ms [00:39:30] PROBLEM - Apache HTTP on mw31 is CRITICAL: Connection refused [00:40:31] RECOVERY - Apache HTTP on mw31 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.412 second response time [00:46:10] PROBLEM - Puppet freshness on ms-be1006 is CRITICAL: No successful Puppet run in the last 10 hours [00:46:10] PROBLEM - Puppet freshness on ms-be1012 is CRITICAL: No successful Puppet run in the last 10 hours [00:53:10] PROBLEM - Puppet freshness on ms-be1007 is CRITICAL: No successful Puppet run in the last 10 hours [00:53:10] PROBLEM - Puppet freshness on ms-be1009 is CRITICAL: No successful Puppet run in the last 10 hours [00:58:10] PROBLEM - Puppet freshness on ms-be1002 is CRITICAL: No successful Puppet run in the last 10 hours [00:58:10] PROBLEM - Puppet freshness on ms-be1004 is CRITICAL: No successful Puppet run in the last 10 hours [00:59:10] PROBLEM - Puppet freshness on ms-be1010 is CRITICAL: No successful Puppet run in the last 10 hours [01:02:10] PROBLEM - Puppet freshness on ms-be1003 is CRITICAL: No successful Puppet run in the last 10 hours [01:04:10] PROBLEM - Puppet freshness on ms-be1008 is CRITICAL: No successful Puppet run in the last 10 hours [01:05:10] PROBLEM - Puppet freshness on ms-be1005 is CRITICAL: No successful Puppet run in the last 10 hours [01:06:34] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [01:07:14] PROBLEM - Puppet freshness on ms-be1011 is CRITICAL: No successful Puppet run in the last 10 hours [01:07:44] RECOVERY - Puppet freshness on labstore4 is OK: puppet ran at Wed Oct 2 01:07:40 UTC 2013 [01:08:34] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [01:13:55] (03CR) 10TTO: "This restricts uploads to sysops, rather than disabling them altogether... is this what was intended?" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/86643 (owner: 10Danny B.) [01:33:54] RECOVERY - Puppet freshness on labstore4 is OK: puppet ran at Wed Oct 2 01:33:49 UTC 2013 [01:34:34] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [02:07:00] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [02:16:20] !log LocalisationUpdate completed (1.22wmf19) at Wed Oct 2 02:16:19 UTC 2013 [02:16:35] Logged the message, Master [02:30:23] !log LocalisationUpdate completed (1.22wmf18) at Wed Oct 2 02:30:23 UTC 2013 [02:30:34] Logged the message, Master [02:34:00] RECOVERY - Puppet freshness on labstore4 is OK: puppet ran at Wed Oct 2 02:33:53 UTC 2013 [02:35:00] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [02:56:05] !log LocalisationUpdate ResourceLoader cache refresh completed at Wed Oct 2 02:56:05 UTC 2013 [02:56:18] Logged the message, Master [03:02:04] (03CR) 10MZMcBride: "An associated Bugzilla bug would be nice here." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/86643 (owner: 10Danny B.) [03:03:50] RECOVERY - Puppet freshness on labstore4 is OK: puppet ran at Wed Oct 2 03:03:49 UTC 2013 [03:04:00] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [03:34:29] RECOVERY - Puppet freshness on labstore4 is OK: puppet ran at Wed Oct 2 03:34:26 UTC 2013 [03:34:59] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [04:04:39] RECOVERY - Puppet freshness on labstore4 is OK: puppet ran at Wed Oct 2 04:04:37 UTC 2013 [04:04:59] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [04:34:22] RECOVERY - Puppet freshness on labstore4 is OK: puppet ran at Wed Oct 2 04:34:15 UTC 2013 [04:34:52] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [04:44:32] RECOVERY - MySQL Replication Heartbeat on db1046 is OK: OK replication delay 0 seconds [05:03:52] RECOVERY - Puppet freshness on labstore4 is OK: puppet ran at Wed Oct 2 05:03:43 UTC 2013 [05:04:52] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [05:08:01] Intermittent search failures on Commons: [05:08:06] "An error has occurred while searching: HTTP request timed out." [05:09:25] i'd !log that [05:15:16] !log Commons has intermittent search failures: "An error has occurred while searching: HTTP request timed out." [05:15:28] cc LeslieCarr [05:15:31] Logged the message, Master [05:17:27] superm401: no report in bugzilla yet, right? [05:17:37] greg-g, didn't check or file. [05:18:29] k [05:18:31] superm401: what's strange about it? [05:19:02] Nemo_bis, didn't say it was strange, but it shouldn't happen either. Not sure what you mean. [05:19:09] superm401: https://gerrit.wikimedia.org/r/#/c/60759/ [05:19:18] we even stopped logging those errors because there were too many [05:19:36] it's totally normal [05:19:39] SNAFU [05:20:16] We could just raise the timeout as a workaround. [05:20:21] Although fixing it would be nice too. :) [05:26:37] superm401: and maybe that would kill the lucene hosts completely? :) [05:27:24] I mean, I'd like someone to work on lucene but it's not simple, that's why they're abandoning it [05:27:28] Maybe, depending on how much it was raised. [05:28:09] Lucene has been that broken for many months and probably years, we just didn't notice clearly because there are no logs and it said "0 results" when it failed [05:28:26] result is that now that search is being improved some think it's worse ;) [05:28:33] e.g. https://en.wiktionary.org/wiki/Wiktionary:Grease_pit/2013/September#w00t.21_New_search_indexing_with_all_scripts.2Ftemplate_expansion [05:28:45] (very short-sighted reply warning) [05:28:53] I don't recall getting either timeouts or spurious 0 results. [05:29:03] Not saying it never happened, but it wasn't often enough for me to notice. [05:29:49] And you probably know, but they're not abandoing Lucene, just using a tool that wraps it (solving some of the problems for us). [05:33:56] RECOVERY - Puppet freshness on labstore4 is OK: puppet ran at Wed Oct 2 05:33:54 UTC 2013 [05:34:46] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [05:59:50] (03PS1) 10Yuvipanda: labs-vagrant: Ensure that vagrant homedir is created [operations/puppet] - 10https://gerrit.wikimedia.org/r/87041 [06:00:48] anyone to merge a trivial patch? [06:07:07] (03PS2) 10Legoktm: labs-vagrant: Ensure that vagrant homedir is created [operations/puppet] - 10https://gerrit.wikimedia.org/r/87041 (owner: 10Yuvipanda) [06:08:22] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [06:11:24] superm401: the timeouts were not reported as such, the spurious 0 results are not trivial to identify as spurious :) [06:11:51] Nemo_bis, yeah, sometimes I wouldn't know. [06:12:04] But a lot of times, I search for stuff I know there should be results for. [06:14:10] Yes, that's the only case when you can know [06:14:27] superm401: personally I think something useful to do would be this: https://bugzilla.wikimedia.org/show_bug.cgi?id=43544#c23 [06:15:15] being able to at least grep -c the logs so that we notice if the errors are suddenly an order of magnitude more frequent would be nice, even just for few months :) [06:15:28] Yeah [06:21:13] Ryan_Lane: trivial merge of https://gerrit.wikimedia.org/r/87041? [06:22:58] superm401: you could file a bug for it then ;) [06:23:17] I'm surely not as good at filing bugs in a focused way as you are [06:24:20] Nemo_bis, do you know whether what proportion of the errors are timeouts? [06:25:27] superm401: define "errors"? :) [06:25:47] Whatever makes it to the mwsearch log. [06:26:07] superm401: I think you are the one with shell access here? :P [06:26:15] it may be on fluorine:/a/mw-log/mwsearch.log still [06:26:21] or wherever it's rotated [06:27:49] a possible approach, if it's really so huge, would be to just turn on logging for, say, 24h and compare the length of the log to a "standard" one [06:28:24] No files in that directory with search anywhere in the name. [06:29:00] hmpf [06:29:14] where is the archive [06:29:50] only 5 years old page https://wikitech.wikimedia.org/wiki/MediaWiki_UDP_logging [06:31:42] (03CR) 10Akosiaris: [C: 032] labs-vagrant: Ensure that vagrant homedir is created [operations/puppet] - 10https://gerrit.wikimedia.org/r/87041 (owner: 10Yuvipanda) [06:31:50] ty, akosiaris [06:32:19] sigh https://wikitech.wikimedia.org/wiki/Search/UDP_Logger [06:32:19] wow you are fast :-) [06:32:47] akosiaris: :D one more patch coming up [06:32:52] hope there's a timezone chart of all ops [06:32:55] s/hope/wish [06:33:39] ahahaha... now that we got Sean from australia we pretty much cover the world :-) [06:34:38] :D [06:34:38] true [06:35:02] RECOVERY - Puppet freshness on labstore4 is OK: puppet ran at Wed Oct 2 06:34:55 UTC 2013 [06:35:22] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [06:42:40] Nemo_bis, CCed you on https://bugzilla.wikimedia.org/show_bug.cgi?id=54865 [06:44:46] superm401: thanks [07:12:17] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [07:12:27] RECOVERY - search indices - check lucene status page on search1022 is OK: HTTP OK: HTTP/1.1 200 OK - 56465 bytes in 0.016 second response time [07:33:57] RECOVERY - Puppet freshness on labstore4 is OK: puppet ran at Wed Oct 2 07:33:56 UTC 2013 [07:34:17] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [08:08:26] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [08:33:56] RECOVERY - Puppet freshness on labstore4 is OK: puppet ran at Wed Oct 2 08:33:54 UTC 2013 [08:34:26] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [08:53:56] (03PS1) 10Mattflaschen: Labs: Turn off secure login on loginwiki due to untrusted SSL [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/87045 [08:54:33] (03PS2) 10Mattflaschen: Labs: Turn off secure login on loginwiki due to untrusted SSL [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/87045 [09:36:06] RECOVERY - Puppet freshness on labstore4 is OK: puppet ran at Wed Oct 2 09:36:00 UTC 2013 [09:36:36] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [09:49:26] (03CR) 10Danny B.: "@TTO:" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/86643 (owner: 10Danny B.) [09:51:10] PROBLEM - Puppet freshness on virt1000 is CRITICAL: No successful Puppet run in the last 10 hours [09:53:28] aaaaaaaaaaaaaaaarghhh [09:53:36] mw1125 is out of sync [09:58:36] ? [09:59:25] MaxSem: ? [09:59:32] I see errors on it that were resolved by yesterday's deploy and are not coming from other boxes anymore [09:59:46] running sync-common [10:00:08] it's in dsh groups, wonder how it would have failed [10:01:46] happens at times [10:10:36] (03PS2) 10TTO: skwikisource: Disable upload [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/86643 (owner: 10Danny B.) [10:13:41] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [10:13:52] DatabaseInstaller::setupSchemaVars: unexpected DB connection error [10:14:03] that is sooo useful :-D [10:30:20] !log Resynched MW on mw1125, looked like slightly out of sync [10:30:31] Logged the message, Master [10:34:01] RECOVERY - Puppet freshness on labstore4 is OK: puppet ran at Wed Oct 2 10:33:58 UTC 2013 [10:34:41] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [10:46:06] (03PS1) 10Hashar: contint: browsertests needs php5-sqlite [operations/puppet] - 10https://gerrit.wikimedia.org/r/87056 [10:46:44] PROBLEM - Puppet freshness on ms-be1006 is CRITICAL: No successful Puppet run in the last 10 hours [10:46:44] PROBLEM - Puppet freshness on ms-be1012 is CRITICAL: No successful Puppet run in the last 10 hours [10:53:44] PROBLEM - Puppet freshness on ms-be1007 is CRITICAL: No successful Puppet run in the last 10 hours [10:53:44] PROBLEM - Puppet freshness on ms-be1009 is CRITICAL: No successful Puppet run in the last 10 hours [10:54:48] (03CR) 10MaxSem: [C: 04-1] "Please provide a link to bug requesting this." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/86643 (owner: 10Danny B.) [10:57:50] (03CR) 10Danny B.: "What is the sense of creating of a new bug and immediately closing it as fixed? No prob to do it, but seems like unnecessary bureaucracy t" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/86643 (owner: 10Danny B.) [10:58:44] PROBLEM - Puppet freshness on ms-be1002 is CRITICAL: No successful Puppet run in the last 10 hours [10:58:44] PROBLEM - Puppet freshness on ms-be1004 is CRITICAL: No successful Puppet run in the last 10 hours [10:59:44] PROBLEM - Puppet freshness on ms-be1010 is CRITICAL: No successful Puppet run in the last 10 hours [11:02:44] PROBLEM - Puppet freshness on ms-be1003 is CRITICAL: No successful Puppet run in the last 10 hours [11:04:44] PROBLEM - Puppet freshness on ms-be1008 is CRITICAL: No successful Puppet run in the last 10 hours [11:05:43] (03PS1) 10Hashar: contint: fetch slave scripts on slaves [operations/puppet] - 10https://gerrit.wikimedia.org/r/87058 [11:05:44] PROBLEM - Puppet freshness on ms-be1005 is CRITICAL: No successful Puppet run in the last 10 hours [11:06:24] (03CR) 10MaxSem: "Sure, no problem - but then a link to community consensus would be appreciated:)" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/86643 (owner: 10Danny B.) [11:07:43] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [11:08:13] PROBLEM - Puppet freshness on ms-be1011 is CRITICAL: No successful Puppet run in the last 10 hours [11:15:34] (03CR) 10Danny B.: "There is no such link, this configuration is just logical outcome of the status quo (like on other projects):" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/86643 (owner: 10Danny B.) [11:33:53] RECOVERY - Puppet freshness on labstore4 is OK: puppet ran at Wed Oct 2 11:33:44 UTC 2013 [11:34:43] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [11:36:18] (03CR) 10TTO: "In the spirit of fairness, at least post an announcement on the community portal to give anyone a chance to object." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/86643 (owner: 10Danny B.) [11:40:15] (03CR) 10Danny B.: "OK, as you wish. Although I am actually the only active user there ATM... ;-)" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/86643 (owner: 10Danny B.) [12:07:38] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [12:34:38] RECOVERY - Puppet freshness on labstore4 is OK: puppet ran at Wed Oct 2 12:34:32 UTC 2013 [12:35:38] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [13:10:27] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [13:33:57] RECOVERY - Puppet freshness on labstore4 is OK: puppet ran at Wed Oct 2 13:33:48 UTC 2013 [13:34:27] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [13:37:56] (03CR) 10Dzahn: [C: 04-1] "the command => "usermod -a -G Debian-exim nagios is necessary because puppet can't add an existing user to an existing group, it can only " [operations/puppet] - 10https://gerrit.wikimedia.org/r/86889 (owner: 10Matanya) [13:49:53] mutante: thanks for this [13:50:03] I see a better way to do it [13:51:19] (03PS1) 10Yuvipanda: Add vagrant user to sudoers by default [operations/puppet] - 10https://gerrit.wikimedia.org/r/87080 [13:51:32] (03CR) 10jenkins-bot: [V: 04-1] Add vagrant user to sudoers by default [operations/puppet] - 10https://gerrit.wikimedia.org/r/87080 (owner: 10Yuvipanda) [13:51:48] (03PS2) 10Yuvipanda: Add vagrant user to sudoers by default [operations/puppet] - 10https://gerrit.wikimedia.org/r/87080 [13:58:00] hmm, do we have any python services deployed in production? [13:58:12] php is most of our stuff, and then there's parsoid... [14:03:04] yuvipanda: if you count IRC bots as services ...and monitoring scripts [14:03:15] find . -name *.py in puppet repo [14:03:37] hmm [14:03:52] (03PS3) 10Yuvipanda: Add vagrant user to sudoers by default [operations/puppet] - 10https://gerrit.wikimedia.org/r/87080 [14:03:53] mutante: was thinking of setting aside some time to write up code for the ShortURL service from the RFC [14:04:08] YuviPanda: isn't analytics using some python stuff? [14:04:29] siebrand: I don't know if that is 'in production' as such yet, though. [14:04:35] metrics is on labs... [14:04:43] YuviPanda: pywikibot? ;) [14:04:49] heh :D [14:05:15] siebrand: puppet is pretty much the only ruby thing in our stack, afaik [14:05:26] but python is littered here and there, no big service as such uses it [14:05:31] YuviPanda: nope. All the front end QA is too. [14:05:45] YuviPanda: (ruby, that is) [14:05:47] siebrand: ah, right. but I was considering 'things that run in the cluster' [14:05:54] swift , ldap, salt, ganglia, there are some random pythong scripts in lots of places, but that doesnt make them python services i guess [14:06:05] yuvipanda: ( hashar: where is the meetbot repo? ) [14:06:08] YuviPanda: puppet doesn't run on the cluster, it configures the cluster? [14:06:21] siebrand: it runs on each machine in order to configure the cluster [14:06:25] siebrand: plus there's the puppetmaster too [14:06:28] yuvipanda: nothing and I don't have the time to work on meetbot. Consider the current install a teaser [14:06:40] hashar: okay, let me rephrase - where it the meetbot code? :) [14:06:53] yuvipanda: "I don't have time "sorry [14:07:13] hashar: to rephrase again - is it something that exists somewhere and you just used it, or did you write it yourself? [14:07:16] no more questions, I promise! [14:07:30] I already regret having done that proof of concept cause now people are distracting me and thought meetbot is important [14:07:31] :( [14:07:57] well, ok. [14:08:35] just saying that if the code is somewhere perhaps other people can fix things, set it up in a semi-production state, etc. [14:08:41] but I understand if you don't have time for it [14:08:58] yeah none sorry [14:09:10] but will ping ya in a couple weeks when I puppetize it :-] [14:09:20] fine, hoard the code :P [14:09:25] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [14:09:29] focusing on having browser tests triggered in Gerrit when people send patchsets in Gerrit [14:10:22] yuvipanda: integration-meetbot .pmtpa.wmflabs [14:10:34] thank you, that is all that I was looking for :) [14:10:38] yuvipanda: you are already member of that project and get root [14:10:44] let me look [14:10:48] * hashar context switch [14:11:05] yuvipanda: so basically : install supybot using the ubuntu package [14:11:34] yuvipanda: MeetBot is a plugin, I have fetched it under /mnt/meetbot [14:11:49] I see it [14:11:51] then /mnt/supybot is the base dir for supybot to run into [14:12:11] you can drop plugins in /mnt/supybot/plugins [14:12:15] I simply created a symbolic link [14:12:33] yeah, I understand. MeetBot is just a supybot plugin, and I see where the code is [14:12:37] which is ugly [14:13:03] hah! :D [14:13:05] and the configuration for Meetbot is in /mnt/meetbot/MeetBot/meetingLocalConfig.py [14:13:09] the file must be named like that [14:13:14] it is hardcoded inside MeetBot [14:13:31] note that /mnt/meetbot/MeetBot is a darks repository :-] [14:13:39] yeah, i noticed the _darcs folder [14:13:48] should be interesting, have only 'heard' of it before [14:14:16] then one can start meetbot using the upstart conf I imported from openstack: cat /etc/init/meetbot.conf [14:14:18] (03PS1) 10Dzahn: change wikimania.wm redirect from 2013 to 2014 [operations/apache-config] - 10https://gerrit.wikimedia.org/r/87112 [14:14:20] aka start meetbot [14:14:21] or stop meetbot [14:14:27] (03CR) 10Cmcmahon: [C: 031] "I'd like to test this right away in beta labs. It would be easy enough to revert if something goes haywire." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/87045 (owner: 10Mattflaschen) [14:14:45] mm, I see that too! [14:15:12] i'll pkoe around [14:15:13] so it is really simple really: create a git repo to mirror the darcs repo (just need the current version) [14:15:14] thanks hashar! [14:15:55] write a manifest that provide the upstart conf, install supybot, create a meetbot user, clone the meetbot repo, service { meetbot: ensure => running } [14:16:04] + a template for meetingLocalConfig.py :D [14:16:26] Reedy: around? can you tell what the next move is to get this further along to +2? https://gerrit.wikimedia.org/r/#/c/84897/ [14:16:31] * hashar switch back [14:16:41] hashar: thank you! [14:17:07] (03CR) 10Dzahn: [C: 031] "i'm not touching the "wikimania.asia" links because 2014 wikimania isn't in Asia" [operations/apache-config] - 10https://gerrit.wikimedia.org/r/87112 (owner: 10Dzahn) [14:18:30] (03CR) 10Dzahn: [C: 032] change wikimania.wm redirect from 2013 to 2014 [operations/apache-config] - 10https://gerrit.wikimedia.org/r/87112 (owner: 10Dzahn) [14:18:36] lolol [14:20:52] mutante: think you can +2 https://gerrit.wikimedia.org/r/#/c/87080/? [14:21:27] Reedy: ? [14:21:37] Still needs a better name... [14:21:44] The rest of it is simple enough to fix up [14:21:57] Reedy: paint that bikeshed, man! [14:22:00] yuvipanda: need to sync and graceful Apaches first.. [14:22:06] mutante: heh, sure! [14:22:17] and i dont know much vagrant stuff.. well.. [14:22:33] mutante: well, the only person doing vagrant stuff on ops repo is... me [14:22:39] chrismcmahon: It's more of the name doesn't make much sense... [14:22:53] $wgExtensionsEntryPointListFile [14:22:55] s [14:23:03] mark, if you were around, I'd appreciate another look at that same pybal.conf.erb puppet error on lvs4001 [14:23:11] mutante: can't +2 them myself :) [14:24:04] Reedy: I don't want to get stuck in loop where every new name gets a -1. that would take months at this rate. what IS a good name? (and who would know?) [14:24:28] Something that makes sense as to what the global does/is used for [14:25:28] Reedy: $messagesToNonProdEnvs [14:25:35] !log sync-apache, apache-graceful-all for wikimania2014 redirect [14:25:48] Logged the message, Master [14:32:28] Reedy: $splitMessagesForBetaLabs [14:34:55] RECOVERY - Puppet freshness on labstore4 is OK: puppet ran at Wed Oct 2 14:34:46 UTC 2013 [14:35:17] (03PS3) 10Ottomata: Adding kafka::udp2log::relay define to consume from Kafka and send to udp2log. [operations/puppet] - 10https://gerrit.wikimedia.org/r/86894 [14:35:25] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [14:41:48] (03CR) 10Dzahn: [C: 032] "i guess there is not really a way around thiis since the vagrant user sets stuff up" [operations/puppet] - 10https://gerrit.wikimedia.org/r/87080 (owner: 10Yuvipanda) [14:42:09] mutante: thanks! [14:47:08] (03CR) 10Reedy: "(1 comment)" [operations/apache-config] - 10https://gerrit.wikimedia.org/r/84707 (owner: 10Reedy) [14:53:24] (03PS6) 10Reedy: Simplify wikimania apache conf, reuse wikimedia.org docroot. [operations/apache-config] - 10https://gerrit.wikimedia.org/r/84707 [14:55:04] (03PS7) 10Reedy: Simplify wikimania apache confs, reuse wikimedia.org docroot. [operations/apache-config] - 10https://gerrit.wikimedia.org/r/84707 [14:55:14] (03PS8) 10Reedy: Simplify wikimania apache confs, reuse wikimedia.org docroot. [operations/apache-config] - 10https://gerrit.wikimedia.org/r/84707 [15:12:34] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [15:16:45] (03PS1) 10Ottomata: Updating kafka module to latest commit, modifying kafka role to reflect recent changes there. [operations/puppet] - 10https://gerrit.wikimedia.org/r/87151 [15:18:07] (03PS2) 10Ottomata: Updating kafka module to latest commit, modifying kafka role to reflect recent changes there. [operations/puppet] - 10https://gerrit.wikimedia.org/r/87151 [15:18:48] (03CR) 10Ottomata: [C: 032 V: 032] Updating kafka module to latest commit, modifying kafka role to reflect recent changes there. [operations/puppet] - 10https://gerrit.wikimedia.org/r/87151 (owner: 10Ottomata) [15:21:33] (03CR) 10Dzahn: [C: 04-1] "unfortunately this looks like 404s when i tested" [operations/apache-config] - 10https://gerrit.wikimedia.org/r/84707 (owner: 10Reedy) [15:27:05] (03PS4) 10Ottomata: Adding kafka::udp2log::relay define to consume from Kafka and send to udp2log. [operations/puppet] - 10https://gerrit.wikimedia.org/r/86894 [15:30:52] yuvipanda: an URL shortener sounds like an excellent task for Twisted or (faster) node [15:31:11] gwicke: indeed, i was thinking of Node [15:31:23] gwicke: with the idea that 99% of hits or more should be handled by varnish [15:31:34] yeah [15:31:44] gwicke: it's not terribly hard to do, I might see if I can spend an hour or so writing it [15:31:51] a simple node http server gets me something like 7k req/s [15:32:11] with both ab and the server on my aging laptop [15:32:51] gwicke: right [15:33:04] gwicke: only thing I was wondering is the status of npm packages and our ops repo [15:33:11] gwicke: since I'd need to use node-mysql [15:33:15] at the least [15:33:38] we handle that with a contrib/config repo [15:33:53] gwicke: how so? [15:34:08] currently we deploy that manually, but want to make it a subrepository instead that is automatically deployed along with the main code [15:34:10] gwicke: that's the one thing python has going for it, since the packages i'd use there are all in repo. [15:34:12] aaah [15:34:13] right [15:34:16] all with git-deploy [15:34:19] right [15:34:25] in repo dependencies [15:34:54] RECOVERY - Puppet freshness on labstore4 is OK: puppet ran at Wed Oct 2 15:34:50 UTC 2013 [15:35:04] that will also let us test with the libraries before deploying [15:35:34] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [15:35:41] gwicke: right [15:35:46] gwicke: are ops okay with that? [15:36:22] yes, it is better than what we do right now [15:36:27] heh [15:36:46] https://bugzilla.wikimedia.org/show_bug.cgi?id=53723 [15:37:25] discussed this with Ryan, in the context of also supporting Debian packaging this is basically what we came up with [15:38:17] reading https://www.mediawiki.org/wiki/Parsoid/Packaging now [15:40:21] yuvipanda: in the longer term I hope that we can package more dependencies too [15:40:34] gwicke: i guess the nodejs packaging scene isn't that hot.. [15:41:00] gwicke: there are some, but not that many [15:41:17] iirc there is a tool that automatically packages npm modules [15:41:34] https://npmjs.org/package/npm2debian [15:42:39] oh [15:44:34] * Reedy kicks grrrit-wm [15:45:22] Reedy: what's happening? [15:45:34] Reedy: your comments came through on -dev... [15:46:01] thanks Reedy [15:46:02] Closest we have to kicking gerrit [15:46:03] Permission denied (publickey). [15:46:03] fatal: Could not read from remote repository. [15:46:03] Please make sure you have the correct access rights [15:46:03] and the repository exists. [15:46:44] <^d> Gerrit hides in a castle so you can't kick him :) [15:47:11] * Reedy just kicks the messenger instead [15:54:56] paravoid: Have you had any time yet to look over my RfC draft? [15:56:22] not yet :( [15:56:37] * bd808 sulks [15:57:29] paravoid: I'll keep busy today trying to make a better version of purgeDeletedFiles [15:59:18] sorry... [16:00:11] no worries. Everybody is busy. If I run out of other things I'll "be bold" and just move it over to the proper namespace [16:00:15] no need to apologize to bd808, I don't he has feelings. [16:00:44] s/don't he/don't think he/ # ugh [16:00:49] bd808: Will it just delete all the files with no prejudice? [16:01:29] Reedy: Some notes at https://www.mediawiki.org/wiki/User:BDavis_(WMF)/Notes/Finding_Files_To_Purge#purgeChangedFiles [16:02:07] basically want to expand it to handle things other than deletes and be able to limit htcp broadcast range [16:03:16] * bd808 ignores greg-g's trolling :P [16:04:19] bd808: awwwwww [16:04:43] Anyone know where the squid logs go now? locke has no recent writes in /a/squid and emery has no /var/log/squid - https://wikitech.wikimedia.org/wiki/Squid_logging [16:05:13] PROBLEM - Puppet freshness on labstore4 is CRITICAL: No successful Puppet run in the last 10 hours [16:06:15] you had it almost right [16:06:22] just combine the two :) [16:06:25] emery /a/log [16:08:15] That's got sampled-1000.tsv.log but no sign of the 5XX logs [16:08:50] (I know, I didn't specify ;)) [16:08:53] oxygen /a/log ? [16:09:10] reedy@bast1001:~$ ssh -A oxygen [16:09:10] Permission denied (publickey). [16:09:15] oh heh [16:09:28] ottomata1: here? [16:09:32] I can get into emery, locke, stat1 [16:09:34] logging's hard [16:09:41] Not into oxygen and stat1001 etc... [16:10:28] yo [16:10:49] ottomata1: Reedy is looking for 5xx.log, is it anywhere else but oxygen? [16:10:54] Reedy, if you want historical [16:11:01] stat1002:/a/squid/archive [16:11:05] stat1002.eqiad.wmnet [16:11:16] Can't get onto stat1002 [16:11:22] need account? [16:11:25] I'm not sure what's wanted... manybubbles ^^ What're you wanting with the 5XX logs? [16:11:27] * bd808 is ready to think seriously about a giant logstash instance [16:11:47] bd808 yes! [16:11:50] let's talk about that [16:12:01] It's somewhat amusing I can apparently get onto rand() analytics hosts but not others [16:12:02] let's build the proper elasticsearch cluster first [16:12:07] (03PS1) 10Chad: Cirrus: Remove commented officewiki, cawiki to primary [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/87157 [16:12:16] drdee: It's paravoid's pet project. [16:12:18] I was thinking of ordering a few more of the ES boxes for logstash [16:12:25] i have been wanting to look into that hadoop log files [16:12:29] it's also my pet project :D [16:12:36] yeahhhhh! [16:12:37] lets do it! [16:12:48] we have a procurement ticket where we discuss ES boxes with manybubbles [16:12:53] cool [16:12:57] let's figure out the details for these first [16:13:00] ok [16:13:19] We want all the fast servers [16:13:19] Done [16:13:20] yeah, and Reedy, if you want live logs, you need to get onto oxygen [16:13:45] Need to wait for manybubbles to reply [16:14:03] Reedy: sorry, you want to know why I want those 500s I emailed about earlier? [16:14:06] https://wikitech.wikimedia.org/wiki/Squid_logging is pretty out of date it seems [16:14:29] Reedy: someone was complaining about seeing gateway timeouts but I couldn't find them in any logs I knew how to look at [16:14:30] manybubbles: Not necesserily why, just whether you want live and now, or archive, or both [16:14:35] paravoid:what's the rt ticket? [16:14:49] As it's different boxes that you need access too for each... [16:14:54] drdee: 5883 [16:15:01] it's 5883 but please don't distract the conversation with logstash for now [16:15:06] <^d> Reedy: We getting a 1.22wmf20 today? :) [16:15:11]