[00:03:34] New patchset: Reedy; "Bug 44460 - Create Wikiversity Korean" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/47349 [00:03:49] Change merged: Tim Starling; [operations/debs/python-voluptuous] (master) - https://gerrit.wikimedia.org/r/45599 [00:03:53] voluptuous? [00:04:13] Change merged: Tim Starling; [operations/debs/nginx] (master) - https://gerrit.wikimedia.org/r/45598 [00:09:26] New patchset: Reedy; "Remove wgEnableUpload entries same as default" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/47678 [00:10:22] New patchset: Reedy; "Remove wgEnableUpload entries same as default" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/47678 [00:13:44] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/47678 [00:14:14] Thanks Tim [00:37:37] !log stopping mailman to scrub an entry out of archives, will restart shortly [00:37:38] Logged the message, RobH [00:38:20] RobH, one sec [00:38:25] too late. [00:38:27] why? [00:38:39] did you do it following wikitech instructions and replace the text with something else [00:38:44] instead of just deleting the email? [00:38:49] im doing it now, via those yes [00:38:51] why do you ask? [00:39:10] good thanks, because doing it by deleting the email breaks the archives [00:39:40] im just ripping out the address info the dude left [00:39:46] and leaving the majority of the message intact anyhow [00:40:28] mostly done, rebuilding mbox file now [00:40:42] (shouldnt break as no actual full email was removed, headers all intact) [00:41:09] yep [00:41:15] so slow.... [00:41:22] wikitech-l was a huge ass list. [00:41:26] was/is/whatever [00:41:35] New patchset: Andrew Bogott; "In mediawiki::singlenode use a more modest memcached size." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/47680 [00:41:52] I thought you could rebulid just for a particular list? [00:41:56] !log done with data removal on mailing list server, rebuilding mbox [00:41:56] Logged the message, RobH [00:42:02] its just this list. [00:42:04] its a huge list [00:42:13] up to nov 2012, almost to present [00:42:51] !log mailing lists returned to normal [00:42:51] Logged the message, RobH [00:43:02] New patchset: Andrew Bogott; "In mediawiki::singlenode use a more modest memcached size." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/47680 [00:43:40] Thehelpfulone: did daniel say to reassign rt 4478 to him? [00:43:40] Change merged: Andrew Bogott; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/47680 [00:43:53] cuz if not, dont go assigning them on a whim, cuz we have an RT triage person every week to do that [00:44:00] otherwise it will sit forever, ie: daniel is at conference now [00:44:08] so dispatching tickets to him isnt great. [00:44:10] nah I just did it because he's always done it - but yeah fair enough [00:44:21] are you on duty this week (topic?) [00:44:25] andrew otto [00:44:26] I thought you were last week? [00:44:36] yea, i just happen to watch all the RT queues as they come in [00:44:42] as RT admin i get every queue every update. [00:45:09] ah fun [00:46:31] that isnt the word i'd use ;P [00:46:43] so yea, im not saying dont dispatch tickets, but you wanna find out with the person before you do [00:46:54] ie: the rt triage person won't assign someone a ticket without touching base with said person [00:47:06] so you will wanna follow same procedure if possible [00:47:37] (or it may sit without notice ;_; ) [00:50:03] heh fair enough [00:50:20] MW config change deployment should be working, only Apache config change deployment is still broken [00:50:36] hah, wrong channel, sorry [00:51:10] yet i saw it anyhow [00:51:13] so it counts [01:32:19] RECOVERY - Puppet freshness on mw1128 is OK: puppet ran at Wed Feb 6 01:31:47 UTC 2013 [01:36:39] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [01:58:38] New patchset: J; "add cgroup to limit memory of sub processes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/40784 [02:04:01] New patchset: Reedy; "Rename $wmgVectorEditSectionLinks to $wmgVectorSectionEditLinks" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/47690 [02:04:26] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/47690 [02:04:33] PROBLEM - Puppet freshness on db1047 is CRITICAL: Puppet has not run in the last 10 hours [02:05:00] !log reedy synchronized wmf-config/ [02:05:03] Logged the message, Master [02:28:06] !log LocalisationUpdate completed (1.21wmf8) at Wed Feb 6 02:28:05 UTC 2013 [02:28:08] Logged the message, Master [02:51:37] !log LocalisationUpdate completed (1.21wmf9) at Wed Feb 6 02:51:36 UTC 2013 [02:51:38] Logged the message, Master [05:49:52] PROBLEM - MySQL Replication Heartbeat on db32 is CRITICAL: CRIT replication delay 195 seconds [05:51:40] RECOVERY - MySQL Replication Heartbeat on db32 is OK: OK replication delay 0 seconds [07:03:13] PROBLEM - Puppet freshness on analytics1001 is CRITICAL: Puppet has not run in the last 10 hours [07:11:19] morning [08:00:27] PROBLEM - Puppet freshness on msfe1002 is CRITICAL: Puppet has not run in the last 10 hours [08:00:27] PROBLEM - Puppet freshness on ms1004 is CRITICAL: Puppet has not run in the last 10 hours [08:00:28] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [08:00:28] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [08:00:28] PROBLEM - Puppet freshness on vanadium is CRITICAL: Puppet has not run in the last 10 hours [08:02:24] PROBLEM - Puppet freshness on professor is CRITICAL: Puppet has not run in the last 10 hours [08:41:24] PROBLEM - Puppet freshness on cp3020 is CRITICAL: Puppet has not run in the last 10 hours [09:57:00] New patchset: Hashar; "cleanout testswarm from the manifests" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/47665 [09:57:51] New review: Hashar; "PS2 make this change independent for easier merging." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/47665 [09:59:08] paravoid: hi :-] If you are around I did my first "per software" module with https://gerrit.wikimedia.org/r/#/c/47665/ :-] [09:59:39] that clean out the awful manifests/misc/contint.pp of anything related to the "testswarm" software and put the remaining part we are interested in in a testswarm module :-] [10:10:07] New review: Hashar; "The idea was merely to let me execute mw-update-l10n on my Mac laptop for debugging purpose. I am p..." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/46907 [10:21:37] PROBLEM - MySQL Replication Heartbeat on db33 is CRITICAL: CRIT replication delay 200 seconds [10:22:04] PROBLEM - MySQL Slave Delay on db33 is CRITICAL: CRIT replication delay 209 seconds [10:51:38] RECOVERY - MySQL Replication Heartbeat on db33 is OK: OK replication delay 0 seconds [10:52:05] RECOVERY - MySQL Slave Delay on db33 is OK: OK replication delay 0 seconds [11:05:55] Ryan_Lane: what did you remove? libpam-ldapd? [11:06:00] yes [11:06:15] which should also purge the ldap config from the pam files [11:06:16] libnss-ldapd Recommends libpam-ldapd [11:06:24] which means that it gets installed by default [11:06:30] -_ [11:06:30] err [11:06:32] -_- [11:06:41] so this needs to be ensure => absent in puppet [11:07:03] is this something new? [11:07:09] virt0 doesn't have it [11:07:15] neither does virt1000 [11:07:18] or mchenry [11:08:22] either way, yeah, it needs to be set absent [11:08:49] The following NEW packages will be installed: ldap-utils libnss-ldapd libpam-ldapd nscd nslcd [11:08:56] -_- [11:08:56] that's what I get for apt-get install libnss-ldapd [11:09:21] virt0/virt1000/mchenry don't have libnss-ldapd installed [11:09:22] well, that explains that [11:09:25] ah [11:09:31] * Ryan_Lane hates recommended packages [11:09:34] as far as I can see [11:09:51] installing recommends by default can be disabled [11:09:53] but I don't think we should [11:09:56] it's usually a good idea [11:09:59] sometimes it isn't :) [11:10:03] rightr [11:10:07] *right [11:10:15] from what I've read it'll break ubuntu, generally [11:10:39] oh, I don't know about that [11:10:42] it works in Debian [11:10:48] I have it on some systems at least [11:16:00] man. I *really* despise that the package changes the pam config, too [11:16:04] what the fuck [11:19:30] New patchset: Ryan Lane; "If pam ldap isn't enabled, then ensure it's absent" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/47722 [11:20:04] Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/47722 [11:22:20] New patchset: Hashar; "(bug 44061) initial release" [operations/debs/python-voluptuous] (master) - https://gerrit.wikimedia.org/r/44408 [11:22:35] New review: Hashar; "typo in debian/control" [operations/debs/python-voluptuous] (master); V: 0 C: 0; - https://gerrit.wikimedia.org/r/44408 [11:24:47] hashar: give me a few, then I'll be yours :) [11:24:50] definitely today [11:25:17] good :-] [11:25:34] trying to figure out the commands to get the package to build manually [11:25:34] :-] [11:27:19] dpkg-buildpackage -uc -us is the canonical one [11:27:24] git-buildpackage for git [11:27:46] I came up with: uscan --verbose --rename --download-current-version && dpkg-buildpackage -rfakeroot -us -uc -b [11:28:41] that's okay [11:28:44] drop the -b though [11:28:53] I always prefer including source packages to our apt too [11:29:18] it's also necessary from a licensing perspective for GPLed binaries [11:30:38] ultimately I would like Jenkins to build packages on each change set / after merge [11:38:21] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [11:41:30] PROBLEM - MySQL Replication Heartbeat on db53 is CRITICAL: CRIT replication delay 181 seconds [11:41:30] PROBLEM - MySQL Slave Delay on db53 is CRITICAL: CRIT replication delay 181 seconds [11:42:03] New patchset: Hashar; "gallium blessed with misc::package-builder" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/47725 [11:46:11] New patchset: Hashar; "(bug 44061) initial release" [operations/debs/python-voluptuous] (master) - https://gerrit.wikimedia.org/r/44408 [11:47:13] New review: Hashar; "PS8 debian/copyright now points to /usr/share/common-licenses/GPL-2 to make lintian happy" [operations/debs/python-voluptuous] (master); V: 0 C: 0; - https://gerrit.wikimedia.org/r/44408 [11:49:04] bah empty binary package [11:52:15] New patchset: Silke Meyer; "Added documentation to the Wikidata roles" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/47726 [12:05:39] PROBLEM - Puppet freshness on db1047 is CRITICAL: Puppet has not run in the last 10 hours [12:24:42] RECOVERY - MySQL Replication Heartbeat on db53 is OK: OK replication delay 0 seconds [12:25:00] RECOVERY - MySQL Slave Delay on db53 is OK: OK replication delay 0 seconds [12:43:04] New patchset: Hashar; "gallium blessed with misc::package-builder" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/47725 [12:43:17] New review: Hashar; "fixed up space / tabs" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/47725 [12:46:13] New review: Hashar; "works for me, pending mark approval." [operations/puppet] (production); V: 1 C: 1; - https://gerrit.wikimedia.org/r/47567 [12:48:06] PROBLEM - MySQL Replication Heartbeat on db32 is CRITICAL: CRIT replication delay 231 seconds [12:48:24] PROBLEM - MySQL Slave Delay on db32 is CRITICAL: CRIT replication delay 240 seconds [12:49:50] * paravoid is having lunch while watching Ceph talks :) [12:59:55] paravoid: while eating, could you potentially merge in https://gerrit.wikimedia.org/r/47725 :-] that get misc::package-builder on gallium [13:00:02] so jenkins can build / lint deb packages eventually [13:07:14] why do you want to build packages with jenkins? [13:09:29] to run the linting there and have them build automatically instead of mannually ? :-D [13:09:53] then people can send their patchset, receive a temp .deb as a result and test it out in labs :-] [13:11:55] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/47726 [13:12:44] New review: Faidon; "Why do we configure MediaWiki via puppet? We don't generally do that, please explain why Wikidata is..." [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/47585 [13:13:11] forgot to eat myself damn [13:13:11] thanks for the reminder [13:13:31] http://commons.wikimedia.org/wiki/File:2006_sardines_can_open.jpg miam [13:14:00] that's... scary [13:14:07] the automated building of debs [13:14:42] well this way we can have jenkins report about lintian failure in Gerrit [13:15:03] that is really the only things I wanted to achieve [13:19:05] New review: Faidon; "See inline for a few comments. Additional to that, debian/python-jsonschema.* and debian/python-json..." [operations/debs/python-jsonschema] (master); V: 0 C: 0; - https://gerrit.wikimedia.org/r/47662 [13:21:51] New review: Faidon; "Any reason to have a separate class for "systemuser" and reference that directly? Maybe just put it ..." [operations/puppet] (production); V: 0 C: -1; - https://gerrit.wikimedia.org/r/47665 [13:22:01] PROBLEM - MySQL Slave Delay on db53 is CRITICAL: CRIT replication delay 203 seconds [13:22:18] PROBLEM - MySQL Replication Heartbeat on db53 is CRITICAL: CRIT replication delay 208 seconds [13:22:40] <^demon> paravoid: Speaking of debs...I think some hands-on training in how to build debs would be useful (and I suspect I'm not alone). [13:22:51] <^demon> I've tried reading the docs before but they're not very user-friendly. [13:22:57] <^demon> (various docs, for that matter) [13:22:59] maybe we can set up a workshop while we are all in SF ? [13:23:33] I guess you're right [13:23:43] although we have a pretty busy schedule planned for those days [13:23:50] how long are you staying? [13:23:51] <^demon> I know, and it doesn't have to be that week. [13:23:51] but it seems it might be needed, that's true [13:24:17] I'll be there for three weeks, starting from the next one [13:24:19] I am there from Sunday 24th feb till Sat 9th march [13:24:50] oh heh [13:25:00] <^demon> I'm on the same schedule as hashar. [13:25:31] until this happens though, feel free to ask me and/or put me as a reviewer [13:25:41] and I'll happily provide feedback [13:26:07] well we better schedule that in advance if we want it to happens :-] [13:26:30] so [13:26:36] I just reviewed two different debs [13:26:42] and they apparently are for the exact same purpose [13:26:42] cause I suspect we will all be very busy and that might be hard to all gather at some place [13:26:57] https://gerrit.wikimedia.org/r/#/c/44408/ [13:26:59] https://gerrit.wikimedia.org/r/#/c/47662/ [13:27:13] oh [13:27:19] ori is packaging too :-] [13:28:23] which highlight how I skipped "git-buildpackage" from your review [13:28:24] doh [13:28:29] need to find out the doc for that one [13:30:02] New review: Hashar; "in site.pp we have:" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/47665 [13:30:06] RECOVERY - MySQL Replication Heartbeat on db32 is OK: OK replication delay 0 seconds [13:30:24] RECOVERY - MySQL Slave Delay on db32 is OK: OK replication delay 0 seconds [13:30:38] New review: Faidon; "This has nothing to do with packaging but also on the Gerrit queue is Antoine's packaging of Voluptu..." [operations/debs/python-jsonschema] (master); V: 0 C: -1; - https://gerrit.wikimedia.org/r/47662 [13:31:51] hmmm? system user with /bin/bash for a shell? [13:32:03] is that really needed here? [13:32:06] (sometimes it is) [13:32:10] cant remember [13:32:17] i merely copy pasted it [13:32:40] we can move it back to sillyshell / /bin/false [13:32:48] and put back bash later on if that is needed [13:32:54] /bin/false is fine [13:32:58] that or a FIXME [13:33:00] amending [13:33:18] sorry, I guess I'm too strict of a reviewer... :) [13:33:22] na it is ok [13:33:32] strictness is cool :-] [13:33:37] that ensure we don't produce crap [13:33:46] considering I found a few vulnerabilities today I'm not feeling bad [13:33:48] <^demon> Don't use sillyshell for anything except svn. [13:33:54] <^demon> It's not meant for anything but svn. [13:34:06] nod [13:34:12] I just read sillyshell today [13:34:34] <^demon> sillyshell is silly ;-) [13:34:36] New patchset: Hashar; "cleanout testswarm from the manifests" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/47665 [13:34:55] New review: Hashar; "cleanup: made the shell /bin/false" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/47665 [13:35:02] paravoid: ^^^^ [13:35:22] New review: Faidon; "That's a fair reason. Maybe we can simplify it when it gets complete." [operations/puppet] (production); V: 0 C: 2; - https://gerrit.wikimedia.org/r/47665 [13:35:24] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/47665 [13:35:45] goood [13:36:03] contint.pp is slightly lighter now :-) [13:36:32] Ryan_Lane: merging libpam-ldapd on sockpuppet btw [13:37:30] wanna look at my other puppet changes? :-] [13:38:14] Change abandoned: Hashar; "yeah this is being split in smaller modules." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43429 [13:41:00] yes [13:41:02] which ones? [13:44:00] so yesterday [13:44:07] we rejected the wikimedia module :-] [13:44:18] I started extracting stuff out of misc/contint.pp to some new modules [13:44:38] Change abandoned: Hashar; "(no reason)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/43420 [13:45:08] paravoid: https://gerrit.wikimedia.org/r/#/c/47663/ that moves ton of packages definitions to the new "contint" modules [13:45:19] that merely list random packages we need for jenkins jobs [13:45:23] such as php-* rake .. [13:45:38] I haven't moved everything though cause some packages are a bit scary and I have no idea where to move them [13:46:57] I think I will try to make several small changes, I guess that will be easier to review / apply [13:48:35] New patchset: Demon; "Install git on all servers" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/37247 [13:51:31] RECOVERY - MySQL Replication Heartbeat on db53 is OK: OK replication delay 0 seconds [13:51:40] RECOVERY - MySQL Slave Delay on db53 is OK: OK replication delay 0 seconds [13:52:18] <^demon> paravoid: I cleaned up that git change ^. Now puts it in $packages like you suggested. [13:54:26] sec [13:57:19] ^demon: that wasn't the reason I didn't merge it, but okay :) [13:57:25] Change merged: Faidon; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/37247 [13:58:15] New patchset: Tpt; "(bug 44032) Deploy Universal Language Selector to oldwikiource" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/47732 [13:58:27] hashar: so... [13:58:50] paravoid: I am not sure if you prefer ton of small patchets [13:59:04] paravoid: or just a massive refactoring one that split misc/contint.pp to various small modules [14:02:35] wikidata seems down :( [14:02:38] PROBLEM - Apache HTTP on mw1078 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:02:38] PROBLEM - Apache HTTP on mw1109 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:02:38] PROBLEM - Apache HTTP on mw1066 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:02:47] PROBLEM - Apache HTTP on mw1029 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:02:47] PROBLEM - Apache HTTP on mw1033 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:02:47] PROBLEM - Apache HTTP on mw1045 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:02:47] PROBLEM - Apache HTTP on mw1025 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:02:47] PROBLEM - Apache HTTP on mw1021 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:02:47] PROBLEM - Apache HTTP on mw1037 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:02:47] PROBLEM - Apache HTTP on mw1061 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:02:48] PROBLEM - Apache HTTP on mw1053 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:02:48] PROBLEM - Apache HTTP on mw1069 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:02:49] PROBLEM - Apache HTTP on mw1059 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:02:49] yeah [14:02:53] well maybe more [14:02:55] PROBLEM - Apache HTTP on mw1057 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:02:56] PROBLEM - Apache HTTP on mw1080 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:02:56] PROBLEM - Apache HTTP on mw1111 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:02:56] PROBLEM - Apache HTTP on mw1065 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:02:56] PROBLEM - Apache HTTP on mw1077 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:02:56] PROBLEM - Apache HTTP on mw1081 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:04] PROBLEM - Apache HTTP on mw1097 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:04] PROBLEM - Apache HTTP on mw1044 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:04] PROBLEM - Apache HTTP on mw1052 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:04] PROBLEM - Apache HTTP on mw1068 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:04] PROBLEM - Apache HTTP on mw1020 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:05] PROBLEM - Apache HTTP on mw1102 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:05] PROBLEM - Apache HTTP on mw1017 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:06] PROBLEM - Apache HTTP on mw1088 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:06] PROBLEM - Apache HTTP on mw1036 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:07] PROBLEM - Apache HTTP on mw1076 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:07] PROBLEM - Apache HTTP on mw1048 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:08] PROBLEM - Apache HTTP on mw1032 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:08] PROBLEM - Apache HTTP on mw1060 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:13] PROBLEM - Apache HTTP on mw1103 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:22] panic! [14:03:22] PROBLEM - Apache HTTP on mw1107 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:23] PROBLEM - Apache HTTP on mw1092 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:23] PROBLEM - Apache HTTP on mw1084 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:31] PROBLEM - Apache HTTP on mw1024 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:32] PROBLEM - Apache HTTP on mw1110 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:32] PROBLEM - Apache HTTP on mw1073 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:32] what the hell [14:03:41] PROBLEM - Apache HTTP on mw1035 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:41] PROBLEM - Apache HTTP on mw1091 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:41] PROBLEM - Apache HTTP on mw1099 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:41] PROBLEM - Apache HTTP on mw1087 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:49] PROBLEM - Apache HTTP on mw1105 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:49] PROBLEM - Apache HTTP on mw1051 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:03:58] PROBLEM - Apache HTTP on mw1042 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:04:07] PROBLEM - Apache HTTP on mw1075 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:04:07] PROBLEM - Apache HTTP on mw1062 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:04:16] PROBLEM - Apache HTTP on mw1108 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:04:21] at least one of them has apaches running [14:04:24] might just be overloadded [14:04:25] RECOVERY - Apache HTTP on mw1033 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.096 second response time [14:04:26] RECOVERY - Apache HTTP on mw1109 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 9.407 second response time [14:04:26] RECOVERY - Apache HTTP on mw1025 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 3.072 second response time [14:04:26] RECOVERY - Apache HTTP on mw1029 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 7.632 second response time [14:04:34] RECOVERY - Apache HTTP on mw1065 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.109 second response time [14:04:34] PROBLEM - Apache HTTP on mw1049 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:04:43] RECOVERY - Apache HTTP on mw1020 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.106 second response time [14:04:44] RECOVERY - Apache HTTP on mw1068 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.110 second response time [14:04:44] RECOVERY - Apache HTTP on mw1102 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.131 second response time [14:04:44] RECOVERY - Apache HTTP on mw1036 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 5.379 second response time [14:04:52] PROBLEM - Apache HTTP on mw1106 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:05:01] RECOVERY - Apache HTTP on mw1107 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.111 second response time [14:05:01] RECOVERY - Apache HTTP on mw1092 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.123 second response time [14:05:01] RECOVERY - Apache HTTP on mw1084 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.094 second response time [14:05:10] RECOVERY - Apache HTTP on mw1110 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.099 second response time [14:05:11] RECOVERY - Apache HTTP on mw1073 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.106 second response time [14:05:20] RECOVERY - Apache HTTP on mw1024 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 8.825 second response time [14:05:20] RECOVERY - Apache HTTP on mw1035 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.102 second response time [14:05:20] RECOVERY - Apache HTTP on mw1091 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.102 second response time [14:05:20] RECOVERY - Apache HTTP on mw1087 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.099 second response time [14:05:20] RECOVERY - Apache HTTP on mw1099 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.111 second response time [14:05:29] RECOVERY - Apache HTTP on mw1105 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.105 second response time [14:05:29] RECOVERY - Apache HTTP on mw1051 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 3.506 second response time [14:05:31] seems to be better now [14:05:37] RECOVERY - Apache HTTP on mw1042 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 5.042 second response time [14:06:04] RECOVERY - Apache HTTP on mw1078 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.101 second response time [14:06:05] RECOVERY - Apache HTTP on mw1066 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.109 second response time [14:06:13] RECOVERY - Apache HTTP on mw1059 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.103 second response time [14:06:14] RECOVERY - Apache HTTP on mw1037 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.093 second response time [14:06:14] RECOVERY - Apache HTTP on mw1045 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.099 second response time [14:06:14] RECOVERY - Apache HTTP on mw1021 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.097 second response time [14:06:14] RECOVERY - Apache HTTP on mw1049 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.115 second response time [14:06:14] RECOVERY - Apache HTTP on mw1069 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.093 second response time [14:06:14] RECOVERY - Apache HTTP on mw1053 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 1.855 second response time [14:06:14] maybe a rack switch in eqiad [14:06:15] RECOVERY - Apache HTTP on mw1061 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 3.742 second response time [14:06:22] RECOVERY - Apache HTTP on mw1081 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.116 second response time [14:06:22] RECOVERY - Apache HTTP on mw1057 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.116 second response time [14:06:22] RECOVERY - Apache HTTP on mw1077 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 6.400 second response time [14:06:31] RECOVERY - Apache HTTP on mw1097 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.110 second response time [14:06:32] RECOVERY - Apache HTTP on mw1048 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.094 second response time [14:06:32] RECOVERY - Apache HTTP on mw1017 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.112 second response time [14:06:32] RECOVERY - Apache HTTP on mw1076 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.099 second response time [14:06:32] RECOVERY - Apache HTTP on mw1032 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.103 second response time [14:06:32] RECOVERY - Apache HTTP on mw1106 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 3.456 second response time [14:06:32] RECOVERY - Apache HTTP on mw1088 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 3.509 second response time [14:06:33] RECOVERY - Apache HTTP on mw1044 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 3.768 second response time [14:06:33] RECOVERY - Apache HTTP on mw1060 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 5.440 second response time [14:06:34] RECOVERY - Apache HTTP on mw1052 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 7.603 second response time [14:07:34] RECOVERY - Apache HTTP on mw1062 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.108 second response time [14:07:34] RECOVERY - Apache HTTP on mw1075 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 5.300 second response time [14:07:43] RECOVERY - Apache HTTP on mw1108 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.721 second response time [14:08:10] RECOVERY - Apache HTTP on mw1080 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.110 second response time [14:08:29] RECOVERY - Apache HTTP on mw1103 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 0.111 second response time [14:09:31] PROBLEM - Apache HTTP on mw1026 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:09:58] RECOVERY - Apache HTTP on mw1111 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 8.194 second response time [14:11:11] RECOVERY - Apache HTTP on mw1026 is OK: HTTP OK - HTTP/1.1 301 Moved Permanently - 1.736 second response time [14:14:02] New patchset: Silke Meyer; "Definition of a function that gets MW extensions with less code" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/46809 [14:14:33] db1027's mysql was killed by OOM [14:15:38] but that was 15' before the alerts, hmm [14:20:49] paravoid: are you investigating that more or can we proceed on the puppet cleanup I have sent ? :-∆ [14:20:50] New review: Demon; "I really don't think this is necessary anymore. Can we just abandon this?" [operations/puppet] (production) C: 0; - https://gerrit.wikimedia.org/r/8120 [14:20:51] ∆∆∆∆ [14:20:52] ho [14:37:10] hashar: go ahead :) [14:37:26] just noticed you reviewed one of the changes ;-) [14:37:33] refactoring [14:44:33] New patchset: Hashar; "move contint packages under a submodule" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/47663 [14:44:44] New patchset: Hashar; "Jenkins module created out of contint manifests" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/47664 [14:44:49] bah [14:46:33] New patchset: Hashar; "move contint packages under a submodule" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/47663 [14:47:21] New review: Hashar; "rebased" [operations/puppet] (production); V: 0 C: 0; - https://gerrit.wikimedia.org/r/47663 [14:47:47] paravoid: updated https://gerrit.wikimedia.org/r/#/c/47663/1..3/modules/contint/manifests/packages.pp,unified [14:48:07] New patchset: Hashar; "Jenkins module created out of contint manifests" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/47664 [14:50:28] what about the generic:: and misc:: includes? [14:51:09] looking at the jenkins one too [14:51:17] this starts to feel much cleaner, doesn't it? [14:51:53] well the generic::packages::ant18 I have done that explicitly for others to use it if needed [14:51:59] I can clean it out in another chnage [14:52:12] the misc::irc::wikibugs::packages , I am not sure what you mean :-] [14:52:24] that is used by a job that validate wikibugs (a perl script) [14:52:30] which has its own puppet class [14:52:41] so instead of copy pasting the list of dependencies, I am just including them [14:52:47] save us from some code duplication [14:53:24] ideally we shouldn't have modules depending on manifests at all [14:53:36] until we get there, sure, we can do that [14:54:49] yeah in an ideal word :-] [14:55:01] hashar: for jenkins: use multiline definitions everywhere, don't use compression (i.e. multiple files in a file { } stanza) [14:55:06] I can move the wikibugs stuff to a wikibugs module :) [14:55:08] we haven't been consistent *at all* for those [14:55:17] but since you're cleaning up and adhering to the style guide more or less [14:55:26] let's do that too if you don't mind [14:55:47] oh and same thing about /bin/bash for jenkins [14:55:48] so one file{} per file ? [14:55:48] :) [14:55:58] http://docs.puppetlabs.com/guides/style_guide.html 9.4 Compression [14:56:11]