[00:31:09] <^d> Ryan_Lane: What's the command for "users in 'foo' ldap group" again? [11:53:53] !ping [11:53:53] pong [13:40:52] 09/13/2012 - 13:40:52 - Updating keys for btongminh at /export/keys/btongminh [14:47:28] Change on 12mediawiki a page Developer access was modified, changed by Dneary link https://www.mediawiki.org/w/index.php?diff=582918 edit summary: [14:51:36] Damianz what's status of nagios [14:52:15] Damianz what's up with free ram check [14:52:21] it returns some crap [14:52:32] It should be mostly working afaik at the moment [14:52:46] it returns number [14:52:48] there's still a bunch of broken ram checks on hosts that run local puppetmasters that havn't updated their repos [14:52:52] it used to be % [14:53:00] which is much better [14:53:08] should be % or both [14:53:19] you don't even know how large is ram from nagios [14:53:31] that number could be anything and you can't know if it's ok or not [14:53:33] dunno, that file was in the repo I just told puppet to install it... % would be better though [14:53:39] hmm [14:53:48] whatever is this, it's not written by me [14:54:15] Dunno, I could go find the version you wrote and push it into prod I guess [14:54:37] 10796 root 20 0 156m 30m 2552 R 98.4 1.5 5:05.60 glusterf :D [14:54:47] LOL [14:54:48] teh driver is crap [14:54:58] 5h of CPU time [14:55:06] it eats ton of cpu [14:55:17] Hmm [14:55:29] 98.4 cpu use [14:55:31] % [14:55:33] meh [14:56:20] I liked when nagios told me about problem 2 months before actually finding it [14:56:42] reason why I found it, was that bunch of people on enwiki complained [14:56:49] for long time [14:57:12] We really need better graphing <> monitoring stuff, could easily pick up trends [14:57:18] nagios is good [14:57:21] we just need to fix it [14:57:43] Hmm wtf is the puppet check not updated for some hosts [14:57:52] I don't know [14:57:57] It needs quite a lot of work on some bits [14:58:08] it would make it easier if I had global shell access [14:58:15] so that I could check it on instances [14:58:24] now we can only make a ticket in bz and wait [14:59:17] svc: failed to register lockdv1 RPC service < Oh I loe nfs [14:59:25] where [14:59:42] nagios [14:59:59] there is no nfs [15:00:15] oh wait [15:00:16] there is :D [15:00:20] home dir [15:00:22] mhm [15:00:28] really need to move homedirs [15:00:32] haha [15:00:59] labs-nagios-wm is bleh again also [15:02:22] !log nagios petrb: restarting bot [15:03:00] !log nagios petrb: restarting bot [15:03:44] lol [15:03:46] another bot died [15:03:53] <3 c# :D [15:03:57] sigh [15:04:00] wm-bot has uptime like month [15:04:11] wm-bot also randomly doesn't respond :P [15:04:16] these python bots crashes everytime you need them [15:04:20] Damianz don't believe it [15:04:23] !ping [15:04:23] pong [15:04:30] Only caus ethey're written shittly [15:04:31] it always respond [15:04:56] wm-bot: talk!! [15:04:56] Hi petan, there is some error, I am a stupid bot and I am not intelligent enough to hold a conversation with you :-) [15:05:29] Damianz but there was ipv6 bug [15:05:35] so maybe if you were using ipv6 it didn't see you [15:05:42] it's fixed like 1 month [15:05:55] possibly [15:09:45] RECOVERY Free ram is now: OK on bots-sql2 i-000000af output: 844012 [15:11:05] RECOVERY Disk Space is now: OK on hugglewiki i-000000aa output: DISK OK [15:11:57] ARGH [15:11:58] I see [15:13:45] PROBLEM host: deployment-backup is DOWN address: i-000000f8 CRITICAL - Host Unreachable (i-000000f8) [15:14:55] PROBLEM host: analytics is DOWN address: i-000000e2 CRITICAL - Host Unreachable (i-000000e2) [15:15:18] * Damianz kicks the fuck out of nagios [15:15:35] PROBLEM host: configtest-main is DOWN address: i-000002dd CRITICAL - Host Unreachable (i-000002dd) [15:15:35] PROBLEM host: wep is DOWN address: i-000000c2 CRITICAL - Host Unreachable (i-000000c2) [15:15:57] !log nagios reset rw owernship to snmptt so the puppet check can be entered - probably should switch this to a setuid'd binary or setup group memberships properly. [15:15:58] Logged the message, Master [15:20:10] Change on 12mediawiki a page Developer access was modified, changed by Jeremyb link https://www.mediawiki.org/w/index.php?diff=582942 edit summary: /* User:Dneary */ done [15:22:05] PROBLEM host: mobile-feeds is DOWN address: i-000000c1 CRITICAL - Host Unreachable (i-000000c1) [15:23:15] PROBLEM host: conventionextension-test is DOWN address: i-000003c0 CRITICAL - Host Unreachable (i-000003c0) [15:23:25] PROBLEM host: lynwood is DOWN address: i-000003e5 PING CRITICAL - Packet loss = 100% [15:27:15] PROBLEM Total Processes is now: WARNING on wikistats-01 i-00000042 output: PROCS WARNING: 178 processes [15:27:35] PROBLEM host: pageviews is DOWN address: i-000000b2 CRITICAL - Host Unreachable (i-000000b2) [15:28:15] PROBLEM Puppet freshness is now: CRITICAL on gerrit-build i-000003f5 output: Puppet has not run in the last 10 hours [15:28:15] PROBLEM Puppet freshness is now: CRITICAL on robh-mingle i-000003f6 output: Puppet has not run in the last 10 hours [15:28:15] PROBLEM host: deployment-feed is DOWN address: i-00000118 CRITICAL - Host Unreachable (i-00000118) [15:32:05] PROBLEM host: mobile-wlm2 is DOWN address: i-0000038e CRITICAL - Host Unreachable (i-0000038e) [15:34:05] PROBLEM HTTP is now: WARNING on incubator-apache i-00000211 output: HTTP WARNING: HTTP/1.1 404 Not Found - 477 bytes in 0.004 second response time [15:41:55] RECOVERY Puppet freshness is now: OK on bots-1 i-000000a9 output: puppet ran at Thu Sep 13 15:41:44 UTC 2012 [15:41:57] woow fixed [15:42:05] RECOVERY Puppet freshness is now: OK on mediawiki-bugfix-kozuch i-000003e4 output: puppet ran at Thu Sep 13 15:42:01 UTC 2012 [15:42:25] RECOVERY Puppet freshness is now: OK on bots-sql2 i-000000af output: puppet ran at Thu Sep 13 15:42:12 UTC 2012 [15:42:35] RECOVERY Puppet freshness is now: OK on vumi i-000001e5 output: puppet ran at Thu Sep 13 15:42:23 UTC 2012 [15:42:35] RECOVERY Puppet freshness is now: OK on demo-deployment1 i-00000276 output: puppet ran at Thu Sep 13 15:42:31 UTC 2012 [15:42:35] !log nagios fixed snmp trap config [15:42:36] Logged the message, Master [15:43:05] RECOVERY Puppet freshness is now: OK on otrs-jgreen i-0000015a output: puppet ran at Thu Sep 13 15:43:03 UTC 2012 [15:43:25] RECOVERY Puppet freshness is now: OK on wep2 i-000003ad output: puppet ran at Thu Sep 13 15:43:12 UTC 2012 [15:43:25] RECOVERY Puppet freshness is now: OK on exim-test i-00000265 output: puppet ran at Thu Sep 13 15:43:17 UTC 2012 [15:43:35] RECOVERY Puppet freshness is now: OK on robh2-mingle i-000003f8 output: puppet ran at Thu Sep 13 15:43:27 UTC 2012 [15:43:35] RECOVERY Puppet freshness is now: OK on bots-2 i-0000009c output: puppet ran at Thu Sep 13 15:43:28 UTC 2012 [15:43:45] PROBLEM host: deployment-backup is DOWN address: i-000000f8 CRITICAL - Host Unreachable (i-000000f8) [15:44:25] RECOVERY Puppet freshness is now: OK on nova-precise1 i-00000236 output: puppet ran at Thu Sep 13 15:44:09 UTC 2012 [15:44:35] RECOVERY Puppet freshness is now: OK on log1 i-00000239 output: puppet ran at Thu Sep 13 15:44:25 UTC 2012 [15:44:35] RECOVERY Puppet freshness is now: OK on building i-0000014d output: puppet ran at Thu Sep 13 15:44:34 UTC 2012 [15:44:55] RECOVERY Puppet freshness is now: OK on robh2 i-000001a2 output: puppet ran at Thu Sep 13 15:44:35 UTC 2012 [15:44:55] RECOVERY Puppet freshness is now: OK on cn-wiki-db-lucid i-00000241 output: puppet ran at Thu Sep 13 15:44:37 UTC 2012 [15:44:55] RECOVERY Puppet freshness is now: OK on su-aux1 i-000002ea output: puppet ran at Thu Sep 13 15:44:42 UTC 2012 [15:44:55] RECOVERY Puppet freshness is now: OK on php-packaging i-000003ae output: puppet ran at Thu Sep 13 15:44:47 UTC 2012 [15:44:55] PROBLEM host: analytics is DOWN address: i-000000e2 CRITICAL - Host Unreachable (i-000000e2) [15:45:05] RECOVERY Puppet freshness is now: OK on deployment-sql i-000000d0 output: puppet ran at Thu Sep 13 15:44:57 UTC 2012 [15:45:05] RECOVERY Puppet freshness is now: OK on scribunto i-0000022c output: puppet ran at Thu Sep 13 15:45:00 UTC 2012 [15:45:05] RECOVERY Puppet freshness is now: OK on secondinstance i-0000015b output: puppet ran at Thu Sep 13 15:45:02 UTC 2012 [15:45:25] RECOVERY Puppet freshness is now: OK on bots-sql3 i-000000b4 output: puppet ran at Thu Sep 13 15:45:14 UTC 2012 [15:45:25] RECOVERY Puppet freshness is now: OK on nginx-ffuqua-doom1-3 i-00000196 output: puppet ran at Thu Sep 13 15:45:17 UTC 2012 [15:45:35] PROBLEM host: configtest-main is DOWN address: i-000002dd CRITICAL - Host Unreachable (i-000002dd) [15:45:35] PROBLEM host: wep is DOWN address: i-000000c2 CRITICAL - Host Unreachable (i-000000c2) [15:46:05] RECOVERY Puppet freshness is now: OK on dumps-bot2 i-000003f4 output: puppet ran at Thu Sep 13 15:45:52 UTC 2012 [15:46:05] RECOVERY Puppet freshness is now: OK on kripke i-00000268 output: puppet ran at Thu Sep 13 15:45:53 UTC 2012 [15:46:05] RECOVERY Puppet freshness is now: OK on bots-dev i-00000190 output: puppet ran at Thu Sep 13 15:45:55 UTC 2012 [15:46:25] RECOVERY Puppet freshness is now: OK on memcache-puppet i-00000153 output: puppet ran at Thu Sep 13 15:46:06 UTC 2012 [15:46:25] RECOVERY Puppet freshness is now: OK on swift-aux1 i-0000024b output: puppet ran at Thu Sep 13 15:46:07 UTC 2012 [15:46:35] RECOVERY Puppet freshness is now: OK on wikiversity-sandbox-frontend i-0000033a output: puppet ran at Thu Sep 13 15:46:22 UTC 2012 [15:46:55] RECOVERY Puppet freshness is now: OK on ganglia-test2 i-00000250 output: puppet ran at Thu Sep 13 15:46:43 UTC 2012 [15:47:05] RECOVERY Puppet freshness is now: OK on tutorial-puppet2 i-000003fe output: puppet ran at Thu Sep 13 15:46:58 UTC 2012 [15:47:25] RECOVERY Puppet freshness is now: OK on wikistats-history-01 i-000002e2 output: puppet ran at Thu Sep 13 15:47:11 UTC 2012 [15:47:25] RECOVERY Puppet freshness is now: OK on mediawiki-dev-1 i-0000039c output: puppet ran at Thu Sep 13 15:47:13 UTC 2012 [15:47:55] RECOVERY Puppet freshness is now: OK on labs-nfs1 i-0000005d output: puppet ran at Thu Sep 13 15:47:39 UTC 2012 [15:48:05] RECOVERY Puppet freshness is now: OK on varnish-precise i-00000311 output: puppet ran at Thu Sep 13 15:47:58 UTC 2012 [15:48:05] RECOVERY Puppet freshness is now: OK on syslogcol-srv i-000003a9 output: puppet ran at Thu Sep 13 15:48:03 UTC 2012 [15:48:25] RECOVERY Puppet freshness is now: OK on wikistream-1 i-0000016e output: puppet ran at Thu Sep 13 15:48:09 UTC 2012 [15:48:55] RECOVERY Puppet freshness is now: OK on bots-4 i-000000e8 output: puppet ran at Thu Sep 13 15:48:43 UTC 2012 [15:48:55] RECOVERY Puppet freshness is now: OK on bots-labs i-0000015e output: puppet ran at Thu Sep 13 15:48:48 UTC 2012 [15:49:05] RECOVERY Puppet freshness is now: OK on mingledbtest i-00000283 output: puppet ran at Thu Sep 13 15:48:52 UTC 2012 [15:49:05] RECOVERY Puppet freshness is now: OK on hume i-000003cc output: puppet ran at Thu Sep 13 15:48:53 UTC 2012 [15:49:05] RECOVERY Puppet freshness is now: OK on signwriting-ase11 i-000003dc output: puppet ran at Thu Sep 13 15:49:04 UTC 2012 [15:49:25] RECOVERY Puppet freshness is now: OK on paste1 i-000003fa output: puppet ran at Thu Sep 13 15:49:12 UTC 2012 [15:49:55] RECOVERY Puppet freshness is now: OK on worker1 i-00000208 output: puppet ran at Thu Sep 13 15:49:35 UTC 2012 [15:50:05] RECOVERY Puppet freshness is now: OK on phabricator i-000003a2 output: puppet ran at Thu Sep 13 15:49:58 UTC 2012 [15:50:25] RECOVERY Puppet freshness is now: OK on util-abogott i-00000386 output: puppet ran at Thu Sep 13 15:50:17 UTC 2012 [15:50:35] RECOVERY Puppet freshness is now: OK on sultest2 i-00000330 output: puppet ran at Thu Sep 13 15:50:28 UTC 2012 [15:50:55] RECOVERY Puppet freshness is now: OK on opengrok-web i-000001e1 output: puppet ran at Thu Sep 13 15:50:39 UTC 2012 [15:50:55] RECOVERY Puppet freshness is now: OK on publicdata-administration i-0000019e output: puppet ran at Thu Sep 13 15:50:46 UTC 2012 [15:52:05] RECOVERY Puppet freshness is now: OK on dumps-1 i-00000355 output: puppet ran at Thu Sep 13 15:51:51 UTC 2012 [15:52:05] RECOVERY Puppet freshness is now: OK on migration1 i-00000261 output: puppet ran at Thu Sep 13 15:51:59 UTC 2012 [15:52:05] PROBLEM host: mobile-feeds is DOWN address: i-000000c1 CRITICAL - Host Unreachable (i-000000c1) [15:52:39] RECOVERY Puppet freshness is now: OK on incubator-sql i-000003f3 output: puppet ran at Thu Sep 13 15:52:06 UTC 2012 [15:52:39] RECOVERY Puppet freshness is now: OK on echo-xmpp i-00000351 output: puppet ran at Thu Sep 13 15:52:07 UTC 2012 [15:52:39] RECOVERY Puppet freshness is now: OK on wikibits-mysql i-00000341 output: puppet ran at Thu Sep 13 15:52:27 UTC 2012 [15:53:09] RECOVERY Puppet freshness is now: OK on ee-prototype i-0000013d output: puppet ran at Thu Sep 13 15:52:54 UTC 2012 [15:53:09] RECOVERY Puppet freshness is now: OK on integration-jenkins i-00000363 output: puppet ran at Thu Sep 13 15:52:54 UTC 2012 [15:53:09] RECOVERY Puppet freshness is now: OK on sube i-000003d0 output: puppet ran at Thu Sep 13 15:53:00 UTC 2012 [15:53:09] RECOVERY Puppet freshness is now: OK on simplewikt i-00000149 output: puppet ran at Thu Sep 13 15:53:02 UTC 2012 [15:53:19] RECOVERY Puppet freshness is now: OK on wep3 i-000003af output: puppet ran at Thu Sep 13 15:53:07 UTC 2012 [15:53:19] RECOVERY Puppet freshness is now: OK on bots-nfs i-000000b1 output: puppet ran at Thu Sep 13 15:53:12 UTC 2012 [15:53:19] RECOVERY Puppet freshness is now: OK on en-wiki-db-lucid i-0000023b output: puppet ran at Thu Sep 13 15:53:13 UTC 2012 [15:53:19] RECOVERY Puppet freshness is now: OK on queue-wiki1 i-000002b8 output: puppet ran at Thu Sep 13 15:53:13 UTC 2012 [15:53:19] RECOVERY Puppet freshness is now: OK on puppet-tutorial1 i-000003fd output: puppet ran at Thu Sep 13 15:53:15 UTC 2012 [15:53:28] !log nagios fixed parser fetch so it rebuilds when the old file is missing [15:53:29] PROBLEM host: lynwood is DOWN address: i-000003e5 PING CRITICAL - Packet loss = 100% [15:53:29] Logged the message, Master [15:53:29] PROBLEM host: conventionextension-test is DOWN address: i-000003c0 CRITICAL - Host Unreachable (i-000003c0) [15:53:39] RECOVERY Puppet freshness is now: OK on webserver-lcarr i-00000134 output: puppet ran at Thu Sep 13 15:53:33 UTC 2012 [15:53:49] RECOVERY Puppet freshness is now: OK on demo-mysql1 i-00000256 output: puppet ran at Thu Sep 13 15:53:41 UTC 2012 [15:54:39] RECOVERY Puppet freshness is now: OK on demo-deployment2 i-000003fb output: puppet ran at Thu Sep 13 15:54:23 UTC 2012 [15:55:44] RECOVERY Puppet freshness is now: OK on testing-arky i-0000033b output: puppet ran at Thu Sep 13 15:55:17 UTC 2012 [15:55:44] RECOVERY Puppet freshness is now: OK on maps-test-osm2pgsql2 i-00000375 output: puppet ran at Thu Sep 13 15:55:27 UTC 2012 [15:55:44] RECOVERY Puppet freshness is now: OK on hugglewa-1 i-000001e0 output: puppet ran at Thu Sep 13 15:55:28 UTC 2012 [15:55:44] RECOVERY Puppet freshness is now: OK on robh-mingle i-000003f6 output: puppet ran at Thu Sep 13 15:55:29 UTC 2012 [15:55:54] RECOVERY Puppet freshness is now: OK on search1 i-000003f7 output: puppet ran at Thu Sep 13 15:55:41 UTC 2012 [15:55:54] RECOVERY Puppet freshness is now: OK on firstinstance i-0000013e output: puppet ran at Thu Sep 13 15:55:45 UTC 2012 [15:56:14] RECOVERY Puppet freshness is now: OK on cvn-apache2 i-00000339 output: puppet ran at Thu Sep 13 15:56:01 UTC 2012 [15:56:44] RECOVERY Puppet freshness is now: OK on reportcard2 i-000001ea output: puppet ran at Thu Sep 13 15:56:36 UTC 2012 [15:57:44] PROBLEM host: pageviews is DOWN address: i-000000b2 CRITICAL - Host Unreachable (i-000000b2) [15:57:54] RECOVERY Puppet freshness is now: OK on syslogcol-ab i-0000035e output: puppet ran at Thu Sep 13 15:57:46 UTC 2012 [15:57:54] RECOVERY Puppet freshness is now: OK on rds i-00000207 output: puppet ran at Thu Sep 13 15:57:49 UTC 2012 [15:57:54] RECOVERY Puppet freshness is now: OK on bots-cb i-0000009e output: puppet ran at Thu Sep 13 15:57:50 UTC 2012 [15:58:24] RECOVERY Puppet freshness is now: OK on wikisource-web i-000000fe output: puppet ran at Thu Sep 13 15:58:15 UTC 2012 [15:58:34] PROBLEM host: deployment-feed is DOWN address: i-00000118 CRITICAL - Host Unreachable (i-00000118) [15:58:54] RECOVERY Puppet freshness is now: OK on conventionextension-trial i-000003bf output: puppet ran at Thu Sep 13 15:58:39 UTC 2012 [15:58:54] RECOVERY Puppet freshness is now: OK on ubuntu1-pgehres i-000000fb output: puppet ran at Thu Sep 13 15:58:50 UTC 2012 [15:59:14] RECOVERY Puppet freshness is now: OK on signwriting-ase10 i-00000322 output: puppet ran at Thu Sep 13 15:58:54 UTC 2012 [15:59:14] RECOVERY Puppet freshness is now: OK on mailman-01 i-00000235 output: puppet ran at Thu Sep 13 15:59:01 UTC 2012 [15:59:54] RECOVERY Puppet freshness is now: OK on mobile-b2g i-000003b1 output: puppet ran at Thu Sep 13 15:59:42 UTC 2012 [15:59:54] RECOVERY Puppet freshness is now: OK on deployment-integration i-0000034a output: puppet ran at Thu Sep 13 15:59:46 UTC 2012 [16:00:24] RECOVERY Puppet freshness is now: OK on wikiversity-sandbox-test i-00000374 output: puppet ran at Thu Sep 13 16:00:14 UTC 2012 [16:00:44] RECOVERY Puppet freshness is now: OK on pediapress-ocg1 i-00000233 output: puppet ran at Thu Sep 13 16:00:28 UTC 2012 [16:00:54] RECOVERY Puppet freshness is now: OK on tutorial-mysql i-0000028b output: puppet ran at Thu Sep 13 16:00:39 UTC 2012 [16:00:54] RECOVERY Puppet freshness is now: OK on shop-analytics-main i-000001e6 output: puppet ran at Thu Sep 13 16:00:53 UTC 2012 [16:01:14] RECOVERY Puppet freshness is now: OK on varnish i-000001ac output: puppet ran at Thu Sep 13 16:01:07 UTC 2012 [16:01:24] RECOVERY Puppet freshness is now: OK on jawiki-demo i-000003cf output: puppet ran at Thu Sep 13 16:01:13 UTC 2012 [16:01:24] RECOVERY Puppet freshness is now: OK on deployment-apache32 i-0000031a output: puppet ran at Thu Sep 13 16:01:24 UTC 2012 [16:01:44] RECOVERY Puppet freshness is now: OK on bob i-0000012d output: puppet ran at Thu Sep 13 16:01:29 UTC 2012 [16:01:44] RECOVERY Puppet freshness is now: OK on patchtest i-000000f1 output: puppet ran at Thu Sep 13 16:01:37 UTC 2012 [16:01:54] RECOVERY Puppet freshness is now: OK on ipv6test1 i-00000282 output: puppet ran at Thu Sep 13 16:01:52 UTC 2012 [16:02:14] RECOVERY Puppet freshness is now: OK on upload-wizard i-0000021c output: puppet ran at Thu Sep 13 16:02:00 UTC 2012 [16:02:14] PROBLEM host: mobile-wlm2 is DOWN address: i-0000038e CRITICAL - Host Unreachable (i-0000038e) [16:02:44] RECOVERY Puppet freshness is now: OK on wmde-test i-000002ad output: puppet ran at Thu Sep 13 16:02:30 UTC 2012 [16:02:44] RECOVERY Puppet freshness is now: OK on greensmw1 i-0000032c output: puppet ran at Thu Sep 13 16:02:38 UTC 2012 [16:03:14] RECOVERY Puppet freshness is now: OK on asher1 i-0000003a output: puppet ran at Thu Sep 13 16:03:01 UTC 2012 [16:03:24] RECOVERY Puppet freshness is now: OK on pediapress-packager i-000001e4 output: puppet ran at Thu Sep 13 16:03:19 UTC 2012 [16:03:44] RECOVERY Puppet freshness is now: OK on wikiminiatlas i-0000038c output: puppet ran at Thu Sep 13 16:03:33 UTC 2012 [16:03:44] RECOVERY Puppet freshness is now: OK on incubator-apache i-00000211 output: puppet ran at Thu Sep 13 16:03:33 UTC 2012 [16:03:54] RECOVERY Puppet freshness is now: OK on feeds i-000000fa output: puppet ran at Thu Sep 13 16:03:43 UTC 2012 [16:03:54] RECOVERY Puppet freshness is now: OK on bastion1 i-000000ba output: puppet ran at Thu Sep 13 16:03:53 UTC 2012 [16:04:14] RECOVERY Puppet freshness is now: OK on labs-relay i-00000103 output: puppet ran at Thu Sep 13 16:04:03 UTC 2012 [16:04:44] RECOVERY Puppet freshness is now: OK on ve-nodejs i-00000245 output: puppet ran at Thu Sep 13 16:04:33 UTC 2012 [16:05:14] RECOVERY Puppet freshness is now: OK on gerrit-build i-000003f5 output: puppet ran at Thu Sep 13 16:04:55 UTC 2012 [16:05:14] RECOVERY Puppet freshness is now: OK on kubo i-000003dd output: puppet ran at Thu Sep 13 16:04:59 UTC 2012 [16:05:24] RECOVERY Puppet freshness is now: OK on fundraising-civicrm i-00000169 output: puppet ran at Thu Sep 13 16:05:10 UTC 2012 [16:05:44] RECOVERY Puppet freshness is now: OK on deployment-cache-upload04 i-00000357 output: puppet ran at Thu Sep 13 16:05:35 UTC 2012 [16:05:54] RECOVERY Puppet freshness is now: OK on nginx-dev1 i-000000f0 output: puppet ran at Thu Sep 13 16:05:50 UTC 2012 [16:05:54] RECOVERY Puppet freshness is now: OK on dumps-bot3 i-000003ef output: puppet ran at Thu Sep 13 16:05:52 UTC 2012 [16:05:54] RECOVERY Puppet freshness is now: OK on bots-sql1 i-000000b5 output: puppet ran at Thu Sep 13 16:05:53 UTC 2012 [16:06:14] RECOVERY Puppet freshness is now: OK on nova-dev3 i-000000e9 output: puppet ran at Thu Sep 13 16:05:58 UTC 2012 [16:06:54] RECOVERY Puppet freshness is now: OK on bastion-restricted1 i-0000019b output: puppet ran at Thu Sep 13 16:06:39 UTC 2012 [16:07:14] RECOVERY Puppet freshness is now: OK on gerrit-dev i-000003e3 output: puppet ran at Thu Sep 13 16:06:58 UTC 2012 [16:08:14] RECOVERY Puppet freshness is now: OK on pediapress-ocg2 i-00000234 output: puppet ran at Thu Sep 13 16:08:04 UTC 2012 [16:08:24] RECOVERY Puppet freshness is now: OK on master i-0000007a output: puppet ran at Thu Sep 13 16:08:10 UTC 2012 [16:08:24] RECOVERY Puppet freshness is now: OK on bots-3 i-000000e5 output: puppet ran at Thu Sep 13 16:08:11 UTC 2012 [16:08:24] RECOVERY Puppet freshness is now: OK on hugglewiki i-000000aa output: puppet ran at Thu Sep 13 16:08:12 UTC 2012 [16:08:24] RECOVERY Puppet freshness is now: OK on catsort-pub i-000001cc output: puppet ran at Thu Sep 13 16:08:21 UTC 2012 [16:08:24] RECOVERY Puppet freshness is now: OK on wikidata-dev-1 i-0000020c output: puppet ran at Thu Sep 13 16:08:24 UTC 2012 [16:09:14] RECOVERY Puppet freshness is now: OK on sultestdb i-0000032f output: puppet ran at Thu Sep 13 16:08:58 UTC 2012 [16:09:44] RECOVERY Puppet freshness is now: OK on nagios-main i-0000030d output: puppet ran at Thu Sep 13 16:09:27 UTC 2012 [16:09:44] RECOVERY Puppet freshness is now: OK on wikibits-apache i-00000342 output: puppet ran at Thu Sep 13 16:09:37 UTC 2012 [16:09:54] RECOVERY Puppet freshness is now: OK on mobile-testing i-00000271 output: puppet ran at Thu Sep 13 16:09:44 UTC 2012 [16:09:54] RECOVERY Puppet freshness is now: OK on syslogcol-ac i-00000362 output: puppet ran at Thu Sep 13 16:09:52 UTC 2012 [16:09:54] RECOVERY Puppet freshness is now: OK on apachemxetc i-00000348 output: puppet ran at Thu Sep 13 16:09:53 UTC 2012 [16:10:14] RECOVERY Puppet freshness is now: OK on follow01-dev i-000003c6 output: puppet ran at Thu Sep 13 16:09:58 UTC 2012 [16:10:24] RECOVERY Puppet freshness is now: OK on demo-web1 i-00000255 output: puppet ran at Thu Sep 13 16:10:10 UTC 2012 [16:10:24] RECOVERY Puppet freshness is now: OK on dumps-bot1 i-000003ed output: puppet ran at Thu Sep 13 16:10:17 UTC 2012 [16:10:44] RECOVERY Puppet freshness is now: OK on nova-osm-keystone i-00000359 output: puppet ran at Thu Sep 13 16:10:33 UTC 2012 [16:11:24] RECOVERY Puppet freshness is now: OK on outreacheval i-0000012e output: puppet ran at Thu Sep 13 16:11:11 UTC 2012 [16:11:44] RECOVERY Puppet freshness is now: OK on dumps-incr i-0000035d output: puppet ran at Thu Sep 13 16:11:29 UTC 2012 [16:11:54] RECOVERY Puppet freshness is now: OK on build-precise1 i-00000273 output: puppet ran at Thu Sep 13 16:11:46 UTC 2012 [16:12:14] RECOVERY Puppet freshness is now: OK on deployment-squid i-000000dc output: puppet ran at Thu Sep 13 16:11:54 UTC 2012 [16:13:14] RECOVERY Puppet freshness is now: OK on translatesvg i-00000353 output: puppet ran at Thu Sep 13 16:12:56 UTC 2012 [16:13:54] PROBLEM host: deployment-backup is DOWN address: i-000000f8 CRITICAL - Host Unreachable (i-000000f8) [16:14:14] RECOVERY Puppet freshness is now: OK on deployment-apache33 i-0000031b output: puppet ran at Thu Sep 13 16:14:03 UTC 2012 [16:15:04] PROBLEM host: analytics is DOWN address: i-000000e2 CRITICAL - Host Unreachable (i-000000e2) [16:15:44] PROBLEM host: configtest-main is DOWN address: i-000002dd CRITICAL - Host Unreachable (i-000002dd) [16:15:44] PROBLEM host: wep is DOWN address: i-000000c2 CRITICAL - Host Unreachable (i-000000c2) [16:22:14] PROBLEM host: mobile-feeds is DOWN address: i-000000c1 CRITICAL - Host Unreachable (i-000000c1) [16:24:04] PROBLEM host: lynwood is DOWN address: i-000003e5 PING CRITICAL - Packet loss = 100% [16:24:24] PROBLEM host: conventionextension-test is DOWN address: i-000003c0 CRITICAL - Host Unreachable (i-000003c0) [16:27:44] PROBLEM host: pageviews is DOWN address: i-000000b2 CRITICAL - Host Unreachable (i-000000b2) [16:29:34] PROBLEM host: deployment-feed is DOWN address: i-00000118 CRITICAL - Host Unreachable (i-00000118) [16:32:14] PROBLEM host: mobile-wlm2 is DOWN address: i-0000038e CRITICAL - Host Unreachable (i-0000038e) [16:33:44] RECOVERY Puppet freshness is now: OK on udp-filter i-000001df output: puppet ran at Thu Sep 13 16:33:30 UTC 2012 [16:39:54] RECOVERY Puppet freshness is now: OK on deployment-mc i-0000021b output: puppet ran at Thu Sep 13 16:39:41 UTC 2012 [16:39:54] RECOVERY Puppet freshness is now: OK on grail i-000003aa output: puppet ran at Thu Sep 13 16:39:44 UTC 2012 [16:43:24] RECOVERY Free ram is now: OK on dumps-bot3 i-000003ef output: 5765284 [16:44:54] PROBLEM host: deployment-backup is DOWN address: i-000000f8 CRITICAL - Host Unreachable (i-000000f8) [16:45:44] PROBLEM host: analytics is DOWN address: i-000000e2 CRITICAL - Host Unreachable (i-000000e2) [16:45:44] PROBLEM host: configtest-main is DOWN address: i-000002dd CRITICAL - Host Unreachable (i-000002dd) [16:45:44] PROBLEM host: wep is DOWN address: i-000000c2 CRITICAL - Host Unreachable (i-000000c2) [16:52:14] PROBLEM host: mobile-feeds is DOWN address: i-000000c1 CRITICAL - Host Unreachable (i-000000c1) [16:54:04] PROBLEM host: lynwood is DOWN address: i-000003e5 PING CRITICAL - Packet loss = 100% [16:55:24] PROBLEM host: conventionextension-test is DOWN address: i-000003c0 CRITICAL - Host Unreachable (i-000003c0) [16:57:44] PROBLEM host: pageviews is DOWN address: i-000000b2 CRITICAL - Host Unreachable (i-000000b2) [16:59:54] PROBLEM host: deployment-feed is DOWN address: i-00000118 CRITICAL - Host Unreachable (i-00000118) [17:02:14] PROBLEM host: mobile-wlm2 is DOWN address: i-0000038e CRITICAL - Host Unreachable (i-0000038e) [17:02:14] PROBLEM Total Processes is now: CRITICAL on wikistats-01 i-00000042 output: PROCS CRITICAL: 271 processes [17:04:55] Bleh, is nfs being crappy today? [17:05:11] !log bots fixed pupept hostname on bots-apache1 [17:05:12] Logged the message, Master [17:05:14] RECOVERY Puppet freshness is now: OK on bots-apache1 i-000000b0 output: puppet ran at Thu Sep 13 17:04:54 UTC 2012 [17:05:20] :o [17:05:22] * Damianz sees salt [17:07:14] PROBLEM Total Processes is now: WARNING on wikistats-01 i-00000042 output: PROCS WARNING: 178 processes [17:08:34] PROBLEM Puppet freshness is now: CRITICAL on build1 i-000002b3 output: Puppet has not run in the last 10 hours [17:09:34] PROBLEM Puppet freshness is now: CRITICAL on maps-osmmapnik i-0000039b output: Puppet has not run in the last 10 hours [17:09:34] PROBLEM Puppet freshness is now: CRITICAL on micro-design i-000003e8 output: Puppet has not run in the last 10 hours [17:15:44] PROBLEM host: analytics is DOWN address: i-000000e2 CRITICAL - Host Unreachable (i-000000e2) [17:15:44] PROBLEM host: configtest-main is DOWN address: i-000002dd CRITICAL - Host Unreachable (i-000002dd) [17:15:44] PROBLEM host: deployment-backup is DOWN address: i-000000f8 CRITICAL - Host Unreachable (i-000000f8) [17:15:44] PROBLEM host: wep is DOWN address: i-000000c2 CRITICAL - Host Unreachable (i-000000c2) [17:22:14] PROBLEM host: mobile-feeds is DOWN address: i-000000c1 CRITICAL - Host Unreachable (i-000000c1) [17:24:04] PROBLEM host: lynwood is DOWN address: i-000003e5 PING CRITICAL - Packet loss = 100% [17:25:44] PROBLEM host: conventionextension-test is DOWN address: i-000003c0 CRITICAL - Host Unreachable (i-000003c0) [17:27:44] PROBLEM host: pageviews is DOWN address: i-000000b2 CRITICAL - Host Unreachable (i-000000b2) [17:29:54] PROBLEM host: deployment-feed is DOWN address: i-00000118 CRITICAL - Host Unreachable (i-00000118) [17:32:14] PROBLEM host: mobile-wlm2 is DOWN address: i-0000038e CRITICAL - Host Unreachable (i-0000038e) [17:45:44] PROBLEM host: configtest-main is DOWN address: i-000002dd CRITICAL - Host Unreachable (i-000002dd) [17:45:44] PROBLEM host: analytics is DOWN address: i-000000e2 CRITICAL - Host Unreachable (i-000000e2) [17:45:44] PROBLEM host: wep is DOWN address: i-000000c2 CRITICAL - Host Unreachable (i-000000c2) [17:45:44] PROBLEM host: deployment-backup is DOWN address: i-000000f8 CRITICAL - Host Unreachable (i-000000f8) [17:52:14] PROBLEM host: mobile-feeds is DOWN address: i-000000c1 CRITICAL - Host Unreachable (i-000000c1) [17:54:04] PROBLEM host: lynwood is DOWN address: i-000003e5 PING CRITICAL - Packet loss = 100% [17:55:44] PROBLEM host: conventionextension-test is DOWN address: i-000003c0 CRITICAL - Host Unreachable (i-000003c0) [17:57:44] PROBLEM host: pageviews is DOWN address: i-000000b2 CRITICAL - Host Unreachable (i-000000b2) [18:00:24] PROBLEM host: deployment-feed is DOWN address: i-00000118 CRITICAL - Host Unreachable (i-00000118) [18:02:14] PROBLEM host: mobile-wlm2 is DOWN address: i-0000038e CRITICAL - Host Unreachable (i-0000038e) [18:15:44] PROBLEM host: configtest-main is DOWN address: i-000002dd CRITICAL - Host Unreachable (i-000002dd) [18:15:44] PROBLEM host: analytics is DOWN address: i-000000e2 CRITICAL - Host Unreachable (i-000000e2) [18:15:44] PROBLEM host: wep is DOWN address: i-000000c2 CRITICAL - Host Unreachable (i-000000c2) [18:15:44] PROBLEM host: deployment-backup is DOWN address: i-000000f8 CRITICAL - Host Unreachable (i-000000f8) [18:22:14] PROBLEM host: mobile-feeds is DOWN address: i-000000c1 CRITICAL - Host Unreachable (i-000000c1) [18:24:04] PROBLEM host: lynwood is DOWN address: i-000003e5 PING CRITICAL - Packet loss = 100% [18:24:44] PROBLEM Free ram is now: WARNING on bots-sql2 i-000000af output: 878716 [18:25:44] PROBLEM host: conventionextension-test is DOWN address: i-000003c0 CRITICAL - Host Unreachable (i-000003c0) [18:27:44] PROBLEM host: pageviews is DOWN address: i-000000b2 CRITICAL - Host Unreachable (i-000000b2) [18:29:21] * Damianz wonders if ryan's around today [18:30:24] PROBLEM host: deployment-feed is DOWN address: i-00000118 CRITICAL - Host Unreachable (i-00000118) [18:32:14] PROBLEM host: mobile-wlm2 is DOWN address: i-0000038e CRITICAL - Host Unreachable (i-0000038e) [18:34:27] wow linkwatcher bot rapes mysql [18:40:17] petan: https://gerrit.wikimedia.org/r/#/c/23681/ fixes the ram check output so it shows a % [18:45:44] PROBLEM host: configtest-main is DOWN address: i-000002dd CRITICAL - Host Unreachable (i-000002dd) [18:45:44] PROBLEM host: analytics is DOWN address: i-000000e2 CRITICAL - Host Unreachable (i-000000e2) [18:45:44] PROBLEM host: wep is DOWN address: i-000000c2 CRITICAL - Host Unreachable (i-000000c2) [18:45:44] PROBLEM host: deployment-backup is DOWN address: i-000000f8 CRITICAL - Host Unreachable (i-000000f8) [18:47:21] hashar is just evil :P I swear lcarr bites [18:48:00] \O/ [18:48:18] I don't think Leslie is connected [18:48:22] I can see her right now [18:48:34] she probably can't connect to the internet :D [18:48:42] Oh the irony [18:48:43] unfortunate for a network engineer *grins* [18:49:03] + I don't think she is joining the -labs room [18:49:17] She did say the office wifi wasn't here responsibility though, could totally be revenge by the office admins [18:49:44] RECOVERY Free ram is now: OK on bots-sql2 i-000000af output: OK: 20% free memory [18:50:36] Damianz: we are actually in a conference center [18:50:45] :o [18:50:51] Damianz: and of course its wifi does not work that well [18:51:13] Conference wifi never does, you should do what fosdem does and get cisco to come impliment a custom network for the weekend heh [18:52:14] PROBLEM host: mobile-feeds is DOWN address: i-000000c1 CRITICAL - Host Unreachable (i-000000c1) [18:54:04] PROBLEM host: lynwood is DOWN address: i-000003e5 PING CRITICAL - Packet loss = 100% [18:55:44] PROBLEM host: conventionextension-test is DOWN address: i-000003c0 CRITICAL - Host Unreachable (i-000003c0) [18:57:44] PROBLEM host: pageviews is DOWN address: i-000000b2 CRITICAL - Host Unreachable (i-000000b2) [18:57:44] PROBLEM Free ram is now: WARNING on bots-sql2 i-000000af output: 856464 [19:00:24] PROBLEM host: deployment-feed is DOWN address: i-00000118 CRITICAL - Host Unreachable (i-00000118) [19:02:14] PROBLEM host: mobile-wlm2 is DOWN address: i-0000038e CRITICAL - Host Unreachable (i-0000038e) [19:02:14] PROBLEM Total Processes is now: CRITICAL on wikistats-01 i-00000042 output: PROCS CRITICAL: 271 processes [19:04:51] Change on 12mediawiki a page Developer access was modified, changed by Ebraminio link https://www.mediawiki.org/w/index.php?diff=583009 edit summary: [19:13:34] PROBLEM Puppet freshness is now: CRITICAL on aggregator-test1 i-000002bf output: Puppet has not run in the last 10 hours [19:13:34] PROBLEM Puppet freshness is now: CRITICAL on aggregator1 i-0000010c output: Puppet has not run in the last 10 hours [19:13:34] PROBLEM Puppet freshness is now: CRITICAL on aggregator2 i-000002c0 output: Puppet has not run in the last 10 hours [19:13:34] PROBLEM Puppet freshness is now: CRITICAL on blamemaps-m1xsmall i-0000039e output: Puppet has not run in the last 10 hours [19:13:34] PROBLEM Puppet freshness is now: CRITICAL on bots-cb-dev i-000003d1 output: Puppet has not run in the last 10 hours [19:13:35] PROBLEM Puppet freshness is now: CRITICAL on demo-web2 i-00000285 output: Puppet has not run in the last 10 hours [19:13:35] PROBLEM Puppet freshness is now: CRITICAL on deployment-bastion i-00000390 output: Puppet has not run in the last 10 hours [19:13:36] PROBLEM Puppet freshness is now: CRITICAL on deployment-cache-bits02 i-0000031c output: Puppet has not run in the last 10 hours [19:13:36] PROBLEM Puppet freshness is now: CRITICAL on deployment-cache-upload03 i-0000034b output: Puppet has not run in the last 10 hours [19:13:37] PROBLEM Puppet freshness is now: CRITICAL on deployment-dbdump i-000000d2 output: Puppet has not run in the last 10 hours [19:13:37] PROBLEM Puppet freshness is now: CRITICAL on deployment-jobrunner06 i-0000031d output: Puppet has not run in the last 10 hours [19:13:38] PROBLEM Puppet freshness is now: CRITICAL on deployment-video03 i-000003c1 output: Puppet has not run in the last 10 hours [19:13:38] PROBLEM Puppet freshness is now: CRITICAL on dev-solr i-00000152 output: Puppet has not run in the last 10 hours [19:13:39] PROBLEM Puppet freshness is now: CRITICAL on embed-sandbox i-000000d1 output: Puppet has not run in the last 10 hours [19:14:34] PROBLEM Puppet freshness is now: CRITICAL on localpuppet2 i-0000029b output: Puppet has not run in the last 10 hours [19:14:34] PROBLEM Puppet freshness is now: CRITICAL on loon i-000003a5 output: Puppet has not run in the last 10 hours [19:14:34] PROBLEM Puppet freshness is now: CRITICAL on maps-osmrails i-00000373 output: Puppet has not run in the last 10 hours [19:14:34] PROBLEM Puppet freshness is now: CRITICAL on maps-test2 i-00000253 output: Puppet has not run in the last 10 hours [19:14:34] PROBLEM Puppet freshness is now: CRITICAL on maps-tilemill1 i-00000294 output: Puppet has not run in the last 10 hours [19:16:27] Change on 12mediawiki a page Developer access was modified, changed by Jeremyb link https://www.mediawiki.org/w/index.php?diff=583018 edit summary: /* User:Ebraminio */ done [21:18:09] Change on 12mediawiki a page Developer access was modified, changed by Ebraminio link https://www.mediawiki.org/w/index.php?diff=583041 edit summary: /* User:Ebraminio */ [21:27:36] Ryan_Lane: I rebased I99bf6a37 for you. The conflict was on .gitreview. [21:27:43] Made no other changes. [22:00:18] TomDaley: ah. cool [22:00:19] thanks [22:01:42] I swear all mediawiki staff pictures are too happy [22:25:42] Damianz: heh. true [23:13:59] !nagios [23:13:59] http://208.80.153.210/nagios3 http://nagios.wmflabs.org/nagios3 [23:18:53] !g I23764a9b483f9a1fcba1017cac1c73ef18860374 [23:18:53] https://gerrit.wikimedia.org/r/#q,I23764a9b483f9a1fcba1017cac1c73ef18860374,n,z [23:26:25] !log deployment-prep migrate -dbdump misc::scripts to misc::deployment::scripts [23:26:28] Logged the message, Master