[00:02:00] PROBLEM - gitblit.wikimedia.org on antimony is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Server Error - 1703 bytes in 4.308 second response time [00:08:20] (03PS4) 10Tim Landscheidt: Initial setup for legalteamwiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/112850 (owner: 10TTO) [00:11:00] RECOVERY - gitblit.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 64995 bytes in 4.908 second response time [00:42:00] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [01:05:30] PROBLEM - Puppet freshness on labstore4 is CRITICAL: Last successful Puppet run was Tue 25 Feb 2014 06:33:37 PM UTC [01:20:30] PROBLEM - Puppet freshness on labsdb1005 is CRITICAL: Last successful Puppet run was Sat 01 Mar 2014 10:15:50 AM UTC [01:40:00] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [01:47:30] PROBLEM - Puppet freshness on lanthanum is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 04:37:38 PM UTC [02:02:12] !log LocalisationUpdate completed (1.23wmf15) at 2014-03-02 02:02:12+00:00 [02:02:28] Logged the message, Master [02:03:00] !log LocalisationUpdate completed (1.23wmf16) at 2014-03-02 02:03:00+00:00 [02:03:09] Logged the message, Master [02:07:46] hi [02:08:45] !log LocalisationUpdate ResourceLoader cache refresh completed at 2014-03-02 02:08:45+00:00 [02:08:53] Logged the message, Master [02:35:30] PROBLEM - Puppet freshness on gallium is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 05:25:59 PM UTC [02:58:30] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 08:45:44 AM UTC [04:06:30] PROBLEM - Puppet freshness on labstore4 is CRITICAL: Last successful Puppet run was Tue 25 Feb 2014 06:33:37 PM UTC [04:21:30] PROBLEM - Puppet freshness on labsdb1005 is CRITICAL: Last successful Puppet run was Sat 01 Mar 2014 10:15:50 AM UTC [04:48:30] PROBLEM - Puppet freshness on lanthanum is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 04:37:38 PM UTC [05:36:30] PROBLEM - Puppet freshness on gallium is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 05:25:59 PM UTC [05:42:40] PROBLEM - Host mw27 is DOWN: PING CRITICAL - Packet loss = 100% [05:43:20] RECOVERY - Host mw27 is UP: PING OK - Packet loss = 0%, RTA = 35.34 ms [05:59:30] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 08:45:44 AM UTC [06:27:40] PROBLEM - udp2log log age for emery on emery is CRITICAL: CRITICAL: log files /a/log/webrequest/packet-loss.log, have not been written in a critical amount of time. For most logs, this is 4 hours. For slow logs, this is 4 days. [06:28:40] RECOVERY - udp2log log age for emery on emery is OK: OK: all log files active [07:07:30] PROBLEM - Puppet freshness on labstore4 is CRITICAL: Last successful Puppet run was Tue 25 Feb 2014 06:33:37 PM UTC [07:22:31] PROBLEM - Puppet freshness on labsdb1005 is CRITICAL: Last successful Puppet run was Sat 01 Mar 2014 10:15:50 AM UTC [07:49:30] PROBLEM - Puppet freshness on lanthanum is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 04:37:38 PM UTC [08:37:30] PROBLEM - Puppet freshness on gallium is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 05:25:59 PM UTC [09:00:30] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 08:45:44 AM UTC [10:08:30] PROBLEM - Puppet freshness on labstore4 is CRITICAL: Last successful Puppet run was Tue 25 Feb 2014 06:33:37 PM UTC [10:11:10] PROBLEM - SSH on amslvs1 is CRITICAL: Server answer: [10:15:10] RECOVERY - SSH on amslvs1 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [10:18:10] PROBLEM - SSH on amslvs1 is CRITICAL: Server answer: [10:23:30] PROBLEM - Puppet freshness on labsdb1005 is CRITICAL: Last successful Puppet run was Sat 01 Mar 2014 10:15:50 AM UTC [10:26:10] RECOVERY - SSH on amslvs1 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [10:43:23] (03PS1) 10Alexandros Kosiaris: Fix some variable scoping issues in osm [operations/puppet] - 10https://gerrit.wikimedia.org/r/116275 [10:45:05] (03CR) 10Alexandros Kosiaris: [C: 032] Fix some variable scoping issues in osm [operations/puppet] - 10https://gerrit.wikimedia.org/r/116275 (owner: 10Alexandros Kosiaris) [10:47:00] RECOVERY - Puppet freshness on labsdb1005 is OK: puppet ran at Sun Mar 2 10:46:49 UTC 2014 [10:50:30] PROBLEM - Puppet freshness on lanthanum is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 04:37:38 PM UTC [11:00:04] (03PS1) 10Alexandros Kosiaris: Break dependency cycle in osm [operations/puppet] - 10https://gerrit.wikimedia.org/r/116276 [11:02:52] (03CR) 10Alexandros Kosiaris: [C: 032] Break dependency cycle in osm [operations/puppet] - 10https://gerrit.wikimedia.org/r/116276 (owner: 10Alexandros Kosiaris) [11:38:30] PROBLEM - Puppet freshness on gallium is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 05:25:59 PM UTC [12:01:30] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 08:45:44 AM UTC [13:09:30] PROBLEM - Puppet freshness on labstore4 is CRITICAL: Last successful Puppet run was Tue 25 Feb 2014 06:33:37 PM UTC [13:51:30] PROBLEM - Puppet freshness on lanthanum is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 04:37:38 PM UTC [14:05:00] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [14:39:30] PROBLEM - Puppet freshness on gallium is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 05:25:59 PM UTC [15:02:30] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 08:45:44 AM UTC [15:03:00] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [16:10:30] PROBLEM - Puppet freshness on labstore4 is CRITICAL: Last successful Puppet run was Tue 25 Feb 2014 06:33:37 PM UTC [16:52:30] PROBLEM - Puppet freshness on lanthanum is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 04:37:38 PM UTC [17:16:10] PROBLEM - mailman on sodium is CRITICAL: PROCS CRITICAL: 203 processes with args mailman [17:17:20] PROBLEM - HTTPS on sodium is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:18:10] PROBLEM - HTTP on sodium is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:18:48] who wants to kick soduim? [17:20:00] PROBLEM - Exim SMTP on sodium is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:20:10] PROBLEM - SSH on sodium is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:20:20] this does not look good [17:20:20] PROBLEM - spamassassin on sodium is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [17:21:07] no mails for you! [17:21:28] It responds to ping.. what more do you want!? [17:22:36] ganglia wont let me choose things from the combo boxes [17:23:39] Reedy: you need to gift the gods before [17:24:03] https://ganglia.wikimedia.org/latest/?c=Miscellaneous%20eqiad&h=sodium.wikimedia.org&m=cpu_report&r=hour&s=by%20name&hc=4&mc=2 [17:24:34] swapdeath [17:25:16] the site is slow as hell for me [17:25:33] hopefully the 2 aren't related [17:25:41] hope so [17:25:49] seems ok for me on enwiki. not great, but not that bad [17:26:05] 26s for rc page [17:26:24] matanya, you're hitting Amsterdam cluster, right? [17:26:29] yes [17:26:42] when going to eqiad it is ok [17:27:06] performs well for me [17:27:42] it is bits fault [17:27:51] i get 503 [17:28:07] sodium is being looked into [17:29:32] Host: bits.wikimedia.org [17:29:46] GET /static-1.23wmf15/skins/common/images/poweredby_mediawiki_88x31.png HTTP/1.1 [17:29:46] Host: bits.wikimedia.org [17:29:56] this one is aborted [17:30:04] esams bits caches look normal [17:30:15] 2m 4s (onload: 1m 58s) [17:30:25] !log powercycled sodium, swapdeath [17:30:32] Logged the message, Master [17:31:49] takng a while to do anything [17:32:18] it is a network issue, not bits [17:32:38] matanya: stop torrenting [17:32:41] :D [17:32:44] :) [17:32:50] PROBLEM - NTP on sodium is CRITICAL: NTP CRITICAL: Offset unknown [17:33:50] RECOVERY - Exim SMTP on sodium is OK: SMTP OK - 0.081 sec. response time [17:34:00] RECOVERY - SSH on sodium is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [17:34:00] RECOVERY - HTTP on sodium is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 190 bytes in 0.341 second response time [17:34:01] RECOVERY - mailman on sodium is OK: PROCS OK: 13 processes with args mailman [17:34:10] RECOVERY - spamassassin on sodium is OK: PROCS OK: 4 processes with args spamd [17:34:11] RECOVERY - HTTPS on sodium is OK: OK - Certificate will expire on 01/31/2016 02:58. [17:35:50] RECOVERY - NTP on sodium is OK: NTP OK: Offset 0.01452982426 secs [17:40:30] PROBLEM - Puppet freshness on gallium is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 05:25:59 PM UTC [18:03:30] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 08:45:44 AM UTC [18:27:00] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [19:00:48] (03CR) 10PiRSquared17: [C: 04-1] Remove Flow from Meta-Wiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/115412 (owner: 10Odder) [19:04:39] greg-g: Why not schedule the about patch to be -2'd? :p [19:11:30] PROBLEM - Puppet freshness on labstore4 is CRITICAL: Last successful Puppet run was Tue 25 Feb 2014 06:33:37 PM UTC [19:18:23] about? [19:25:00] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [19:45:00] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [19:46:56] (03Abandoned) 10Helder.wiki: Install ArticleFeedbackv5 on pt.wikibooks.org [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/71524 (owner: 10Helder.wiki) [19:53:30] PROBLEM - Puppet freshness on lanthanum is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 04:37:38 PM UTC [19:59:08] odder: above, probably, and I didn't understand it (the suggestion) [20:41:30] PROBLEM - Puppet freshness on gallium is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 05:25:59 PM UTC [20:43:00] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [21:04:30] PROBLEM - Puppet freshness on labstore1001 is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 08:45:44 AM UTC [21:44:00] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [22:00:40] PROBLEM - Kafka Broker Messages In on analytics1021 is CRITICAL: kafka.server.BrokerTopicMetrics.AllTopicsMessagesInPerSec.FifteenMinuteRate CRITICAL: 958.149862487 [22:12:30] PROBLEM - Puppet freshness on labstore4 is CRITICAL: Last successful Puppet run was Tue 25 Feb 2014 06:33:37 PM UTC [22:46:00] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [22:54:30] PROBLEM - Puppet freshness on lanthanum is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 04:37:38 PM UTC [23:42:30] PROBLEM - Puppet freshness on gallium is CRITICAL: Last successful Puppet run was Fri 28 Feb 2014 05:25:59 PM UTC