[00:00:00] petan: hah... my previous version was kind of stupid... why not just override the given args... this way even sql -v dewiki "SELECT ..." will work [00:01:56] (03CR) 10Hoo man: "Just came up with a less insane way of doing this... yay for bash" [operations/puppet] - 10https://gerrit.wikimedia.org/r/113755 (owner: 10Hoo man) [00:02:22] PROBLEM - gitblit.wikimedia.org on antimony is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Server Error - 1703 bytes in 4.413 second response time [00:05:21] "shUnit2 is a xUnit unit test framework for Bourne based shell scripts, and it is designed to work in a similar manner to JUnit, PyUnit, etc. If you have ever had the desire to write a unit test for a shell script, shUnit2 can do the job. " I shouldn't have googled that... [00:06:55] hoo: sounds scary [00:08:08] If you reach that point you're almost certainly using the wrong language... [00:08:31] agreed [00:09:22] RECOVERY - gitblit.wikimedia.org on antimony is OK: HTTP OK: HTTP/1.1 200 OK - 59445 bytes in 6.902 second response time [01:19:02] PROBLEM - puppetmaster https on virt0 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:25:02] PROBLEM - HTTP on virt0 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:29:02] RECOVERY - HTTP on virt0 is OK: HTTP OK: HTTP/1.1 302 Found - 457 bytes in 9.051 second response time [01:31:53] RECOVERY - puppetmaster https on virt0 is OK: HTTP OK: Status line output matched 400 - 336 bytes in 1.050 second response time [01:51:52] PROBLEM - Puppet freshness on virt1000 is CRITICAL: Last successful Puppet run was Fri 21 Feb 2014 04:42:42 PM UTC [02:01:57] !log LocalisationUpdate completed (1.23wmf14) at 2014-02-23 02:01:56+00:00 [02:02:06] Logged the message, Master [02:02:45] !log LocalisationUpdate completed (1.23wmf15) at 2014-02-23 02:02:44+00:00 [02:02:52] Logged the message, Master [02:08:36] !log LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-23 02:08:36+00:00 [02:08:45] Logged the message, Master [02:39:52] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Last successful Puppet run was Sat 22 Feb 2014 02:36:40 PM UTC [04:52:52] PROBLEM - Puppet freshness on virt1000 is CRITICAL: Last successful Puppet run was Fri 21 Feb 2014 04:42:42 PM UTC [05:08:22] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [05:08:32] PROBLEM - SSH on searchidx1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:09:12] RECOVERY - RAID on searchidx1001 is OK: OK: optimal, 1 logical, 4 physical [05:09:22] RECOVERY - SSH on searchidx1001 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [05:40:52] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Last successful Puppet run was Sat 22 Feb 2014 02:36:40 PM UTC [06:58:52] PROBLEM - Host mw31 is DOWN: PING CRITICAL - Packet loss = 100% [06:59:42] RECOVERY - Host mw31 is UP: PING OK - Packet loss = 0%, RTA = 35.43 ms [07:06:25] Ryan_Lane, when moving an instance between the two DCs, do you have any thoughts about how to keep the metadata in sync between them? (Most importantly I'm thinking of instance type.) [07:08:11] (03PS1) 10Andrew Bogott: Open up ssh for wikitech hosts. [operations/puppet] - 10https://gerrit.wikimedia.org/r/114963 [07:09:16] (03CR) 10Andrew Bogott: [C: 032] Open up ssh for wikitech hosts. [operations/puppet] - 10https://gerrit.wikimedia.org/r/114963 (owner: 10Andrew Bogott) [07:53:52] PROBLEM - Puppet freshness on virt1000 is CRITICAL: Last successful Puppet run was Fri 21 Feb 2014 04:42:42 PM UTC [08:41:52] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Last successful Puppet run was Sat 22 Feb 2014 02:36:40 PM UTC [10:06:22] PROBLEM - RAID on searchidx1001 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [10:07:12] RECOVERY - RAID on searchidx1001 is OK: OK: optimal, 1 logical, 4 physical [10:08:28] https://twitter.com/internetarchive/status/437268609052573697 [10:09:00] https://threatpost.com/us-cert-warns-of-ntp-amplification-attacks/103573 [10:54:52] PROBLEM - Puppet freshness on virt1000 is CRITICAL: Last successful Puppet run was Fri 21 Feb 2014 04:42:42 PM UTC [11:06:48] (03Abandoned) 10TTO: Get rid of echowikis.dblist [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/111766 (owner: 10TTO) [11:42:52] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Last successful Puppet run was Sat 22 Feb 2014 02:36:40 PM UTC [11:57:14] (03PS1) 10Odder: Set wmgBabelCategoryNames for Chinese Wikivoyage [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/114970 [13:55:52] PROBLEM - Puppet freshness on virt1000 is CRITICAL: Last successful Puppet run was Fri 21 Feb 2014 04:42:42 PM UTC [14:43:52] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Last successful Puppet run was Sat 22 Feb 2014 02:36:40 PM UTC [15:49:40] (03PS1) 10Aude: make wikiversions-labs.json have valid json [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/114980 [15:50:30] if someone is around and can merge trivial change that fixes beta ^ [16:56:52] PROBLEM - Puppet freshness on virt1000 is CRITICAL: Last successful Puppet run was Fri 21 Feb 2014 04:42:42 PM UTC [17:18:30] Is something wrong with dickson? I got kicked off about 6 hours ago and haven't been able to reconn since. [17:30:11] dicksoff [17:32:07] freenode admins are on it [17:34:54] heh Gloria [17:35:35] rdwrer: why do you connect to dickson? [17:36:45] it was one of three servers surviving yesterday's DDoS [17:37:10] in the chat.f.n rotation that is [17:40:02] so? does that make it better for the future too? [17:44:52] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Last successful Puppet run was Sat 22 Feb 2014 02:36:40 PM UTC [17:47:02] * Nemo_bis keeps irssi's default irc.freenote.net [17:47:48] freenote :D [17:51:57] whatever [17:52:04] that's why I keep defaults :P [17:52:43] heh :) [17:53:02] PROBLEM - Packetloss_Average on erbium is CRITICAL: packet_loss_average CRITICAL: 10.3618578947 [18:03:02] PROBLEM - Packetloss_Average on oxygen is CRITICAL: packet_loss_average CRITICAL: 10.7731942708 [18:10:54] Nemo_bis: I use it because it was recommended as the server to use for a bunch of us [18:11:14] rdwrer: recommended by whom/what? [18:12:04] I think an internal mailing list thread? [18:12:17] The idea was that if netsplits happened, we would all be stuck together [18:12:31] Obviously if the server goes down not so much. [18:12:56] 1. Please define "us." 2. Please define internal mailing list. [18:13:53] Yes, and if the server goes down you're all outside IRC :P [18:14:31] And if you're all in the same office you keep communication with people on the same floor of the same building while losing communication with the rest of the world [18:15:08] Anyway, ok, I only wanted to know if freenode suggests my setting is incorrect, I don't care about anything else [18:17:03] RECOVERY - Packetloss_Average on erbium is OK: packet_loss_average OKAY: 2.248312 [18:27:02] RECOVERY - Packetloss_Average on oxygen is OK: packet_loss_average OKAY: 1.663409375 [18:36:37] Nemo_bis: Yeah, this was an informal suggestion for staff, it wasn't very well adopted either :) [18:48:12] PROBLEM - Apache HTTP on mw1097 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:48:12] PROBLEM - Apache HTTP on mw1081 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:48:12] PROBLEM - Apache HTTP on mw1029 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:48:12] PROBLEM - Apache HTTP on mw1168 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:48:22] PROBLEM - Apache HTTP on mw1085 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:48:32] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [18:48:32] PROBLEM - Apache HTTP on mw1047 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:48:32] PROBLEM - Apache HTTP on mw1064 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:49:02] RECOVERY - Apache HTTP on mw1029 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.048 second response time [18:49:02] RECOVERY - Apache HTTP on mw1081 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.049 second response time [18:49:02] RECOVERY - Apache HTTP on mw1097 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.101 second response time [18:49:02] RECOVERY - Apache HTTP on mw1168 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.062 second response time [18:49:12] RECOVERY - Apache HTTP on mw1085 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.063 second response time [18:49:22] RECOVERY - Apache HTTP on mw1064 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.062 second response time [18:49:22] RECOVERY - Apache HTTP on mw1047 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.067 second response time [18:49:42] https://gdash.wikimedia.org/dashboards/reqerror/ is through the roof [18:50:01] czołem jeremyb [18:50:06] yeah, I also got spurious 503s on ru.wikipedia, that's why I went online. :) [18:50:39] odder: errr, translation? :) [18:50:58] wonder why that happens [18:51:54] jeremyb: "forehead jeremyb" is the literal translation [18:51:58] it means hello :-) [18:52:12] odder: hah [18:52:31] so czołem/forehead [18:53:46] odder: I'd translate it as "Cheers" though? [18:55:52] andre__: You'd say cheers in Poland as a toast, or to greet someone [18:56:04] neither of these translates as czołem [18:56:21] heh, true, good point. pretty similar to the usage here. :) [18:59:42] github's fault or what https://integration.wikimedia.org/ci/job/translatewiki-puppet-validate/1897/console [19:08:32] brb [19:14:12] PROBLEM - Apache HTTP on mw1168 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:22] PROBLEM - Apache HTTP on mw1085 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:22] PROBLEM - Apache HTTP on mw1109 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:22] PROBLEM - Apache HTTP on mw1167 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:22] PROBLEM - Apache HTTP on mw1062 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:32] PROBLEM - Apache HTTP on mw1041 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:32] PROBLEM - Apache HTTP on mw1030 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:32] PROBLEM - Apache HTTP on mw1064 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:32] PROBLEM - Apache HTTP on mw1058 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:32] PROBLEM - Apache HTTP on mw1078 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:32] PROBLEM - Apache HTTP on mw1033 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:32] PROBLEM - Apache HTTP on mw1166 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:33] PROBLEM - MySQL Processlist on db1009 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [19:14:33] PROBLEM - Apache HTTP on mw1080 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:34] PROBLEM - Apache HTTP on mw1104 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:34] PROBLEM - Apache HTTP on mw1073 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:35] PROBLEM - Apache HTTP on mw1218 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:35] PROBLEM - Apache HTTP on mw1088 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:36] PROBLEM - Apache HTTP on mw1045 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:36] PROBLEM - Apache HTTP on mw1180 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:42] PROBLEM - Apache HTTP on mw1102 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:42] PROBLEM - Apache HTTP on mw1056 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:42] PROBLEM - Apache HTTP on mw1027 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:42] PROBLEM - Apache HTTP on mw1093 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:42] PROBLEM - Apache HTTP on mw1063 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:42] PROBLEM - Apache HTTP on mw1090 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:42] PROBLEM - Apache HTTP on mw1179 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:43] PROBLEM - Apache HTTP on mw1169 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:43] PROBLEM - Apache HTTP on mw1181 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:44] PROBLEM - Apache HTTP on mw1025 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:44] PROBLEM - Apache HTTP on mw1046 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:45] PROBLEM - Apache HTTP on mw1164 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:45] PROBLEM - Apache HTTP on mw1061 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:46] PROBLEM - Apache HTTP on mw1050 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:46] PROBLEM - Apache HTTP on mw1184 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:47] PROBLEM - Apache HTTP on mw1172 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:50] not good [19:14:52] PROBLEM - Apache HTTP on mw1209 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:52] PROBLEM - Apache HTTP on mw1040 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:52] PROBLEM - Apache HTTP on mw1057 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:52] PROBLEM - Apache HTTP on mw1187 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:15:02] PROBLEM - Apache HTTP on mw1043 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:15:02] PROBLEM - Apache HTTP on mw1087 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:15:02] PROBLEM - Apache HTTP on mw1060 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:15:02] PROBLEM - Apache HTTP on mw1211 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:15:03] PROBLEM - Apache HTTP on mw1219 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:15:03] PROBLEM - Apache HTTP on mw1112 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:15:03] PROBLEM - Apache HTTP on mw1089 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:15:11] ^ the wikis just got slow for me. [19:15:22] Yes. [19:15:23] Apache problems? [19:15:24] lol [19:15:28] varnish errors here [19:15:38] huh: so many, icinga-wm was flood-kicked for reporting them. [19:15:40] Same here. [19:15:45] woah [19:16:19] 503 for Special:Watchlist, 503 for Special:Recentchanges [19:16:22] RECOVERY - Apache HTTP on mw1058 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.156 second response time [19:16:30] Request: GET http://commons.wikimedia.org/wiki/Special:Watchlist, from 10.64.32.105 via cp1066 cp1066 ([10.64.0.103]:3128), Varnish XID 650435178 [19:16:32] PROBLEM - Apache HTTP on mw1096 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:16:37] Request: GET http://commons.wikimedia.org/wiki/Special:RecentChanges, from 10.64.0.104 via cp1054 cp1054 ([10.64.32.106]:3128), Varnish XID 1990650877 [19:16:43] so different boxes [19:16:52] PROBLEM - Apache HTTP on mw1069 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:16:52] PROBLEM - Apache HTTP on mw1084 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:16:53] RECOVERY - Apache HTTP on mw1175 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 4.083 second response time [19:16:55] (03PS2) 10Hashar: make wikiversions-labs.json have valid json [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/114980 (owner: 10Aude) [19:17:02] RECOVERY - Apache HTTP on mw1022 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.959 second response time [19:17:02] PROBLEM - Apache HTTP on mw1111 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:17:02] PROBLEM - Apache HTTP on mw1182 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:17:12] RECOVERY - Apache HTTP on mw1174 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.536 second response time [19:17:12] PROBLEM - Apache HTTP on mw1037 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:17:22] RECOVERY - MySQL Processlist on db1009 is OK: OK 0 unauthenticated, 0 locked, 0 copy to table, 0 statistics [19:17:22] RECOVERY - Apache HTTP on mw1180 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.319 second response time [19:17:31] ... [19:17:32] RECOVERY - Apache HTTP on mw1166 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 4.239 second response time [19:17:32] RECOVERY - Apache HTTP on mw1184 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 1.915 second response time [19:17:41] Hi Technical_13. [19:17:42] RECOVERY - Apache HTTP on mw1164 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.390 second response time [19:17:48] apergos: Around? [19:17:49] Hi Gloria. [19:17:52] RECOVERY - Apache HTTP on mw1215 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 1.118 second response time [19:18:02] RECOVERY - Apache HTTP on mw1210 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.889 second response time [19:18:02] RECOVERY - Apache HTTP on mw1182 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 4.732 second response time [19:18:02] PROBLEM - Apache HTTP on mw1103 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:18:12] RECOVERY - Apache HTTP on mw1029 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.325 second response time [19:18:12] RECOVERY - Apache HTTP on mw1106 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.100 second response time [19:18:32] RECOVERY - Apache HTTP on mw1088 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.300 second response time [19:18:32] RECOVERY - Apache HTTP on mw1169 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.065 second response time [19:18:32] RECOVERY - Apache HTTP on mw1104 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.445 second response time [19:18:42] RECOVERY - Apache HTTP on mw1069 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.073 second response time [19:18:42] RECOVERY - Apache HTTP on mw1061 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.652 second response time [19:18:42] RECOVERY - Apache HTTP on mw1084 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.173 second response time [19:18:42] RECOVERY - Apache HTTP on mw1056 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 9.178 second response time [19:18:52] RECOVERY - Apache HTTP on mw1103 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.073 second response time [19:18:52] RECOVERY - Apache HTTP on mw1111 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.098 second response time [19:18:52] RECOVERY - Apache HTTP on mw1083 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.059 second response time [19:18:52] RECOVERY - Apache HTTP on mw1219 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.090 second response time [19:18:53] RECOVERY - Apache HTTP on mw1214 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.318 second response time [19:18:53] RECOVERY - Apache HTTP on mw1039 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.655 second response time [19:19:02] RECOVERY - Apache HTTP on mw1053 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.149 second response time [19:19:02] RECOVERY - Apache HTTP on mw1185 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 4.488 second response time [19:19:02] RECOVERY - MySQL InnoDB on db1009 is OK: OK longest blocking idle transaction sleeps for 0 seconds [19:19:02] RECOVERY - Apache HTTP on mw1038 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 9.463 second response time [19:19:02] RECOVERY - Apache HTTP on mw1037 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 2.859 second response time [19:19:03] RECOVERY - Apache HTTP on mw1168 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.068 second response time [19:19:03] RECOVERY - Apache HTTP on mw1070 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.716 second response time [19:19:03] (03CR) 10Hashar: [C: 032] "Pass PHP json_decode()" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/114980 (owner: 10Aude) [19:19:04] RECOVERY - Apache HTTP on mw1081 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.042 second response time [19:19:09] (03Merged) 10jenkins-bot: make wikiversions-labs.json have valid json [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/114980 (owner: 10Aude) [19:19:10] Request: POST http://en.wikipedia.org/w/index.php?title=Template:Mailing_list_member&action=submit, from 166.182.83.178 via cp1065 frontend ([208.80.154.224]:80), Varnish XID 3522474967 Forwarded for: 166.182.83.178 Error: 503, Service Unavailable at Sun, 23 Feb 2014 19:17:55 GMT [19:19:12] RECOVERY - Apache HTTP on mw1110 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.073 second response time [19:19:12] RECOVERY - Apache HTTP on mw1062 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 2.727 second response time [19:19:22] RECOVERY - Apache HTTP on mw1167 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 6.804 second response time [19:19:22] RECOVERY - Apache HTTP on mw1080 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.101 second response time [19:19:33] How long is this RECOVERY expected to take? [19:19:44] Technical_13: under a year [19:19:51] Sounds feasible. [19:19:53] RECOVERY - Apache HTTP on mw1087 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.638 second response time [19:20:00] Techman224: icinga-wm is rate-limiting itself in order not to be flood-kicked afaik [19:20:05] it sometimes fails :P [19:20:09] Technical_13: ^ [19:20:22] PROBLEM - Apache HTTP on mw1021 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:20:23] MatmaRex: I figured it was a tab fail... ;) [19:20:32] RECOVERY - Apache HTTP on mw1033 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.967 second response time [19:20:42] RECOVERY - Apache HTTP on mw1187 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.008 second response time [19:21:12] PROBLEM - Apache HTTP on mw1029 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:21:22] RECOVERY - Apache HTTP on mw1109 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.317 second response time [19:21:32] PROBLEM - Apache HTTP on mw1058 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:21:34] just fyi to others in here: [19:21:34] 14:18 greg-g is calling robh [19:21:34] 14:20 greg-g is calling paravoid [19:21:34] 14:20 < greg-g> paravoid answered, is coming, left message for robla [19:21:38] 14:21 < greg-g> er robh [19:21:42] PROBLEM - Apache HTTP on mw1093 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:21:42] PROBLEM - Apache HTTP on mw1061 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:21:44] I'm here now [19:21:52] PROBLEM - Apache HTTP on mw1040 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:21:57] * Technical_13 cheers! [19:21:59] it's s2 [19:22:02] PROBLEM - Apache HTTP on mw1032 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:22:02] PROBLEM - Apache HTTP on mw1060 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:22:02] RECOVERY - Apache HTTP on mw1048 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.080 second response time [19:22:02] PROBLEM - Apache HTTP on mw1219 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:22:02] PROBLEM - Apache HTTP on mw1103 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:22:03] PROBLEM - Apache HTTP on mw1035 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:22:03] PROBLEM - Apache HTTP on mw1053 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:22:04] PROBLEM - Apache HTTP on mw1095 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:22:04] PROBLEM - Apache HTTP on mw1039 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:22:05] PROBLEM - Apache HTTP on mw1185 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:22:05] PROBLEM - Apache HTTP on mw1038 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:22:06] PROBLEM - Apache HTTP on mw1217 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:22:12] RECOVERY - Apache HTTP on mw1097 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.962 second response time [19:22:12] PROBLEM - Apache HTTP on mw1081 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:22:12] PROBLEM - Apache HTTP on mw1213 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:22:22] RECOVERY - Apache HTTP on mw1021 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 6.348 second response time [19:22:32] RECOVERY - Apache HTTP on mw1078 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 9.405 second response time [19:22:42] RECOVERY - Apache HTTP on mw1209 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 2.494 second response time [19:22:42] RECOVERY - Apache HTTP on mw1113 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 6.696 second response time [19:22:42] RECOVERY - Apache HTTP on mw1040 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.104 second response time [19:22:52] RECOVERY - Apache HTTP on mw1060 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.191 second response time [19:22:53] * odder hands pom-poms to Technical_13 [19:23:02] PROBLEM - Apache HTTP on mw1210 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:23:02] PROBLEM - Apache HTTP on mw1022 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:23:02] PROBLEM - Apache HTTP on mw1161 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:23:20] I would, only I got a 503 when trying to view an en.WP article on pom-poms [19:23:22] RECOVERY - Apache HTTP on mw1030 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.742 second response time [19:23:23] Ironic. [19:23:32] PROBLEM - Apache HTTP on mw1176 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:23:32] PROBLEM - Apache HTTP on mw1088 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:23:32] PROBLEM - Apache HTTP on mw1080 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:23:52] RECOVERY - Apache HTTP on mw1039 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.071 second response time [19:23:52] RECOVERY - Apache HTTP on mw1162 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.058 second response time [19:23:53] * Technical_13 cheers "paravoid! paravoid! he's our man! if he can't do it, so-one else can!" while shaking the pom-poms from odder  [19:24:02] RECOVERY - Apache HTTP on mw1185 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.815 second response time [19:24:02] RECOVERY - Apache HTTP on mw1107 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.105 second response time [19:24:02] RECOVERY - Apache HTTP on mw1068 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.022 second response time [19:24:12] RECOVERY - Apache HTTP on mw1213 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 1.430 second response time [19:24:12] PROBLEM - LVS HTTP IPv4 on appservers.svc.eqiad.wmnet is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:24:15] RECOVERY - Apache HTTP on mw1081 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.395 second response time [19:24:15] RECOVERY - Apache HTTP on mw1178 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.726 second response time [19:24:22] RECOVERY - Apache HTTP on mw1058 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.803 second response time [19:24:22] RECOVERY - Apache HTTP on mw1041 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 6.045 second response time [19:24:32] RECOVERY - Apache HTTP on mw1088 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.387 second response time [19:24:42] PROBLEM - Apache HTTP on mw1105 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:24:46] tons of SELECT /* SiteStatsInit::edits */ COUNT(*) FROM `revision` LIMIT 1 [19:24:53] wtf [19:25:02] PROBLEM - Apache HTTP on mw1028 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:25:09] wtf indeed [19:25:28] WFT!? is more like it... :p [19:25:38] or WTF!? even.. [19:25:43] * Technical_13 isn't helpful... [19:25:52] PROBLEM - MySQL Processlist on db1060 is CRITICAL: CRIT 2 unauthenticated, 0 locked, 0 copy to table, 83 statistics [19:25:52] RECOVERY - Apache HTTP on mw1219 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.333 second response time [19:25:55] * Technical_13 knows this and shuts up now.. [19:26:02] RECOVERY - Apache HTTP on mw1044 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 4.299 second response time [19:26:02] RECOVERY - LVS HTTP IPv4 on appservers.svc.eqiad.wmnet is OK: HTTP OK: HTTP/1.1 200 OK - 65441 bytes in 0.290 second response time [19:26:06] RECOVERY - Apache HTTP on mw1091 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 9.079 second response time [19:26:32] RECOVERY - Apache HTTP on mw1181 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.080 second response time [19:26:32] RECOVERY - Apache HTTP on mw1050 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.087 second response time [19:26:32] PROBLEM - Apache HTTP on mw1092 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:26:42] PROBLEM - Apache HTTP on mw1216 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:26:42] PROBLEM - Apache HTTP on mw1113 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:26:42] RECOVERY - Apache HTTP on mw1172 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 9.305 second response time [19:26:52] PROBLEM - Apache HTTP on mw1040 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:26:52] RECOVERY - MySQL Processlist on db1060 is OK: OK 7 unauthenticated, 0 locked, 0 copy to table, 2 statistics [19:27:02] PROBLEM - Apache HTTP on mw1060 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:27:02] PROBLEM - Apache HTTP on mw1087 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:27:02] PROBLEM - Apache HTTP on mw1083 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:27:02] PROBLEM - Apache HTTP on mw1054 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:27:02] PROBLEM - Apache HTTP on mw1065 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:27:12] PROBLEM - Apache HTTP on mw1072 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:27:22] RECOVERY - Apache HTTP on mw1176 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.059 second response time [19:27:22] PROBLEM - Apache HTTP on mw1106 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:27:32] RECOVERY - Apache HTTP on mw1092 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 1.060 second response time [19:27:32] RECOVERY - Apache HTTP on mw1045 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 2.958 second response time [19:27:32] PROBLEM - Apache HTTP on mw1047 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:27:32] PROBLEM - Apache HTTP on mw1058 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:27:32] PROBLEM - Apache HTTP on mw1030 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:27:32] can people please keep the chatter down in here? no need to distract with comments that do add anything, please chatter over in -tech or similar [19:27:33] PROBLEM - Apache HTTP on mw1078 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:27:39] don't* [19:27:42] RECOVERY - Apache HTTP on mw1040 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 6.516 second response time [19:27:52] RECOVERY - Apache HTTP on mw1060 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 2.740 second response time [19:27:52] RECOVERY - Apache HTTP on mw1051 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.052 second response time [19:27:52] RECOVERY - Apache HTTP on mw1083 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.087 second response time [19:27:52] RECOVERY - Apache HTTP on mw1034 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.079 second response time [19:27:52] RECOVERY - Apache HTTP on mw1035 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 2.503 second response time [19:27:53] RECOVERY - Apache HTTP on mw1095 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 2.661 second response time [19:27:53] RECOVERY - Apache HTTP on mw1054 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.678 second response time [19:27:54] RECOVERY - Apache HTTP on mw1049 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.498 second response time [19:28:02] RECOVERY - Apache HTTP on mw1100 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 4.650 second response time [19:28:02] RECOVERY - Apache HTTP on mw1029 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.074 second response time [19:28:02] PROBLEM - Apache HTTP on mw1042 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:28:02] RECOVERY - Apache HTTP on mw1161 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 9.698 second response time [19:28:12] PROBLEM - Apache HTTP on mw1048 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:28:32] RECOVERY - Apache HTTP on mw1047 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.528 second response time [19:28:42] RECOVERY - Apache HTTP on mw1216 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.920 second response time [19:28:42] PROBLEM - Apache HTTP on mw1056 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:29:32] RECOVERY - Apache HTTP on mw1090 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 2.515 second response time [19:30:02] RECOVERY - Apache HTTP on mw1042 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.705 second response time [19:30:22] RECOVERY - Apache HTTP on mw1106 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.047 second response time [19:30:52] PROBLEM - Apache HTTP on mw1209 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:30:52] PROBLEM - Apache HTTP on mw1040 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:31:02] PROBLEM - Apache HTTP on mw1211 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:31:02] PROBLEM - Apache HTTP on mw1060 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:31:02] PROBLEM - Apache HTTP on mw1039 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:31:02] PROBLEM - Apache HTTP on mw1091 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:31:02] PROBLEM - Apache HTTP on mw1094 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:31:03] PROBLEM - Apache HTTP on mw1054 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:31:03] PROBLEM - Apache HTTP on mw1095 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:31:04] PROBLEM - Apache HTTP on mw1023 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:31:04] PROBLEM - Apache HTTP on mw1083 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:31:05] PROBLEM - Apache HTTP on mw1049 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:31:05] PROBLEM - Apache HTTP on mw1077 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:31:06] PROBLEM - Apache HTTP on mw1185 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:31:06] PROBLEM - Apache HTTP on mw1162 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:31:12] PROBLEM - Apache HTTP on mw1171 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:31:12] PROBLEM - Apache HTTP on mw1086 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:31:21] it's all plwiki [19:31:32] PROBLEM - Apache HTTP on mw1098 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:31:32] PROBLEM - Apache HTTP on mw1047 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:31:32] RECOVERY - Apache HTTP on mw1102 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 6.945 second response time [19:31:32] PROBLEM - Apache HTTP on mw1088 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:31:42] RECOVERY - Apache HTTP on mw1209 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.744 second response time [19:31:42] PROBLEM - Apache HTTP on mw1216 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:31:47] MatmaRex probably. [19:31:50] lol [19:31:52] RECOVERY - Apache HTTP on mw1091 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 1.408 second response time [19:32:02] God damn it MatmaRex. [19:32:02] PROBLEM - Apache HTTP on mw1066 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:32:02] PROBLEM - Apache HTTP on mw1076 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:32:02] RECOVERY - Apache HTTP on mw1171 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 4.864 second response time [19:32:12] RECOVERY - Apache HTTP on mw1048 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.982 second response time [19:32:12] RECOVERY - Apache HTTP on mw1072 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 6.393 second response time [19:32:12] PROBLEM - Apache HTTP on mw1029 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:32:17] people commented on my talk about RC being slow on my talk page before icinga-wm started spamming here [19:32:27] (on my talk page, too) [19:32:36] !log < paravoid> it's s2 [19:32:44] Logged the message, Master [19:32:49] and just when i came in to report the PROBLEMs appeared [19:32:52] RECOVERY - Apache HTTP on mw1210 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 2.468 second response time [19:32:54] !log < paravoid> tons of SELECT /* SiteStatsInit::edits */ COUNT(*) FROM `revision` LIMIT 1 [19:33:02] RECOVERY - Apache HTTP on mw1077 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.852 second response time [19:33:02] PROBLEM - Apache HTTP on mw1059 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:33:03] Logged the message, Master [19:33:05] !log < paravoid> it's all plwiki [19:33:12] RECOVERY - Apache HTTP on mw1074 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.867 second response time [19:33:13] Logged the message, Master [19:33:22] PROBLEM - Apache HTTP on mw1109 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:33:22] PROBLEM - Apache HTTP on mw1106 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:33:22] RECOVERY - Apache HTTP on mw1096 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.071 second response time [19:33:25] * odder high-fives greg-g for not forgetting to log things [19:33:32] RECOVERY - Apache HTTP on mw1098 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 6.792 second response time [19:34:42] PROBLEM - Apache HTTP on mw1090 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:34:42] PROBLEM - Apache HTTP on mw1172 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:34:52] PROBLEM - Apache HTTP on mw1084 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:34:52] RECOVERY - Apache HTTP on mw1076 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.131 second response time [19:35:02] RECOVERY - Apache HTTP on mw1020 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.520 second response time [19:35:02] PROBLEM - Apache HTTP on mw1100 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:35:02] RECOVERY - Apache HTTP on mw1059 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 9.130 second response time [19:35:12] PROBLEM - Apache HTTP on mw1188 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:35:12] PROBLEM - Apache HTTP on mw1168 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:35:22] RECOVERY - Apache HTTP on mw1047 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.809 second response time [19:35:22] RECOVERY - Apache HTTP on mw1218 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.067 second response time [19:35:32] RECOVERY - Apache HTTP on mw1172 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.073 second response time [19:35:42] RECOVERY - Apache HTTP on mw1084 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.084 second response time [19:35:42] RECOVERY - Apache HTTP on mw1105 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.245 second response time [19:35:42] PROBLEM - Apache HTTP on mw1179 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:35:52] PROBLEM - MySQL Processlist on db1060 is CRITICAL: CRIT 14 unauthenticated, 0 locked, 0 copy to table, 75 statistics [19:35:52] RECOVERY - Apache HTTP on mw1039 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.076 second response time [19:35:52] RECOVERY - Apache HTTP on mw1060 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.343 second response time [19:36:02] PROBLEM - Apache HTTP on mw1175 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:36:02] PROBLEM - Apache HTTP on mw1077 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:36:12] RECOVERY - Apache HTTP on mw1188 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.235 second response time [19:36:12] RECOVERY - Apache HTTP on mw1168 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 9.876 second response time [19:36:32] RECOVERY - Apache HTTP on mw1216 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.056 second response time [19:36:32] RECOVERY - Apache HTTP on mw1179 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.064 second response time [19:36:52] RECOVERY - MySQL Processlist on db1060 is OK: OK 1 unauthenticated, 0 locked, 0 copy to table, 3 statistics [19:36:52] RECOVERY - Apache HTTP on mw1099 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 4.461 second response time [19:36:53] RECOVERY - Apache HTTP on mw1175 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 4.625 second response time [19:37:02] RECOVERY - Apache HTTP on mw1065 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 6.063 second response time [19:37:02] RECOVERY - Apache HTTP on mw1100 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.464 second response time [19:37:02] RECOVERY - Apache HTTP on mw1077 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.815 second response time [19:37:02] RECOVERY - Apache HTTP on mw1029 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.084 second response time [19:37:02] RECOVERY - Apache HTTP on mw1185 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.208 second response time [19:37:03] RECOVERY - Apache HTTP on mw1022 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 9.526 second response time [19:37:03] RECOVERY - Apache HTTP on mw1019 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 9.777 second response time [19:37:04] PROBLEM - Apache HTTP on mw1091 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:37:12] RECOVERY - Apache HTTP on mw1108 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.445 second response time [19:37:12] you people totally need to tell me later whatever it was that plwiki did that made it go down :D [19:37:22] RECOVERY - Apache HTTP on mw1030 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.060 second response time [19:37:32] RECOVERY - Apache HTTP on mw1061 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 1.105 second response time [19:37:39] something that has to do with SiteStatsInit, probably [19:37:42] RECOVERY - Apache HTTP on mw1090 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.409 second response time [19:37:52] RECOVERY - Apache HTTP on mw1211 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.050 second response time [19:37:52] RECOVERY - Apache HTTP on mw1032 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.638 second response time [19:37:52] RECOVERY - Apache HTTP on mw1038 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.074 second response time [19:37:52] RECOVERY - Apache HTTP on mw1023 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.096 second response time [19:37:52] RECOVERY - Apache HTTP on mw1028 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.055 second response time [19:37:54] !log < paravoid> something that has to do with SiteStatsInit, probably [19:37:57] ;) [19:38:01] Logged the message, Master [19:38:02] RECOVERY - Apache HTTP on mw1067 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 2.224 second response time [19:38:02] PROBLEM - Apache HTTP on mw1070 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:38:24] paravoid: well, this i know from what you said, but that doesn't look like it's supposed to ever run in WMF environment [19:38:44] greg-g: You'd think some people might find it condescending for you to log for them. [19:39:22] PROBLEM - Apache HTTP on mw1062 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:39:56] and why it's preventing edits to enwp if it is just a plwp thing... [19:39:57] not me [19:40:32] PROBLEM - Apache HTTP on mw1098 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:32] PROBLEM - Apache HTTP on mw1041 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:32] PROBLEM - Apache HTTP on mw1092 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:32] PROBLEM - Apache HTTP on mw1045 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:32] PROBLEM - Apache HTTP on mw1033 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:33] PROBLEM - Apache HTTP on mw1166 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:33] PROBLEM - Apache HTTP on mw1096 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:42] PROBLEM - Apache HTTP on mw1102 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:40:45] Technical_13: bits seems to be affected too. that touches almost everything [19:40:52] PROBLEM - Apache HTTP on mw1187 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:02] PROBLEM - Apache HTTP on mw1032 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:02] PROBLEM - Apache HTTP on mw1023 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:02] PROBLEM - Apache HTTP on mw1051 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:02] PROBLEM - Apache HTTP on mw1214 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:02] PROBLEM - Apache HTTP on mw1034 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:03] PROBLEM - Apache HTTP on mw1212 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:03] PROBLEM - Apache HTTP on mw1065 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:04] PROBLEM - Apache HTTP on mw1101 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:04] PROBLEM - Apache HTTP on mw1100 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:12] PROBLEM - Apache HTTP on mw1072 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:12] PROBLEM - Apache HTTP on mw1029 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:12] PROBLEM - Apache HTTP on mw1213 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:41:18] Right. It's a cluster of servers. There are single points of failure still. [19:41:18] ahh.. that makes sense [19:41:52] RECOVERY - Apache HTTP on mw1032 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.089 second response time [19:41:52] PROBLEM - MySQL Processlist on db1060 is CRITICAL: CRIT 1 unauthenticated, 0 locked, 0 copy to table, 85 statistics [19:41:52] RECOVERY - Apache HTTP on mw1212 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 1.649 second response time [19:42:02] RECOVERY - Apache HTTP on mw1053 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 6.018 second response time [19:42:02] RECOVERY - Apache HTTP on mw1072 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.076 second response time [19:42:02] RECOVERY - Apache HTTP on mw1214 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 9.874 second response time [19:42:02] PROBLEM - Apache HTTP on mw1075 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:42:02] RECOVERY - Apache HTTP on mw1024 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.453 second response time [19:42:03] RECOVERY - Apache HTTP on mw1101 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 9.443 second response time [19:42:03] RECOVERY - Apache HTTP on mw1162 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 9.182 second response time [19:42:04] RECOVERY - Apache HTTP on mw1213 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.061 second response time [19:42:22] RECOVERY - Apache HTTP on mw1033 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.069 second response time [19:42:32] RECOVERY - Apache HTTP on mw1102 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.070 second response time [19:42:42] RECOVERY - Apache HTTP on mw1187 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 4.213 second response time [19:42:52] RECOVERY - MySQL Processlist on db1060 is OK: OK 8 unauthenticated, 0 locked, 0 copy to table, 13 statistics [19:43:02] PROBLEM - Apache HTTP on mw1067 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:43:12] PROBLEM - LVS HTTP IPv4 on appservers.svc.eqiad.wmnet is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:44:54] I'm here but migrained out so not any use (just saw the pages) [19:45:00] RECOVERY - Apache HTTP on mw1083 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 1.689 second response time [19:45:09] PROBLEM - Apache HTTP on mw1111 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:09] PROBLEM - Apache HTTP on mw1076 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:09] PROBLEM - Apache HTTP on mw1212 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:09] PROBLEM - Apache HTTP on mw1215 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:09] PROBLEM - Apache HTTP on mw1038 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:11] PROBLEM - Apache HTTP on mw1024 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:11] PROBLEM - Apache HTTP on mw1162 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:15] RECOVERY - LVS HTTP IPv4 on appservers.svc.eqiad.wmnet is OK: HTTP OK: HTTP/1.1 200 OK - 65441 bytes in 7.261 second response time [19:45:15] PROBLEM - Apache HTTP on mw1059 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:17] PROBLEM - Apache HTTP on mw1174 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:17] PROBLEM - Apache HTTP on mw1178 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:18] PROBLEM - Apache HTTP on mw1213 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:24] PROBLEM - Apache HTTP on mw1021 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:24] PROBLEM - Apache HTTP on mw1110 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:33] RECOVERY - Apache HTTP on mw1096 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 2.428 second response time [19:45:33] PROBLEM - Apache HTTP on mw1176 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:33] RECOVERY - Apache HTTP on mw1092 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 6.883 second response time [19:45:48] RECOVERY - Apache HTTP on mw1063 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 4.572 second response time [19:45:48] PROBLEM - Apache HTTP on mw1090 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:48] PROBLEM - Apache HTTP on mw1164 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:48] PROBLEM - Apache HTTP on mw1216 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:48] PROBLEM - Apache HTTP on mw1169 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:45:54] PROBLEM - MySQL Processlist on db1060 is CRITICAL: CRIT 0 unauthenticated, 1 locked, 0 copy to table, 65 statistics [19:45:54] RECOVERY - Apache HTTP on mw1215 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 2.105 second response time [19:45:54] RECOVERY - Apache HTTP on mw1111 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.680 second response time [19:46:02] PROBLEM - Apache HTTP on mw1060 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:46:02] RECOVERY - Apache HTTP on mw1054 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.499 second response time [19:46:02] PROBLEM - Apache HTTP on mw1099 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:46:03] PROBLEM - Apache HTTP on mw1035 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:46:03] RECOVERY - Apache HTTP on mw1212 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 9.213 second response time [19:46:12] PROBLEM - Apache HTTP on mw1171 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:46:12] PROBLEM - Apache HTTP on mw1072 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:46:12] RECOVERY - Apache HTTP on mw1110 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.063 second response time [19:46:32] RECOVERY - Apache HTTP on mw1176 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.528 second response time [19:46:32] RECOVERY - Apache HTTP on mw1164 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.073 second response time [19:46:42] RECOVERY - Apache HTTP on mw1090 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.931 second response time [19:46:42] PROBLEM - Apache HTTP on mw1102 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:46:42] RECOVERY - Apache HTTP on mw1216 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.583 second response time [19:46:42] PROBLEM - Apache HTTP on mw1105 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:46:52] PROBLEM - Apache HTTP on mw1084 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:46:52] RECOVERY - MySQL Processlist on db1060 is OK: OK 2 unauthenticated, 0 locked, 0 copy to table, 19 statistics [19:46:52] RECOVERY - Apache HTTP on mw1071 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.889 second response time [19:47:02] RECOVERY - Apache HTTP on mw1035 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.148 second response time [19:47:02] RECOVERY - Apache HTTP on mw1072 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.067 second response time [19:47:14] Request: GET http://en.wikipedia.org/w/index.php?title=Special:Watchlist&hideWikibase=1&days=3, from 10.64.0.105 via cp1065 cp1065 ([10.64.0.102]:3128), Varnish XID 142000845 Forwarded for: [my IP redacted], 208.80.154.134, 10.64.0.105 Error: 503, Service Unavailable at Sun, 23 Feb 2014 19:46:01 GMT [19:47:53] !log operations folks are looking into site issues at present [19:48:05] (i know its not really any info, but meh, its what i have) [19:48:13] RECOVERY - Apache HTTP on mw1076 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.722 second response time [19:48:34] lol jackmcbarn known... [19:48:34] Logged the message, RobH [19:48:36] oh hey, notpeter1 [19:48:36] PROBLEM - Apache HTTP on mw1092 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:48:36] PROBLEM - Apache HTTP on mw1180 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:48:52] RECOVERY - Apache HTTP on mw1060 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.074 second response time [19:48:52] PROBLEM - Apache HTTP on mw1069 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:49:02] RECOVERY - Apache HTTP on mw1034 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.570 second response time [19:49:02] RECOVERY - Apache HTTP on mw1075 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 9.134 second response time [19:49:02] RECOVERY - Apache HTTP on mw1059 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.125 second response time [19:49:10] my expert diagnosis is that the stats stopped being sane (SiteStats::isSane), so MW decided to regenerate them :P [19:49:12] PROBLEM - LVS HTTP IPv4 on appservers.svc.eqiad.wmnet is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:49:15] RECOVERY - Apache HTTP on mw1213 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.504 second response time [19:49:23] RECOVERY - Apache HTTP on mw1092 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.071 second response time [19:49:23] RECOVERY - Apache HTTP on mw1180 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.072 second response time [19:49:23] RECOVERY - Apache HTTP on mw1088 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.111 second response time [19:49:32] RECOVERY - Apache HTTP on mw1078 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 2.909 second response time [19:49:32] RECOVERY - Apache HTTP on mw1166 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.750 second response time [19:49:32] RECOVERY - Apache HTTP on mw1073 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.783 second response time [19:49:32] RECOVERY - Apache HTTP on mw1102 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.077 second response time [19:49:32] PROBLEM - MySQL Processlist on db1009 is CRITICAL: CRIT 0 unauthenticated, 0 locked, 0 copy to table, 141 statistics [19:49:33] RECOVERY - Apache HTTP on mw1098 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.972 second response time [19:49:33] RECOVERY - Apache HTTP on mw1041 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 9.420 second response time [19:49:34] RECOVERY - Apache HTTP on mw1056 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.099 second response time [19:49:34] RECOVERY - Apache HTTP on mw1169 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.060 second response time [19:49:35] RECOVERY - Apache HTTP on mw1113 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.123 second response time [19:49:35] RECOVERY - Apache HTTP on mw1105 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.072 second response time [19:49:42] RECOVERY - Apache HTTP on mw1057 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.088 second response time [19:49:42] RECOVERY - Apache HTTP on mw1069 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.081 second response time [19:49:42] RECOVERY - Apache HTTP on mw1084 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.099 second response time [19:49:42] RECOVERY - Apache HTTP on mw1093 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 6.198 second response time [19:49:42] RECOVERY - Apache HTTP on mw1046 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.762 second response time [19:49:52] RECOVERY - Apache HTTP on mw1087 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.054 second response time [19:49:52] RECOVERY - Apache HTTP on mw1043 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.102 second response time [19:49:52] RECOVERY - Apache HTTP on mw1099 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.076 second response time [19:49:52] RECOVERY - Apache HTTP on mw1112 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.068 second response time [19:49:52] RECOVERY - Apache HTTP on mw1095 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.128 second response time [19:49:53] RECOVERY - Apache HTTP on mw1018 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.127 second response time [19:49:53] RECOVERY - Apache HTTP on mw1023 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.147 second response time [19:49:54] RECOVERY - Apache HTTP on mw1066 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.113 second response time [19:49:54] RECOVERY - Apache HTTP on mw1051 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.122 second response time [19:49:55] RECOVERY - Apache HTTP on mw1091 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.080 second response time [19:49:55] RECOVERY - Apache HTTP on mw1079 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.159 second response time [19:49:56] RECOVERY - Apache HTTP on mw1103 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 1.714 second response time [19:49:56] RECOVERY - Apache HTTP on mw1082 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.690 second response time [19:49:57] RECOVERY - Apache HTTP on mw1049 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.094 second response time [19:49:57] RECOVERY - Apache HTTP on mw1031 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.622 second response time [19:49:58] RECOVERY - Apache HTTP on mw1094 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 1.989 second response time [19:49:58] RECOVERY - Apache HTTP on mw1024 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.082 second response time [19:49:59] RECOVERY - Apache HTTP on mw1026 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.089 second response time [19:49:59] RECOVERY - Apache HTTP on mw1065 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.069 second response time [19:50:00] RECOVERY - Apache HTTP on mw1038 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.096 second response time [19:50:00] RECOVERY - Apache HTTP on mw1055 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.080 second response time [19:50:01] RECOVERY - Apache HTTP on mw1162 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.057 second response time [19:50:01] RECOVERY - Apache HTTP on mw1067 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.070 second response time [19:50:02] RECOVERY - Apache HTTP on mw1100 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.066 second response time [19:50:02] RECOVERY - Apache HTTP on mw1217 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.057 second response time [19:50:03] RECOVERY - Apache HTTP on mw1070 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.072 second response time [19:50:03] RECOVERY - Apache HTTP on mw1089 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.044 second response time [19:50:04] RECOVERY - LVS HTTP IPv4 on appservers.svc.eqiad.wmnet is OK: HTTP OK: HTTP/1.1 200 OK - 65441 bytes in 0.282 second response time [19:50:06] RECOVERY - Apache HTTP on mw1171 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.061 second response time [19:50:06] RECOVERY - Apache HTTP on mw1174 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.048 second response time [19:50:14] PROBLEM - MySQL InnoDB on db1009 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [19:50:14] RECOVERY - Apache HTTP on mw1085 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.070 second response time [19:50:14] RECOVERY - Apache HTTP on mw1106 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.068 second response time [19:51:06] RECOVERY - Apache HTTP on mw1109 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.674 second response time [19:51:07] RECOVERY - Apache HTTP on mw1058 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.052 second response time [19:51:07] RECOVERY - Apache HTTP on mw1080 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.069 second response time [19:51:07] RECOVERY - Apache HTTP on mw1045 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.073 second response time [19:51:07] RECOVERY - Apache HTTP on mw1064 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.518 second response time [19:51:16] RECOVERY - Apache HTTP on mw1025 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.065 second response time [19:51:25] so that means that Polish Wikipedia is mad? :) [19:51:46] RECOVERY - Apache HTTP on mw1027 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.077 second response time [19:51:50] i'd certainly want to know [19:51:52] Platonides: Seems so. [19:52:03] It was brought as an example of madness on a recent Commons RfA [19:52:12] PROBLEM - Apache HTTP on mw1074 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:12] PROBLEM - Apache HTTP on mw1168 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:12] PROBLEM - Apache HTTP on mw1213 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:17] MatmaRex: You'll enjoy this [19:52:22] PROBLEM - Apache HTTP on mw1110 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:22] PROBLEM - Apache HTTP on mw1052 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:22] PROBLEM - Apache HTTP on mw1167 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:22] RECOVERY - MySQL Processlist on db1009 is OK: OK 0 unauthenticated, 0 locked, 0 copy to table, 0 statistics [19:52:32] PROBLEM - Apache HTTP on mw1098 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:32] PROBLEM - Apache HTTP on mw1176 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:32] PROBLEM - Apache HTTP on mw1047 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:32] PROBLEM - Apache HTTP on mw1030 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:32] PROBLEM - Apache HTTP on mw1041 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:33] PROBLEM - Apache HTTP on mw1078 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:33] PROBLEM - Apache HTTP on mw1096 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:34] PROBLEM - Apache HTTP on mw1033 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:34] PROBLEM - Apache HTTP on mw1104 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:35] PROBLEM - Apache HTTP on mw1166 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:35] PROBLEM - Apache HTTP on mw1088 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:36] PROBLEM - Apache HTTP on mw1073 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:36] PROBLEM - Apache HTTP on mw1180 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:37] PROBLEM - Apache HTTP on mw1218 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:37] PROBLEM - Apache HTTP on mw1092 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:42] PROBLEM - Apache HTTP on mw1090 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:42] PROBLEM - Apache HTTP on mw1056 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:42] PROBLEM - Apache HTTP on mw1179 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:42] PROBLEM - Apache HTTP on mw1184 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:42] PROBLEM - Apache HTTP on mw1169 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:43] PROBLEM - Apache HTTP on mw1181 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:43] PROBLEM - Apache HTTP on mw1050 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:44] PROBLEM - Apache HTTP on mw1113 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:44] PROBLEM - Apache HTTP on mw1164 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:45] PROBLEM - Apache HTTP on mw1216 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:45] PROBLEM - Apache HTTP on mw1061 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:45] /if/ I ever beat the 503s [19:52:46] PROBLEM - Apache HTTP on mw1105 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:46] PROBLEM - Apache HTTP on mw1172 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:47] Yay [19:52:52] PROBLEM - Apache HTTP on mw1187 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:52] PROBLEM - Apache HTTP on mw1057 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:52] PROBLEM - Apache HTTP on mw1209 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:52] PROBLEM - Apache HTTP on mw1084 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:52] PROBLEM - Apache HTTP on mw1069 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:52:55] :-( [19:53:02] PROBLEM - Apache HTTP on mw1032 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:53:02] PROBLEM - Apache HTTP on mw1087 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:53:02] PROBLEM - Apache HTTP on mw1211 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:53:02] PROBLEM - Apache HTTP on mw1219 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:53:02] PROBLEM - Apache HTTP on mw1111 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:53:20] gg icinga bot [19:53:36] RIP in Peace [19:53:53] greg-g: you probably know, is something being done about this, possibly not by just one person? :P [19:53:56] awww.. poor thing.. [19:53:57] greg-g: hey hey [19:54:02] Yes, ops is working on it. [19:54:07] i admin logged this ;p [19:54:11] since someone mentioned stats.. Wiki ViewStats was launched over the past days. http://tools.wmflabs.org/wikiviewstats [19:54:12] RECOVERY - Apache HTTP on mw1178 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.690 second response time [19:54:12] RECOVERY - Apache HTTP on mw1174 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 9.942 second response time [19:54:13] MatmaRex: tim and faidon and springle are on it [19:54:31] notpeter1: don't look now, but... :) [19:54:32] RECOVERY - Apache HTTP on mw1203 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.684 second response time [19:54:32] RECOVERY - Apache HTTP on mw1073 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.622 second response time [19:54:42] RECOVERY - Apache HTTP on mw1179 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 4.403 second response time [19:54:42] RECOVERY - Apache HTTP on mw1172 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.018 second response time [19:54:42] RECOVERY - Apache HTTP on mw1164 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.065 second response time [19:54:42] RECOVERY - Apache HTTP on mw1040 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.027 second response time [19:54:42] RECOVERY - Apache HTTP on mw1184 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.705 second response time [19:54:43] PROBLEM - Apache HTTP on mw1205 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:54:43] PROBLEM - Apache HTTP on mw1200 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:54:44] tim always counts as ops, since he was ops before everyone in ops ;] [19:54:47] greg-g: have they considered what i suggested earlier? because that looked like a good way to stop the 503s [19:54:52] RECOVERY - Apache HTTP on mw1211 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.068 second response time [19:54:53] RECOVERY - LVS HTTP IPv4 on api.svc.eqiad.wmnet is OK: HTTP OK: HTTP/1.1 200 OK - 4398 bytes in 0.095 second response time [19:54:56] RECOVERY - Apache HTTP on mw1175 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.066 second response time [19:54:56] RECOVERY - Apache HTTP on mw1196 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 1.368 second response time [19:55:03] PROBLEM - Apache HTTP on mw1132 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:55:03] PROBLEM - Apache HTTP on mw1189 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:55:03] PROBLEM - Apache HTTP on mw1201 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:55:08] greg-g: I hear some good people are on the issue :) [19:55:12] PROBLEM - Apache HTTP on mw1198 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:55:24] RobH: WIkidata is missing in http://tools.wmflabs.org/wikiviewstats [19:55:30] MatmaRex: repeat it, plz [19:55:42] 20:49 MatmaRex: my expert diagnosis is that the stats stopped being sane (SiteStats::isSane), so MW decided to regenerate them :P [19:55:42] RECOVERY - Apache HTTP on mw1205 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.090 second response time [19:55:52] RECOVERY - Apache HTTP on mw1039 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.073 second response time [19:55:52] RECOVERY - Apache HTTP on mw1132 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.089 second response time [19:55:52] RECOVERY - Apache HTTP on mw1051 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 1.828 second response time [19:55:52] RECOVERY - Apache HTTP on mw1189 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.122 second response time [19:55:53] RECOVERY - Apache HTTP on mw1201 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.065 second response time [19:55:55] that's UTC+1, sorry [19:56:02] RECOVERY - MySQL InnoDB on db1009 is OK: OK longest blocking idle transaction sleeps for 0 seconds [19:56:08] +2? :-PP [19:56:12] RECOVERY - Apache HTTP on mw1097 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.599 second response time [19:56:12] RECOVERY - Apache HTTP on mw1198 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.673 second response time [19:56:12] RECOVERY - Apache HTTP on mw1108 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 6.970 second response time [19:56:12] RECOVERY - Apache HTTP on mw1072 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.894 second response time [19:56:12] RECOVERY - Apache HTTP on mw1188 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.387 second response time [19:56:13] RECOVERY - Apache HTTP on mw1171 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.552 second response time [19:56:13] RECOVERY - Apache HTTP on mw1213 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.773 second response time [19:56:14] RECOVERY - Apache HTTP on mw1052 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 2.280 second response time [19:56:15] !log tstarling synchronized php-1.23wmf14/includes/SiteStats.php [19:56:22] Logged the message, Master [19:56:22] RECOVERY - Apache HTTP on mw1033 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.074 second response time [19:56:32] RECOVERY - Apache HTTP on mw1200 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.061 second response time [19:56:32] RECOVERY - Apache HTTP on mw1025 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 1.550 second response time [19:56:42] RECOVERY - Apache HTTP on mw1187 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 1.107 second response time [19:56:42] RECOVERY - Apache HTTP on mw1169 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.345 second response time [19:56:42] RECOVERY - Apache HTTP on mw1209 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 1.402 second response time [19:56:42] RECOVERY - Apache HTTP on mw1216 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 6.983 second response time [19:56:42] RECOVERY - Apache HTTP on mw1181 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 9.834 second response time [19:56:52] RECOVERY - Apache HTTP on mw1087 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.072 second response time [19:56:52] PROBLEM - MySQL Processlist on db1060 is CRITICAL: CRIT 9 unauthenticated, 0 locked, 0 copy to table, 117 statistics [19:56:52] RECOVERY - Apache HTTP on mw1219 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.065 second response time [19:56:52] RECOVERY - Apache HTTP on mw1103 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.070 second response time [19:56:52] RECOVERY - Apache HTTP on mw1111 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.096 second response time [19:57:01] wow [19:57:10] Wiki13 [19:57:10] !log tstarling synchronized php-1.23wmf15/includes/SiteStats.php [19:57:12] greg-g: ugh, my client died earlier [19:57:16] Logged the message, Master [19:57:19] i suggested commenting out a line in SiteStats.php [19:57:23] which tim apparnetly just did [19:57:24] MatmaRex: I got it, odder pasted it [19:57:29] or zeroing out the site_stats table [19:57:41] since whatever data was in it was apparently not sane [19:58:32] RECOVERY - Apache HTTP on mw1090 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.070 second response time [19:59:02] yeah, there are -1 active users on pl.wp apparently. [19:59:07] which triggers the not-sane case. [19:59:13] https://pl.wikipedia.org/wiki/Specjalna:Statystyka?uselang=en [19:59:52] RECOVERY - MySQL Processlist on db1060 is OK: OK 0 unauthenticated, 0 locked, 0 copy to table, 26 statistics [20:00:03] PROBLEM - Apache HTTP on mw1065 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:00:19] yuck that's ugly [20:00:32] MatmaRex: You should ask people to get more active [20:00:32] RECOVERY - Apache HTTP on mw1061 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.073 second response time [20:00:32] RECOVERY - Apache HTTP on mw1105 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.100 second response time [20:00:42] totally [20:00:45] will do [20:00:51] :-) [20:00:52] RECOVERY - Apache HTTP on mw1065 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.085 second response time [20:01:01] !log killed SiteStatsInit queries on db1060 [20:01:09] Logged the message, Master [20:01:23] 19:58 * Holden_Caulfield gives Tim a cookies (if he were here). [20:01:33] Thanks, TimStarling! [20:01:35] ss_active_users = -1 [20:02:47] Eloquence: people are on it :) [20:02:51] sean, tim, and faidon [20:02:52] https://pl.wikipedia.org/wiki/Specjalna:Statystyka?uselang=en -1 here too [20:03:09] better to show that than regenerate everything [20:03:36] even better not to corrupt the data… [20:03:36] imho,, though scream loud in the logs is ok [20:04:24] greg-g, I saw, thanks. [20:05:03] * aude notes when wikidata gets close to 2 billion edits, that sitestats might be a problem [20:05:17] 115 million now [20:05:26] aude: At least a table which is easy to alter :D [20:05:35] * odder hands MatmaRex a Polandball https://commons.wikimedia.org/wiki/File:Austria_can_into_space.png [20:06:15] aude: We should regenerate the edit count for Wikidata at some point, though :P [20:06:19] aude: you mean they'll bring down the sites again, unless the check is updated. :P [20:06:31] plwiki won't into broken if Poland could into space, MatmaRex [20:06:33] hope that code is not there anymore [20:06:43] when wikidata gets there [20:07:07] * Technical_13 is not sane either... [20:11:16] MatmaRex: any idea what is calling the site stats request? [20:11:23] *update [20:11:30] the site stats update [20:12:10] greg-g: {{numberofedits}} parser function, special:statistics, apiquerysiteinfo [20:12:20] they call SiteStats::edits() [20:12:32] paravoid: ^ [20:12:56] which gets the cached data, checks if it's sane, and if not, tries to regenerate it, which brings down the cluster [20:13:07] we have -1 active users at pl.wp apparnetly, which is not sane [20:13:48] it just needs to be wrapped in a pool counter [20:14:04] !log killed SiteStatsInit from both wikiuser and wikiadmin on all s2 slaves [20:14:12] Logged the message, Master [20:14:29] I commented out the SiteStatsInit::doAllAndCommit() call in loadAndLazyInit() [20:14:42] because I seem to remember that was the solution last time SiteStatsInit caused downtime [20:14:47] where did the -1 come from? :/ [20:14:57] TimStarling: hey, that's cheating! [20:15:16] * greg-g goes back to taking care of kid [20:15:22] cheers greg-g [20:15:23] i suggested the same earlier, people should watch this channel sometimes during outages :/ [20:15:54] it's a bit difficult with all the bot spam [20:16:01] mh [20:16:28] there's just one bot [20:16:29] I was trying to log things I saw that was relevant, I know others can't read it all/aren't always connect, sorry for not logging yours MatmaRex [20:16:33] * greg-g goes for real [20:16:41] so, what caused this in the first place? [20:16:54] -1 appearing in the site_stats table for plwiki [20:17:00] yeah, why? [20:17:04] i'd love to know what caused *that* [20:18:04] MatmaRex: When software is written by less than sane coders, sanity checks fail in the software from time to time... [20:18:09] :p [20:18:22] MatmaRex: | ss_active_users | bigint(20) | YES | | -1 | | [20:18:29] -1 is the default [20:18:36] that's live data from master [20:18:53] hoo: it should be set to 0 during installation, but that's probably not relevant here [20:19:07] or even 1 i guess [20:19:17] MatmaRex: It probably tried to stick something weird in there [20:19:32] well, we could have a check in SiteStatsUpdate::tryDBUpdateInternal() to roll back the transaction if the site_stats row becomes insane by decrementing [20:19:52] hoo: that's updated in only one place by a rather fancy query [20:19:56] but that would increase the lock time by a lot [20:19:58] hoo: but the value goes through intval() [20:20:09] SiteStatsUpdate::cacheUpdate [20:20:32] TimStarling: should we run initSiteStats for the wikis with insane data? [20:21:10] how can COUNT( DISTINCT rc_user_text ) be -1? [20:21:59] it looks like something messed with the table… [20:24:10] the db query in the code returns 4600 for plwiki [20:24:25] for active users [20:24:27] aude: Did you run it? [20:24:33] yes (on labs) [20:24:46] it's against recentchanges [20:24:56] the value on feb 6th or so was 4619 users [20:24:59] (dumps) [20:25:01] nothing wrong with the query [20:25:04] mh [20:25:12] apergos: Shall we update it on master? [20:25:37] it would be nice to knnow how it got to -1 [20:25:44] yeah [20:25:46] after that it could be updated, sure... [20:26:06] I'm sort of barely able to look at the screen a this point, sorry I"m not more use [20:27:41] if the query was false, then intval() would be 0 [20:27:49] e.g. if the query failed somehow [20:29:26] it's not so uncommon for those stats to show negative numbers [20:29:40] !log updated ss_active_users on plwiki master to not be -1 [20:29:49] Logged the message, Master [20:29:51] Nemo_bis: usually by decrementing past zero though [20:30:40] https://bugzilla.wikimedia.org/show_bug.cgi?id=20017 is one example [20:31:13] Nemo_bis: nothing apart from that one query is writing to ss_active_users in core [20:31:18] (or at least nothing greps) [20:31:20] if the query failed [20:31:25] = 0 [20:31:27] then decremented [20:31:27] ? [20:31:41] what would decrement it? [20:32:06] throws in: https://bugzilla.wikimedia.org/show_bug.cgi?id=20017 [20:32:14] no idea [20:32:36] so, who's grepping extensions for ss_active_users? [20:33:16] se4598_2: didn't I just link it above :) [20:34:22] MatmaRex: Nothing useful https://github.com/search?l=&q=ss_active_users+user%3Awikimedia&ref=advsearch&type=Code [20:34:44] hoo: github's search tends to be hopelessly outdated [20:34:53] Nemo_bis: silly backlog, doesn't saw before copying link :) [20:35:23] right, mh [20:37:48] Hm? https://bugzilla.wikimedia.org/show_bug.cgi?id=54888#c6 [20:38:48] eb [20:38:55] Nemo_bis: it's not like we run these patches on the cluster though :/ [20:39:03] -1 is the default value, as hoo already said [20:39:20] you know, doAllAndCommit() doesn't update ss_active_users [20:39:37] but i don't see how would anything cause the default value to be restored [20:39:43] so it doesn't really make sense to include it in isSane() [20:40:16] it would answer the question though, wouldn't it? [20:41:30] it would at least solve the problem... I guess [20:41:33] what question? [20:41:54] isn't the question 'wtf did that -1 come from?'? [20:42:05] if the row was reset to its default, all the fields in it would be updated to their correct values except ss_active_users [20:42:16] why would it be reset to default? D: [20:42:20] which would be -1 until Special:Statistics runs [20:42:46] database rows shouldn't randomly reset to their defaults [20:42:57] the binlog may sort it out [20:43:24] got it [20:43:31] $dbw->delete( 'site_stats', $conds, __METHOD__ ); [20:43:50] then we reinsert, but w/o ss_active_users [20:43:54] which makes it -1 [20:44:42] SiteStatsInit::doAllAndCommit calls SiteStatsInit::refresh which does that if no options are set [20:44:57] and SiteStats::SiteStatsInit doesn't set any options... [20:45:13] well done hoo [20:45:42] it should user Database::replace probably [20:45:44] * use [20:45:52] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Last successful Puppet run was Sat 22 Feb 2014 02:36:40 PM UTC [21:00:32] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [21:07:23] Reedy: what's the most sensible directory a maintenance script output run by mwdeploy/apache/whatever could output stuff to the WWW on noc.wikimedia.org as you do every now and then? [21:33:29] (03CR) 10Jeremyb: redirect ukwikimedia to wikimedia.org.uk [operations/apache-config] - 10https://gerrit.wikimedia.org/r/113877 (owner: 10Jeremyb) [21:33:53] PROBLEM - ElasticSearch health check on logstash1001 is CRITICAL: CRITICAL - elasticsearch (production-logstash-eqiad) is running. status: red: timed_out: false: number_of_nodes: 2: number_of_data_nodes: 2: active_primary_shards: 36: active_shards: 47: relocating_shards: 0: initializing_shards: 7: unassigned_shards: 20 [22:14:59] (03PS1) 10coren: Make gordon an alternate to dickson [operations/dns] - 10https://gerrit.wikimedia.org/r/115093 [22:15:29] paravoid: ^^ reasoning being obtuse for the obvious reason. [22:15:52] Gah. Autoexpand [22:15:57] * Coren fixes spacing [22:16:41] (03PS2) 10coren: Make gordon an alternate to dickson [operations/dns] - 10https://gerrit.wikimedia.org/r/115093 [22:17:15] (03PS3) 10coren: Make gordon an alternate to dickson [operations/dns] - 10https://gerrit.wikimedia.org/r/115093 [22:17:20] mumble, mumble. [22:58:52] PROBLEM - Puppet freshness on virt1000 is CRITICAL: Last successful Puppet run was Fri 21 Feb 2014 04:42:42 PM UTC [23:41:22] PROBLEM - Host mw31 is DOWN: PING CRITICAL - Packet loss = 100% [23:42:42] RECOVERY - Host mw31 is UP: PING OK - Packet loss = 0%, RTA = 35.47 ms [23:46:52] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Last successful Puppet run was Sat 22 Feb 2014 02:36:40 PM UTC [23:54:06] (03Abandoned) 10Brion VIBBER: Revert low-res .ogv transcode enable; player is picking too-small version by default [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/114761 (owner: 10Brion VIBBER) [23:55:58] (03PS1) 10Brion VIBBER: Fix popup video size by ordering transcode settings properly [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/115094 [23:56:58] (03PS2) 10Brion VIBBER: Fix popup video size by ordering transcode settings properly [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/115094 [23:57:32] PROBLEM - NTP on mw31 is CRITICAL: NTP CRITICAL: Offset unknown