[00:10:47] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:16:36] paravoid: I'd like to update https://meta.wikimedia.org/wiki/Planet_Wikimedia#Requests_for_inclusion to indicate how developers add someone to the en.planet feed. https://bugzilla.wikimedia.org/show_bug.cgi?id=37929 is fixed but I can't figure out where to submit a changeset if I want a Planet addition [00:16:47] as in, what repo. [00:18:53] I figured it out - templates/planet/en_config.erb in operations/puppet [00:25:56] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.024 seconds [00:59:50] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:05:50] New review: Nemo bis; "Hit and sunk!" [operations/mediawiki-config] (master) C: 1; - https://gerrit.wikimedia.org/r/31205 [01:13:11] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.022 seconds [01:43:01] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 312 seconds [01:46:20] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:47:45] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 11 seconds [02:01:02] PROBLEM - MySQL Slave Delay on db1025 is CRITICAL: CRIT replication delay 311 seconds [02:01:24] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.027 seconds [02:29:24] !log LocalisationUpdate completed (1.21wmf3) at Sat Nov 3 02:29:23 UTC 2012 [02:29:31] Logged the message, Master [02:34:38] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:39:17] RECOVERY - Puppet freshness on ms1002 is OK: puppet ran at Sat Nov 3 02:39:08 UTC 2012 [02:39:35] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.030 seconds [02:50:07] !log LocalisationUpdate completed (1.21wmf2) at Sat Nov 3 02:50:07 UTC 2012 [02:50:14] Logged the message, Master [03:59:50] RECOVERY - MySQL Slave Delay on db1025 is OK: OK replication delay 6 seconds [04:07:20] PROBLEM - Puppet freshness on db42 is CRITICAL: Puppet has not run in the last 10 hours [04:07:20] PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours [04:07:20] PROBLEM - Puppet freshness on ms-be7 is CRITICAL: Puppet has not run in the last 10 hours [06:04:38] PROBLEM - LVS Lucene on search-pool4.svc.eqiad.wmnet is CRITICAL: Connection timed out [06:07:36] RECOVERY - LVS Lucene on search-pool4.svc.eqiad.wmnet is OK: TCP OK - 3.028 second response time on port 8123 [06:10:06] !Log restarted lucene search on search1015 [06:16:52] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [06:18:48] PROBLEM - MySQL Slave Delay on db1042 is CRITICAL: CRIT replication delay 182 seconds [06:20:02] PROBLEM - MySQL Replication Heartbeat on db1042 is CRITICAL: CRIT replication delay 207 seconds [06:21:01] lowercase L [06:22:16] RECOVERY - MySQL Replication Heartbeat on db1042 is OK: OK replication delay 14 seconds [06:22:40] RECOVERY - MySQL Slave Delay on db1042 is OK: OK replication delay 1 seconds [06:23:45] PROBLEM - Memcached on virt0 is CRITICAL: Connection refused [06:45:18] RECOVERY - Memcached on virt0 is OK: TCP OK - 0.001 second response time on port 11000 [06:49:34] grrr [06:49:45] that's just a typo from the exlamation point of course [06:50:11] !log (from half an hour ago) restarted lucene search on search1015. stupid shift key [06:50:14] Logged the message, Master [07:19:21] PROBLEM - Puppet freshness on ms-fe3 is CRITICAL: Puppet has not run in the last 10 hours [07:46:12] New patchset: Denny Vrandecic; "Configure Babel for Wikidata\n\nChange-Id: Ia8e9a415b32dd0865fa" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/31608 [07:48:46] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [07:48:46] PROBLEM - Puppet freshness on analytics1001 is CRITICAL: Puppet has not run in the last 10 hours [07:48:46] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [08:23:57] New patchset: Nikerabbit; "Configure Babel category for Wikidata" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/31608 [09:01:28] PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours [09:09:03] New review: Danmichaelo; "Can you try again? We deleted existing pages with WP: prefix two days ago." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/30588 [10:21:20] New patchset: Nemo bis; "Fix sourceswiki/oldwikisource name typo" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/31638 [12:40:19] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [13:01:37] New patchset: Hashar; "Unit testing for InitialiseSettings.php (WIP - DO NOT MERGE)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/28627 [13:02:21] New review: Hashar; "PS6 fix a few minor glitches I introduced in PS5." [operations/mediawiki-config] (master); V: 0 C: 0; - https://gerrit.wikimedia.org/r/28627 [13:02:25] New patchset: Hashar; "Unit testing for InitialiseSettings.php (WIP - DO NOT MERGE)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/28627 [13:22:41] New patchset: Hashar; "solve some of the tests" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/31647 [13:22:41] New patchset: Hashar; "phpunit now fails the build whenever a test fail" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/31648 [13:23:38] Change merged: Hashar; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/31648 [13:23:38] Change merged: Hashar; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/31647 [13:28:20] is jobs-loop.sh running on tmh1/2? somehow jobs dont get processed [14:01:26] New patchset: Hashar; "Unit testing for InitialiseSettings.php (WIP - DO NOT MERGE)" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/28627 [14:01:45] New review: Hashar; "PS8 address most issues I had in PS4." [operations/mediawiki-config] (master); V: 0 C: 0; - https://gerrit.wikimedia.org/r/28627 [14:08:03] PROBLEM - Puppet freshness on db42 is CRITICAL: Puppet has not run in the last 10 hours [14:08:03] PROBLEM - Puppet freshness on ms-be7 is CRITICAL: Puppet has not run in the last 10 hours [14:08:03] PROBLEM - Puppet freshness on neon is CRITICAL: Puppet has not run in the last 10 hours [14:08:19] PROBLEM - MySQL Replication Heartbeat on db1035 is CRITICAL: CRIT replication delay 209 seconds [14:08:37] PROBLEM - MySQL Slave Delay on db1035 is CRITICAL: CRIT replication delay 200 seconds [14:10:16] RECOVERY - MySQL Slave Delay on db1035 is OK: OK replication delay 0 seconds [14:11:37] RECOVERY - MySQL Replication Heartbeat on db1035 is OK: OK replication delay 0 seconds [14:58:52] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:00:31] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 3.248 seconds [15:35:37] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:37:21] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.028 seconds [16:10:17] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:16:44] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.043 seconds [16:18:06] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [16:50:12] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:06:32] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.020 seconds [17:20:25] PROBLEM - Puppet freshness on ms-fe3 is CRITICAL: Puppet has not run in the last 10 hours [17:25:46] j^: https://bugzilla.wikimedia.org/show_bug.cgi?id=41736 tmh problem?? [17:34:49] !log reedy synchronized php-1.21wmf3/includes/ [17:34:55] Logged the message, Master [17:39:41] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:45:59] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/31638 [17:46:43] !log reedy synchronized wmf-config/InitialiseSettings.php [17:46:49] Logged the message, Master [17:48:22] Nemo_bis: yes, fixed in https://gerrit.wikimedia.org/r/31658 [17:50:11] PROBLEM - Puppet freshness on analytics1001 is CRITICAL: Puppet has not run in the last 10 hours [17:50:11] PROBLEM - Puppet freshness on ocg3 is CRITICAL: Puppet has not run in the last 10 hours [17:50:11] PROBLEM - Puppet freshness on virt1004 is CRITICAL: Puppet has not run in the last 10 hours [17:52:44] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.071 seconds [18:15:32] New patchset: Alex Monk; "(bug 41743) Raise account creation throttle for an enwiki workshop in mumbai" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/31659 [18:17:21] New patchset: Alex Monk; "(bug 41743) Raise account creation throttle for an enwiki workshop in mumbai" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/31659 [18:18:33] New patchset: Dereckson; "(bug 41743) Throttle rule for 2012-11-04 Mumbai conference" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/31660 [18:20:01] New review: Dereckson; "Please abandon this change." [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/31659 [18:22:26] New patchset: Platonides; "(Bug 41745) Remove ptwiki and ptwikinews from wmgEmergencyCaptcha." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/31661 [18:23:09] Change abandoned: Alex Monk; "Actually 31660 is a dupe of this, but it doesn't matter." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/31659 [18:27:19] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:28:49] New review: Alex Monk; "shellpolicy" [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/31661 [18:32:10] Change merged: Reedy; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/31660 [18:33:02] New review: Helder.wiki; "-1 for now, since the community is discussing if this should be done for a test period (and only the..." [operations/mediawiki-config] (master) C: -1; - https://gerrit.wikimedia.org/r/31661 [18:40:30] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 6.960 seconds [19:02:06] PROBLEM - Puppet freshness on zhen is CRITICAL: Puppet has not run in the last 10 hours [19:14:24] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:29:15] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.024 seconds [20:02:08] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:13:50] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.302 seconds [20:49:14] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:02:26] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.031 seconds [21:34:51] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:51:30] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.025 seconds [22:23:03] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:36:30] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 4.146 seconds [22:41:22] PROBLEM - Puppet freshness on ms1002 is CRITICAL: Puppet has not run in the last 10 hours [23:03:30] New patchset: Reedy; "Updates per 30297 30300" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/31670 [23:11:30] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:24:38] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.358 seconds [23:59:12] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds