[00:25:18] <logmsgbot>	 !log catrope synchronized php-1.22wmf8/extensions/VisualEditor  'Update VE to pick up cssText fix'
[00:25:28] <morebots>	 Logged the message, Master
[00:25:42] <logmsgbot>	 !log catrope synchronized php-1.22wmf9/extensions/VisualEditor  'Update VE to pick up cssText fix'
[00:25:52] <morebots>	 Logged the message, Master
[01:52:21] <icinga-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[01:53:11] <icinga-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.128 second response time
[02:01:21] <icinga-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[02:02:11] <icinga-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.128 second response time
[02:17:47] <logmsgbot>	 !log LocalisationUpdate completed (1.22wmf9) at Thu Jul 11 02:17:47 UTC 2013
[02:17:58] <morebots>	 Logged the message, Master
[02:26:42] <icinga-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[02:27:32] <icinga-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.148 second response time
[02:37:40] <logmsgbot>	 !log LocalisationUpdate completed (1.22wmf8) at Thu Jul 11 02:37:40 UTC 2013
[02:37:50] <morebots>	 Logged the message, Master
[02:46:56] <icinga-wm>	 PROBLEM - Puppet freshness on grosley is CRITICAL: No successful Puppet run in the last 10 hours
[02:51:17] <logmsgbot>	 !log LocalisationUpdate ResourceLoader cache refresh completed at Thu Jul 11 02:51:16 UTC 2013
[02:51:27] <morebots>	 Logged the message, Master
[04:16:49] <icinga-wm>	 PROBLEM - Puppet freshness on manutius is CRITICAL: No successful Puppet run in the last 10 hours
[04:46:32] <gerrit-wm>	 New review: Dr0ptp4kt; "I believe yurik's already on this, but the VCL update deployment and the corresponding META config b..." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73027
[05:36:46] <icinga-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[05:37:38] <icinga-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.130 second response time
[06:06:18] <icinga-wm>	 PROBLEM - Memcached on mc15 is CRITICAL: Connection timed out
[06:07:08] <icinga-wm>	 RECOVERY - Memcached on mc15 is OK: TCP OK - 0.026 second response time on port 11211
[06:23:25] <gerrit-wm>	 New review: Yurik; "Update/rename of the META pages is not critical here, because this carrier is disabled. Thus, not ha..." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73027
[07:10:41] <icinga-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[07:11:31] <icinga-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.143 second response time
[07:38:28] <hashar>	 !log gallium : fix permissions for android-common nightly build (need to be jenkins owned, not jenkins-slave) {{bug|51137}}
[07:38:38] <morebots>	 Logged the message, Master
[08:02:02] <icinga-wm>	 PROBLEM - Puppet freshness on db78 is CRITICAL: No successful Puppet run in the last 10 hours
[08:02:42] <icinga-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[08:03:32] <icinga-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.125 second response time
[08:27:43] <icinga-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[08:29:33] <icinga-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.123 second response time
[09:05:56] <gerrit-wm>	 New patchset: Mark Bergsma; "Detect and exit when persistent storage can't mmap a file at the required address." [operations/debs/varnish] (testing/3.0.3plus-rc1) - https://gerrit.wikimedia.org/r/73145
[09:05:56] <gerrit-wm>	 New patchset: Mark Bergsma; "varnish (3.0.3plus~rc1-wm13) precise; urgency=low" [operations/debs/varnish] (testing/3.0.3plus-rc1) - https://gerrit.wikimedia.org/r/73146
[09:28:07] <icinga-wm>	 PROBLEM - Puppet freshness on erzurumi is CRITICAL: No successful Puppet run in the last 10 hours
[09:28:07] <icinga-wm>	 PROBLEM - Puppet freshness on lvs1004 is CRITICAL: No successful Puppet run in the last 10 hours
[09:28:07] <icinga-wm>	 PROBLEM - Puppet freshness on lvs1005 is CRITICAL: No successful Puppet run in the last 10 hours
[09:28:07] <icinga-wm>	 PROBLEM - Puppet freshness on lvs1006 is CRITICAL: No successful Puppet run in the last 10 hours
[09:28:07] <icinga-wm>	 PROBLEM - Puppet freshness on mc15 is CRITICAL: No successful Puppet run in the last 10 hours
[09:28:07] <icinga-wm>	 PROBLEM - Puppet freshness on virt1 is CRITICAL: No successful Puppet run in the last 10 hours
[09:28:07] <icinga-wm>	 PROBLEM - Puppet freshness on virt3 is CRITICAL: No successful Puppet run in the last 10 hours
[09:28:08] <icinga-wm>	 PROBLEM - Puppet freshness on virt4 is CRITICAL: No successful Puppet run in the last 10 hours
[09:44:54] <gerrit-wm>	 New patchset: Mark Bergsma; "Retry starting Varnish 3 times on temp error 75" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73151
[09:59:40] <gerrit-wm>	 Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73151
[10:20:21] <Krinkle>	 !log Graceful restart of Zuul (forward to I3fc6baba8c6d65)
[10:20:31] <morebots>	 Logged the message, Master
[10:20:37] <Krinkle>	 s/restart/reload
[10:27:59] <gerrit-wm>	 New patchset: Mark Bergsma; "Fix test" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73155
[10:28:39] <gerrit-wm>	 Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73155
[10:46:38] <icinga-wm>	 PROBLEM - Varnish HTTP parsoid-backend on cp1058 is CRITICAL: Connection refused
[10:50:28] <icinga-wm>	 PROBLEM - Disk space on analytics1006 is CRITICAL: DISK CRITICAL - free space: / 705 MB (3% inode=84%):
[10:50:38] <icinga-wm>	 RECOVERY - Varnish HTTP parsoid-backend on cp1058 is OK: HTTP OK: HTTP/1.1 200 OK - 634 bytes in 0.002 second response time
[10:53:18] <icinga-wm>	 PROBLEM - DPKG on cp1059 is CRITICAL: DPKG CRITICAL dpkg reports broken packages
[11:01:18] <icinga-wm>	 RECOVERY - DPKG on cp1059 is OK: All packages OK
[11:05:38] <gerrit-wm>	 New patchset: Mark Bergsma; "Allow persistent connections for HTTP PURGE (error) responses" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/72530
[11:05:38] <gerrit-wm>	 New patchset: Mark Bergsma; "Maintain persistent connections on text cluster redirects" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73157
[11:09:22] <gerrit-wm>	 Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/72530
[11:10:02] <gerrit-wm>	 New patchset: Mark Bergsma; "Revert "Allow persistent connections for HTTP PURGE (error) responses"" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73158
[11:10:13] <gerrit-wm>	 Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73158
[11:28:18] <gerrit-wm>	 New patchset: Mark Bergsma; "Revert "Revert "Allow persistent connections for HTTP PURGE (error) responses""" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73159
[11:29:06] <gerrit-wm>	 Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73159
[11:36:48] <gerrit-wm>	 Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73157
[11:42:58] <gerrit-wm>	 New patchset: Mark Bergsma; "Rename Varnish storage files for consistency" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73161
[11:43:37] <gerrit-wm>	 Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73161
[11:47:10] <icinga-wm>	 PROBLEM - Varnish HTTP text-backend on amssq47 is CRITICAL: Connection refused
[11:56:20] <icinga-wm>	 PROBLEM - Varnish HTTP text-backend on cp1039 is CRITICAL: Connection refused
[11:56:20] <icinga-wm>	 PROBLEM - DPKG on cp1039 is CRITICAL: DPKG CRITICAL dpkg reports broken packages
[12:04:10] <gerrit-wm>	 New patchset: Mark Bergsma; "Add monitoring of esams wikidata/wikivoyage LVS services" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73162
[12:05:30] <icinga-wm>	 PROBLEM - Varnish HTTP text-backend on cp1040 is CRITICAL: Connection refused
[12:06:10] <gerrit-wm>	 Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73162
[12:12:03] <qchris>	 Any op around that could help me with LDAP problems around gerrit and migrating SVN users?
[12:14:15] <icinga-wm>	 RECOVERY - Varnish HTTP text-backend on amssq47 is OK: HTTP OK: HTTP/1.1 200 OK - 190 bytes in 0.190 second response time
[12:24:15] <icinga-wm>	 RECOVERY - Varnish HTTP text-backend on cp1039 is OK: HTTP OK: HTTP/1.1 200 OK - 189 bytes in 0.001 second response time
[12:24:25] <icinga-wm>	 RECOVERY - DPKG on cp1039 is OK: All packages OK
[12:25:17] <gerrit-wm>	 New patchset: Mark Bergsma; "Set wikidata/wikivoyage checks non-critical" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73166
[12:25:59] <gerrit-wm>	 Change merged: Mark Bergsma; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73166
[12:36:05] <icinga-wm>	 PROBLEM - Packetloss_Average on analytics1006 is CRITICAL: CRITICAL: packet_loss_average is 23.7303262416 (gt 8.0)
[12:47:05] <icinga-wm>	 PROBLEM - Puppet freshness on grosley is CRITICAL: No successful Puppet run in the last 10 hours
[13:02:32] <icinga-wm>	 RECOVERY - Varnish HTTP text-backend on cp1040 is OK: HTTP OK: HTTP/1.1 200 OK - 188 bytes in 0.002 second response time
[13:11:39] <Coren>	 Is our naming scheme documented somewhere?
[13:11:49] <icinga-wm>	 PROBLEM - Host virt3 is DOWN: PING CRITICAL - Packet loss = 100%
[13:26:27] <gerrit-wm>	 New patchset: coren; "DHCP: rename virt[34] to labsudb[12]" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73168
[14:17:43] <icinga-wm>	 PROBLEM - Puppet freshness on manutius is CRITICAL: No successful Puppet run in the last 10 hours
[14:28:11] <gerrit-wm>	 New patchset: Reedy; "Add new symlinks" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/73170
[14:28:47] <gerrit-wm>	 Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/73170
[14:29:35] <gerrit-wm>	 New patchset: coren; "Add new labsudb role for Labs users' database" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73171
[14:30:28] <Coren>	 Sean needs to have a gerrit account set up.  :-)
[14:42:50] <icinga-wm>	 RECOVERY - Host virt3 is UP: PING OK - Packet loss = 0%, RTA = 26.58 ms
[14:44:24] <logmsgbot>	 !log reedy synchronized php-1.22wmf10  'initial sync of 1.22wmf10'
[14:44:36] <morebots>	 Logged the message, Master
[14:45:21] <logmsgbot>	 !log reedy synchronized docroot and w
[14:45:31] <morebots>	 Logged the message, Master
[14:47:52] <logmsgbot>	 !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: test2wiki to 1.22wmf10 and rebuild l10n cache
[14:48:03] <morebots>	 Logged the message, Master
[14:49:30] <icinga-wm>	 PROBLEM - Host virt3 is DOWN: PING CRITICAL - Packet loss = 100%
[14:49:37] <Reedy>	 Yay, scap is broken
[14:50:35] <apergos>	 oh joy
[14:51:15] <Reedy>	 I'm guessing it's related to the changes Tim/Ori made to fix up the problems last time
[14:52:20] <icinga-wm>	 RECOVERY - SSH on virt3 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0)
[14:52:30] <icinga-wm>	 RECOVERY - Host virt3 is UP: PING OK - Packet loss = 0%, RTA = 26.67 ms
[14:52:50] <icinga-wm>	 RECOVERY - Disk space on analytics1006 is OK: DISK OK
[14:55:10] <icinga-wm>	 PROBLEM - Host virt4 is DOWN: PING CRITICAL - Packet loss = 100%
[14:59:46] <logmsgbot>	 !log reedy Started syncing Wikimedia installation... : test2wiki to 1.22wmf10 and rebuild l10n cache
[14:59:56] <morebots>	 Logged the message, Master
[15:00:17] <Reedy>	 https://bugzilla.wikimedia.org/show_bug.cgi?id=51174
[15:00:48] <Reedy>	 Updating ExtensionMessages-1.22wmf8.php...
[15:00:48] <Reedy>	 done
[15:00:48] <Reedy>	 Updating LocalisationCache for 1.22wmf8... done
[15:00:57] <Reedy>	 I'm going to have to fix that newline
[15:01:16] <chrismcmahon>	 Reedy: starting early today :-)
[15:04:45] <Reedy>	 Nope! This is the time I usually start to have time to fix up silly issues
[15:05:45] <icinga-wm>	 RECOVERY - Host virt4 is UP: PING OK - Packet loss = 0%, RTA = 26.57 ms
[15:06:25] <icinga-wm>	 PROBLEM - NTP on virt4 is CRITICAL: NTP CRITICAL: No response from NTP server
[15:07:15] <icinga-wm>	 PROBLEM - NTP on virt3 is CRITICAL: NTP CRITICAL: No response from NTP server
[15:10:24] <gerrit-wm>	 New patchset: ArielGlenn; "replace prod ssh key for cmjohnson (rt 5448)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73176
[15:12:21] <logmsgbot>	 !log reedy Finished syncing Wikimedia installation... : test2wiki to 1.22wmf10 and rebuild l10n cache
[15:12:23] <chrismcmahon>	 Reedy: speaking of silly issues, do you have any theory as to why a POST to api.php in beta labs would return just the API HTML doc and not actually do an API call?   It's bugzilla 50622 and 50623
[15:12:25] <icinga-wm>	 PROBLEM - Host virt4 is DOWN: PING CRITICAL - Packet loss = 100%
[15:12:31] <morebots>	 Logged the message, Master
[15:12:41] <Reedy>	 Usually because the request is wrong
[15:12:51] <gerrit-wm>	 New patchset: ArielGlenn; "replace prod ssh key for cmjohnson (rt 5448)" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73176
[15:14:17] <gerrit-wm>	 Change merged: ArielGlenn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73176
[15:14:23] <drdee>	 mark, bblack: can  you guys make it to tomorrow's varnishkafka meeting with Snaps? (see Google Calendar)
[15:15:25] <icinga-wm>	 RECOVERY - SSH on virt4 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0)
[15:15:35] <icinga-wm>	 RECOVERY - Host virt4 is UP: PING OK - Packet loss = 0%, RTA = 26.53 ms
[15:18:41] <gerrit-wm>	 New patchset: Reedy; "test2wiki to 1.22wmf10" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/73180
[15:19:16] <gerrit-wm>	 Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/73180
[15:20:05] <icinga-wm>	 RECOVERY - Packetloss_Average on analytics1006 is OK: OK: packet_loss_average is -0.22630703125
[15:30:20] <yurik>	 mark, hi, could you +2 this tiny rename pls? Analytics wants it to simplify their life. https://gerrit.wikimedia.org/r/#/c/73027/
[15:35:00] <drdee>	 blame the analytics boys for simplifying their life ;)
[15:35:10] <yurik>	 always!
[15:38:46] <gerrit-wm>	 New review: Akosiaris; "(1 comment)" [operations/puppet/cdh4] (master) C: 2;  - https://gerrit.wikimedia.org/r/71569
[15:53:50] <gerrit-wm>	 New review: Ottomata; "(1 comment)" [operations/puppet/cdh4] (master) - https://gerrit.wikimedia.org/r/71569
[15:54:03] <ottomata>	 akosiaris: ^
[21:18:29] <cmjohnson1>	 i believe i gave the go ahead  today but this after I cleared sending it to eqiad�you were on that email
[21:18:32] <RobH>	 ok, but we cannot unplug network gear without the netadmins doing things on thier end
[21:18:32] <RobH>	 their even
[21:18:34] <cmjohnson1>	 oh..i talked to her about it before it was unplugged
[21:18:36] <RobH>	 ok, so she said was ok to just unplug without them doing stuff, if so thats ok
[21:18:38] <RobH>	 lets spin up two machiens for this
[21:18:41] <RobH>	 i've got two, going to name them tmc1 tmc2
[21:18:41] <RobH>	 temp memcache ;]
[21:18:43] <Ryan_Lane>	 no worries about why things broke, we'll write up an outage report after
[21:20:33] <icinga-wm>	 PROBLEM - SSH on pdf2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[21:20:37] <icinga-wm>	 RECOVERY - SSH on pdf2 is OK: SSH OK - OpenSSH_4.7p1 Debian-8ubuntu3 (protocol 2.0)
[21:20:53] <RobH>	 !log authdns-update for tmc1/2 and mgmt
[21:21:07] <morebots>	 Logged the message, RobH
[21:21:08] <RobH>	 hrmm
[21:22:22] <Ryan_Lane>	 RobH: so...
[21:22:25] <logmsgbot>	 !log spage Finished syncing Wikimedia installation... : E3 deploy latest GettingStarted and VE gender survey config
[21:22:35] <Ryan_Lane>	 RobH: don't worry about bringing up some tempcs
[21:22:36] <morebots>	 Logged the message, Master
[21:22:37] <Ryan_Lane>	 err
[21:22:40] <RobH>	 ........
[21:22:40] <Ryan_Lane>	 temp mcs
[21:22:42] <RobH>	 ?
[21:22:47] <Ryan_Lane>	 there's lots of stuff fucked up right now
[21:22:48] <cmjohnson1>	 tmc
[21:22:55] <Ryan_Lane>	 like redis replication
[21:23:00] <Ryan_Lane>	 and the lack of redis
[21:23:01] <RobH>	 so we need to bring the old ones back?
[21:23:10] <Ryan_Lane>	 no. we're going to move test and the misc jobs
[21:29:08] <Ryan_Lane>	 !log depooling mw1017 to make it the new test.wm.o
[21:29:18] <morebots>	 Logged the message, Master
[21:36:27] <icinga-wm>	 RECOVERY - RAID on ms-be5 is OK: OK: State is Optimal, checked 1 logical device(s)
[21:36:37] <icinga-wm>	 RECOVERY - DPKG on ms-be5 is OK: All packages OK
[21:36:37] <icinga-wm>	 RECOVERY - Disk space on ms-be5 is OK: DISK OK
[21:36:55] <Ryan_Lane>	 !log deployed squid change to switch test to mw1017
[21:37:01] <icinga-wm>	 RECOVERY - swift-container-replicator on ms-be5 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-replicator
[21:37:02] <RobH>	 !log authdns update for zinc to move internal wiki
[21:37:05] <morebots>	 Logged the message, Master
[21:37:11] <icinga-wm>	 RECOVERY - swift-object-server on ms-be5 is OK: PROCS OK: 101 processes with regex args ^/usr/bin/python /usr/bin/swift-object-server
[21:37:14] <morebots>	 Logged the message, RobH
[21:37:21] <icinga-wm>	 RECOVERY - swift-object-replicator on ms-be5 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-object-replicator
[21:37:28] <RobH>	 !log move internal ip, not wiki. bleh
[21:37:31] <icinga-wm>	 RECOVERY - swift-container-auditor on ms-be5 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-auditor
[21:37:31] <icinga-wm>	 RECOVERY - swift-account-server on ms-be5 is OK: PROCS OK: 13 processes with regex args ^/usr/bin/python /usr/bin/swift-account-server
[21:37:31] <icinga-wm>	 RECOVERY - swift-account-auditor on ms-be5 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-auditor
[21:37:31] <icinga-wm>	 RECOVERY - swift-object-updater on ms-be5 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-object-updater
[21:37:31] <icinga-wm>	 RECOVERY - swift-object-auditor on ms-be5 is OK: PROCS OK: 2 processes with regex args ^/usr/bin/python /usr/bin/swift-object-auditor
[21:37:31] <icinga-wm>	 RECOVERY - swift-container-updater on ms-be5 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-container-updater
[21:37:31] <icinga-wm>	 RECOVERY - swift-container-server on ms-be5 is OK: PROCS OK: 13 processes with regex args ^/usr/bin/python /usr/bin/swift-container-server
[21:37:43] <morebots>	 Logged the message, RobH
[21:38:42] <qchris>	 One of our users is having troubles to log into gerrit as he has unknowingly received two LDAP accounts whose cns agree (non-case-sensitively). Whom could he contact to get his LDAP accounts fixed?
[21:39:11] <icinga-wm>	 RECOVERY - NTP on ms-be5 is OK: NTP OK: Offset 0.06268358231 secs
[21:41:13] <gerrit-wm>	 New patchset: Mattflaschen; "Enable GuidedTour and VE EventLogging on wikis with survey." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/73338
[21:43:17] <mutante>	 qchris: could you send a mail to the "requests" address in the topic
[21:43:27] <Ryan_Lane>	 AaronSchulz: around?
[21:43:36] <Ryan_Lane>	 AaronSchulz: we could use your help
[21:43:44] <qchris>	 mutante: Thanks.
[21:43:55] <gerrit-wm>	 Change merged: jenkins-bot; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/73338
[21:45:32] <gerrit-wm>	 New patchset: Dzahn; "change dsh group testwikipedia from srv193 to mw1017" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73339
[21:46:01] <icinga-wm>	 RECOVERY - swift-account-reaper on ms-be5 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-reaper
[21:46:01] <icinga-wm>	 RECOVERY - swift-account-replicator on ms-be5 is OK: PROCS OK: 1 process with regex args ^/usr/bin/python /usr/bin/swift-account-replicator
[21:46:10] <YuviPanda>	 anyone with access to the api logs who can grep something for me?
[21:48:01] <icinga-wm>	 PROBLEM - LVS HTTP IPv4 on appservers.svc.pmtpa.wmnet is CRITICAL: Connection refused
[21:48:23] <spagewmf>	 YuviPanda, sure what?
[21:48:27] <logmsgbot>	 !log spage synchronized wmf-config/InitialiseSettings.php  'Config changes for VE survey'
[21:48:45] <morebots>	 Logged the message, Master
[21:48:45] <YuviPanda>	 spagewmf: too late, MaxSem is on it
[21:48:50] <gerrit-wm>	 New review: Dzahn; "< Ryan_Lane> !log deployed squid change to switch test to mw1017" [operations/puppet] (production) C: 2;  - https://gerrit.wikimedia.org/r/73339
[21:48:51] <gerrit-wm>	 Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73339
[21:48:55] <YuviPanda>	 spagewmf: thanks, though. will poke next time :)
[21:49:43] <gerrit-wm>	 New patchset: Asher; "moving misc::maintenance::update_flaggedrev_stats and misc::maintenance::geodata to eqiad" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73340
[21:50:29] <gerrit-wm>	 New patchset: Dzahn; "delete raw-nagios-host-list dsh group file, outdated, not in use, replace with icinga list if needed" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73341
[21:52:36] <gerrit-wm>	 Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73340
[21:54:57] <gerrit-wm>	 New patchset: MaxSem; "Remove device detection from bits" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73342
[21:54:58] <gerrit-wm>	 New patchset: MaxSem; "Switch testwiki to mw1017 in Varnish too" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73343
[21:55:13] <RoanKattouw>	 Ryan_Lane: Yeah just read update-special-pages and it's pretty ridiculous
[21:55:24] <Ryan_Lane>	 RoanKattouw: yeah. wtf?
[21:55:39] <RoanKattouw>	 I know
[21:56:01] <Ryan_Lane>	 RoanKattouw: so, you know what's going on right?
[21:56:10] <Ryan_Lane>	 RoanKattouw: want to help us move the remaining jobs to eqiad?
[21:58:10] <RoanKattouw>	 What jobs?
[21:58:10] <superm401>	 greg-g, we're running just a tad late on the deploy, probably 5-10 minutes.
[21:58:38] <gerrit-wm>	 New patchset: MaxSem; "Remove device detection from bits" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73342
[21:58:40] <greg-g>	 superm401: well fine then.
[21:58:43] <greg-g>	 :P
[22:00:31] <icinga-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[22:01:59] <notpeter>	 RoanKattouw: that's what all of the US is asking, man
[22:02:11] <gerrit-wm>	 New patchset: MaxSem; "Switch testwiki to mw1017 in Varnish too" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73343
[22:02:15] <MaxSem>	 Ryan_Lane, https://gerrit.wikimedia.org/r/73343 is also needed ^^^
[22:02:21] <icinga-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.123 second response time
[22:04:33] <gerrit-wm>	 New review: Dzahn; "if ever needed again can be recreated by script if pointed to neon" [operations/puppet] (production) C: 2;  - https://gerrit.wikimedia.org/r/73341
[22:04:34] <gerrit-wm>	 Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73341
[22:04:41] <gerrit-wm>	 New patchset: Lcarr; "two fixes" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73344
[22:05:50] <gerrit-wm>	 New review: Dzahn; "keep the ❤'s" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73344
[22:07:41] <Ryan_Lane>	 MaxSem: ah, right, forgot about mobile
[22:10:10] <gerrit-wm>	 New patchset: Asher; "moving misc::maintenance::refreshlinks to eqiad." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73346
[22:10:51] <gerrit-wm>	 Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73343
[22:12:04] <gerrit-wm>	 New patchset: Asher; "moving misc::maintenance::refreshlinks to eqiad." [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73346
[22:12:26] <gerrit-wm>	 Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73346
[22:13:26] <gerrit-wm>	 New review: Lcarr; "Don't worry, the hearts are staying forever!!!!" [operations/puppet] (production) C: 2;  - https://gerrit.wikimedia.org/r/73344
[22:13:27] <gerrit-wm>	 Change merged: Lcarr; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73344
[22:13:35] <LeslieCarr>	 :) mutante++
[22:13:40] <mutante>	 hehe:)
[22:14:15] <gerrit-wm>	 New patchset: Tim Starling; "Use eqiad memcached servers for scripts running in pmtpa" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/73347
[22:14:31] <icinga-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[22:14:43] <mutante>	 wanted to scare peter
[22:16:22] <icinga-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.124 second response time
[22:17:21] <Ryan_Lane>	 MaxSem: your change should work now (and mobile should be fixed for test)
[22:18:14] <MaxSem>	 Ryan_Lane, still looks slow:(
[22:18:26] <Ryan_Lane>	 it may still be running on one
[22:18:35] <Ryan_Lane>	 or two
[22:18:36] <Ryan_Lane>	 :)
[22:19:06] <gerrit-wm>	 Change merged: Tim Starling; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/73347
[22:19:30] <gerrit-wm>	 New patchset: Asher; "if creating /home/mwdeploy/refreshlinks dir, also better create /home/mwdeploy" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73349
[22:19:51] <gerrit-wm>	 Change merged: Asher; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73349
[22:22:15] <gerrit-wm>	 New patchset: Reedy; "Update noc symlinks for display" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/73350
[22:22:30] <icinga-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[22:24:20] <icinga-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.124 second response time
[22:26:54] <gerrit-wm>	 New patchset: Springle; "simplewiki change_tags indexes updated, bug 40867" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/73353
[22:27:46] <spagewmf>	 hey ops, two things from my scap.  1.  "snapshot3: sudo: no tty present and no askpass program specified" from all 4 snapshotN hosts.
[22:29:03] <gerrit-wm>	 Change merged: Springle; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/73353
[22:30:06] <spagewmf>	 2.  "cannot delete non-empty directory: php-1.22wmf2/.git/modules/extensions/WikiLove
[22:30:06] <spagewmf>	  cannot delete non-empty directory: php-1.22wmf2/.git/modules/extensions" (and all its parents up to "cannot delete non-empty directory: php-1.22wmf2"
[22:31:48] <spagewmf>	  2 might because the rsync commands doesn't have the (dangerous) --force or --delete options.
[22:32:37] <mutante>	 spagewmf: 1. RT-2644 , re-opening
[22:32:54] <logmsgbot>	 !log springle synchronized wmf-config/InitialiseSettings.php  'simplewiki change_tags indexes updated, bug 40867'
[22:32:54] <gerrit-wm>	 New patchset: Tim Starling; "Use eqiad memcached servers in pmtpa also" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/73354
[22:32:57] <morebots>	 Logged the message, Master
[22:33:20] <gerrit-wm>	 Change merged: Tim Starling; [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/73354
[22:33:48] <binasher>	 https://wikitech.wikimedia.org/wiki/Scap#All-script
[22:34:03] <Nemo_bis>	 mutante: you once +1'd https://gerrit.wikimedia.org/r/33713 , wanna also +2 now that Asher said it's ok? :) pleeeeeeease
[22:35:11] <binasher>	 springle: https://wikitech.wikimedia.org/wiki/Deploy
[22:35:22] <binasher>	 (possibly too detailed)
[22:36:01] <logmsgbot>	 !log tstarling synchronized wmf-config/twemproxy-pmtpa.yaml
[22:36:11] <morebots>	 Logged the message, Master
[22:36:25] <logmsgbot>	 !log tstarling synchronized wmf-config/twemproxy.yaml
[22:36:36] <morebots>	 Logged the message, Master
[22:36:46] <mutante>	 Nemo_bis: but it has to be moved to terbium , right
[22:37:13] <Nemo_bis>	 mutante: I think not, because it must hit the Tampa slaves
[22:37:24] <Nemo_bis>	 or that's what I understood
[22:37:47] <Reedy>	 springle: You might want to consider using !log in here when you're making schema changes and such
[22:39:07] <logmsgbot>	 !log tstarling synchronized wmf-config/mc.php
[22:39:18] <morebots>	 Logged the message, Master
[22:40:09] <logmsgbot>	 !log tstarling restarted twemproxy on all servers
[22:40:20] <morebots>	 Logged the message, Master
[22:40:42] <springle>	 Reedy, ok
[22:40:57] <logmsgbot>	 !log tstarling synchronized wmf-config
[22:41:07] <morebots>	 Logged the message, Master
[22:41:10] <icinga-wm>	 RECOVERY - LVS HTTP IPv4 on appservers.svc.pmtpa.wmnet is OK: HTTP OK: HTTP/1.1 200 OK - 61704 bytes in 0.776 second response time
[22:41:41] <Reedy>	 springle: It's useful for other people to know stuff is going on, incase of any issues that seem related
[22:42:45] <Reedy>	 ie !log Updating change_tags indexes on simplewiki
[22:42:58] <Reedy>	 then !log change_tag index updates completed on simplewiki
[22:44:19] <springle>	 fair enough
[22:47:50] <icinga-wm>	 PROBLEM - Puppet freshness on grosley is CRITICAL: No successful Puppet run in the last 10 hours
[22:52:15] <gerrit-wm>	 New patchset: Ryan Lane; "Update dsh group for mobile caches" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73355
[22:52:37] <springle>	 !log updating change_tags indexes on remaining wikis in bug 40867 comment 6
[22:52:48] <morebots>	 Logged the message, Master
[22:54:24] <Ryan_Lane>	 MaxSem: ok, so… the dsh group was wrong. mobile caches are on new hosts. I'm force running puppet on the new systems
[22:54:32] <MaxSem>	 heh:)
[22:54:48] <Ryan_Lane>	 I also updated the dsh group
[22:55:00] <gerrit-wm>	 Change merged: Ryan Lane; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73355
[22:55:43] <Ryan_Lane>	 MaxSem: now it's fixed :)
[22:55:50] <icinga-wm>	 PROBLEM - Puppet freshness on mw56 is CRITICAL: No successful Puppet run in the last 10 hours
[22:56:36] <MaxSem>	 {{confirmed}}
[23:02:28] <spagewmf>	 Reedy, MaxSem, anyone I see regular "Fatal error: Call to undefined method Solarium_Result_Update::numRows() at /usr/local/apache/common-local/php-1.22wmf9/extensions/GeoData/solrupdate.php on line 182" , shall I file a bug?
[23:02:46] <Reedy>	 Please
[23:02:54] * Reedy  throws something at MaxSem
[23:03:30] <MaxSem>	 Reedy, throw it at those who moved it from a host where it worked:P
[23:04:12] <gerrit-wm>	 New patchset: Dzahn; "set variables for etherpad_lite as suggested in labs docs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73358
[23:05:21] <Reedy>	 MaxSem: You probably should still prevent the fatals ;)
[23:05:31] <spagewmf>	 MaxSem, Reedy bug 51207
[23:05:59] <Reedy>	 though, loopings
[23:06:49] <MaxSem>	 fffffuuuuuu
[23:07:02] <gerrit-wm>	 New patchset: Dzahn; "set variables for etherpad_lite as suggested in labs docs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73358
[23:07:14] <icinga-wm>	 PROBLEM - Puppet freshness on cp1041 is CRITICAL: No successful Puppet run in the last 10 hours
[23:07:14] <icinga-wm>	 PROBLEM - Puppet freshness on cp1042 is CRITICAL: No successful Puppet run in the last 10 hours
[23:07:24] <icinga-wm>	 PROBLEM - Puppet freshness on cp1044 is CRITICAL: No successful Puppet run in the last 10 hours
[23:07:54] <icinga-wm>	 PROBLEM - Puppet freshness on cp1043 is CRITICAL: No successful Puppet run in the last 10 hours
[23:07:57] <MaxSem>	 nah, the problem wasn't with a move to hume
[23:08:05] <Reedy>	 } while ( $res && $res->numRows() > 0 );
[23:08:06] <Reedy>	 :D
[23:09:22] <MaxSem>	 Reedy, lawl
[23:09:27] <MaxSem>	 that will also fail
[23:09:41] <gerrit-wm>	 New patchset: Pyoungmeister; "removing dsc from analytics contact group" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73359
[23:10:57] <gerrit-wm>	 Change merged: Pyoungmeister; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73359
[23:12:57] <gerrit-wm>	 New patchset: Dzahn; "set variables for etherpad_lite as suggested in labs docs" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73358
[23:13:46] <mutante>	 ehm.. Code Review 500 Internal server error
[23:13:56] <gerrit-wm>	 Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73358
[23:14:03] <mutante>	 but.. can't reproduce.. shrug
[23:20:14] <spagewmf>	 Anyone, should I file  "clean out /usr/local/apache/common/php-1.22wmf2 on production machines" in RT or in bugzilla Wikimedia component?  I assume the latter
[23:21:57] <RobH>	 i figure latter, usually a dev/deployer would clean that up right?
[23:22:13] <RobH>	 if its goign to require an ops person to do it, RT.
[23:22:34] <icinga-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[23:22:35] <notpeter>	 spagewmf: rt ticket please
[23:22:39] <notpeter>	 I'll do it today or tomorrow
[23:22:45] <notpeter>	 based on the time, probably tomorrow
[23:23:01] <MaxSem>	 greg-g, can I push a quick fix for GeoData fatal?
[23:24:16] <spagewmf>	 RobH well a regular scap/sync-dir won't clean out the .git objects  and most deployers (well, me  anyway :)  ) don't know how to run a dsh job.  notpeter, will do.
[23:24:24] <icinga-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.124 second response time
[23:24:25] <spagewmf>	 not urgent
[23:24:38] <RobH>	 springle: yea, peter is right then, cool
[23:24:46] <RobH>	 ack
[23:24:49] <RobH>	 spagewmf: even
[23:24:52] <RobH>	 sorry spring ;]
[23:25:04] <notpeter>	 RobH: do you even lift irc, bro?
[23:25:16] <spagewmf>	 "springle" makes me hungry. taste the salty rainbow :)
[23:25:56] <RobH>	 notpeter: yes. http://i.imgur.com/hEowN.jpg
[23:26:05] <mutante>	 http://weknowmemes.com/wp-content/uploads/2013/06/im-so-out-of-shape.jpg
[23:31:34] <icinga-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[23:32:24] <icinga-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.122 second response time
[23:33:09] <notpeter>	 RobH: clearly, you do not lift.
[23:33:19] <gerrit-wm>	 New patchset: Dzahn; "comment mod_rewrite to quick fix duplicate definition when using with another class defining the same" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73365
[23:33:37] <greg-g>	 MaxSem: sorry I didn't see it until now. Yes.
[23:33:48] <greg-g>	 MaxSem: quickly, of course, since we're over LD time :)
[23:33:55] <RobH>	 notpeter:  do to!  http://24.media.tumblr.com/230058fea2a482dd1cf9d12eaea3b837/tumblr_mgh60w8BqX1qcbo9lo1_1280.jpg
[23:34:04] <RobH>	 too even.
[23:34:05] <greg-g>	 MaxSem: also, I'm heading out shortly, can you add it to the Deployment calendar for the LD today
[23:34:09] <MaxSem>	 thx
[23:34:27] <gerrit-wm>	 Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73365
[23:34:50] <notpeter>	 RobH: I'll put you down for "maybe lifts"
[23:34:57] <RobH>	 \o/
[23:37:40] <logmsgbot>	 !log catrope Started syncing Wikimedia installation... : Update VE to master
[23:37:51] <morebots>	 Logged the message, Master
[23:38:42] <MaxSem>	 greg-g, collided with Roan, will do tomorrow or someting
[23:40:09] <greg-g>	 MaxSem: not tomorrow :)
[23:40:35] <icinga-wm>	 PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds
[23:41:24] <icinga-wm>	 RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK: Status line output matched 400 - 336 bytes in 0.141 second response time
[23:42:15] <Ryan_Lane>	 MaxSem: so, to be fair, people tried to move your job off of hume before and move it back because it was broken
[23:42:20] <gerrit-wm>	 New patchset: Dzahn; "update apache-fast-test to use mw1070 instead of srv193 as default test host" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73367
[23:42:32] <Ryan_Lane>	 and it was moved again, and partially fixed
[23:44:49] <gerrit-wm>	 New patchset: Dzahn; "update apache-fast-test to use mw1017 instead of srv193 as default test host" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73367
[23:44:56] <logmsgbot>	 !log catrope Finished syncing Wikimedia installation... : Update VE to master
[23:45:06] <morebots>	 Logged the message, Master
[23:45:08] <MaxSem>	 Ryan_Lane, the problem wan't with the terbium move as I later admitted:P
[23:45:16] <Ryan_Lane>	 :)
[23:46:51] <spagewmf>	 notpeter , I filed RT 5455 , no rush
[23:47:16] <notpeter>	 spagewmf: thanks!
[23:48:44] <gerrit-wm>	 Change merged: Dzahn; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/73367