[00:00:00] cool [00:00:05] no need to get red of it, really [00:00:06] figured if no one bitches in a bit, you can pull and archive it so its not takin up space. [00:00:08] <^demon> If misc::limesurvey's not used anywhere, can it be removed? [00:00:16] <^demon> I'd rather not encourage anyone to ever use it :p [00:00:18] ^demon: i wanna make sure no one flips out [00:00:22] <^demon> *nod* [00:00:26] but if no one does in a week or two, we can pull it out entirely yep [00:00:45] im glad its dead. [00:00:51] it was a pain in the ass to administer [00:01:10] 'im so tired of dealing with limesurvey, i need something less error prone and less frustrating, like wordpress.' [00:01:13] =P [00:01:26] (saying 'like mediawiki' goes without saying ;) [00:01:41] <^demon> Heh [00:02:02] PROBLEM - Host argon is DOWN: PING CRITICAL - Packet loss = 100% [00:02:35] DIE ARGON [00:02:42] then come back without limesurvey. [00:03:03] notpeter: well, if we dont get rid of it, it will travel to the new misc dbs! (as long as you dont migrate it i dont care ;) [00:03:51] yeah [00:03:52] I dunno [00:04:00] I'm not too worried [00:04:15] drop now, or drop later [00:04:26] PROBLEM - Puppet freshness on stafford is CRITICAL: Puppet has not run in the last 10 hours [00:04:33] the important part, is that we hire skrillex as our dba. [00:06:59] RobH, shouldn't misc::limesurvey die too? [00:07:45] <^demon> Already asked that :p [00:09:40] meh [00:11:06] hrmm, i guess i could kill it now [00:11:13] cuz i can always revert, yay git. [00:11:18] RobH: no [00:11:22] ? [00:11:30] just wait until we know that no one needs it [00:11:40] let the laziness flow through you ;) [00:11:48] nah, i want it dead nowwwwwwwwwww [00:11:53] it is! [00:11:59] i shoudnt have cut off argon without seeing if it had any recent access first. [00:12:11] oh well [00:12:32] cuz if it had no access for months, i would feel even more pressure to delete limesurvey.pp [00:12:34] here, this might help: http://www.youtube.com/watch?v=-4KIoz_g0V0 [00:13:31] I am at least 17% more chilled out since I started listening to that [00:13:43] Why remove it today if someone else will remove it tomorrow? :p [00:13:54] Reedy: you speak my mind [00:16:28] bwahahaha [00:16:33] die limesurvey dddddiiiieeeeee [00:16:46] New patchset: RobH; "limesurvey.pp is dead" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/50461 [00:17:09] i had the joy of commiting the change [00:17:12] anyone else wanna merge? [00:17:19] its kinda cathartic. [00:20:41] New review: RobH; "from hell's heart I stab at thee" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/50461 [00:21:01] New review: RobH; "from hell's heart I stab at thee" [operations/puppet] (production); V: 2 C: 2; - https://gerrit.wikimedia.org/r/50461 [00:21:14] Change merged: RobH; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/50461 [00:21:18] \o/ [00:21:47] so if it has to come back now, its gonna mean reverting crap, good reason to not bring it back. [00:22:35] ahh, bleh, i missed the damned apache host file. [00:23:41] New patchset: RobH; "removing limesurvey apache vhost file from repo" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/50463 [00:24:36] Change merged: RobH; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/50463 [00:29:27] New patchset: RobH; "old vhost files for services no longer offered" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/50464 [00:30:16] New review: RobH; "death to old cruft" [operations/puppet] (production); V: 2 C: 2; - https://gerrit.wikimedia.org/r/50464 [00:30:27] Change merged: RobH; [operations/puppet] (production) - https://gerrit.wikimedia.org/r/50464 [00:37:26] PROBLEM - Puppet freshness on lvs1003 is CRITICAL: Puppet has not run in the last 10 hours [00:39:14] PROBLEM - MySQL Slave Delay on db32 is CRITICAL: CRIT replication delay 187 seconds [00:40:44] PROBLEM - MySQL Replication Heartbeat on db32 is CRITICAL: CRIT replication delay 187 seconds [00:43:08] RECOVERY - Puppet freshness on lvs1003 is OK: puppet ran at Sat Feb 23 00:42:57 UTC 2013 [00:43:11] !log lvs1003 had a ton of hung puppet processes (no output on them when tracing, so dunno whats up). killed them and kicked manual run [00:43:12] Logged the message, RobH [00:49:26] PROBLEM - Puppet freshness on maerlant is CRITICAL: Puppet has not run in the last 10 hours [00:55:12] woosters: https://gerrit.wikimedia.org/r/#/c/49069/ [01:01:40] New patchset: Reedy; "Recurseively checkout submodules..." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/50467 [01:32:29] PROBLEM - MySQL disk space on neon is CRITICAL: Connection refused by host [01:33:24] PROBLEM - ircecho_service_running on neon is CRITICAL: Connection refused by host [01:42:50] PROBLEM - MySQL Slave Delay on db53 is CRITICAL: CRIT replication delay 181 seconds [01:43:21] New review: Mattflaschen; "Seems like it should resolve the issue." [operations/mediawiki-config] (master) C: 1; - https://gerrit.wikimedia.org/r/50467 [01:44:11] PROBLEM - MySQL Replication Heartbeat on db53 is CRITICAL: CRIT replication delay 212 seconds [01:49:00] RECOVERY - MySQL Replication Heartbeat on db53 is OK: OK replication delay 0 seconds [01:49:27] RECOVERY - MySQL Slave Delay on db53 is OK: OK replication delay 3 seconds [01:53:30] PROBLEM - Puppet freshness on mw64 is CRITICAL: Puppet has not run in the last 10 hours [01:53:30] PROBLEM - Puppet freshness on mw1039 is CRITICAL: Puppet has not run in the last 10 hours [02:04:00] RECOVERY - MySQL disk space on neon is OK: DISK OK [02:04:27] RECOVERY - ircecho_service_running on neon is OK: PROCS OK: 2 processes with args ircecho [02:27:18] !log LocalisationUpdate completed (1.21wmf10) at Sat Feb 23 02:27:17 UTC 2013 [02:27:21] Logged the message, Master [02:36:24] PROBLEM - Puppet freshness on db1009 is CRITICAL: Puppet has not run in the last 10 hours [02:38:30] PROBLEM - Puppet freshness on mw1134 is CRITICAL: Puppet has not run in the last 10 hours [02:52:11] !log LocalisationUpdate completed (1.21wmf9) at Sat Feb 23 02:52:10 UTC 2013 [02:52:13] Logged the message, Master [02:59:30] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:01:18] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 9.132 seconds [03:16:54] PROBLEM - ircecho_service_running on neon is CRITICAL: Connection refused by host [03:18:06] PROBLEM - MySQL disk space on neon is CRITICAL: Connection refused by host [03:22:54] PROBLEM - MySQL Replication Heartbeat on db53 is CRITICAL: CRIT replication delay 183 seconds [03:23:21] PROBLEM - MySQL Slave Delay on db53 is CRITICAL: CRIT replication delay 188 seconds [03:36:42] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:49:54] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.059 seconds [03:50:39] RECOVERY - MySQL disk space on neon is OK: DISK OK [03:51:06] RECOVERY - ircecho_service_running on neon is OK: PROCS OK: 2 processes with args ircecho [03:59:03] RECOVERY - MySQL Slave Delay on db32 is OK: OK replication delay 0 seconds [04:00:15] RECOVERY - MySQL Replication Heartbeat on db32 is OK: OK replication delay 0 seconds [04:08:21] PROBLEM - Puppet freshness on srv245 is CRITICAL: Puppet has not run in the last 10 hours [04:09:15] RECOVERY - check_job_queue on neon is OK: JOBQUEUE OK - all job queues below 10,000 [04:10:45] RECOVERY - check_job_queue on spence is OK: JOBQUEUE OK - all job queues below 10,000 [04:21:42] RECOVERY - MySQL Slave Delay on db53 is OK: OK replication delay 26 seconds [04:22:18] RECOVERY - MySQL Replication Heartbeat on db53 is OK: OK replication delay 0 seconds [04:22:45] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:29:21] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [04:33:33] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.077 seconds [04:53:48] PROBLEM - MySQL disk space on neon is CRITICAL: Connection refused by host [04:54:24] PROBLEM - ircecho_service_running on neon is CRITICAL: Connection refused by host [05:08:21] PROBLEM - Puppet freshness on ms1004 is CRITICAL: Puppet has not run in the last 10 hours [05:25:30] RECOVERY - MySQL disk space on neon is OK: DISK OK [05:25:56] RECOVERY - ircecho_service_running on neon is OK: PROCS OK: 2 processes with args ircecho [05:26:32] PROBLEM - LVS Lucene on search-pool4.svc.eqiad.wmnet is CRITICAL: Connection timed out [05:29:59] RECOVERY - LVS Lucene on search-pool4.svc.eqiad.wmnet is OK: TCP OK - 9.021 second response time on port 8123 [05:57:17] PROBLEM - Lucene on search1016 is CRITICAL: Connection timed out [05:58:38] PROBLEM - LVS Lucene on search-pool4.svc.eqiad.wmnet is CRITICAL: Connection timed out [06:01:38] RECOVERY - LVS Lucene on search-pool4.svc.eqiad.wmnet is OK: TCP OK - 0.027 second response time on port 8123 [06:07:47] PROBLEM - LVS Lucene on search-pool4.svc.eqiad.wmnet is CRITICAL: Connection timed out [06:11:41] RECOVERY - Lucene on search1016 is OK: TCP OK - 0.027 second response time on port 8123 [06:11:41] RECOVERY - LVS Lucene on search-pool4.svc.eqiad.wmnet is OK: TCP OK - 0.030 second response time on port 8123 [06:13:50] !log restarted lucene search on search1016 [06:13:52] Logged the message, Master [06:15:44] PROBLEM - Host cp3003 is DOWN: PING CRITICAL - Packet loss = 100% [06:20:45] RECOVERY - Host cp3003 is UP: PING OK - Packet loss = 0%, RTA = 118.29 ms [06:22:51] PROBLEM - Puppet freshness on virt0 is CRITICAL: Puppet has not run in the last 10 hours [06:40:15] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:41:54] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.066 seconds [06:43:42] PROBLEM - MySQL Replication Heartbeat on db64 is CRITICAL: CRIT replication delay 311 seconds [06:44:10] PROBLEM - MySQL Slave Delay on db64 is CRITICAL: CRIT replication delay 338 seconds [06:49:27] RECOVERY - MySQL Slave Delay on db64 is OK: OK replication delay NULL seconds [06:53:30] PROBLEM - MySQL Slave Running on db64 is CRITICAL: CRIT replication Slave_IO_Running: Yes Slave_SQL_Running: No Last_Error: Error Lock wait timeout exceeded: try restarting transaction on que [07:02:03] PROBLEM - SSH on dataset1001 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:05:39] RECOVERY - SSH on dataset1001 is OK: SSH OK - OpenSSH_5.3p1 Debian-3ubuntu7 (protocol 2.0) [07:15:06] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:16:54] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 7.058 seconds [07:47:16] PROBLEM - MySQL disk space on neon is CRITICAL: Connection refused by host [07:47:53] PROBLEM - ircecho_service_running on neon is CRITICAL: Connection refused by host [07:51:10] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:05:25] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.037 seconds [08:07:58] PROBLEM - Host cp3003 is DOWN: PING CRITICAL - Packet loss = 100% [08:17:52] RECOVERY - MySQL disk space on neon is OK: DISK OK [08:18:28] RECOVERY - ircecho_service_running on neon is OK: PROCS OK: 2 processes with args ircecho [08:19:58] RECOVERY - Host cp3003 is UP: PING OK - Packet loss = 0%, RTA = 118.26 ms [08:24:46] PROBLEM - Puppet freshness on cp3004 is CRITICAL: Puppet has not run in the last 10 hours [08:35:08] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:39:55] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.048 seconds [08:44:43] PROBLEM - Puppet freshness on ms-be3002 is CRITICAL: Puppet has not run in the last 10 hours [08:44:43] PROBLEM - Puppet freshness on ms-be3003 is CRITICAL: Puppet has not run in the last 10 hours [09:07:40] PROBLEM - Puppet freshness on ms-be3001 is CRITICAL: Puppet has not run in the last 10 hours [09:30:19] RECOVERY - Puppet freshness on stafford is OK: puppet ran at Sat Feb 23 09:30:14 UTC 2013 [09:38:16] PROBLEM - MySQL Slave Delay on db32 is CRITICAL: CRIT replication delay 187 seconds [09:39:10] PROBLEM - MySQL Replication Heartbeat on db32 is CRITICAL: CRIT replication delay 210 seconds [09:40:04] RECOVERY - MySQL Slave Delay on db32 is OK: OK replication delay 8 seconds [09:40:59] RECOVERY - MySQL Replication Heartbeat on db32 is OK: OK replication delay 0 seconds [10:37:25] PROBLEM - MySQL Replication Heartbeat on db32 is CRITICAL: CRIT replication delay 213 seconds [10:39:13] RECOVERY - MySQL Replication Heartbeat on db32 is OK: OK replication delay 0 seconds [10:50:37] PROBLEM - Puppet freshness on maerlant is CRITICAL: Puppet has not run in the last 10 hours [10:58:33] PROBLEM - Host cp3003 is DOWN: PING CRITICAL - Packet loss = 100% [11:10:24] RECOVERY - Host cp3003 is UP: PING OK - Packet loss = 0%, RTA = 118.30 ms [11:36:39] PROBLEM - MySQL Replication Heartbeat on db32 is CRITICAL: CRIT replication delay 190 seconds [11:38:27] RECOVERY - MySQL Replication Heartbeat on db32 is OK: OK replication delay 6 seconds [11:54:21] PROBLEM - Puppet freshness on mw64 is CRITICAL: Puppet has not run in the last 10 hours [11:54:21] PROBLEM - Puppet freshness on mw1039 is CRITICAL: Puppet has not run in the last 10 hours [12:23:02] PROBLEM - LVS Lucene on search-pool4.svc.eqiad.wmnet is CRITICAL: Connection timed out [12:24:32] RECOVERY - LVS Lucene on search-pool4.svc.eqiad.wmnet is OK: TCP OK - 0.035 second response time on port 8123 [12:37:26] PROBLEM - Puppet freshness on db1009 is CRITICAL: Puppet has not run in the last 10 hours [12:38:29] PROBLEM - check_job_queue on neon is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 9,999 jobs: , svwiki (34371), Total (41259) [12:38:29] PROBLEM - check_job_queue on spence is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 9,999 jobs: , svwiki (34371), Total (41259) [12:39:32] PROBLEM - Puppet freshness on mw1134 is CRITICAL: Puppet has not run in the last 10 hours [12:54:23] RECOVERY - check_job_queue on spence is OK: JOBQUEUE OK - all job queues below 10,000 [12:54:32] RECOVERY - check_job_queue on neon is OK: JOBQUEUE OK - all job queues below 10,000 [13:37:26] PROBLEM - MySQL Replication Heartbeat on db32 is CRITICAL: CRIT replication delay 190 seconds [13:37:44] PROBLEM - MySQL Slave Delay on db32 is CRITICAL: CRIT replication delay 198 seconds [13:39:33] RECOVERY - MySQL Replication Heartbeat on db32 is OK: OK replication delay 0 seconds [13:39:42] RECOVERY - MySQL Slave Delay on db32 is OK: OK replication delay 0 seconds [13:58:45] PROBLEM - MySQL disk space on neon is CRITICAL: Connection refused by host [13:59:12] PROBLEM - ircecho_service_running on neon is CRITICAL: Connection refused by host [14:05:02] New patchset: Hydriz; "Adding .gitreview and .gitignore files" [operations/dumps/archiving] (master) - https://gerrit.wikimedia.org/r/50485 [14:05:38] Change merged: Hydriz; [operations/dumps/archiving] (master) - https://gerrit.wikimedia.org/r/50485 [14:09:51] PROBLEM - Puppet freshness on srv245 is CRITICAL: Puppet has not run in the last 10 hours [14:30:51] PROBLEM - Puppet freshness on db62 is CRITICAL: Puppet has not run in the last 10 hours [14:31:09] RECOVERY - MySQL disk space on neon is OK: DISK OK [14:31:36] RECOVERY - ircecho_service_running on neon is OK: PROCS OK: 2 processes with args ircecho [15:08:30] PROBLEM - ircecho_service_running on neon is CRITICAL: Connection refused by host [15:09:24] PROBLEM - Puppet freshness on ms1004 is CRITICAL: Puppet has not run in the last 10 hours [15:09:51] PROBLEM - MySQL disk space on neon is CRITICAL: Connection refused by host [15:39:06] RECOVERY - ircecho_service_running on neon is OK: PROCS OK: 2 processes with args ircecho [15:40:28] RECOVERY - MySQL disk space on neon is OK: DISK OK [16:24:13] PROBLEM - Puppet freshness on virt0 is CRITICAL: Puppet has not run in the last 10 hours [17:04:43] PROBLEM - check_job_queue on neon is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 9,999 jobs: , frwiki (112423), Total (116107) [17:06:13] PROBLEM - check_job_queue on spence is CRITICAL: JOBQUEUE CRITICAL - the following wikis have more than 9,999 jobs: , frwiki (111017), Total (112893) [17:31:16] PROBLEM - Host google is DOWN: CRITICAL - Time to live exceeded (74.125.225.84) [17:31:16] PROBLEM - Host mobile-lb.eqiad.wikimedia.org_ipv6 is DOWN: /bin/ping6 -n -U -w 15 -c 5 2620:0:861:ed1a::c [17:31:21] RECOVERY - Host google is UP: PING OK - Packet loss = 0%, RTA = 54.93 ms [17:31:39] RECOVERY - Host mobile-lb.eqiad.wikimedia.org_ipv6 is UP: PING OK - Packet loss = 0%, RTA = 35.45 ms [17:46:13] New review: Danny B.; "What's the status, please? Instead of the desired enwikt-like icon ['w] we have now the scrabble-lik..." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/49681 [17:47:12] New review: Alex Monk; "Nothing to do with this." [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/49681 [17:49:21] New review: Alex Monk; "Probably I6b727825" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/49681 [17:49:41] New patchset: Alex Monk; "(bug 45113) Set cswiktionary favicon to the same as enwiktionary" [operations/mediawiki-config] (master) - https://gerrit.wikimedia.org/r/49681 [17:53:27] PROBLEM - check google safe browsing for wikipedia.org on google is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:53:42] New review: Danny B.; "Please sync." [operations/mediawiki-config] (master) C: 1; - https://gerrit.wikimedia.org/r/49681 [17:55:03] RECOVERY - check google safe browsing for wikipedia.org on google is OK: HTTP OK HTTP/1.0 200 OK - 0.153 second response time [17:57:00] PROBLEM - Puppet freshness on sq73 is CRITICAL: Puppet has not run in the last 10 hours [17:58:03] PROBLEM - Puppet freshness on amssq37 is CRITICAL: Puppet has not run in the last 10 hours [18:08:33] PROBLEM - ircecho_service_running on neon is CRITICAL: Connection refused by host [18:08:45] PROBLEM - MySQL disk space on neon is CRITICAL: Connection refused by host [18:19:57] PROBLEM - Puppet freshness on cp3003 is CRITICAL: Puppet has not run in the last 10 hours [18:26:06] PROBLEM - Puppet freshness on cp3004 is CRITICAL: Puppet has not run in the last 10 hours [18:37:29] PROBLEM - Puppet freshness on lardner is CRITICAL: Puppet has not run in the last 10 hours [18:40:24] RECOVERY - MySQL disk space on neon is OK: DISK OK [18:40:56] RECOVERY - ircecho_service_running on neon is OK: PROCS OK: 2 processes with args ircecho [18:46:29] PROBLEM - Puppet freshness on ms-be3002 is CRITICAL: Puppet has not run in the last 10 hours [18:46:30] PROBLEM - Puppet freshness on ms-be3003 is CRITICAL: Puppet has not run in the last 10 hours [18:56:50] PROBLEM - LVS Lucene on search-pool4.svc.eqiad.wmnet is CRITICAL: Connection timed out [18:58:48] RECOVERY - LVS Lucene on search-pool4.svc.eqiad.wmnet is OK: TCP OK - 0.027 second response time on port 8123 [19:09:26] PROBLEM - Puppet freshness on ms-be3001 is CRITICAL: Puppet has not run in the last 10 hours [19:55:34] PROBLEM - MySQL disk space on neon is CRITICAL: Connection refused by host [19:56:05] PROBLEM - ircecho_service_running on neon is CRITICAL: Connection refused by host [20:04:12] PROBLEM - Host google is DOWN: CRITICAL - Time to live exceeded (74.125.225.84) [20:04:29] RECOVERY - Host google is UP: PING OK - Packet loss = 0%, RTA = 47.47 ms [20:15:35] PROBLEM - Puppetmaster HTTPS on stafford is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:17:32] RECOVERY - Puppetmaster HTTPS on stafford is OK: HTTP OK HTTP/1.1 400 Bad Request - 336 bytes in 0.039 seconds [20:28:20] RECOVERY - MySQL disk space on neon is OK: DISK OK [20:28:30] RECOVERY - ircecho_service_running on neon is OK: PROCS OK: 2 processes with args ircecho [20:52:29] PROBLEM - Puppet freshness on maerlant is CRITICAL: Puppet has not run in the last 10 hours [20:56:40] apergos: why does class mediawiki_new::jobrunner have $procs = 5? that number used is 12 [21:11:41] PROBLEM - MySQL Slave Delay on db36 is CRITICAL: CRIT replication delay 260 seconds [21:13:02] PROBLEM - MySQL Replication Heartbeat on db36 is CRITICAL: CRIT replication delay 341 seconds [21:18:53] RECOVERY - MySQL Slave Delay on db36 is OK: OK replication delay NULL seconds [21:21:35] PROBLEM - MySQL Slave Running on db36 is CRITICAL: CRIT replication Slave_IO_Running: Yes Slave_SQL_Running: No Last_Error: Error Lock wait timeout exceeded: try restarting transaction on que [21:28:50] Aaron|home, paravoid: looks like the last deploy of TMH caused 404s for transcoded videos (https://bugzilla.wikimedia.org/show_bug.cgi?id=45294) this can be fixed buy moving them to the transcoded container with the maintenance/moveTranscoded.php script [21:35:13] j^: looks like htere's loads on commons alone.. [21:36:27] Reedy: yes this affects all videos [21:36:47] the move from thumbs to trancoded required that they are all moved [21:36:56] deploying the new coded also required moving them [21:37:16] only see now that code was deployed but transcodes not moved [21:43:45] !log mw110 is asking for a password [21:43:49] Logged the message, Master [21:44:29] PROBLEM - MySQL disk space on neon is CRITICAL: Connection refused by host [21:44:46] !log reedy synchronized php-1.21wmf10/cache/interwiki.cdb [21:44:47] Logged the message, Master [21:44:56] PROBLEM - ircecho_service_running on neon is CRITICAL: Connection refused by host [21:45:40] !log reedy synchronized php-1.21wmf9/cache/interwiki.cdb [21:45:41] Logged the message, Master [21:45:43] Nemo_bis: ^^ Both done [21:48:01] thanks! [21:53:12] j^: Looks like this will need re-running when the wmf9 wikis are moved to wmf10? [21:53:47] Due to "The MediaWiki script file "/usr/local/apache/common-local/php-1.21wmf9/extensions/TimedMediaHandler/maintenance/moveTranscoded.php" does not exist." [21:53:48] Reedy: was this ever run? [21:54:02] ah ok [21:54:03] No idea [21:54:03] hm [21:54:33] i've just restarted it in a screen session to iterate over all the wikis [21:54:41] commons is on 10 so yes needs to be run upgrading to 10 [21:55:44] PROBLEM - Puppet freshness on mw1039 is CRITICAL: Puppet has not run in the last 10 hours [21:55:44] PROBLEM - Puppet freshness on mw64 is CRITICAL: Puppet has not run in the last 10 hours [21:56:08] but possibly something else goes wrong https://upload.wikimedia.org/wikipedia/commons/transcoded/4/41/Gwtoolset-sprint6-demo.webm/Gwtoolset-sprint6-demo.webm.360p.webm return 401 not 404 [21:57:48] it's doing stuff [21:58:40] ok waiting [22:03:50] PROBLEM - MySQL Replication Heartbeat on db53 is CRITICAL: CRIT replication delay 182 seconds [22:04:36] PROBLEM - MySQL Slave Delay on db53 is CRITICAL: CRIT replication delay 200 seconds [22:05:11] PROBLEM - MySQL Slave Delay on db56 is CRITICAL: CRIT replication delay 247 seconds [22:06:59] RECOVERY - MySQL Slave Delay on db56 is OK: OK replication delay 0 seconds [22:15:05] RECOVERY - MySQL disk space on neon is OK: DISK OK [22:15:32] RECOVERY - ircecho_service_running on neon is OK: PROCS OK: 2 processes with args ircecho [22:33:23] RECOVERY - MySQL Slave Delay on db53 is OK: OK replication delay 2 seconds [22:34:26] RECOVERY - MySQL Replication Heartbeat on db53 is OK: OK replication delay 0 seconds [22:38:27] PROBLEM - Puppet freshness on db1009 is CRITICAL: Puppet has not run in the last 10 hours [22:38:52] New patchset: QChris; "Move connection limiting from gerrit's Jetty to Apache" [operations/puppet] (production) - https://gerrit.wikimedia.org/r/50591 [22:40:24] PROBLEM - Puppet freshness on mw1134 is CRITICAL: Puppet has not run in the last 10 hours [22:56:45] PROBLEM - MySQL Slave Running on db39 is CRITICAL: CRIT replication Slave_IO_Running: Yes Slave_SQL_Running: No Last_Error: Error Cant find record in page_restrictions on query. Default da [23:37:05] PROBLEM - MySQL Replication Heartbeat on db53 is CRITICAL: CRIT replication delay 185 seconds [23:37:32] PROBLEM - MySQL Slave Delay on db53 is CRITICAL: CRIT replication delay 192 seconds [23:40:41] RECOVERY - MySQL Replication Heartbeat on db53 is OK: OK replication delay 0 seconds [23:41:08] RECOVERY - MySQL Slave Delay on db53 is OK: OK replication delay 0 seconds