[00:00:29] AaronSchulz: aren't we using self destructing temp files ? [00:00:51] there is a bunch of them on deployment-jobrunner08.pmtpa.wmflabs :-/ [00:00:55] yes, maybe it's fatals during parsing, like OOMs [00:01:32] http://paste.openstack.org/show/47247/ [00:01:39] thus the destructor is never called I guess ? [00:01:50] 71add1dc4b5feaf2abfa63567764bc69.png [00:01:57] hmm, no prefix...I wonder what that's from [00:06:11] can't find any related fatal errors at that time [00:06:20] though the fatal log might use a different timezone [00:06:36] looked at /data/project/logs/archive/fatal.log-20130911.gz [00:10:37] hashar: I know why [00:10:38] duh [00:10:51] oh [00:10:59] and I am pretty sure we have a RT ticket somewhere about it [00:11:09] hashar: it uses a file at the TempFSFile path but then makes to more with different extensions [00:11:15] obviously the later won't be deleted [00:11:19] that's why there are svg/pngs [00:15:48] interesting, one of my two recently rebooted instances no longer suffers from https://bugzilla.redhat.com/show_bug.cgi?id=996746 it seems [00:22:01] (03CR) 10MZMcBride: "Chad: is getting this file updated in Gerrit a task that should have an associated RT ticket?" [operations/puppet] - 10https://gerrit.wikimedia.org/r/84743 (owner: 10QChris) [00:24:29] greg-g, you there? we'd like to have a deleted i18n message restored in MobileFrontend. cc: kaldari [00:25:30] greg-g: just wondering if we could do an after hours MobileFrontend deployment with scap, or if we need to wait until another day [00:26:39] (03PS1) 10RobH: bastion4001 setup [operations/puppet] - 10https://gerrit.wikimedia.org/r/84912 [00:27:17] kaldari: Where's it deleted? [00:27:22] In your branch but not master or something? [00:27:35] yes [00:27:40] (03CR) 10RobH: [C: 032] bastion4001 setup [operations/puppet] - 10https://gerrit.wikimedia.org/r/84912 (owner: 10RobH) [00:28:08] I was going to suggest waiting for localisation update to run [00:28:08] we deleted it from MobileFrontend, but didn't realize it was being used in Zero. It's back in master now. [00:28:17] Messy [00:28:28] we added it back in en.wiki manually for now [00:28:54] It's low risk to do it tbh [00:30:45] it's not the end of the world [00:30:51] we can wait [00:31:22] Might aswell just fix it [00:32:08] RECOVERY - check_job_queue on fenari is OK: JOBQUEUE OK - all job queues below 10,000 [00:32:10] (03PS1) 10RobH: adding tftpd to bastion4001 [operations/puppet] - 10https://gerrit.wikimedia.org/r/84914 [00:33:01] (03CR) 10RobH: [C: 032] adding tftpd to bastion4001 [operations/puppet] - 10https://gerrit.wikimedia.org/r/84914 (owner: 10RobH) [00:35:18] PROBLEM - check_job_queue on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [00:39:45] !log Jenkins: migrating phpcs-HEAD mediawiki extension jobs to use the git protocol when fetching Zuul refs. [00:39:49] Logged the message, Master [00:43:38] (03PS2) 10Reedy: Use $wgMessageFileList [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/84898 [00:56:43] !log iron coming offline [00:56:46] Logged the message, RobH [01:06:09] Can someone depool mw1154 from the image scaler cluster? [01:06:20] what's wrong with it? [01:07:40] It's seemingly generating a lot more errors (by a factor of 2 or so) than any of the other scalers [01:07:41] https://bugzilla.wikimedia.org/show_bug.cgi?id=54045 [01:08:32] root@mw1154:~# ls /sys/fs/cgroup/memory/mediawiki/job [01:08:32] ls: cannot access /sys/fs/cgroup/memory/mediawiki/job: No such file or directory [01:08:35] blergh [01:09:00] fixed [01:09:58] is that's what's up with it? [01:10:05] yes, fixed [01:10:10] awesome [01:10:10] thanks [01:32:27] !log jenkins: updating qunit jobs to fetch Zuul ref over git protocol [01:32:31] Logged the message, Master [01:32:55] according to -tech, the beta cluster is broken btw [01:33:08] Say wut? [01:34:48] !log jenkins: migrating test extensions jobs to fetch Zuul ref over git protocol. Tie them to label 'hasPhpUnit' as well since it is only installed on gallium .. [01:34:51] Logged the message, Master [02:15:35] (03PS5) 10Reedy: Simplify wikimania apache conf, reuse wikimedia.org docroot. [operations/apache-config] - 10https://gerrit.wikimedia.org/r/84707 [02:17:13] (03PS1) 10Reedy: Add labs docroot back by adding symlink [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/84922 [02:17:43] (03CR) 10Reedy: [C: 032] Add labs docroot back by adding symlink [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/84922 (owner: 10Reedy) [06:06:55] PROBLEM - Puppet freshness on mw1015 is CRITICAL: No successful Puppet run in the last 10 hours [06:06:56] PROBLEM - Puppet freshness on mw9 is CRITICAL: No successful Puppet run in the last 10 hours [06:06:56] PROBLEM - Puppet freshness on search1018 is CRITICAL: No successful Puppet run in the last 10 hours [06:06:56] PROBLEM - Puppet freshness on search22 is CRITICAL: No successful Puppet run in the last 10 hours [06:06:56] PROBLEM - Puppet freshness on srv258 is CRITICAL: No successful Puppet run in the last 10 hours [06:07:55] PROBLEM - Puppet freshness on mw1017 is CRITICAL: No successful Puppet run in the last 10 hours [06:07:55] PROBLEM - Puppet freshness on mw1114 is CRITICAL: No successful Puppet run in the last 10 hours [06:07:56] PROBLEM - Puppet freshness on mw1101 is CRITICAL: No successful Puppet run in the last 10 hours [06:07:56] PROBLEM - Puppet freshness on mw14 is CRITICAL: No successful Puppet run in the last 10 hours [06:07:56] PROBLEM - Puppet freshness on mw1182 is CRITICAL: No successful Puppet run in the last 10 hours [06:07:56] PROBLEM - Puppet freshness on mw1175 is CRITICAL: No successful Puppet run in the last 10 hours [06:07:56] PROBLEM - Puppet freshness on mw1095 is CRITICAL: No successful Puppet run in the last 10 hours [06:07:57] PROBLEM - Puppet freshness on mw62 is CRITICAL: No successful Puppet run in the last 10 hours [06:07:57] PROBLEM - Puppet freshness on mw90 is CRITICAL: No successful Puppet run in the last 10 hours [06:07:58] PROBLEM - Puppet freshness on srv284 is CRITICAL: No successful Puppet run in the last 10 hours [06:07:58] PROBLEM - Puppet freshness on srv250 is CRITICAL: No successful Puppet run in the last 10 hours [06:07:59] PROBLEM - Puppet freshness on srv286 is CRITICAL: No successful Puppet run in the last 10 hours [06:08:55] PROBLEM - Puppet freshness on mw100 is CRITICAL: No successful Puppet run in the last 10 hours [06:08:55] PROBLEM - Puppet freshness on mw1044 is CRITICAL: No successful Puppet run in the last 10 hours [06:08:55] PROBLEM - Puppet freshness on mw1162 is CRITICAL: No successful Puppet run in the last 10 hours [06:08:55] PROBLEM - Puppet freshness on mw119 is CRITICAL: No successful Puppet run in the last 10 hours [06:08:55] PROBLEM - Puppet freshness on mw1196 is CRITICAL: No successful Puppet run in the last 10 hours [06:08:56] PROBLEM - Puppet freshness on mw1214 is CRITICAL: No successful Puppet run in the last 10 hours [06:08:56] PROBLEM - Puppet freshness on mw38 is CRITICAL: No successful Puppet run in the last 10 hours [06:08:57] PROBLEM - Puppet freshness on mw41 is CRITICAL: No successful Puppet run in the last 10 hours [06:08:57] PROBLEM - Puppet freshness on mw5 is CRITICAL: No successful Puppet run in the last 10 hours [06:08:58] PROBLEM - Puppet freshness on mw83 is CRITICAL: No successful Puppet run in the last 10 hours [06:08:58] PROBLEM - Puppet freshness on search21 is CRITICAL: No successful Puppet run in the last 10 hours [06:09:55] PROBLEM - Puppet freshness on mw1111 is CRITICAL: No successful Puppet run in the last 10 hours [06:09:55] PROBLEM - Puppet freshness on mw1124 is CRITICAL: No successful Puppet run in the last 10 hours [06:09:56] PROBLEM - Puppet freshness on mw1125 is CRITICAL: No successful Puppet run in the last 10 hours [06:09:56] PROBLEM - Puppet freshness on mw1147 is CRITICAL: No successful Puppet run in the last 10 hours [06:09:56] PROBLEM - Puppet freshness on mw1130 is CRITICAL: No successful Puppet run in the last 10 hours [06:09:56] PROBLEM - Puppet freshness on mw1190 is CRITICAL: No successful Puppet run in the last 10 hours [06:09:56] PROBLEM - Puppet freshness on mw1161 is CRITICAL: No successful Puppet run in the last 10 hours [06:10:55] PROBLEM - Puppet freshness on mw1004 is CRITICAL: No successful Puppet run in the last 10 hours [06:10:56] PROBLEM - Puppet freshness on mw1081 is CRITICAL: No successful Puppet run in the last 10 hours [06:10:56] PROBLEM - Puppet freshness on mw1146 is CRITICAL: No successful Puppet run in the last 10 hours [06:10:56] PROBLEM - Puppet freshness on search19 is CRITICAL: No successful Puppet run in the last 10 hours [08:40:06] (03CR) 10Faidon Liambotis: [C: 032] ZERO: Enabled VIVA (426-04) for all languages [operations/puppet] - 10https://gerrit.wikimedia.org/r/84940 (owner: 10Yurik) [08:45:44] (03CR) 10Faidon Liambotis: [C: 04-1] "SSLCACertificatePath isn't (just) about client auth, that's incorrect." [operations/puppet] - 10https://gerrit.wikimedia.org/r/84901 (owner: 10Dzahn) [13:26:20] RECOVERY - check_job_queue on fenari is OK: JOBQUEUE OK - all job queues below 10,000 [13:29:30] PROBLEM - check_job_queue on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [13:32:18] (03PS1) 10Manybubbles: Make elasticsearch.yml match packaged version. [operations/puppet] - 10https://gerrit.wikimedia.org/r/84973 [13:37:14] (03PS2) 10Manybubbles: Add elasticsearch plugins. [operations/puppet] - 10https://gerrit.wikimedia.org/r/82673 [13:37:38] (03PS1) 10QChris: Compute geowiki data only for full days [operations/puppet] - 10https://gerrit.wikimedia.org/r/84974 [13:49:12] PROBLEM - Puppet freshness on hume is CRITICAL: No successful Puppet run in the last 10 hours [13:54:12] PROBLEM - Puppet freshness on terbium is CRITICAL: No successful Puppet run in the last 10 hours [14:18:50] (03CR) 10Ottomata: [C: 032 V: 032] Make elasticsearch.yml match packaged version. [operations/puppet] - 10https://gerrit.wikimedia.org/r/84973 (owner: 10Manybubbles) [14:19:23] ottomata1: the great irony is that that change right there will bounce elasticsearch [14:19:34] which is fine. it doesn't mind [14:19:56] but we were really careful when I upgraded it yesterday and puppet isn't that careful. [14:20:04] (03PS2) 10Ottomata: Compute geowiki data only for full days [operations/puppet] - 10https://gerrit.wikimedia.org/r/84974 (owner: 10QChris) [14:20:12] (03CR) 10Ottomata: [C: 032 V: 032] Compute geowiki data only for full days [operations/puppet] - 10https://gerrit.wikimedia.org/r/84974 (owner: 10QChris) [16:00:13] RECOVERY - MySQL Replication Heartbeat on db1038 is OK: OK replication delay 0 seconds [16:00:22] RECOVERY - MySQL Replication Heartbeat on db1042 is OK: OK replication delay -0 seconds [16:00:22] RECOVERY - MySQL Replication Heartbeat on db65 is OK: OK replication delay -0 seconds [16:00:22] RECOVERY - MySQL Replication Heartbeat on db72 is OK: OK replication delay -1 seconds [16:00:32] RECOVERY - MySQL Replication Heartbeat on db1004 is OK: OK replication delay -1 seconds [16:00:32] RECOVERY - MySQL Replication Heartbeat on db51 is OK: OK replication delay -1 seconds [16:00:33] RECOVERY - MySQL Replication Heartbeat on db1059 is OK: OK replication delay -0 seconds [16:00:42] RECOVERY - MySQL Replication Heartbeat on db31 is OK: OK replication delay 0 seconds [16:00:42] RECOVERY - MySQL Replication Heartbeat on db1020 is OK: OK replication delay -0 seconds [16:01:02] RECOVERY - MySQL Replication Heartbeat on db1011 is OK: OK replication delay -0 seconds [16:44:59] Logged the message, Master [16:46:27] <^d> "Indexed a total of 380 pages at 23/second" [16:59:59] * aude annoyed that patrolled edits are hidden [17:00:09] (unrelated) [17:00:31] (03PS1) 10Chad: testwikidatawiki to Cirrus as primary [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/84995 [17:00:51] <^d> Hehe, there we go. [17:13:07] ^d: manybubbles is there a particular page on mediawiki.org or somewhere that talks about cirrus? [17:13:13] e.g. for people interested in trying the search [17:13:35] or we can give it some time to improve it before asking people to test [17:13:49] ^d: beyond https://www.mediawiki.org/wiki/Search which I'm sure is outdated I don't believe so [17:13:58] ok [17:14:00] no, it is totally time to ask people to test [17:14:21] ok, i can send a mail [17:14:27] we want people that are on the wikis on which it is installed to test it. mostly, though, it implements all the same features as the old one. [17:14:31] is there place people can give feedback? [17:14:33] let me update that page [17:14:37] ok [17:14:39] aude: bugzilla [17:14:43] aude: :) [17:14:46] wikidata's community is fairly technical [17:14:52] bugzilla is good [17:18:57] (03PS1) 10coren: Tool Labs: tweak sql utility [operations/puppet] - 10https://gerrit.wikimedia.org/r/85002 [17:18:59] ttps://test.wikidata.org/wiki/MediaWiki:Searchmenu-new [17:19:06] that's what is done on wikidata [17:49:34] ^d: cool! [18:05:02] PROBLEM - check_job_queue on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [18:05:21] !log reedy synchronized php-1.22wmf18 [18:05:26] Logged the message, Master