[00:01:53] (03CR) 10Kaldari: [C: 032] Move all MobileFrontend EventLogging rules into MobileFrontend [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91511 (owner: 10Jdlrobson) [00:17:35] !log kaldari synchronized wmf-config/mobile.php 'MobileFrontend schema configs moved to MobileFrontend extension' [00:17:48] Logged the message, Master [00:21:28] !log kaldari synchronized php-1.22wmf22/extensions/MobileFrontend/ 'syncing MobileFrontend 1.22wmf22' [00:21:41] Logged the message, Master [00:22:59] !log kaldari synchronized php-1.22wmf21/extensions/MobileFrontend/ 'syncing MobileFrontend 1.22wmf21' [00:23:13] Logged the message, Master [00:32:07] (03PS1) 10Kaldari: Temporarily reverting. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91540 [00:32:45] (03CR) 10Kaldari: [C: 032] Temporarily reverting. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91540 (owner: 10Kaldari) [00:35:50] (03PS6) 10Andrew Bogott: Add install and upstart for proxy api [operations/puppet] - 10https://gerrit.wikimedia.org/r/91499 [00:38:36] (03Abandoned) 10Kaldari: Temporarily reverting. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91540 (owner: 10Kaldari) [00:38:59] (03PS1) 10Kaldari: Revert "Move all MobileFrontend EventLogging rules into MobileFrontend" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91542 [00:40:55] (03CR) 10Kaldari: [C: 032] Revert "Move all MobileFrontend EventLogging rules into MobileFrontend" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91542 (owner: 10Kaldari) [00:41:00] (03PS7) 10Andrew Bogott: Add install and upstart for proxy api [operations/puppet] - 10https://gerrit.wikimedia.org/r/91499 [00:43:14] !log kaldari synchronized wmf-config/mobile.php 'readding mobile schema config for now' [00:43:30] Logged the message, Master [01:00:17] (03PS4) 10Dzahn: add account marktraceur and add to stat1 [operations/puppet] - 10https://gerrit.wikimedia.org/r/91047 [01:00:53] Huzzah [01:02:23] (03CR) 10Dzahn: [C: 032] "good to go, has approvals" [operations/puppet] - 10https://gerrit.wikimedia.org/r/91047 (owner: 10Dzahn) [01:03:31] marktraceur: so it turned into just adding the existing account, hope you're ok with that [01:03:36] makes it much easier [01:03:50] Sure sure [01:04:01] I don't have to screw with my SSH config too much then [01:04:06] you can still request a rename for everything at a later point, like when you have a new key anyways maybe [01:04:12] ok [01:06:24] (03CR) 10Dzahn: "so this was just adding existing account mholmquist to the "stat1 special accounts"" [operations/puppet] - 10https://gerrit.wikimedia.org/r/91047 (owner: 10Dzahn) [01:06:55] notices ebernhardson has duplicate account in same puppet class [01:07:17] marktraceur: notice: /Stage[main]/Accounts::Mholmquist/Unixaccount[Mark Holmquist]/User[mholmquist]/ensure: created [01:07:20] on stat1 [01:07:31] Cool [01:12:57] mutante: Are there docs for stat1 somewhere? [01:14:02] mutante: no reply from smart yet, right? [01:14:36] marktraceur: uhm.. afraid if they exist i'm not sure where.. https://wikitech.wikimedia.org/wiki/Stat1 but ottomata or other analytics people will be able to help [01:14:46] Cool [01:14:50] I poked YuviPanda too [01:14:57] None of you are helpful [01:15:05] marktraceur: site.pp! [01:15:33] hehe, wait . doc.wikimedia.org [01:15:51] yeah, right [01:15:55] https://doc.wikimedia.org/puppet/ [01:16:04] !log kaldari synchronized php-1.22wmf22/extensions/MobileFrontend/ 'updating MobileFrontend for cherrypick' [01:16:18] Logged the message, Master [01:17:34] mutante: so, smart? [01:17:40] jeremyb: ? [01:17:51] 24 01:14:01 < jeremyb> mutante: no reply from smart yet, right? [01:17:56] mutante: to noc@ [01:18:10] i think i missed something there [01:18:20] want me to check noc@ mail ? [01:18:22] i'll bump the thread [01:18:48] the 504s ..and Leslie [01:18:50] ok , got it [01:19:09] no, i don't see any reply yet [01:19:20] mutante: right. too late though, i bumped already :) [01:19:30] danke [01:19:35] unless they replied only to her [01:19:41] bitte [01:19:50] right, possible but that risk i will take :) [01:21:00] i'm in my new appartment, first 2 hours still [01:21:20] errr, new city? [01:21:21] just ..no furniture.. [01:21:25] nah, S.F. [01:21:29] sitting on floor? :) [01:21:33] but i've been living like a nomad for over 1 year [01:21:34] yes [01:22:01] i know nomads that have you beat! [01:22:30] i bet, yep [01:22:54] ok, need to go downstairs,bbiaw [01:28:45] mutante: mazel tov on the new place! [01:28:55] i also go downstairs i think [01:29:25] !log kaldari synchronized php-1.22wmf22/extensions/MobileFrontend/ 'updating MobileFrontend for cherrypick' [01:32:03] !log kaldari synchronized php-1.22wmf21/extensions/MobileFrontend/ 'updating MobileFrontend for cherrypick' [01:32:16] Logged the message, Master [01:33:41] jeremyb: and now i got a cot to sleep on for tonight:) cya [01:33:47] thanks for wishes [01:34:03] (03PS1) 10Kaldari: Moving MobileFrontend schema configs to MobileFrontend extension [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91552 [01:34:26] (03CR) 10Kaldari: [C: 032] Moving MobileFrontend schema configs to MobileFrontend extension [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91552 (owner: 10Kaldari) [01:36:27] !log kaldari synchronized wmf-config/mobile.php 'updating for MobileFrontend schema changes' [01:36:39] Logged the message, Master [02:22:26] !log LocalisationUpdate completed (1.22wmf22) at Thu Oct 24 02:22:25 UTC 2013 [02:22:40] Logged the message, Master [02:42:50] !log LocalisationUpdate completed (1.22wmf21) at Thu Oct 24 02:42:50 UTC 2013 [02:43:05] Logged the message, Master [03:01:15] mutante: ack your mail [03:02:41] that's PDT? so just a couple mins ago? [03:04:27] mutante [03:10:16] ok, updated otrs-wiki page with the list of all the reports [03:10:23] back tomorrow [03:11:12] !log LocalisationUpdate ResourceLoader cache refresh completed at Thu Oct 24 03:11:12 UTC 2013 [03:11:28] Logged the message, Master [04:13:36] RECOVERY - check_job_queue on fenari is OK: JOBQUEUE OK - all job queues below 10,000 [04:16:46] PROBLEM - check_job_queue on fenari is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [05:13:25] (03PS1) 10Springle: remove bash-ism from cron job command [operations/puppet] - 10https://gerrit.wikimedia.org/r/91561 [05:14:33] (03CR) 10Springle: [C: 032] remove bash-ism from cron job command [operations/puppet] - 10https://gerrit.wikimedia.org/r/91561 (owner: 10Springle) [07:02:35] PROBLEM - Check status of defined EventLogging jobs on vanadium is CRITICAL: CRITICAL: Stopped EventLogging jobs: consumer/mysql-db1047 [07:04:08] springle: are you doing any work on db1047? [07:05:10] ori-l: hmm nope [07:05:22] i'll check the logs, could be unrelated [07:05:34] unrelated to the state of the database, I mean. [07:07:43] db1026 is unhappy atm, but nothing odd about db1047 that i've seen [07:11:58] (03PS1) 10Springle: depool db1026 for recovery, max connections, overwhelmed [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91568 [07:12:25] (03CR) 10Springle: [C: 032] depool db1026 for recovery, max connections, overwhelmed [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91568 (owner: 10Springle) [07:13:28] !log springle synchronized wmf-config/db-eqiad.php 'depool db1026' [07:13:45] Logged the message, Master [07:33:16] springle: around? [07:33:27] hello [07:33:32] hi hashar [07:33:35] hiii [07:33:50] Azatoth and I managed to get Jenkins to build Debian packages for us \O/ [07:34:15] aka you push a change set (ex: https://gerrit.wikimedia.org/r/#/c/91506/ ) and Jenkins provides you .deb .dsc … https://integration.wikimedia.org/ci/job/operations-debs-jenkins-debian-glue-debian-glue/3/ \O/ [07:34:51] wow, that's really cool [07:38:52] ori-l: yes [07:40:26] (03PS1) 10Yurik: Optimized the number of req.http.host checks performed [operations/puppet] - 10https://gerrit.wikimedia.org/r/91569 [07:52:45] (03PS1) 10ArielGlenn: oldimage table moved to private dump area [operations/dumps] (ariel) - 10https://gerrit.wikimedia.org/r/91571 [07:54:22] (03CR) 10ArielGlenn: [C: 032] oldimage table moved to private dump area [operations/dumps] (ariel) - 10https://gerrit.wikimedia.org/r/91571 (owner: 10ArielGlenn) [08:09:36] RECOVERY - Check status of defined EventLogging jobs on vanadium is OK: OK: All defined EventLogging jobs are runnning. [08:12:26] PROBLEM - Apache HTTP on mw1056 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:13:05] (03CR) 10Nemo bis: "Makes sense. Note that we query that table on Toolserver for WikiTeam's Commons backups on archive.org." [operations/dumps] (ariel) - 10https://gerrit.wikimedia.org/r/91571 (owner: 10ArielGlenn) [08:13:07] PROBLEM - Apache HTTP on mw1163 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:13:07] PROBLEM - Apache HTTP on mw1019 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:13:16] PROBLEM - Apache HTTP on mw1093 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:15:07] RECOVERY - Apache HTTP on mw1019 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 9.553 second response time [08:15:16] PROBLEM - Apache HTTP on mw1055 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:15:58] (03PS1) 10QChris: Allow qchris to sudo -u stats to debug and test jobs on stats1001 [operations/puppet] - 10https://gerrit.wikimedia.org/r/91572 [08:16:16] PROBLEM - Apache HTTP on mw1104 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:16:29] (03PS1) 10Matanya: mail.pp: change exec to file [operations/puppet] - 10https://gerrit.wikimedia.org/r/91573 [08:16:56] PROBLEM - Apache HTTP on mw1108 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:17:13] (03CR) 10Matanya: "superseded by: https://gerrit.wikimedia.org/r/#/c/91573/" [operations/puppet] - 10https://gerrit.wikimedia.org/r/86889 (owner: 10Matanya) [08:17:36] PROBLEM - Apache HTTP on mw1058 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:17:40] (03CR) 10jenkins-bot: [V: 04-1] mail.pp: change exec to file [operations/puppet] - 10https://gerrit.wikimedia.org/r/91573 (owner: 10Matanya) [08:17:59] (03Abandoned) 10Matanya: Repalce exec calls with file and user. [operations/puppet] - 10https://gerrit.wikimedia.org/r/86889 (owner: 10Matanya) [08:18:07] PROBLEM - Apache HTTP on mw1019 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:19:07] (03PS2) 10Matanya: mail.pp: change exec to file [operations/puppet] - 10https://gerrit.wikimedia.org/r/91573 [08:19:16] PROBLEM - Apache HTTP on mw1048 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:19:16] RECOVERY - Apache HTTP on mw1104 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.669 second response time [08:19:26] PROBLEM - Apache HTTP on mw1060 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:19:26] RECOVERY - Apache HTTP on mw1058 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.383 second response time [08:20:07] RECOVERY - Apache HTTP on mw1093 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.082 second response time [08:20:09] (03CR) 10jenkins-bot: [V: 04-1] mail.pp: change exec to file [operations/puppet] - 10https://gerrit.wikimedia.org/r/91573 (owner: 10Matanya) [08:20:16] RECOVERY - Apache HTTP on mw1055 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 8.709 second response time [08:21:26] RECOVERY - Apache HTTP on mw1060 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.404 second response time [08:22:07] RECOVERY - Apache HTTP on mw1019 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 9.228 second response time [08:22:07] RECOVERY - Apache HTTP on mw1048 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 7.119 second response time [08:23:16] PROBLEM - Apache HTTP on mw1055 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:23:35] (03PS3) 10Matanya: mail.pp: change exec to file [operations/puppet] - 10https://gerrit.wikimedia.org/r/91573 [08:23:56] RECOVERY - Apache HTTP on mw1163 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.124 second response time [08:24:07] RECOVERY - Apache HTTP on mw1055 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 5.650 second response time [08:24:56] RECOVERY - Apache HTTP on mw1108 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 6.569 second response time [08:26:16] RECOVERY - Apache HTTP on mw1056 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.097 second response time [08:26:42] uhhh [08:26:44] ?? [08:29:16] PROBLEM - Apache HTTP on mw1104 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:29:26] PROBLEM - Apache HTTP on mw1056 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:29:26] PROBLEM - Apache HTTP on mw1060 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:30:07] PROBLEM - Apache HTTP on mw1163 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:31:16] RECOVERY - Apache HTTP on mw1104 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 3.664 second response time [08:32:06] RECOVERY - Apache HTTP on mw1163 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 1.153 second response time [08:32:16] RECOVERY - Apache HTTP on mw1056 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.063 second response time [08:32:16] RECOVERY - Apache HTTP on mw1060 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.084 second response time [08:32:48] wth [08:33:05] http://ganglia.wikimedia.org/latest/?r=4hr&cs=&ce=&c=Application+servers+eqiad&h=mw1055.eqiad.wmnet&tab=m&vn=&mc=2&z=medium&metric_group=ALLGROUPS [08:33:45] playing with varnish on mobile01 . beta [08:35:06] PROBLEM - Apache HTTP on mw1171 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:36:43] well http://ganglia.wikimedia.org/latest/?r=4hr&cs=&ce=&m=ap_busy_workers&s=by+name&c=Application+servers+eqiad&h=&host_regex=&max_graphs=0&tab=m&vn=&sh=1&z=small&hc=4 [08:36:58] what started near 6 am utc? [08:41:20] (03PS1) 10QChris: Move geowiki datafile generation to stat1001 [operations/puppet] - 10https://gerrit.wikimedia.org/r/91576 [08:41:56] is there a way to kill varnish check on beta cluster somehow for a few min? [08:42:06] floods logs :( [08:43:32] I guess the labs icinga [08:57:56] RECOVERY - Apache HTTP on mw1171 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 808 bytes in 0.073 second response time [09:09:35] (03PS1) 10Ori.livneh: Make respawn behavior of EventLogging consumers more resilient [operations/puppet] - 10https://gerrit.wikimedia.org/r/91580 [09:10:40] yurik_: do you mean the backend checks ? [09:10:53] hashar: yep [09:10:55] yurik_: I haven't found out how to filter them out with varnishncsa, but it is surely possible [09:10:57] (03CR) 10Ori.livneh: [C: 032] Make respawn behavior of EventLogging consumers more resilient [operations/puppet] - 10https://gerrit.wikimedia.org/r/91580 (owner: 10Ori.livneh) [09:11:09] MaxSem: mark: any idea how to filter out Varnish backend checks in varnishncsa ? [09:11:27] hashar: i'm using varnishlog - much more useful info [09:11:36] dunno) [09:11:47] but unfortunatelly they are dumped as they become available [09:12:06] and often it doesn't know that the request should be filtered until it already logged a few items [09:13:25] can you paste a log record you want to exclude? [09:16:02] ori-l: http://paste.debian.net/61161/ [09:16:27] the frontend varnish hits the backend one by doing a http://varnishcheck/check request [09:16:30] every second or so [09:21:13] does not work: varnishlog -c -m RxHeader:'(?!Host: varnishcheck)' [09:21:14] :( [09:21:47] what about: varnishlog -c -X "Host: etherpad.org" ? [09:22:02] errr: varnishlog -c -X "Host: varnishcheck" ? [09:23:11] that just strip the Host: header :-] [09:23:29] http://paste.debian.net/61164/ [09:34:43] hashar: varnishlog -c -m 'RxHeader:Host: (?!varnishcheck)' [09:37:18] does not work: varnishlog -c -m RxHeader:'(?!Host: varnishcheck)' [09:37:30] that will match if the request sets any header other than host [09:37:45] which it does -- the user agent, for example [09:48:34] (03PS4) 10Physikerwelt: Mathoid service [operations/puppet] - 10https://gerrit.wikimedia.org/r/90733 [10:06:22] !log set global userstat=1 on mariadb instances for audit [10:06:37] Logged the message, Master [10:07:26] (03PS1) 10Odder: (bug 36002) Configure $wgMobileUrlTemplate for sourceswiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91586 [10:09:17] (03PS2) 10Odder: (bug 36002) Configure $wgMobileUrlTemplate for sourceswiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91586 [10:15:39] !log tstarling Started syncing Wikimedia installation... : fix linkprefix regexes (Iaa7eaa44) [10:15:56] Logged the message, Master [10:24:38] !log tstarling Finished syncing Wikimedia installation... : fix linkprefix regexes (Iaa7eaa44) [10:24:49] Logged the message, Master [10:41:52] !log tstarling Started syncing Wikimedia installation... : fix linkprefix regexes in 1.22wmf22 (Iaa7eaa44) [10:42:06] Logged the message, Master [10:45:48] mark or paravoid, can you guys please tell me if you approve the addition of a cookie in https://gerrit.wikimedia.org/r/91401 ? [10:47:21] MaxSem: so this only matters if the cache should vary on the cookie [10:47:31] !log tstarling Finished syncing Wikimedia installation... : fix linkprefix regexes in 1.22wmf22 (Iaa7eaa44) [10:47:33] if not, it will be stripped for caching purposes, and restored before sending to mediawiki [10:47:46] Logged the message, Master [10:48:31] mark, this cookie is for JS only so our only concern with it is that it shouldn't impact cache hit ratio [10:48:44] then it shouldn't matter [10:48:50] yup [10:48:56] thanks! [10:49:28] replied on the patshet [10:49:29] patchset [10:53:56] wait [10:54:03] a user with this cookie doesn't necessarily have a session, right? [10:54:26] you say on the patchset it's only set when the user is logged in [10:54:27] hmm [10:55:33] we allow only logged-in users to edit, so they're mostly guaranteed to have a session [10:57:08] if the user loses the session but still has the cookie, that might be problematic [10:58:46] but it wouldn't bust the cache in that case [11:02:06] PROBLEM - MySQL InnoDB on db1059 is CRITICAL: CRIT longest blocking idle transaction sleeps for 609 seconds [11:02:43] heh, php.net ended up on google's safe browsing blacklist: http://www.google.com/safebrowsing/diagnostic?site=http://php.net/manual/en/class.jsonserializable.php&hl=en [11:03:06] RECOVERY - MySQL InnoDB on db1059 is OK: OK longest blocking idle transaction sleeps for 0 seconds [11:03:44] ori-l, not blocked enough as long there's no nuke crater on place of php.net [11:03:46] no web site built on php could possibly be safe to browse on ;p [11:09:39] funny how mobile traffic has a differently shaped peak from desktop [11:09:48] more in evenings and weekends [11:09:51] makes sense of course [11:13:43] (03CR) 10MaxSem: [C: 032] Setting Persian Wikipedia's weekly feeds [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90759 (owner: 10Ebrahim) [11:13:55] (03Merged) 10jenkins-bot: Setting Persian Wikipedia's weekly feeds [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90759 (owner: 10Ebrahim) [11:14:13] (03CR) 10MaxSem: [C: 032] Ensure that m.mediawiki.org will work as an origin for CORS [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91058 (owner: 10Awjrichards) [11:14:23] (03Merged) 10jenkins-bot: Ensure that m.mediawiki.org will work as an origin for CORS [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91058 (owner: 10Awjrichards) [11:17:58] (03PS1) 10Mark Bergsma: Repool cp3004 [operations/puppet] - 10https://gerrit.wikimedia.org/r/91590 [11:18:00] !log maxsem synchronized wmf-config/CommonSettings.php 'https://gerrit.wikimedia.org/r/#/c/91058/' [11:18:13] Logged the message, Master [11:18:21] (03CR) 10Mark Bergsma: [C: 032] Repool cp3004 [operations/puppet] - 10https://gerrit.wikimedia.org/r/91590 (owner: 10Mark Bergsma) [11:19:33] !log maxsem synchronized wmf-config/InitialiseSettings.php 'https://gerrit.wikimedia.org/r/90759' [11:19:46] Logged the message, Master [11:22:51] PROBLEM - RAID on mw1140 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [11:24:41] RECOVERY - RAID on mw1140 is OK: OK: no RAID installed [12:18:01] $ quilt add debian [12:18:01] Cannot add symbolic link debian [12:18:04] for god sake debian [12:37:05] (03PS1) 10Springle: repool db1026 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91592 [12:37:28] (03CR) 10Springle: [C: 032] repool db1026 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91592 (owner: 10Springle) [12:38:07] !log springle synchronized wmf-config/db-eqiad.php 'repool db1026' [12:38:25] Logged the message, Master [12:54:47] hi springle-away [12:55:04] are you around? [12:57:01] (03CR) 10ArielGlenn: [C: 031] "yep, it's gone, see r64002" [operations/dns] - 10https://gerrit.wikimedia.org/r/91125 (owner: 10Dzahn) [13:04:12] (03PS2) 10ArielGlenn: dumps: Copy pagecounts data to public labs nfs too [operations/puppet] - 10https://gerrit.wikimedia.org/r/91293 (owner: 10Yuvipanda) [13:05:46] (03CR) 10ArielGlenn: [C: 032] dumps: Copy pagecounts data to public labs nfs too [operations/puppet] - 10https://gerrit.wikimedia.org/r/91293 (owner: 10Yuvipanda) [13:13:50] (03PS1) 10Mark Bergsma: Add Text caches esams [operations/puppet] - 10https://gerrit.wikimedia.org/r/91595 [13:14:10] (03CR) 10Mark Bergsma: [C: 032] Add Text caches esams [operations/puppet] - 10https://gerrit.wikimedia.org/r/91595 (owner: 10Mark Bergsma) [13:15:51] (03PS1) 10coren: Update manage-nfs-volumes-daemon [operations/puppet] - 10https://gerrit.wikimedia.org/r/91596 [13:17:35] (03CR) 10coren: [C: 032] "o/~ I have confidence in confidence alone! o/~" [operations/puppet] - 10https://gerrit.wikimedia.org/r/91596 (owner: 10coren) [13:18:18] (03PS1) 10ArielGlenn: add mount points for download rsync cron jobs [operations/puppet] - 10https://gerrit.wikimedia.org/r/91597 [13:18:46] (03PS1) 10Mark Bergsma: Add amssq48-62 [operations/puppet] - 10https://gerrit.wikimedia.org/r/91598 [13:18:56] (03CR) 10jenkins-bot: [V: 04-1] add mount points for download rsync cron jobs [operations/puppet] - 10https://gerrit.wikimedia.org/r/91597 (owner: 10ArielGlenn) [13:20:08] (03CR) 10Mark Bergsma: [C: 032] Add amssq48-62 [operations/puppet] - 10https://gerrit.wikimedia.org/r/91598 (owner: 10Mark Bergsma) [13:23:06] (03PS2) 10ArielGlenn: add mount points for download rsync cron jobs [operations/puppet] - 10https://gerrit.wikimedia.org/r/91597 [13:24:46] (03CR) 10ArielGlenn: [C: 032] add mount points for download rsync cron jobs [operations/puppet] - 10https://gerrit.wikimedia.org/r/91597 (owner: 10ArielGlenn) [13:42:52] (03PS1) 10ArielGlenn: fix up name of nfs server for dump host pagecount rsync [operations/puppet] - 10https://gerrit.wikimedia.org/r/91600 [13:45:42] (03CR) 10ArielGlenn: [C: 032] fix up name of nfs server for dump host pagecount rsync [operations/puppet] - 10https://gerrit.wikimedia.org/r/91600 (owner: 10ArielGlenn) [14:08:43] hey apergos, [14:08:57] a long time ago, when I was puppetizing stuff on stat1 [14:09:23] i was told that NFS was bad was barely barely allowed to keep a nfs mount of dumps.wm.org so that erik zachte could use pagecounts from it [14:10:02] (03CR) 10Andrew Bogott: [C: 04-1] "(2 comments)" [operations/puppet] - 10https://gerrit.wikimedia.org/r/91573 (owner: 10Matanya) [14:13:26] ottomata: we used to run the MediaWiki install over a NFS share mounted on all apaches [14:13:42] (03Abandoned) 10QChris: Allow qchris to sudo -u stats to debug and test jobs on stats1001 [operations/puppet] - 10https://gerrit.wikimedia.org/r/91572 (owner: 10QChris) [14:13:55] (03Abandoned) 10QChris: Move geowiki datafile generation to stat1001 [operations/puppet] - 10https://gerrit.wikimedia.org/r/91576 (owner: 10QChris) [14:14:06] ottomata: that sometime caused a bunch of issues :D Like whenever the NFS server died we were having timeout on the site [14:14:26] ottomata: and the NFS server was also the bastion / deployment host, so that happened somehow frequently [14:14:44] the funny thing is that you could vi a file, save -> instant deploy! [14:17:43] yeah, the reason i'm asking, is analytics was recently working a project to get pageview data into hdfs [14:18:00] and we could have made use of an nfs share of dumps [14:18:07] but I told them that ops wouldn't like this [14:18:17] because of the experience I had trying to put the same share on stat1 [14:18:26] so we coded a more complicated solution [14:20:20] (03PS1) 10Mark Bergsma: Revert "dumps: Copy pagecounts data to public labs nfs too" [operations/puppet] - 10https://gerrit.wikimedia.org/r/91604 [14:20:28] (03CR) 10jenkins-bot: [V: 04-1] Revert "dumps: Copy pagecounts data to public labs nfs too" [operations/puppet] - 10https://gerrit.wikimedia.org/r/91604 (owner: 10Mark Bergsma) [14:20:59] apergos: can you please explain the above? [14:25:41] (03PS1) 10Mark Bergsma: Unmount NFS mount [operations/puppet] - 10https://gerrit.wikimedia.org/r/91606 [14:26:56] (03CR) 10Mark Bergsma: [C: 032] Unmount NFS mount [operations/puppet] - 10https://gerrit.wikimedia.org/r/91606 (owner: 10Mark Bergsma) [14:32:54] mark: see https://bugzilla.wikimedia.org/show_bug.cgi?id=48894 [14:33:51] this was cleared with coren, who is managing the nfs shares in labs right now [14:33:53] that's nice, but can you implement a solution that does not involve yet another NFS mount? [14:34:03] I don't care what Coren thinks about that [14:34:16] we don't allow others to use NFS for that to reduce dependency problems we've had in the past [14:35:14] now we've finally mostly gotten rid of that, we're not gonna add it again just because labs is on NFS [14:35:16] mark: I have no opinion on the matter. I've been asked "can you export the place from which users can read pagecounts", I've exported it. [14:35:37] if they are going to have access to a shared pool of data, it's going to be nfs mounted for them until labs has some other solution (haven't they had this discussion multiple times and not found a better solution for the short/mid-term)? [14:36:05] you can copy the data onto the labs nfs server in another way [14:36:19] we should not also make this depend on NFS [14:36:32] oh I don't care how I copy it, it can be rsynced to a daemon for all I care [14:36:49] I'm just saying, basically it's an nfs share that's being made available in the projects [14:36:58] I don't see how that can be worked around [14:37:15] that's not what I'm arguing against am I? [14:37:34] I'm not clear what you're arguing against tbh [14:37:57] within labs we have NFS right now for shared data. that's unfortunate but we can't really work around that in a sane way [14:38:22] but please do not also depend on that on the dumps servers [14:38:31] oh that's fine [14:40:06] so that means... Coren, can you give me an rsync stanza for the dataset{1001,2} hosts, and what I should be calling the share on my side? [14:40:14] (03CR) 10Milimetric: "Yuvi (or anyone looking to import pagecount data from dumps) might find this of use: https://git.wikimedia.org/blob/analytics%2Fkraken/02a" [operations/puppet] - 10https://gerrit.wikimedia.org/r/91604 (owner: 10Mark Bergsma) [14:42:08] apergos: what method would you like? ssh, rsync daemon? [14:42:16] rsync daemon I guess [14:42:22] does that work for you? [14:42:58] Given the volume of data, it's probably better to not tunnel over ssh. [14:46:19] some day that should probably be in cephfs [14:46:26] hopefully :) [14:46:36] openstack is creating a project for "shared filesystem" [14:46:46] haha [14:47:11] and there's discussion on the ceph list on how to conformant [14:47:21] to be* [14:47:24] https://blueprints.launchpad.net/manila/+spec/cephfs-driver [14:56:40] hey mark, any idea on how long it might be til there are no more productions squids? [14:56:56] (i.e., when we could use varnishkafka everywhere for all webrequest logs) [14:57:06] either before I go on holiday on nov 18 [14:57:12] or by the time I come back :) [14:57:39] I'm thinking of putting OC text traffic on varnish there and use it for testing also [14:57:46] if that all works out fine, I think we're pretty close to ditching squid entirely [14:58:32] \o/ [14:59:30] mark, was talking with snaps and diederik [14:59:35] if we modify udp2log to consume from kafka [14:59:37] i think we can do this [14:59:46] that would make all the downstream dependencies easier to handle [14:59:52] yeah [15:00:19] i like that better than in varnishkafka [15:05:11] apergos: hmm, I see it got reverted [15:05:27] mark: the dumps are already put on NFS and available on labs [15:05:32] so was just adding this to that [15:05:50] yeah I know [15:05:55] oh, revert not merged [15:06:03] no but I unmounted the NFS share anyway [15:06:08] heh [15:06:21] why exactly? too much network traffic? [15:07:06] we've spent years trying to get rid of NFS and NFS dependencies across the cluster [15:07:23] in that case the page dumps should be killed too, no¿ [15:07:26] once NFS servers go down, clients all block on it and there's little you can do about that [15:09:22] within labs we'll have to use NFS for the foreseable future, but within production, we'd like to limit that as much as possible [15:09:50] mark: sure, but the dumps are being exported to labs by a cron running on production [15:10:14] yeah that should probably change also [15:10:21] hmm, I see [15:10:42] I could do something like have a cron running on labs somewhere that just uses http to get the new pagecounts things [15:10:51] provided we do an initial copy there in somew ay [15:10:52] *way [15:11:03] if it's a lot of data I can imagine Coren/Ryan wouldn't like to see that go through a labs instance [15:11:13] so then it's probably fine to do something on the nfs server itself, rsync daemon or whatever [15:11:19] mark: Coren made the nfs mount and was happy about it [15:11:29] mark: ah, http. yes [15:11:34] yeah but I'm not :) [15:12:01] mark: hmm, so cron on the nfs server, rsyncing over... scp? [15:12:11] or just rsyncd [15:12:22] cron can live on the dumps server [15:12:25] doesn't matter [15:12:31] Yeah, I'm working with apergos to do rsync. [15:12:40] cool [15:13:21] Personally, I don't care either way; NFS is a given labs-side, but how the data gets onto the spinning rust is not material from a user's perspective. Whatever works and is apropriate in our general setup is fine. [15:14:04] And rsync probably does the trick. [15:14:15] that's what we've done in other similar situations also [15:15:01] I'm tempted to pull from the file server than the other way 'round, though. Easier to fine tune the schedule that way. [15:15:17] whatever works [15:15:25] well [15:15:30] which isn't nfs ;p [15:15:50] apergos mentions that it's better to push so that it can be interleaved with the other rsyncs. [15:15:51] :-) [15:16:05] You seem to have a rather bad case of nfsphobia. :-) [15:16:22] i can tell you the horror stories [15:16:32] circular dependencies and all that [15:16:35] clearly we should all switch to gluster :P [15:16:40] I have a couple of my own to share, mostly revolving about boot ordering. :-) [15:16:46] yeah [15:17:42] Coren: when do you think you'll have time to work on moving it? I can lend a hand if you want. [15:18:04] "moving it"? [15:18:37] Coren: well, the pagecounts rsync. it won't really be doing much now, since mark has unmounted the nfs volume [15:18:43] hmm, unless puppet mounts it again [15:18:53] no, i had puppet unmount it [15:19:09] YuviPanda: Oh! I'm working on something now but it's next on my list. [15:19:20] mark: ah, okay [15:19:25] Coren: ah, sweet. ok [15:30:18] cmjohnson1: hey, quick question [15:30:30] maybe I've asked you again, I don't remember [15:30:49] do you have any idea what's up with those e.g. ps1-a6-eqiad-infeed-load-tower-B-phase-Z alerts? [15:30:55] a6, b5, c2 [15:31:30] we have too many servers in those racks...b5 is really b6. they are mixed up [15:31:40] (03PS1) 10ArielGlenn: turn off udp2log monitoring on silver, it's broken (rt #6023) [operations/puppet] - 10https://gerrit.wikimedia.org/r/91608 [15:31:55] too many servers? [15:32:35] yes...they're loaded and we are using y cables for power...which is part of the reason I cannot adjust [15:32:48] oh my [15:32:53] they're above the threshold of 12A [15:33:45] are all the y-cables on the same phase or something? [15:33:51] (03CR) 10ArielGlenn: [C: 032] turn off udp2log monitoring on silver, it's broken (rt #6023) [operations/puppet] - 10https://gerrit.wikimedia.org/r/91608 (owner: 10ArielGlenn) [15:35:56] mark: so the y cables appear to be on 1 phase on b6 and a6...c2 is not balanced correctly. [15:35:57] ksnider: you realize that most data centers you invited for RFP probably don't meet our primary requirements right :) [15:36:07] how so? [15:37:04] well, some (e.g. savvis) not carrier neutral, some are likely resellers of actual facilities [15:37:04] most of them won't have many carriers/networks at all [15:37:04] it's fine, it's an RFP after all [15:37:04] just saying, probably not worth spending a lot of time on [15:37:11] others are good though, latisys, servercentral especially [15:37:29] I think leslie contacted those also [15:37:36] ksnider: Do you want to ask cmjohnson1 to expend the effort of switching the controller on labstore3 for further testing, or do we just punt on this because we have better things to spend our time on? [15:38:43] (03PS1) 10ArielGlenn: woops, needed fully qualified class name [operations/puppet] - 10https://gerrit.wikimedia.org/r/91609 [15:38:47] mark: I was trying to cast a wide net. :) I don't think I chose any resellers - all of those I reached out to run their own facilities, AFAIK. :) [15:39:20] servercentral came through me weirdly enough [15:39:25] Coren: As long as the server isn't going to end up a pariah, I'm happy to leave the testing wherever you both feel is appropriate [15:39:26] probably the only one I'll bring though [15:39:44] apergos: Ah, apologies then, I reached out to them as well [15:40:03] ksnider: Whenever it gets transported to $future_location would be fine with me; the switch can occur when it gets racked. [15:40:03] coordinating on the ticket is good though [15:40:06] ok well no harm done I'm sure [15:40:37] mark: Yep! [15:41:00] (03CR) 10ArielGlenn: [C: 032] woops, needed fully qualified class name [operations/puppet] - 10https://gerrit.wikimedia.org/r/91609 (owner: 10ArielGlenn) [15:46:05] (03PS1) 10coren: Add rsync daemon to labstore[34] [operations/puppet] - 10https://gerrit.wikimedia.org/r/91611 [15:52:43] ottomata: heya [15:52:53] ottomata: how's openjdk? do we use sun java anywhere now? [15:53:13] noooooooooooooooooooo :D [15:53:23] ;) [15:54:18] so are you sure kafka/cdh4 works fine with openjdk? [15:54:40] also btw, I noticed yesterday that cloudera's repositories have cdh 4.4 now [15:57:36] (03CR) 10coren: [C: 032] Add rsync daemon to labstore[34] [operations/puppet] - 10https://gerrit.wikimedia.org/r/91611 (owner: 10coren) [15:59:15] blergh [15:59:18] lsearchd uses it [15:59:30] and I'm not going to switch it so close to its death [16:00:37] ori-l: ping [16:01:03] apergos: You can haz rsync. [16:02:19] thanks d00d [16:07:39] (03PS1) 10Krinkle: Enable $wgUseRCPatrol on mediawiki.org [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91614 [16:08:27] ottomata: hey, the kafka package installation has failed on analytics1011 [16:08:46] ottomata: looks like a package bug, but it's also an ops issue with kafka not being able to install/run there [16:09:12] hm [16:10:36] don't even think it needs to be there [16:10:39] (03PS1) 10ArielGlenn: don't use nfs for rsync of pagecount data from dumps hosts [operations/puppet] - 10https://gerrit.wikimedia.org/r/91616 [16:11:07] dunno, there's an icinga alert about broken packages for a while, so I was investigating [16:11:14] (03CR) 10jenkins-bot: [V: 04-1] don't use nfs for rsync of pagecount data from dumps hosts [operations/puppet] - 10https://gerrit.wikimedia.org/r/91616 (owner: 10ArielGlenn) [16:11:18] ja trying to remove it now, i think i see the bug too [16:11:41] yeah, hm [16:11:44] feeling too crappy to do productive work, so chasing down alerts seemed like a good use of my time :P [16:11:56] RECOVERY - DPKG on analytics1011 is OK: All packages OK [16:11:59] i think i just fixed, but had to do so manually [16:12:09] the mirror scripts that the init script looked for didn't exist [16:12:14] because the package only installs example ones [16:12:29] i wish packages didn't start/stop packages by default, hm [16:12:35] paravoid: https://www.mediawiki.org/wiki/User:GWicke/Notes/Storage/Testing#Dump_import.2C_600_writers [16:12:41] i think i can fix that [16:12:46] mainly that's an issue of looking for those files on stop [16:12:59] on install it won't matter, because /etc/defaul/kafka-mirror is set to disabled [16:13:00] gwicke: hey [16:13:41] gwicke: so I read the diffs from the notes page [16:13:58] gwicke: I'm not sure if the heap error and older JNA have anything to do with each other [16:14:08] (03PS2) 10ArielGlenn: don't use nfs for rsync of pagecount data from dumps hosts [operations/puppet] - 10https://gerrit.wikimedia.org/r/91616 [16:14:16] paravoid: I'm not 100% sure either [16:14:31] after symlinking the new jna it shows up in the list of loaded jars though [16:14:38] which list? [16:14:53] the list of loaded jars is passed to java [16:15:01] check ps aux | grep java [16:15:11] oh the arguments you mean [16:15:21] yup [16:15:27] well, yeah, I suppose the init script concatenates /usr/share/cassandra/lib [16:15:47] for j in /usr/share/$NAME/lib/*.jar; do [16:15:47] [ "x$cp" = "x" ] && cp=$j || cp=$cp:$j [16:15:47] done [16:15:49] yup [16:16:09] I also upped the heap to 7000M, so it is now hard to see whether jna or the heap increase fixed it [16:16:20] heh [16:16:43] from what I read on what it does with JNA, I don't think it has anything to do with the heap [16:16:55] it just mlockall() so the kernel won't swap [16:16:57] gwicke, btw, not sure if you heard me say this, but if/when you want to puppetize cassandra, I'm happy to help! sounds fun. :) [16:17:01] if it swapped, it'd be slower, but java wouldn't crash [16:17:20] ottomata: hehehe [16:17:27] ottomata: awesome, thanks! [16:17:41] otto, our apache packager [16:17:52] or puppetizer? [16:18:23] seems easy so far [16:18:25] paravoid: also, only xenon ran out of heap before I tweaked the settings and installed latest jna [16:18:29] a single package, a few config files [16:18:39] *nod* [16:18:53] or maybe even one so far? [16:18:57] and disk setups [16:19:09] yup [16:19:25] for writing, having data on spinning disks certainly does not hurt at all [16:19:34] (03CR) 10ArielGlenn: [C: 032] don't use nfs for rsync of pagecount data from dumps hosts [operations/puppet] - 10https://gerrit.wikimedia.org/r/91616 (owner: 10ArielGlenn) [16:19:50] all sequential IO and all in the background [16:20:16] having the commitlog on an ssd is a good idea though [16:21:51] nod [16:21:55] that's not a problem [16:23:20] * paravoid reads http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0 [16:23:21] something is not right about the hints table on praseodymium- getting an exception when trying to compact it [16:28:58] (03CR) 10Mark Bergsma: "I'm not sure I agree. Other than Zero, why does Varnish need to resolve the real client.ip? MediaWiki already does that itself for its nee" [operations/puppet] - 10https://gerrit.wikimedia.org/r/88261 (owner: 10Dr0ptp4kt) [16:29:59] without compaction, the compression ratio including metadata, indexes etc is 31% of the input text size [16:30:09] I saw the diff :) [16:30:27] that's not too bad [16:30:39] *nod* [16:31:25] would be better with lzma [16:32:17] should compare this with ExternalStore [16:36:35] * Nemo_bis is playing with lrzip and lzop https://archive.org/details/ftp-ftp.hp.com_ftp1 [16:36:55] "LZO is fast to the point that it can speed up some Hadoop tasks by reducing the amount of I/O" http://www.schmitztech.com/tech/archives/5 [16:38:09] gwicke: we could use a separate machine for the client, if you want to stress test cassandra [16:38:52] paravoid: yes, that might make sense [16:39:04] would also simplify the client management a bit [16:40:01] for reads, I am considering to build a map of title/revision and md4 [16:40:06] *md5* [16:40:38] and then do random requests according to some distribution and verify the md5 for the returned wikitext [16:41:32] are you worrying about corruption [16:42:52] not that much, but would still be good to double-check [16:50:03] yurik: who do i talk to find out about philippine zero? [16:50:18] kul? [16:54:33] jeremyb: best first bet [16:56:26] (03CR) 10Eloquence: [C: 04-1] "Let's not hardcode the name "EdwardsBot" but use a descriptive name like MessengerBot, otherwise this just becomes yet another piece of in" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91344 (owner: 10Legoktm) [16:57:37] hah. Elsie ^^ [16:57:47] paravoid: pong [16:58:02] ori-l: hello [16:58:20] wanted to ask you about the ganglia mobile view thing [16:58:32] it isn't worth the gerrit ping-pongs ;) [16:59:48] well, juliusz is interested in making his team more aware of performance issues by having useful dashboards [17:00:06] i don't know yet what exactly he has in mind, and i don't think he does either yet [17:00:16] I saw we have a similar thing for VE [17:00:26] ori-l: put me in the same boat [17:00:49] ori-l: and I'd be willing to have the extra monitor setup like Eloquence does for such things for platform/core ;) [17:00:58] greg-g: http://www.counter-currents.com/wp-content/uploads/2010/09/boschfools.jpg ? [17:01:20] I.... [17:01:32] don't like that boat [17:01:57] i was just teasing [17:02:00] but paravoid, yeah [17:02:01] are you call juliusz and I fools? [17:02:04] greg-g, no joke, if you want to build a setup like that, just ask [17:02:16] Eloquence: will do, need data/graphs first :) [17:02:16] Eloquence: the monitors or the boat? [17:02:17] i've been saying that for ops for years [17:02:23] ori-l++ [17:02:24] there should be an ops/noc wall [17:02:32] ori-l, you can have a boat. greg can have monitors. how's that? :) [17:02:33] mark: yes, somewhere at least [17:02:42] Eloquence: OK! [17:02:46] I feel like I'm getting the short end of the stick here [17:02:57] greg-g: you can install the monitors on my boat [17:03:00] ori-l: so one of my points is, I'd rather not have a view per team [17:03:00] and we can share it [17:03:00] do we get sails with our boats? [17:03:08] let's find out which views make sense in general [17:03:20] for you especially, and for dev teams as well [17:03:42] paravoid: well, yeah, there could be a unified/top priorities view, but I don't think Chad should have to care about network/bandwidth issues that you/leslie care about [17:03:45] my old reqstats/trafficstats/latencystats would have made a lot of sense on a noc wall [17:03:54] and now all the mediawiki metrics [17:04:34] i.e. let's not apply conway's law toganglia views :P [17:04:38] paravoid: OK, but it seems a bit bizarre for the -1 hammer to come down on Juliusz specifically here [17:05:07] gerrit change url? [17:05:11] (so I can play along) [17:05:29] yup I don't want to block juliusz specifically [17:05:32] hence my ping here [17:05:37] let's solve this now [17:05:41] greg-g: you can find it (i don't have it open) but I'm sure paravoid and I can work it out [17:05:47] * greg-g nods [17:06:02] sorry, that came across differently than i had intended [17:06:24] paravoid: hi, a 2sec question: to upload a deb package to debian.org do I have to be a DD or being listed in upload field is enough to whitelist me ? :° [17:06:34] ori-l: understood, though [17:06:42] the problem with ganglia is that it is host-oriented rather than product-oriented, and that makes discovery difficult [17:07:02] so product-oriented monitoring needs the json view 'crutch' a bit more [17:07:16] torrus was sort of nice for the more ops centric metrics [17:07:21] but not for what you guys are looking at [17:07:46] graphite might be a better fit for 'product-oriented' metrics, though [17:08:22] * mark out for groceries [17:08:32] * ori-l waves [17:08:34] and http://square.github.io/cubism/ would provide a nice wall, it fetches data from graphite to display some stocks alike charts [17:08:48] hashar: we had that set up for a while :P [17:09:16] ganglia is still nice for the host based view though [17:09:17] horizon charts look pretty but are pretty confusing [17:09:29] but I have the feeling graphite is more adapted for "business" monitoring [17:09:42] paravoid: thoughts? [17:10:19] was looking for a way to detect issues in metrics but haven't found anything very useful, might end up having to write our piece of software to poll graphite and raise alarms on some threshold [17:10:39] what's "product-oriented metrics"? navtiming data? more? [17:11:41] statsd data are on both graphite and ganglia but that's not the case for the rest of the metrics [17:11:54] paravoid: VisualEditor, for example [17:11:55] and that includes mediawiki (at least until you convert mediawiki to statsd :) [17:12:18] so if you want a dashboard that also has mediawiki performance data, ganglia wouldn't help [17:13:05] ori-l: I am pretty sure the http://ganglia.wikimedia.org/latest/?r=hour&cs=&ce=&tab=v&vn=VisualEditor view could be achieved with MediaWiki profiling data from graphite [17:14:07] hashar: no, it couldn't; these metrics are generated from front-end JavaScript code [17:15:03] paravoid: I agree re: graphite [17:15:40] ori-l: btw, do you have a link to the VE metrics? [17:15:42] so I think the question boils down to what do we/others want these dashboards to be about [17:16:09] gwicke: http://ganglia.wikimedia.org/latest/?r=week&cs=&ce=&tab=v&vn=VisualEditor [17:16:12] also in graphite [17:16:25] but not all datapoints are plotted yet [17:16:49] ah [17:17:00] is https://meta.wikimedia.org/wiki/Research:VisualEditor up to date? [17:17:01] you might want to correct the labels a bit [17:17:02] I don't mind having such views in Ganglia at all [17:17:30] or change what is measured to reflect them- did Roan ping you about that? [17:17:31] but I'd prefer not having a view per each team with each of those containing two navtiming graphs or whatever [17:17:40] paravoid: from my perspective, the answer is: - we don't know yet, - because we're pretty new to this, - but you don't cultivate experience and wisdom by preventing people from trying things out, and i think that's what juliusz is doing (trying things out) [17:19:35] gwicke: roan did, yes - there are a number of refinements and additional metrics we want to plot, give it a bit of time [17:19:47] ori-l: james had some hit rate graphs yesterday, are those public? [17:20:25] gwicke: they're in graphite -- graphite.wikimedia.org, use your labs credentials [17:20:57] trying things out is fine, but let's have a sense of where we need to be going too :) [17:21:27] this isn't about this patchset per se, but it seemed as an appropriate time to ask these questions [17:21:49] it doesn't seem far-fetched to me that 'mobile' provides enough scope for a ganglia view of operational data [17:22:06] i think it's possible that it remains a two-graph nav timing thing, in which case your suspicion will have been right [17:22:17] should I include mobile varnishes hits/sec there too? [17:22:27] i think that would be very nice, yeah [17:22:36] dunno, depends on who's viewing the graphs [17:22:41] and for what purpose [17:24:12] my labs login does not work on graphite for some reason [17:24:27] brb [17:24:31] are you in the wmf group? [17:24:41] gwicke: are you using your wiki username? [17:24:57] can we start this inquiry with http://ganglia.wikimedia.org/latest/?r=week&cs=&ce=&tab=v&vn=analytics-data , http://ganglia.wikimedia.org/latest/?r=week&cs=&ce=&tab=v&vn=kafka, or http://ganglia.wikimedia.org/latest/?r=week&cs=&ce=&tab=v&vn=udp2log-analytics ? [17:25:17] jeremyb: I guess so, and yes I tried all permutations thereof [17:27:29] (03CR) 10MZMcBride: "Global renames are still super-painful." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91344 (owner: 10Legoktm) [17:28:26] (03PS2) 10MZMcBride: Enable MassMessage on all wikis [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91344 (owner: 10Legoktm) [17:29:08] jeremyb: I tried the same user/pass that works at wikitech.wikimedia.org [17:30:41] (03CR) 10Legoktm: "Using EdwardsBot is easier since most communities are already familiar with that name, plus I don't think anyone wants to move/create new " [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91344 (owner: 10Legoktm) [17:30:52] (03CR) 10Andrew Bogott: [C: 032] "This pretty much works now." [operations/puppet] - 10https://gerrit.wikimedia.org/r/91499 (owner: 10Andrew Bogott) [17:31:36] bbiab [17:31:44] gwicke: well how about my other question then? are you in the wmf group? [17:31:52] i'll answer for you, no, you're not :) [17:31:52] $ ssh bastion1.pmtpa.wmflabs groups gwicke 2>/dev/null [17:31:53] gwicke : svn project-bastion project-visualeditor project-wikitrust project-math [17:32:56] I see, would I normally be in the wmf group? [17:33:22] no, it's not an automatic part of any process AFAICT. it's added as needed [17:33:29] i.e. not HR [17:33:38] k, could you add me? [17:33:39] nor all wmf mortals [17:33:47] hah, i'm not even in the group myself! [17:33:50] so, no [17:34:14] who would be the one to ask? [17:34:20] Ryan? [17:34:30] he can do it. i guess i'd start with RT duty person [17:34:38] topic says mutante [17:34:42] ahh, mutante! [17:34:46] heh, ehm [17:34:54] hehe [17:35:36] I'd like to see graphite stats, for which I'd need to be in the labs wmf group [17:36:00] gwicke: you know about gdash? [17:36:42] I saw it before [17:36:57] we also use some graphs from graphite by URL [17:37:05] (03CR) 10MZMcBride: "I don't care which account name is used. I'm happy to surrender "EdwardsBot" if that's deemed the best choice. My only related thought is " [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91344 (owner: 10Legoktm) [17:37:09] those apparently don't require a login currently [17:37:18] https://graphite.wikimedia.org/render/?width=1486&height=641&_salt=1371161654.988&from=-24hours&target=stats.job-insert-ParsoidCacheUpdateJob.count&target=stats.job-pop-ParsoidCacheUpdateJob.count for example [17:37:19] gwicke: i tried [17:37:22] root@formey:~# modify-ldap-group --addmembers=gwicke wmf [17:37:27] i hope that was it [17:37:49] gwicke: right, they don't [17:37:59] mutante: yes, worked. thanks! [17:38:05] huh? [17:38:05] cool:) [17:38:07] jeremyb: thanks too! [17:38:09] !log added gwicke to wmf ldap group [17:38:20] he's not a member yet for me. i guess it's in the local machine cache [17:38:23] Logged the message, Master [17:38:25] <^d> Heh, I don't think I've ever !logged that :p [17:38:48] yeah, he is on bastion2 [17:38:58] ah, bastion1 now too [17:40:31] back [17:41:51] hah, tfinc [17:43:36] <^d> I love how he got banned from a public channel too. [17:49:00] (03PS1) 10coren: Add some locales to locales::international [operations/puppet] - 10https://gerrit.wikimedia.org/r/91634 [17:50:10] Anyone see a probem with ^^? [17:50:57] paravoid: copper will run out of room around this time tomorrow if swift-repl hasn't finished by then [17:51:03] (log looks normal) [17:51:08] Coren: nah, i added that once for planet and something else, and we added some missing ones repeatedly [17:51:26] Yeah, I use it in tool labs too. [17:51:30] apergos: ok, we can restart tomorrow [17:51:36] (thanks) [17:51:44] long as it's on your rader [17:51:46] radar [17:51:50] Coren: it should also automatically execute localegen [17:52:09] It does, from generic::locales::international [17:52:14] yep [17:52:40] (03CR) 10coren: [C: 032] "mutante likes it, that's good enough for me." [operations/puppet] - 10https://gerrit.wikimedia.org/r/91634 (owner: 10coren) [17:53:14] oh yea, the other one was on role/pdf even [17:56:29] ^d: i was assuming it was too frequent reconnects and no one remembered to remove it [17:58:14] !log reedy synchronized php-1.23wmf1 [17:58:15] PROBLEM - Host ms-be1006 is DOWN: PING CRITICAL - Packet loss = 100% [17:58:26] Logged the message, Master [17:58:50] !log shutting down ms-be1006, to be used by cmjohnson1 for testing firmware changes [17:59:02] Logged the message, Master [17:59:51] RobH: :) [17:59:57] cool [17:59:59] Hm. Something happened to the ugly ssh::bastion class hack. It was probably well-deserved, but tool labs was using it. [18:00:53] i guess when ssh moved to a module [18:00:56] !log reedy synchronized php-1.23wmf1 'make sure' [18:01:03] or was it [18:01:07] Logged the message, Master [18:01:42] !log reedy synchronized docroot and w [18:01:43] The last Puppet run was at Fri Oct 18 14:52:24 UTC 2013 (8829 minutes ago). [18:01:52] Logged the message, Master [18:02:01] No, because that broke puppet for labs; so some time after the last run. [18:02:28] And the module seems to date from the 16th [18:02:44] (03PS1) 10Reedy: Add and update all symlinks [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91637 [18:04:39] (03CR) 10Reedy: [C: 032] Add and update all symlinks [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91637 (owner: 10Reedy) [18:04:45] (03PS1) 10Dzahn: remove search21-36, keep search13-21 & search37-51 [operations/dns] - 10https://gerrit.wikimedia.org/r/91638 [18:05:33] aha. It was andrewbogott [18:05:36] (03CR) 10Dzahn: [C: 04-1] "merge after completed decom in RT #6073" [operations/dns] - 10https://gerrit.wikimedia.org/r/91638 (owner: 10Dzahn) [18:05:44] (03CR) 10Reedy: [V: 032] Add and update all symlinks [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91637 (owner: 10Reedy) [18:06:00] * andrewbogott takes the blame! [18:06:10] Actually, I thought that I broke it and then fixed it. [18:06:15] Is it not fixed? [18:06:44] andrewbogott: There was a reference left in modules/toollabs, but installing mosh wasn't /all/ it did, there are a few templates that depended on it too. [18:06:56] um... [18:06:57] * andrewbogott looks [18:08:43] sshd_banner has/had some conditionality on the class having been included [18:09:20] !log reedy Started syncing Wikimedia installation... : testwiki to 1.23wmf1 and build l10n cache [18:09:33] Logged the message, Master [18:10:20] andrewbogott: (Incidentally, the reference isn't in the puppet manifests, it's a class included from the wikitech interface) [18:10:37] oooh, that would explain my not seeing it [18:10:43] although I should've seen it in ldap [18:11:00] Well, wait, how can that be? There's conditional logic in the wikitech interface? [18:11:19] No, but since the class is now gone, attempt to include it from the interface break. [18:11:47] Ok… but you said before that there were templates that depended on it? [18:12:21] That's been excised since, apparently, which is why my exec nodes now spew and /etc/issue and break some script because of the noise that can't be suppressed. [18:12:59] Which is why this was made conditional to ssh::bastion being included so only bastions would give the 'if you can't access...' spam. [18:13:29] (03PS1) 10Dzahn: fix duplicate account::ebernhardson on stat1 [operations/puppet] - 10https://gerrit.wikimedia.org/r/91643 [18:14:15] Coren: i was thinking we could make it even more conditional. have e.g. bastion3 have no spam [18:14:32] Coren: this can/should be all handled via toollabs::bastion though, yes? [18:15:03] andrewbogott: Well, the idea was that no ssh server /should/ be spammy unless made to be so by being a bastion. [18:15:30] Can someone delete /usr/local/apache/common/php-1.22wmf2 on mw1089 please? There's a few objects without mwdeploy write permissions [18:16:01] mutante: when you get a chance could you have a gander at noc@ again? [18:16:27] Coren: OK, but that class wasn't actually used anywhere in production... [18:16:33] And the labs instances all already included the banner. [18:16:48] Reedy: done [18:16:57] thanks :) [18:17:00] So… sorry, i'm being dim, still don't understand what I actually broke, short of that one checkbox in wikitech [18:17:08] mutante: turns out kul has in fact had recent contact with that carrier (as a prospect) but he didn't reply yet to my latest mail so idk if he mailed them [18:17:36] andrewbogott: You've got it exactly backwards -- the banner was suppressed for every ssh server /except/ those which included the class. [18:18:02] And where was the logic that did that? [18:18:34] I'm trying to find it now, which is a bit harder because it's gone. :-) [18:19:19] Ah, the refactor to the ssh module took the banner out of the bastion and put it everywhere. [18:19:26] jeremyb: nothing for from:smart.com.ph, or to:noc .. latest noc@ mail is a girl looking for any work for her boyfriend "all over Switzerland" [18:19:26] Hm, I wonder why he did that... [18:19:30] Anyway, I can fix that! [18:20:05] andrewbogott: Yes, that's the effect. :-) [18:20:11] Coren: but we only want the banner on labs instances still, right? [18:20:16] I mean, labs + bastion [18:20:37] andrewbogott: I see Ryan having deployed that to prod in a changeset, at least that's what the log says. [18:20:43] jeremyb: Leslie replied to "Jane" , asking if she was a Smart customer. no reply yet , Leslie also posted on ops@ [18:20:51] andrewbogott: Maybe it was added to some production bastions? [18:20:54] But it was marked with if ($::realm == 'labs') [18:21:08] mutante: whoops, i turned jane into a ticket and mailed her too [18:21:26] asked her to contact smart tech support [18:21:35] 'Merge "Make pre-login sshd banner optional" into production' [18:21:54] Hm. Not clear what Ryan meant by that. [18:21:56] Coren, here is the patch in question https://gerrit.wikimedia.org/r/#/c/90098/8/manifests/ssh.pp [18:22:07] RECOVERY - Host ms-be1006 is UP: PING OK - Packet loss = 0%, RTA = 0.28 ms [18:22:08] I think the 'into production' thing is a canned gerrit string [18:22:37] !log reedy Finished syncing Wikimedia installation... : testwiki to 1.23wmf1 and build l10n cache [18:22:39] Ah! At any rate, I think only labs bastion were ever intended to have the banner in the first place. [18:22:49] Logged the message, Master [18:23:01] jeremyb: in general, should i always just forward noc and dns-admin stuff like that to OTRS and never reply direct? hmmm [18:23:04] So rather that suppress it selectively, it would probably be wiser to just include it selectively. [18:25:02] Merge "X" into production , just the branch name yea, and we dont have test anymore [18:25:42] mutante: depends. otrs is not too technical and some tech stuff gets bad replies (e.g. from people that don't understand the question or refer people to the wrong other place to ask). but there's also a tech issues queue and *if* it gets moved there then there's a decent chance that one of the few techy people will see it. [18:25:50] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: phase1 wikis to 1.23wmf1 [18:26:02] Logged the message, Master [18:26:20] mutante: we could set up an address to go straight to tech issues if you want to forward stuff. (but then try to be selective and don't forward edit requests to tech issues) [18:26:31] jeremyb: i see, so it's just use good judgment, nod [18:26:50] mutante: i guess [18:27:24] * Reedy stabs APC [18:27:28] !log reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.22wmf22 [18:27:40] Logged the message, Master [18:28:29] (03PS1) 10Reedy: phase1 wikis to 1.23wmf1. Wikipedias to 1.22wmf22 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91645 [18:28:44] andrewbogott: Can you remove the obsolete class from instances including it LDAP-side or should I go through wikitech? [18:28:50] (03CR) 10Reedy: [C: 032] phase1 wikis to 1.23wmf1. Wikipedias to 1.22wmf22 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91645 (owner: 10Reedy) [18:29:28] Coren: through wikitech will be easier… it's only a couple of instances, right? [18:29:31] But, hang on a second... [18:29:43] (03PS1) 10Andrew Bogott: Only display login banner on labs bastions. [operations/puppet] - 10https://gerrit.wikimedia.org/r/91646 [18:30:18] (03PS2) 10Dzahn: remove search21-36, keep search13-21 & search37-51 [operations/dns] - 10https://gerrit.wikimedia.org/r/91638 [18:30:30] Coren: If you approve ^ then you'll want to add role::labsbastion instead of ssh:bastion to those instances [18:30:59] (03PS3) 10Dzahn: remove search21-36, keep search13-20 & search37-51 [operations/dns] - 10https://gerrit.wikimedia.org/r/91638 [18:31:16] andrewbogott: care to /j #wikimedia-tech ? :) [18:31:42] jeremyb: how long as that been there? [18:31:45] (03CR) 10coren: [C: 032] "Clearly the right way to go." [operations/puppet] - 10https://gerrit.wikimedia.org/r/91646 (owner: 10Andrew Bogott) [18:31:46] not that I'm not here too [18:32:24] (03Merged) 10jenkins-bot: phase1 wikis to 1.23wmf1. Wikipedias to 1.22wmf22 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91645 (owner: 10Reedy) [18:32:37] PROBLEM - Host ms-be1006 is DOWN: PING CRITICAL - Packet loss = 100% [18:32:47] andrewbogott: I go browse wikitech for stragglers now. [18:32:51] (03CR) 10Dzahn: [C: 032] fix duplicate account::ebernhardson on stat1 [operations/puppet] - 10https://gerrit.wikimedia.org/r/91643 (owner: 10Dzahn) [18:33:48] andrewbogott: it predates this channel [18:33:54] huh [18:35:15] this channel is very recent [18:36:53] well #-tech is at least ~6-7 years old (i.e. as old as i can remember) [18:36:58] this channel is 1-2 years? [18:37:12] And yet I've gotten along so well without it :) [18:37:19] must be more than 2 [18:37:26] andrewbogott: hehe [18:37:51] Coren: OK, merged. Let me know if things remain broken... [18:38:07] mediawiki.org Special:Search : "An error has occurred while searching: We could not complete your search due to a temporary problem. Please try again later." [18:40:20] spagewmf: i hear new indices are being built right now [18:40:20] Works in tools; although the banner itself isn't removed by the patch and needs nuking. No way around that. [18:40:26] d^ [18:41:54] Coren, I could ensure=>absent? [18:44:18] (03PS4) 10Reedy: Enable VisualEditor for NS_FILE, NS_HELP, NS_CATEGORY [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90923 (owner: 10Jforrester) [18:44:24] (03CR) 10Reedy: [C: 032] Enable VisualEditor for NS_FILE, NS_HELP, NS_CATEGORY [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90923 (owner: 10Jforrester) [18:44:36] (03Merged) 10jenkins-bot: Enable VisualEditor for NS_FILE, NS_HELP, NS_CATEGORY [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90923 (owner: 10Jforrester) [18:51:46] (03PS5) 10Reedy: cawiki: Enable VisualEditor for Portal: and Viquiprojecte: [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91197 (owner: 10Jforrester) [18:51:51] (03CR) 10Reedy: [C: 032] cawiki: Enable VisualEditor for Portal: and Viquiprojecte: [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91197 (owner: 10Jforrester) [18:52:04] (03Merged) 10jenkins-bot: cawiki: Enable VisualEditor for Portal: and Viquiprojecte: [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91197 (owner: 10Jforrester) [18:53:16] (03PS5) 10Reedy: enwiki: Enable VisualEditor for Portal: and Book: [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91198 (owner: 10Jforrester) [18:53:20] (03CR) 10Dzahn: [C: 032] remove nomcom wiki [operations/dns] - 10https://gerrit.wikimedia.org/r/91125 (owner: 10Dzahn) [18:53:22] (03CR) 10Reedy: [C: 032] enwiki: Enable VisualEditor for Portal: and Book: [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91198 (owner: 10Jforrester) [18:53:44] (03Merged) 10jenkins-bot: enwiki: Enable VisualEditor for Portal: and Book: [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91198 (owner: 10Jforrester) [18:53:53] !log DNS update - remove nomcom wiki [18:54:07] Logged the message, Master [18:54:08] whyyy it was so useful :P [18:54:16] hah [18:55:08] https://www.mediawiki.org/wiki/Special:CodeReview/MediaWiki/64002 [18:55:40] apergos: ^ eh, you linked to that , was it right URL ? [18:55:50] (03PS2) 10Reedy: Enabled the abusefilter block option for English Wikinews [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90540 (owner: 10Vogone) [18:55:56] (03CR) 10Reedy: [C: 032] Enabled the abusefilter block option for English Wikinews [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90540 (owner: 10Vogone) [18:56:01] gerrit rev [18:56:13] (03Merged) 10jenkins-bot: Enabled the abusefilter block option for English Wikinews [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90540 (owner: 10Vogone) [18:56:14] got linkified to that [18:56:30] https://gerrit.wikimedia.org/r/#/c/64002/ [18:56:43] yes, :) that one :) [18:56:46] ty [18:56:53] feelfree to stuff that link in there someplace [18:57:16] (03PS2) 10Reedy: Changed Tamil Wikiquote logo [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90538 (owner: 10Vogone) [18:57:21] (03CR) 10Reedy: [C: 032] Changed Tamil Wikiquote logo [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90538 (owner: 10Vogone) [18:57:39] (03CR) 10Dzahn: "11:59 < apergos> https://gerrit.wikimedia.org/r/#/c/64002/" [operations/dns] - 10https://gerrit.wikimedia.org/r/91125 (owner: 10Dzahn) [18:57:41] (03Merged) 10jenkins-bot: Changed Tamil Wikiquote logo [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90538 (owner: 10Vogone) [18:57:54] dumb question: are graphite graphs always in units per minute or per second? [18:57:57] searching is broken on wikitech: https://wikitech.wikimedia.org/w/index.php?search=testing&title=Special%3ASearch [18:57:58] (03PS10) 10Ottomata: (WIP) Initial Debian version [operations/software/varnish/varnishkafka] (debian) - 10https://gerrit.wikimedia.org/r/78782 (owner: 10Faidon Liambotis) [18:58:08] andrewbogott: ... you couldn't ensure => 'absent' without conflicting in the cases where the class /is/ included, I think. [18:58:24] yeah, it would require various logicy things :) [18:58:24] (03PS2) 10Reedy: Add import sources for frwikivoyage [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90677 (owner: 10TTO) [18:58:27] andrewbogott: It took me all of 60s to make the rounds and nuke teh leftovers, so I'm pretty sure it's not necessary. [18:58:31] (03CR) 10Reedy: [C: 032] Add import sources for frwikivoyage [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90677 (owner: 10TTO) [18:58:34] ok, great [18:58:36] (03PS11) 10Ottomata: (WIP) Initial Debian version [operations/software/varnish/varnishkafka] (debian) - 10https://gerrit.wikimedia.org/r/78782 (owner: 10Faidon Liambotis) [19:00:14] who would be the right person to ping about search being broken on Wikitech? [19:00:20] any ideas? [19:00:42] (03CR) 10Dzahn: [C: 04-1] "not what was intended. see explation from ottomata on the ticket" [operations/puppet] - 10https://gerrit.wikimedia.org/r/91067 (owner: 10Dzahn) [19:00:50] doesn't look like Ryan is around [19:00:53] kaldari: ^demon|lunch or Manybubbles [19:01:30] kaldari: well ryan just set up cirrus at another of his wikis. no idea if he did the same with wikitech [19:02:06] (03Merged) 10jenkins-bot: Add import sources for frwikivoyage [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90677 (owner: 10TTO) [19:02:45] (03PS2) 10Reedy: (bug 55909) Add an alias for NS_USER on kowiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90708 (owner: 10Odder) [19:02:49] (03CR) 10Reedy: [C: 032] (bug 55909) Add an alias for NS_USER on kowiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90708 (owner: 10Odder) [19:02:49] kaldari: i hear chad was building new indices just a little while ago [19:03:04] (03PS5) 10Ori.livneh: Add Mathoid module (TeX -> MathML / SVG conversion web service) [operations/puppet] - 10https://gerrit.wikimedia.org/r/90733 (owner: 10Physikerwelt) [19:03:05] but also not sure if it included wikitech [19:03:16] I'll bug him when he's back from lunch then [19:03:18] thanks! [19:04:02] (03CR) 10Ori.livneh: "This patch needs to be further modified to use git-deploy. Installing node dependencies via npm may not be acceptable, either. But I clean" [operations/puppet] - 10https://gerrit.wikimedia.org/r/90733 (owner: 10Physikerwelt) [19:06:03] (03CR) 10Edenhill: [C: 031] "(1 comment)" [operations/software/varnish/varnishkafka] (debian) - 10https://gerrit.wikimedia.org/r/78782 (owner: 10Faidon Liambotis) [19:06:05] (03PS12) 10Ottomata: (WIP) Initial Debian version [operations/software/varnish/varnishkafka] (debian) - 10https://gerrit.wikimedia.org/r/78782 (owner: 10Faidon Liambotis) [19:07:17] (03Merged) 10jenkins-bot: (bug 55909) Add an alias for NS_USER on kowiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90708 (owner: 10Odder) [19:07:26] (03CR) 10Faidon Liambotis: [C: 032] "Merging this for now, let's iterate when we have a clearer picture of where we need to go." [operations/puppet] - 10https://gerrit.wikimedia.org/r/91079 (owner: 10JGonera) [19:07:42] (03PS3) 10Faidon Liambotis: Add mobile views to ganglia [operations/puppet] - 10https://gerrit.wikimedia.org/r/91079 (owner: 10JGonera) [19:08:28] (03PS1) 10Andrew Bogott: Install the latest python-flask, so it pulls from labsdebrepo. [operations/puppet] - 10https://gerrit.wikimedia.org/r/91651 [19:08:53] (03CR) 10Edenhill: [C: 031] "(2 comments)" [operations/software/varnish/varnishkafka] (debian) - 10https://gerrit.wikimedia.org/r/78782 (owner: 10Faidon Liambotis) [19:09:44] (03PS2) 10Reedy: Alphasort echowikis.dblist [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91311 [19:10:16] (03CR) 10Reedy: [C: 032] Alphasort echowikis.dblist [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91311 (owner: 10Reedy) [19:12:02] (03CR) 10Andrew Bogott: [V: 032] Install the latest python-flask, so it pulls from labsdebrepo. [operations/puppet] - 10https://gerrit.wikimedia.org/r/91651 (owner: 10Andrew Bogott) [19:12:29] !log Created WikiLove extension tables on kowiki [19:12:33] (03CR) 10Andrew Bogott: [C: 032] Install the latest python-flask, so it pulls from labsdebrepo. [operations/puppet] - 10https://gerrit.wikimedia.org/r/91651 (owner: 10Andrew Bogott) [19:12:40] Logged the message, Master [19:13:32] (03CR) 10Faidon Liambotis: [C: 04-1] "Echoing Ori: git cloning Math isn't acceptable, this should be deployed via the processes that we have for such things, namely git-deploy." [operations/puppet] - 10https://gerrit.wikimedia.org/r/90733 (owner: 10Physikerwelt) [19:13:59] (03CR) 10jenkins-bot: [V: 04-1] Alphasort echowikis.dblist [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91311 (owner: 10Reedy) [19:14:14] (03CR) 10Faidon Liambotis: [C: 032] Add mobile views to ganglia [operations/puppet] - 10https://gerrit.wikimedia.org/r/91079 (owner: 10JGonera) [19:14:27] (03CR) 10Reedy: [V: 032] Alphasort echowikis.dblist [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91311 (owner: 10Reedy) [19:14:28] jeremyb: dfoy [19:14:51] (03PS2) 10Reedy: Enabled WikiLove extension on kowiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90699 (owner: 10Vogone) [19:14:56] (03CR) 10Reedy: [C: 032] Enabled WikiLove extension on kowiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90699 (owner: 10Vogone) [19:16:16] yurik_: kul replied once to me and now i'm waiting on him again. idk what TZ he's in [19:16:33] jeremyb: kul is taking a baby leave [19:16:39] ohhhhh [19:17:01] email dfoy@w... [19:17:07] ok [19:17:30] he replied talking about stuff he did last week and is planning on doing next week [19:17:35] so no clues there! :P [19:18:10] (03Merged) 10jenkins-bot: Enabled WikiLove extension on kowiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90699 (owner: 10Vogone) [19:18:33] yurik_: cc you? or no? [19:18:41] sure [19:18:53] k [19:20:33] (03PS2) 10Reedy: Update size related dblists [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91312 [19:20:39] (03CR) 10Reedy: [C: 032] Update size related dblists [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91312 (owner: 10Reedy) [19:21:02] yurik_: sent [19:21:13] PROBLEM - MySQL Slave Delay on db1047 is CRITICAL: CRIT replication delay 318 seconds [19:21:13] PROBLEM - MySQL Replication Heartbeat on db1047 is CRITICAL: CRIT replication delay 319 seconds [19:22:07] ori-l: you were talking about db1047 recently... see that alert [19:22:33] (03PS3) 10Dzahn: give milimetric sudo privileges on analytics nodes [operations/puppet] - 10https://gerrit.wikimedia.org/r/91067 [19:23:49] (03PS4) 10Dzahn: promote milimetric from restricted to mortals [operations/puppet] - 10https://gerrit.wikimedia.org/r/91067 [19:24:10] (03Merged) 10jenkins-bot: Update size related dblists [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91312 (owner: 10Reedy) [19:24:18] (03PS13) 10Ottomata: (WIP) Initial Debian version [operations/software/varnish/varnishkafka] (debian) - 10https://gerrit.wikimedia.org/r/78782 (owner: 10Faidon Liambotis) [19:24:26] (03CR) 10Ottomata: "(2 comments)" [operations/software/varnish/varnishkafka] (debian) - 10https://gerrit.wikimedia.org/r/78782 (owner: 10Faidon Liambotis) [19:25:27] (03PS2) 10Reedy: (bug 54828) Configure FlaggedRevs for ptwiki (take 3) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90893 (owner: 10Odder) [19:25:32] (03CR) 10Reedy: [C: 032] (bug 54828) Configure FlaggedRevs for ptwiki (take 3) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90893 (owner: 10Odder) [19:25:34] (03CR) 10Ottomata: [C: 032] promote milimetric from restricted to mortals [operations/puppet] - 10https://gerrit.wikimedia.org/r/91067 (owner: 10Dzahn) [19:25:56] (03CR) 10Dzahn: "am i right that this requirement: "This is about access to tin so that Dan can deploy analytics codebases, particularly the Kraken reposit" [operations/puppet] - 10https://gerrit.wikimedia.org/r/91067 (owner: 10Dzahn) [19:28:01] (03CR) 10Dzahn: [C: 032] "has approvals, good to go." [operations/puppet] - 10https://gerrit.wikimedia.org/r/91067 (owner: 10Dzahn) [19:29:32] (03Merged) 10jenkins-bot: (bug 54828) Configure FlaggedRevs for ptwiki (take 3) [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90893 (owner: 10Odder) [19:29:33] RECOVERY - Host ms-be1006 is UP: PING OK - Packet loss = 0%, RTA = 0.23 ms [19:30:13] (03PS2) 10Reedy: Add lang and dir attributes to the Wikimedia address for Echo [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90934 (owner: 10Amire80) [19:30:21] (03CR) 10Reedy: [C: 032] Add lang and dir attributes to the Wikimedia address for Echo [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90934 (owner: 10Amire80) [19:34:14] (03Merged) 10jenkins-bot: Add lang and dir attributes to the Wikimedia address for Echo [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/90934 (owner: 10Amire80) [19:35:08] milimetric: welcome to software deployers [19:35:33] hey thanks mutante, I am going to be using the privilege in a *very* limited way for now [19:35:59] I'm just deploying kraken to the analytics machines once in a while [19:36:46] yep, ok, i had to confirm deploy to analytics is like deploy mw [19:36:58] yep, ottomata explained [19:43:14] (03PS2) 10Reedy: Remove wikidata.org from CORS, keep only *.wikidata.org [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/56153 (owner: 10Aude) [19:43:30] (03CR) 10Reedy: [C: 032] Remove wikidata.org from CORS, keep only *.wikidata.org [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/56153 (owner: 10Aude) [19:45:10] !log reedy synchronized database lists files: [19:45:23] Logged the message, Master [19:45:43] (03Merged) 10jenkins-bot: Remove wikidata.org from CORS, keep only *.wikidata.org [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/56153 (owner: 10Aude) [19:46:34] (03PS2) 10Reedy: Changing default wmgMinimumVideoPlayerSize from 200 to 800. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91342 (owner: 10Kaldari) [19:46:39] (03CR) 10Reedy: [C: 032] Changing default wmgMinimumVideoPlayerSize from 200 to 800. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91342 (owner: 10Kaldari) [19:47:02] kaldari: :* [19:47:07] (03Merged) 10jenkins-bot: Changing default wmgMinimumVideoPlayerSize from 200 to 800. [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91342 (owner: 10Kaldari) [19:47:14] That was quick [19:47:37] (03PS2) 10Reedy: Remove b/c IPv6 forms [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91505 (owner: 10MaxSem) [19:47:43] paravoid: am currently looking into the 'cassandra won't stop from init' issue [19:47:44] (03CR) 10Reedy: [C: 032] Remove b/c IPv6 forms [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91505 (owner: 10MaxSem) [19:48:00] (03CR) 10Dzahn: [C: 031] "has approval from Howie now" [operations/puppet] - 10https://gerrit.wikimedia.org/r/91084 (owner: 10Dzahn) [19:48:04] the pop-up looks really awesome, kaldari [19:48:14] RECOVERY - MySQL Slave Delay on db1047 is OK: OK replication delay 150 seconds [19:48:14] RECOVERY - MySQL Replication Heartbeat on db1047 is OK: OK replication delay 149 seconds [19:48:20] (03Merged) 10jenkins-bot: Remove b/c IPv6 forms [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91505 (owner: 10MaxSem) [19:48:24] the pop-up? [19:48:38] for videos [19:48:42] oh yeah :) [19:48:56] (03PS2) 10Reedy: (bug 31068) Configure namespaces for Azerbaijani Wikibooks [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91523 (owner: 10Odder) [19:49:01] (03CR) 10Reedy: [C: 032] (bug 31068) Configure namespaces for Azerbaijani Wikibooks [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91523 (owner: 10Odder) [19:49:11] oh, .. goes to check wikivoyage pages where he included videos [19:49:16] (03Merged) 10jenkins-bot: (bug 31068) Configure namespaces for Azerbaijani Wikibooks [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91523 (owner: 10Odder) [19:50:12] mutante: Not deployed yet ;) [19:50:41] (03PS3) 10Reedy: (bug 36002) Configure $wgMobileUrlTemplate for sourceswiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91586 (owner: 10Odder) [19:50:58] Reedy: k:) [19:50:59] (03CR) 10Reedy: [C: 032] (bug 36002) Configure $wgMobileUrlTemplate for sourceswiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91586 (owner: 10Odder) [19:51:11] (03Merged) 10jenkins-bot: (bug 36002) Configure $wgMobileUrlTemplate for sourceswiki [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91586 (owner: 10Odder) [19:51:45] (03CR) 10Reedy: [C: 04-1] "Not ready to deploy and needs rebasing anyway" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/70861 (owner: 10Reedy) [19:52:39] (03PS2) 10Reedy: Enable $wgUseRCPatrol on mediawiki.org [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91614 (owner: 10Krinkle) [19:52:46] (03CR) 10Reedy: [C: 032] Enable $wgUseRCPatrol on mediawiki.org [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91614 (owner: 10Krinkle) [19:52:58] (03Merged) 10jenkins-bot: Enable $wgUseRCPatrol on mediawiki.org [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91614 (owner: 10Krinkle) [19:53:45] !log reedy synchronized wmf-config/ [19:53:57] Logged the message, Master [19:55:28] (03CR) 10Mdale: "Did we test that this won't also make the File:video.webm pages on commons do a pop up? I don't think we want that." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91342 (owner: 10Kaldari) [19:56:08] (03PS2) 10Dzahn: add account fflorin and add to stat1 [operations/puppet] - 10https://gerrit.wikimedia.org/r/91084 [19:58:45] (03CR) 10Dzahn: [C: 032] "key confirmed, has approval, waiting period over.. merged" [operations/puppet] - 10https://gerrit.wikimedia.org/r/91084 (owner: 10Dzahn) [20:01:01] (03CR) 10Dzahn: "stat1: notice: /Stage[main]/Accounts::Fflorin/Unixaccount[Fabrice Florin]/User[fflorin]/ensure: created" [operations/puppet] - 10https://gerrit.wikimedia.org/r/91084 (owner: 10Dzahn) [20:04:22] (03PS1) 10Ottomata: Updating with recent upstream changes to varnishkafka.conf [operations/puppet/varnishkafka] - 10https://gerrit.wikimedia.org/r/91664 [20:08:32] (03CR) 10Dzahn: "if you wonder about the range being correct:" [operations/dns] - 10https://gerrit.wikimedia.org/r/91638 (owner: 10Dzahn) [20:10:03] Error: 1146 Table 'test2wiki.echo_notification' doesn't exist (10.64.16.19) [20:10:05] * Reedy sighs [20:10:17] (03PS1) 10Odder: (bug 31068) Set up NS_PROJECT for azwikibooks [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91665 [20:10:41] yay! for family defaults hidden somewhere I can't notice them :-) [20:13:03] Oh [20:13:06] SADFAkinfesloew [20:15:14] (03PS2) 10Ottomata: Updating with recent upstream changes to varnishkafka.conf [operations/puppet/varnishkafka] - 10https://gerrit.wikimedia.org/r/91664 [20:15:51] <^demon|lunch> kaldari, ori-l: Probably needs an index rebuild. [20:16:00] <^demon|lunch> Downside of having it not on the cluster, I can't do it myself :( [20:16:47] !log reedy synchronized echowikis.dblist 'Disable Echo on test2wiki due to a lack of database tables in extension1' [20:17:00] Logged the message, Master [20:18:26] !log Created echo tables for test2wiki on 10.64.16.18 [20:18:37] Logged the message, Master [20:18:57] lols. [20:19:50] !log reedy synchronized echowikis.dblist 're-enable echo on test2wiki' [20:20:01] Logged the message, Master [20:21:42] Thu Oct 24 20:20:26 UTC 2013 mw1082 foundationwiki Error connecting to db1008.eqiad.wmnet: Unknown MySQL server host 'db1008.eqiad.wmnet' (0) [20:22:49] Reedy: https://rt.wikimedia.org/Ticket/Display.html?id=5532 ? [20:23:12] Yeah [20:23:17] Thought that one sounded familiar [20:23:53] 724 node "db1008.eqiad.wmnet" { [20:23:54] 725 # moved to frack puppet [20:23:54] 726 } [20:23:56] Reedy: you're going to have to do some deploying as soon as i submit the patch for https://bugzilla.wikimedia.org/show_bug.cgi?id=56115 [20:24:01] empty node? really? [20:24:45] Jeff_Green: ^ does the empty node have any advantage? [20:25:05] i guess if it's up it's better it's documented though..ok [20:25:45] well, no, not in DNS as pointed out before [20:26:40] https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=db1008 [20:27:06] but in monitoring it is ..ehh [20:27:57] Reedy: s/db1008.eqiad.wmnet/db1008.frack.eqiad.wmnet/g [20:28:27] mutante: i think that was one of the various strategies for not having the host surprise-decommissioned [20:28:33] "but we need to retool this so we don't have to leave fundraisingdb open to the cluster" [20:29:01] reedy where's that? [20:29:11] on https://rt.wikimedia.org/Ticket/Display.html?id=5532 [20:29:16] Written by you ;) [20:29:30] Jeff_Green: i see.. we just got to this because of the errors Reedy posted above [20:29:31] I've written that like 8 times in 12 different places [20:29:35] :D [20:29:38] mw1082 foundationwiki Error connecting to db1008.eqiad.wmnet: Unknown MySQL server host 'db1008.eqiad.wmnet' [20:29:59] Reedy: ok so we have final word on this. just rip all that stuff out. the tool has been broken forever [20:30:14] i just haven't had a chance to figure out where it resides and do it myself yet [20:31:26] Is ContributionTracking to die too? [20:32:16] yes [20:32:41] this is re. cron jobs on fluorine right? everything goes. [20:33:56] (03PS1) 10Reedy: Disable and remove ContributionTracking [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91675 [20:34:35] * Reedy looks in puppet [20:39:01] anifests/misc/maintenance.pp:class misc::maintenance::foundationwiki( $enabled = false ) { [20:39:01] manifests/site.pp: class { misc::maintenance::foundationwiki: enabled => false } [20:41:15] (03PS1) 10Reedy: Make misc::maintenance::foundationwiki cronjobs ensure => absent [operations/puppet] - 10https://gerrit.wikimedia.org/r/91676 [20:41:40] (03PS6) 10Ori.livneh: Add Mathoid module (TeX -> MathML / SVG conversion web service) [operations/puppet] - 10https://gerrit.wikimedia.org/r/90733 (owner: 10Physikerwelt) [20:42:23] (03CR) 10Ori.livneh: "PS6 configures Mathoid to be deployed by git-deploy. The NPM issue is still outstanding." [operations/puppet] - 10https://gerrit.wikimedia.org/r/90733 (owner: 10Physikerwelt) [20:45:42] "An error has occurred while searching: We could not complete your search due to a temporary problem. Please try again later." [20:45:47] yay wikitech! [20:55:49] (03CR) 10Kaldari: "@Mdale: Yes, this does affect file pages. I actually chose 800px since that's the maximum thumbnail width on file pages. If you think that" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91342 (owner: 10Kaldari) [20:56:52] ksnider: afaik it's because search indices are being rebuilt currently [20:57:01] (03CR) 10Ori.livneh: "I took a look at the npm dependencies. They are: querystring, connect, and request. 'querystring' is part of the node standard library, so" [operations/puppet] - 10https://gerrit.wikimedia.org/r/90733 (owner: 10Physikerwelt) [20:58:05] I wonder if anyone is tracking mathoid at all [20:58:12] did we allocate hardware for it? [20:58:51] do we even have RT tickets for the service provisioning (lvs etc.)? [20:59:08] noooooo idea, maybe gwicke knows [20:59:32] mutante: Ah, got it. [21:01:21] (03PS1) 10Hashar: contint: add in checkstyle package [operations/puppet] - 10https://gerrit.wikimedia.org/r/91767 [21:02:13] (03PS1) 10Ottomata: Updating udp2log.pp so that udp2log instance monitoring is not configured if $misc::udp2log::monitor == false [operations/puppet] - 10https://gerrit.wikimedia.org/r/91768 [21:02:19] (03CR) 10Hashar: "that is a sub project for Yuvi / Brion to get nice style report and annoying Verified-1 on their java applications :-]" [operations/puppet] - 10https://gerrit.wikimedia.org/r/91767 (owner: 10Hashar) [21:02:52] (03CR) 10Ottomata: [C: 032 V: 032] Updating udp2log.pp so that udp2log instance monitoring is not configured if $misc::udp2log::monitor == false [operations/puppet] - 10https://gerrit.wikimedia.org/r/91768 (owner: 10Ottomata) [21:03:04] paravoid, ori-l: nobody looked into hardware yet afaik [21:03:55] the service would be purely CPU-bound [21:04:24] so two load-balanced machines with CPU cores and a few M of RAM and disk would be sufficient [21:05:45] I'll open a tracking ticket, there's multiple tasks needed for this [21:06:11] (03CR) 10Ori.livneh: "request is actually available in apt as 'node-request' (2.9.153-1)" [operations/puppet] - 10https://gerrit.wikimedia.org/r/90733 (owner: 10Physikerwelt) [21:10:49] (03CR) 10Yuvipanda: [C: 031] contint: add in checkstyle package [operations/puppet] - 10https://gerrit.wikimedia.org/r/91767 (owner: 10Hashar) [21:11:37] (03CR) 10Hashar: [C: 04-1] "Apparently that is not even needed since maven comes with check style already :-)" [operations/puppet] - 10https://gerrit.wikimedia.org/r/91767 (owner: 10Hashar) [21:11:41] YuviPanda: might not be needed [21:12:09] hashar: so, mvn checkstyle:checkstyle generates a HTML report, and also fails if there are errors [21:12:14] I think [21:12:17] it's kinda weird [21:13:37] <^d> Oh man, maven. [21:13:48] <^d> You know, now that I don't use it anymore I totally see why people hate it. [21:14:00] * ^d mutters something about hindsight [21:15:04] YuviPanda: https://integration.wikimedia.org/ci/job/test-checkstyle-maven/2/ [21:15:17] YuviPanda: Checkstyle: 0 warnings from one analysis. [21:15:20] ^d: buck's better? [21:15:22] YuviPanda: is that expected? [21:15:34] <^d> YuviPanda: No, buck's a flaming pile of donkey shit. [21:15:37] <^d> :) [21:16:19] hashar: looking at console output [21:16:32] YuviPanda: output is v [21:16:33] https://integration.wikimedia.org/ci/job/test-checkstyle-maven/ws/target/site/checkstyle.html [21:17:28] (03Abandoned) 10Hashar: contint: add in checkstyle package [operations/puppet] - 10https://gerrit.wikimedia.org/r/91767 (owner: 10Hashar) [21:17:40] hashar: sounds about right [21:17:46] hashar: will it -1 if i submit a failing thing? [21:17:48] * YuviPanda tries [21:18:12] (03CR) 10Dzahn: [C: 031] "lgtm, @tin: 4.0K drwxrwxr-x 22 root wikidev 4.0K Oct 24 20:18 common" [operations/puppet] - 10https://gerrit.wikimedia.org/r/65254 (owner: 10Hashar) [21:18:22] YuviPanda: no idea [21:18:41] hashar: hmm, so if it is run on https://gerrit.wikimedia.org/r/91770 should fail [21:18:45] YuviPanda: will look at create the job template tomorrow and then add it in zuul [21:18:50] sweet! [21:18:59] hashar: but yeah, other than that, looks about right [21:19:46] will try with refs/changes/70/91770/1 [21:19:47] ^d: I considered buck for the new app, and then hit myself in the head and went away :P [21:19:57] hashar: should have 1 error [21:20:28] <^d> YuviPanda: We call that "using good judgement" :po [21:20:38] ^d: I *am* using maven... [21:20:52] lesser of the many evils, I guess [21:20:54] <^d> Ok, well then I take it back ;-) [21:21:20] :P [21:22:40] YuviPanda: https://integration.wikimedia.org/ci/job/test-checkstyle-maven/5/console [21:22:51] * YuviPanda checks [21:23:05] YuviPanda: mavenExecutionResult exceptions not empty [21:23:40] /srv/ssd/jenkins-slave/workspace/test-checkstyle-maven/src/main/java/org/mediawiki/api/RequestBuilder.java:11: First sentence should end with a period. [21:23:42] is teh error [21:23:45] hashar: so I guess it's okay [21:23:50] just not very clearly visible :P [21:24:38] hashar: but yay, it works! :D [21:25:34] YuviPanda: I got the same issue locally [21:25:43] hashar: yeah, it's accurate [21:25:52] http://paste.debian.net/61373/ [21:26:02] hashar: yup! [21:26:05] so it fails generating the check style report :((( [21:26:14] hashar: no, it doesn't actually [21:26:20] oh [21:26:21] wait [21:26:23] maybe it does [21:26:23] moment [21:27:30] tried deleting the target subdir [21:27:34] then mvn checkstyle:checkstyle [21:27:43] the target/site/ is not generated :/ [21:28:04] hashar: yeah, that's a config [21:28:06] gotta disconnect … daughter awake [21:28:14] hashar: good night! [21:28:19] follow up on bug / tomorrow :-] [21:28:30] I can at least create the job :] [21:28:32] * hashar waves [21:55:39] (03CR) 10Edenhill: [C: 031] Updating with recent upstream changes to varnishkafka.conf [operations/puppet/varnishkafka] - 10https://gerrit.wikimedia.org/r/91664 (owner: 10Ottomata) [22:00:40] (03CR) 10Faidon Liambotis: "Debian unstable has node-request 2.26.1-1 and node-connect 1.7.3-1 but these bring at least 10 other Node module dependencies along with t" [operations/puppet] - 10https://gerrit.wikimedia.org/r/90733 (owner: 10Physikerwelt) [22:07:14] !log awight synchronized php-1.22wmf22/extensions/CentralNotice [22:07:26] Logged the message, Master [22:07:40] is there something up with commons now? [22:07:56] hearing through the grapevine that there might be....testing myself now [22:08:00] !log awight synchronized php-1.23wmf1/extensions/CentralNotice [22:08:12] Logged the message, Master [22:09:24] nevermind, seems to be fine [22:12:36] Can someone delete rm -rf /home/wikipedia/common-before-tin/docroot/www.wiki* from fenari for me please? [22:19:26] Reedy: done [22:20:04] thanks [22:24:33] (03CR) 10Physikerwelt: "I tried to install to use different versions of request using npm. Moving forward in the version to 3.x leads to deprecation warnings. Thi" [operations/puppet] - 10https://gerrit.wikimedia.org/r/90733 (owner: 10Physikerwelt) [22:32:58] (03PS1) 10Reedy: Update symlinks to web configs [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91788 [22:33:15] (03CR) 10Reedy: [C: 032] Update symlinks to web configs [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91788 (owner: 10Reedy) [22:33:30] (03Merged) 10jenkins-bot: Update symlinks to web configs [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91788 (owner: 10Reedy) [22:41:52] ori-l: graphite is still on fenari it seems [22:42:00] Should be moved to EQIAD [22:42:17] graphite is on professor in tampa. do you mean the deploy scripts that log to graphite? [22:42:20] if so, yeah, should be moved to tin [22:42:35] (graphite should move to eqiad too -- that's already in progress) [22:43:01] reedy@ubuntu64-web-esxi:~/git/operations/apache-config$ ping graphite.wikimedia.org [22:43:01] PING fenari.wikimedia.org (208.80.152.165) 56(84) bytes of data. [22:45:18] it's just a reverse-proxy [22:45:44] Ah [22:45:54] I can't view /etc/apache2/sites-enabled/graphite.wikimedia.org [22:47:22] Which is in puppet templates/apache/sites/graphite.wikimedia.org.erb [22:47:25] Amusing [22:59:28] !log reedy synchronized docroot and w [22:59:35] Logged the message, Master [23:05:19] !log reedy synchronized multiversion/ [23:05:31] Logged the message, Master [23:16:46] (03PS1) 10Reedy: Move activeMWVersions to a php file. Create wrapper to replace it [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91796 [23:20:24] (03PS2) 10Reedy: Move activeMWVersions to a php file. Create wrapper to replace it [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91796 [23:20:29] PROBLEM - Host mw31 is DOWN: PING CRITICAL - Packet loss = 100% [23:21:10] RECOVERY - Host mw31 is UP: PING OK - Packet loss = 0%, RTA = 26.57 ms [23:21:59] (03CR) 10Reedy: [C: 032] Move activeMWVersions to a php file. Create wrapper to replace it [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91796 (owner: 10Reedy) [23:22:52] (03Merged) 10jenkins-bot: Move activeMWVersions to a php file. Create wrapper to replace it [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91796 (owner: 10Reedy) [23:23:22] !log reedy synchronized multiversion [23:23:34] Logged the message, Master [23:25:03] !log reedy synchronized docroot/noc/ [23:25:14] Logged the message, Master [23:29:45] (03PS1) 10Reedy: Make activeMWVersions web accessible [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91798 [23:36:56] (03CR) 10Reedy: [C: 032] "(2 comments)" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91798 (owner: 10Reedy) [23:37:29] (03PS2) 10Reedy: Make activeMWVersions web accessible [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91798 [23:37:36] (03CR) 10Reedy: [C: 032] Make activeMWVersions web accessible [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91798 (owner: 10Reedy) [23:39:32] (03Merged) 10jenkins-bot: Make activeMWVersions web accessible [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/91798 (owner: 10Reedy) [23:40:55] !log reedy synchronized docroot/noc/conf/ [23:41:08] Logged the message, Master [23:51:09] (03PS1) 10Ori.livneh: Report MediaWiki wfDebug() log counts to Ganglia via Gmetric [operations/puppet] - 10https://gerrit.wikimedia.org/r/91804 [23:52:24] (03CR) 10Ori.livneh: [C: 032] Report MediaWiki wfDebug() log counts to Ganglia via Gmetric [operations/puppet] - 10https://gerrit.wikimedia.org/r/91804 (owner: 10Ori.livneh) [23:59:30] (03PS1) 10Ori.livneh: wfdebug-ganglia: set reporting interval to 60 seconds [operations/puppet] - 10https://gerrit.wikimedia.org/r/91807