[00:14:45] ori: http://gdash.wikimedia.org/dashboards/ve/ is neat; is that enwiki only, all wikis, or something else? [00:42:05] (03PS1) 10Tim Landscheidt: Tools: Add Ukrainian locale and sort list [operations/puppet] - 10https://gerrit.wikimedia.org/r/110827 [00:53:19] James_F: it represents all VE edits; it's neither sampled nor wiki-specific [00:54:46] James_F: there's more at http://graphite.wikimedia.org/ ; if you haven't looked there in a while, it's a lot more organized [00:54:56] look under Graphite/ve/performance [00:56:41] ori: Nice. :-) [01:01:43] James_F: if you want to use this data, here's my advice: (a) do some exploratory data analysis by applying transformations using graphite's UI, and see if you can smooth out the daily / weekly seasonality, and devise a graph that shows trends more clearly, (b) plot VE deployments as annotated vertical lines on the graph (see ); (c) for the key metrics, pick a target [01:01:43] and plot it as a horizontal line, and pick an upper, 'crisis' threshold and plot it as another horizontal line [01:02:03] * James_F nods. [01:02:25] i speak from a wealth of other people's experience! :D [01:02:45] * James_F grins. [01:03:01] * ori has been reading up on graphite in particular and performance measurement / analysis in general [01:03:17] PROBLEM - RAID on analytics1009 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [01:04:17] RECOVERY - RAID on analytics1009 is OK: OK: Active: 6, Working: 6, Failed: 0, Spare: 0 [01:04:37] James_F: any graph that you construct using graphite's web UI can be exported to gdash (or embedded anywhere else) quite easily. ok, that's all the unneeded advice i'll dispense [01:11:25] ori: Thanks. Very helpful. [01:11:33] ori: I may come back to you in time with questions. :-) [01:23:46] (03CR) 10Tim Landscheidt: "Suppose this is supposed to be run in a Git repo, you can also use "git grep -c ^$'\t' -- \*.pp | sort -k2 -t: -n"." [operations/puppet] - 10https://gerrit.wikimedia.org/r/108018 (owner: 10Hashar) [02:44:17] !log LocalisationUpdate completed (1.23wmf11) at 2014-02-02 02:44:17+00:00 [02:44:29] Logged the message, Master [03:11:33] !log LocalisationUpdate completed (1.23wmf12) at 2014-02-02 03:11:33+00:00 [03:11:41] Logged the message, Master [03:50:59] !log LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-02 03:50:59+00:00 [03:51:08] Logged the message, Master [05:45:27] (03PS1) 10Springle: repool db1011, warm up [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110833 [05:45:52] (03CR) 10Springle: [C: 032] repool db1011, warm up [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110833 (owner: 10Springle) [05:45:59] (03Merged) 10jenkins-bot: repool db1011, warm up [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110833 (owner: 10Springle) [05:47:03] !log springle synchronized wmf-config/db-eqiad.php 'repool db1011, warm up' [05:47:11] Logged the message, Master [06:45:07] PROBLEM - Disk space on ms-be1002 is CRITICAL: DISK CRITICAL - /srv/swift-storage/sdd1 is not accessible: Input/output error [06:45:27] PROBLEM - RAID on ms-be1002 is CRITICAL: CRITICAL: 1 failed LD(s) (Offline) [07:30:31] (03PS4) 10Matanya: webserver: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/110454 [07:33:28] (03PS14) 10Matanya: site: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/109507 [07:34:05] (03CR) 10jenkins-bot: [V: 04-1] site: lint [operations/puppet] - 10https://gerrit.wikimedia.org/r/109507 (owner: 10Matanya) [07:42:06] where is grrrit-wm ? [07:45:37] slacking [09:35:20] (03PS5) 10Nemo bis: Split exim stats to own class and add it to mchenry [operations/puppet] - 10https://gerrit.wikimedia.org/r/110524 [09:43:25] (03CR) 10Nemo bis: Split exim stats to own class and add it to mchenry (032 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/110524 (owner: 10Nemo bis) [09:46:37] (03PS1) 10Springle: depool db1004 for schema changes [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110838 [09:50:39] (03CR) 10Springle: [C: 032] depool db1004 for schema changes [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110838 (owner: 10Springle) [09:50:48] (03Merged) 10jenkins-bot: depool db1004 for schema changes [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110838 (owner: 10Springle) [09:52:51] !log springle synchronized wmf-config/db-eqiad.php 'depool db1004, schema changes' [09:52:58] Logged the message, Master [10:00:57] PROBLEM - MySQL Replication Heartbeat on db1004 is CRITICAL: CRIT replication delay 322 seconds [10:36:27] (03CR) 10Alexandros Kosiaris: [C: 032] beta: fatal email should say 'last twelve hours' [operations/puppet] - 10https://gerrit.wikimedia.org/r/110519 (owner: 10Hashar) [10:40:11] akosiaris: any chance you can finish the site.pp lint review today? sunday is a nice and quite day for fixing stuff :) [10:51:41] (03PS1) 10Springle: depool db1037 for schema changes [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110841 [10:52:18] (03CR) 10Springle: [C: 032] depool db1037 for schema changes [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110841 (owner: 10Springle) [10:52:25] (03Merged) 10jenkins-bot: depool db1037 for schema changes [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110841 (owner: 10Springle) [10:53:38] !log springle synchronized wmf-config/db-eqiad.php 'depool db1037, schema changes' [10:53:47] Logged the message, Master [10:54:56] iirc nagios.wikimedia.org used to be mostly public, but the change to icinga seems to have changed that, it requires auth for accessing any page of it. Was that intentional? And how would someone go about getting access to it if they were interested in looking around? [10:57:55] matanya: sunday as you already said.... And in fosdem so not that much free time. That being said, you may be happy to know that I am 50% done with the catalog compile job I promised. [10:58:29] yay! thanks a lit akosiaris ! have fun at FOSDEM [10:58:33] *lot [11:07:35] (03CR) 10Guido.iaquinti: [C: 031] Tools: Add Ukrainian locale and sort list [operations/puppet] - 10https://gerrit.wikimedia.org/r/110827 (owner: 10Tim Landscheidt) [11:09:37] (03CR) 10Guido.iaquinti: [C: 031] Add virtual host for wiki.toolserver.org [operations/apache-config] - 10https://gerrit.wikimedia.org/r/109460 (owner: 10Tim Landscheidt) [11:18:05] (03CR) 10Guido.iaquinti: [C: 031] Renamed labstore100[34] to labsdb100[45] [operations/dns] - 10https://gerrit.wikimedia.org/r/110220 (owner: 10Alexandros Kosiaris) [11:20:50] (03PS1) 10Matanya: jenkins: change nrpe check to use nrpe::monitor_service [operations/puppet] - 10https://gerrit.wikimedia.org/r/110842 [11:21:11] (03PS1) 10Andrew Bogott: Rough in a neutron service for labnet1001. [operations/puppet] - 10https://gerrit.wikimedia.org/r/110843 [11:24:26] (03PS2) 10Andrew Bogott: Rough in a neutron service for labnet1001. [operations/puppet] - 10https://gerrit.wikimedia.org/r/110843 [11:25:56] andrewbogott: this ^ change made me lint-blind :) [11:28:53] ok, I guess I can lint the new file :) [11:50:49] matanya: Ok, still a few warnings which I don't understand… but mostly cleaned up. [11:50:52] (03PS3) 10Andrew Bogott: Rough in a neutron service for labnet1001. [operations/puppet] - 10https://gerrit.wikimedia.org/r/110843 [11:53:08] will look shortly [12:02:10] (03PS1) 10Matanya: mysql: change nrpe monitoring to use nrpe::monitor [operations/puppet] - 10https://gerrit.wikimedia.org/r/110844 [12:03:36] matanya: merging so I can test, but I still welcome your lint comments. [12:03:47] (03CR) 10Andrew Bogott: [C: 032] Rough in a neutron service for labnet1001. [operations/puppet] - 10https://gerrit.wikimedia.org/r/110843 (owner: 10Andrew Bogott) [12:03:49] andrewbogott: finishing in a sec [12:04:04] wanted to ask if you wish to wait ... [12:04:31] oops, sorry [12:04:44] nenver mind [12:04:45] can make a fixup [12:04:52] I'm sure I'll need one anyway :) [12:05:06] will put them gerrit [12:07:36] (03PS1) 10Andrew Bogott: Name my templates .erb [operations/puppet] - 10https://gerrit.wikimedia.org/r/110845 [12:09:13] (03CR) 10Andrew Bogott: [C: 032] Name my templates .erb [operations/puppet] - 10https://gerrit.wikimedia.org/r/110845 (owner: 10Andrew Bogott) [12:13:58] (03CR) 10Matanya: Rough in a neutron service for labnet1001. (0310 comments) [operations/puppet] - 10https://gerrit.wikimedia.org/r/110843 (owner: 10Andrew Bogott) [12:14:55] matanya: remove the selector because it only has one case? Or because selectors are generally frowned on? [12:15:41] andrewbogott: both :) [12:15:49] ok [12:16:17] andrewbogott: it also leads later on to http://puppet-lint.com/checks/selector_inside_resource/ [12:16:29] in many cases, which is annoying [12:16:57] PROBLEM - DPKG on labnet1001 is CRITICAL: DPKG CRITICAL dpkg reports broken packages [12:21:17] PROBLEM - HTTP 5xx req/min on tungsten is CRITICAL: CRITICAL: reqstats.5xx [crit=500.000000 [12:22:57] RECOVERY - DPKG on labnet1001 is OK: All packages OK [12:31:57] PROBLEM - DPKG on labnet1001 is CRITICAL: DPKG CRITICAL dpkg reports broken packages [12:32:39] I don't have puppet-lint on my dev box… is this all of them? [12:32:43] (03PS1) 10Andrew Bogott: Lint fixes for neutron bits. [operations/puppet] - 10https://gerrit.wikimedia.org/r/110847 [12:33:06] no, andrewbogott this is just for your change :) [12:33:18] That's what I meant [12:33:20] there are like 1000000 of them for the openstack stuff [12:33:57] RECOVERY - DPKG on labnet1001 is OK: All packages OK [12:34:31] (03CR) 10Matanya: [C: 031] Lint fixes for neutron bits. [operations/puppet] - 10https://gerrit.wikimedia.org/r/110847 (owner: 10Andrew Bogott) [12:36:45] (03CR) 10Andrew Bogott: [C: 032] Lint fixes for neutron bits. [operations/puppet] - 10https://gerrit.wikimedia.org/r/110847 (owner: 10Andrew Bogott) [12:37:57] PROBLEM - DPKG on labnet1001 is CRITICAL: DPKG CRITICAL dpkg reports broken packages [12:39:08] dammit [12:41:13] more labnet1001 warnings coming up [12:42:47] PROBLEM - Host labnet1001 is DOWN: PING CRITICAL - Packet loss = 100% [12:43:16] did you reboot it andrewbogott ? [12:43:27] pxe booting [12:43:35] packages were so scrambled, will be faster with a clean slate [12:44:35] Curious to see if puppet immediately re-scrambles them! [12:47:57] RECOVERY - Host labnet1001 is UP: PING OK - Packet loss = 0%, RTA = 0.34 ms [12:50:17] PROBLEM - RAID on labnet1001 is CRITICAL: Connection refused by host [12:50:17] PROBLEM - puppet disabled on labnet1001 is CRITICAL: Connection refused by host [12:50:27] PROBLEM - SSH on labnet1001 is CRITICAL: Connection refused [12:50:47] PROBLEM - Disk space on labnet1001 is CRITICAL: Connection refused by host [13:02:07] PROBLEM - NTP on labnet1001 is CRITICAL: NTP CRITICAL: No response from NTP server [13:08:27] mutante: anything of interest in RT #6150? [13:09:46] nothing twkozlowski [13:09:58] https://ve.wikimedia.org works [13:10:13] not doc'ed on the ticket [13:10:25] I'm talking https://bugzilla.wikimedia.org/show_bug.cgi?id=55737 [13:10:58] I assume we only need to clean up the config in InitialiseSettings.php, add the db to deleted.dblist but keep the Apache redirects? [13:14:27] RECOVERY - SSH on labnet1001 is OK: SSH OK - OpenSSH_5.9p1 Debian-5ubuntu1.1 (protocol 2.0) [13:22:29] andrewbogott: comments? https://wikitech.wikimedia.org/wiki/Puppet_style_guide [13:23:17] RECOVERY - HTTP 5xx req/min on tungsten is OK: OK: reqstats.5xx [warn=250.000 [13:28:32] matanya: Not that I care a lot, but -- we're using 4-space tabs currently aren't we? [13:28:41] yes [13:29:08] "Many existing manifests use tabs or two-spaces (as suggested in the style guide) instead of our 4 space indent standard" [13:29:39] But then you say "Must use two-space soft tabs. " [13:29:57] RECOVERY - DPKG on labnet1001 is OK: All packages OK [13:30:16] a 'good' example in the conditionals section would be nice. [13:30:17] RECOVERY - RAID on labnet1001 is OK: NRPE: Unable to read output [13:30:17] RECOVERY - puppet disabled on labnet1001 is OK: OK [13:30:31] (03CR) 10Alexandros Kosiaris: [C: 04-1] Split exim stats to own class and add it to mchenry (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/110524 (owner: 10Nemo bis) [13:30:47] RECOVERY - Disk space on labnet1001 is OK: DISK OK [13:30:55] (03CR) 10Alexandros Kosiaris: Split exim stats to own class and add it to mchenry (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/110524 (owner: 10Nemo bis) [13:31:12] I also don't feel very strongly about the one class per file rule. [13:31:39] I do [13:31:47] please keep this rule :-) [13:31:47] i mean -- if a class depends on other classes which are /only/ used by that class... [13:31:51] then they should probably go in the same file. [13:32:15] I think it depends on whether or not the class will ever be referenced from outside of the module. [13:32:18] that might cause problems with the module autoloader though [13:33:01] how so? [13:33:31] fixed your comments andrewbogott [13:33:38] there have been cases with puppet failing to find classes [13:33:51] and scoping issues too [13:33:58] especially with puppet apply and testing [13:34:36] akosiaris: your review of course would be appricated too [13:34:44] and bending that rule might lead to bending the other rule [13:34:54] about classes in classes which i detest ... [13:35:00] matanya: thanks, looks good. [13:35:13] matanya: context ? [13:35:29] akosiaris: https://wikitech.wikimedia.org/wiki/Puppet_style_guide [13:35:49] oh with review I thought you meant a gerrit change [13:36:11] akosiaris: Do you understand what I mean about being referenced from outside? [13:36:39] Like sometimes I want to refactor one big giant class into a few smaller bits for readability. But the original (formerly big) class is still the only interface to the outside world. [13:36:48] In that case it seems cleaner to have the little dependency bits in the same file. [13:37:18] i don't think that happens in puppet very often [13:37:28] and it is a bad habit :) [13:37:48] matanya: well, what about e.g. the role class I just wrote? [13:37:49] there have been more than one instances where an "internal" class ended up being used in another context [13:38:53] well, roles and modules are not the same case, them will never autoload. but i do think you could have splitted it a bit more and get the same results [13:39:00] it would just be more work [13:39:17] more work and more tiny files which would be irritating to flip between while reading [13:39:53] yes, but as a lint-nazi, i prefer this way [13:40:11] heh an argument for a bigger screen :-) [13:40:25] that is that justification you were looking for andrewbogott [13:41:03] Ryan_Lane: working, or just visiting? [13:43:52] ok, if andrewbogott or akosiaris don't mind, send it to the ops ml for feedback, and once it is confirmed, please remove the draft template and mark as policy [13:43:53] (03CR) 10Alexandros Kosiaris: [C: 032] "Thanks !" [operations/puppet] - 10https://gerrit.wikimedia.org/r/110842 (owner: 10Matanya) [13:48:11] matanya: I don't mind, but can you reconcile and/or merge it with the existing puppet usage and style page? [13:48:57] RECOVERY - NTP on labnet1001 is OK: NTP OK: Offset -0.04715681076 secs [13:49:26] done andrewbogott [14:10:56] PROBLEM - DPKG on labnet1001 is CRITICAL: DPKG CRITICAL dpkg reports broken packages [14:21:26] (03PS1) 10Andrew Bogott: Specify rabbit_host for neutron. [operations/puppet] - 10https://gerrit.wikimedia.org/r/110854 [14:22:41] (03CR) 10Andrew Bogott: [C: 032] Specify rabbit_host for neutron. [operations/puppet] - 10https://gerrit.wikimedia.org/r/110854 (owner: 10Andrew Bogott) [14:37:04] (03PS1) 10Andrew Bogott: Fix up the auth section for neutron [operations/puppet] - 10https://gerrit.wikimedia.org/r/110855 [14:43:41] (03CR) 10Andrew Bogott: [C: 032] Fix up the auth section for neutron [operations/puppet] - 10https://gerrit.wikimedia.org/r/110855 (owner: 10Andrew Bogott) [14:53:07] (03PS1) 10Andrew Bogott: Include openvswitch plugin with neutron. [operations/puppet] - 10https://gerrit.wikimedia.org/r/110856 [14:55:18] (03CR) 10Andrew Bogott: [C: 032] Include openvswitch plugin with neutron. [operations/puppet] - 10https://gerrit.wikimedia.org/r/110856 (owner: 10Andrew Bogott) [15:06:29] (03PS1) 10Andrew Bogott: Configure ovs_neutron_plugin [operations/puppet] - 10https://gerrit.wikimedia.org/r/110858 [15:08:05] (03CR) 10Andrew Bogott: [C: 032] Configure ovs_neutron_plugin [operations/puppet] - 10https://gerrit.wikimedia.org/r/110858 (owner: 10Andrew Bogott) [15:11:49] (03PS1) 10Springle: repool db1037 and db1004 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110859 [15:12:49] (03CR) 10Springle: [C: 032] repool db1037 and db1004 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110859 (owner: 10Springle) [15:12:55] (03Merged) 10jenkins-bot: repool db1037 and db1004 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110859 (owner: 10Springle) [15:13:55] !log springle synchronized wmf-config/db-eqiad.php 'repool db1037 and db1004' [15:14:04] Logged the message, Master [15:53:14] (03PS2) 10Matanya: mysql: change nrpe monitoring to use nrpe::monitor [operations/puppet] - 10https://gerrit.wikimedia.org/r/110844 [17:05:47] (03PS6) 10Nemo bis: Split exim stats to own class and add it to mchenry [operations/puppet] - 10https://gerrit.wikimedia.org/r/110524 [17:15:16] PROBLEM - check_swap on thulium is CRITICAL: SWAP CRITICAL - 35% free (2659 MB out of 7627 MB) [17:20:06] PROBLEM - check_swap on thulium is CRITICAL: SWAP CRITICAL - 57% free (4319 MB out of 7627 MB) [17:25:16] PROBLEM - check_swap on thulium is CRITICAL: SWAP CRITICAL - 57% free (4336 MB out of 7627 MB) [17:30:06] PROBLEM - check_swap on thulium is CRITICAL: SWAP CRITICAL - 47% free (3522 MB out of 7627 MB) [17:35:16] PROBLEM - check_swap on thulium is CRITICAL: SWAP CRITICAL - 14% free (1005 MB out of 7627 MB) [17:40:26] PROBLEM - check_swap on thulium is CRITICAL: SWAP CRITICAL - 9% free (663 MB out of 7627 MB) [17:45:16] PROBLEM - check_swap on thulium is CRITICAL: SWAP CRITICAL - 48% free (3641 MB out of 7627 MB) [17:46:53] err someone look at thulium [17:50:06] RECOVERY - check_swap on thulium is OK: SWAP OK - 98% free (7434 MB out of 7627 MB) [18:10:06] PROBLEM - check_swap on thulium is CRITICAL: SWAP CRITICAL - 47% free (3520 MB out of 7627 MB) [18:15:06] PROBLEM - check_swap on thulium is CRITICAL: SWAP CRITICAL - 61% free (4584 MB out of 7627 MB) [18:20:06] RECOVERY - check_swap on thulium is OK: SWAP OK - 99% free (7504 MB out of 7627 MB) [18:27:26] PROBLEM - Varnishkafka Delivery Errors on cp3019 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 48.700001 [18:28:26] RECOVERY - Varnishkafka Delivery Errors on cp3019 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [18:48:36] PROBLEM - Varnishkafka Delivery Errors on cp3020 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 93.400002 [18:49:36] RECOVERY - Varnishkafka Delivery Errors on cp3020 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [18:52:26] PROBLEM - Varnishkafka Delivery Errors on cp3019 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 551.733337 [18:52:36] PROBLEM - Varnishkafka Delivery Errors on cp3020 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 481.466675 [18:53:26] RECOVERY - Varnishkafka Delivery Errors on cp3019 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [18:56:46] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 62.633335 [18:57:46] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [18:58:46] PROBLEM - Varnishkafka Delivery Errors on cp3022 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 160.46666 [18:59:46] RECOVERY - Varnishkafka Delivery Errors on cp3022 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:00:36] RECOVERY - Varnishkafka Delivery Errors on cp3020 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:00:46] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 110.666664 [19:01:26] PROBLEM - Varnishkafka Delivery Errors on cp3019 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 81.933334 [19:03:26] RECOVERY - Varnishkafka Delivery Errors on cp3019 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:03:36] PROBLEM - Varnishkafka Delivery Errors on cp3020 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 283.533325 [19:03:46] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:05:46] PROBLEM - Varnishkafka Delivery Errors on cp3022 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 13.8 [19:09:46] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 175.266663 [19:10:46] RECOVERY - Varnishkafka Delivery Errors on cp3022 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:13:27] PROBLEM - Varnishkafka Delivery Errors on cp3019 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 25.200001 [19:14:27] RECOVERY - Varnishkafka Delivery Errors on cp3019 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:22:46] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:25:46] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 382.966675 [19:28:46] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:34:25] (03PS1) 10Ebe123: Add transwiki import options for zh.wikivoyage [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110876 [19:35:26] PROBLEM - Varnishkafka Delivery Errors on cp3019 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 250.666672 [19:36:27] RECOVERY - Varnishkafka Delivery Errors on cp3019 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:36:36] RECOVERY - Varnishkafka Delivery Errors on cp3020 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:39:36] PROBLEM - Varnishkafka Delivery Errors on cp3020 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 765.133362 [19:40:46] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 443.100006 [19:42:46] PROBLEM - Varnishkafka Delivery Errors on cp3022 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 617.56665 [19:43:46] RECOVERY - Varnishkafka Delivery Errors on cp3022 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:46:27] PROBLEM - Varnishkafka Delivery Errors on cp3019 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 266.600006 [19:46:46] PROBLEM - Varnishkafka Delivery Errors on cp3022 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 360.033325 [19:47:26] RECOVERY - Varnishkafka Delivery Errors on cp3019 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:55:06] (03PS2) 10Se4598: Replace easter egg by a more explaining message [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110874 [19:56:46] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [19:59:46] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 489.5 [20:08:21] ori: want to look at those ^ ? [20:11:46] RECOVERY - Varnishkafka Delivery Errors on cp3022 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [20:16:46] PROBLEM - Varnishkafka Delivery Errors on cp3022 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 13.433333 [20:17:27] PROBLEM - Varnishkafka Delivery Errors on cp3019 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 18.866667 [20:18:26] RECOVERY - Varnishkafka Delivery Errors on cp3019 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [20:21:46] RECOVERY - Varnishkafka Delivery Errors on cp3022 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [20:24:46] PROBLEM - Varnishkafka Delivery Errors on cp3022 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 13.666667 [20:34:46] RECOVERY - Varnishkafka Delivery Errors on cp3022 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [20:36:37] (03PS1) 10Matanya: nrpe: remove hard coded disk checks [operations/puppet] - 10https://gerrit.wikimedia.org/r/110880 [20:36:56] bd808: mind reviewing one patch? [20:37:04] https://gerrit.wikimedia.org/r/#/c/110844/ [20:37:17] matanya: Sure I'll give it a look [20:37:24] thanks [20:38:56] (03CR) 10Matanya: [C: 04-1] "don't merge until https://gerrit.wikimedia.org/r/#/c/110844/ is merged." [operations/puppet] - 10https://gerrit.wikimedia.org/r/110880 (owner: 10Matanya) [20:43:36] RECOVERY - Varnishkafka Delivery Errors on cp3020 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [20:43:46] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [20:44:22] (03CR) 10BryanDavis: mysql: change nrpe monitoring to use nrpe::monitor (031 comment) [operations/puppet] - 10https://gerrit.wikimedia.org/r/110844 (owner: 10Matanya) [20:44:51] matanya: One trivial whitespace issue. Otherwise looks good to me [20:45:28] (03PS3) 10Matanya: mysql: change nrpe monitoring to use nrpe::monitor [operations/puppet] - 10https://gerrit.wikimedia.org/r/110844 [20:45:47] thanks bd808 fixed. if still good, please merge :) [20:46:13] matanya: Unfortunately I don't have +2 on puppet. [20:46:36] PROBLEM - Varnishkafka Delivery Errors on cp3020 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 11.433333 [20:47:03] Or maybe that's actually a fortunate thing :) [20:48:46] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 593.5 [20:54:26] PROBLEM - Varnishkafka Delivery Errors on cp3019 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 31.266666 [20:55:27] RECOVERY - Varnishkafka Delivery Errors on cp3019 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [20:59:13] :) [21:01:36] RECOVERY - Varnishkafka Delivery Errors on cp3020 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [21:10:19] (03CR) 10Hashar: [C: 04-1] "I would prefer to keep the easter egg around. I like when people fill security bug saying one of our password is exposed :D" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110874 (owner: 10Se4598) [21:11:58] (03CR) 10Hoo man: [C: 04-1] "I also like the easter egg" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110874 (owner: 10Se4598) [21:12:23] (03CR) 10MaxSem: [C: 04-1] "+1 to what Antoine said. It does no harm, it's fun." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110874 (owner: 10Se4598) [21:13:24] (03CR) 10Se4598: "ok, I also like it." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110874 (owner: 10Se4598) [21:15:28] (03CR) 10Bartosz Dziewoński: "We could make it say superSecretSitePassword os something similarly silly to make it clearer that it's not a real password. I'm not sure a" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110874 (owner: 10Se4598) [21:18:39] (03CR) 10Se4598: "before someone want to post another "keep it", please mark the bug as RESOLVED WONTFIX, then I'll abandom this change..." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110874 (owner: 10Se4598) [21:20:55] (03CR) 10Odder: [C: 031] "I for one think the patch is valid." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110874 (owner: 10Se4598) [21:24:27] (03CR) 10Matanya: [C: 031] "i'm with odder here." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110874 (owner: 10Se4598) [21:29:59] (03CR) 10MZMcBride: "As I said on bug 60741, I don't believe there's currently consensus to remove this Easter egg. I think the password should be made longer " [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110874 (owner: 10Se4598) [21:30:26] PROBLEM - Varnishkafka Delivery Errors on cp3019 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 53.833332 [21:31:26] RECOVERY - Varnishkafka Delivery Errors on cp3019 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [21:32:38] (03CR) 10Se4598: "it's not helpful and more cryptic than "Whitelist must only contain symlinks."?" [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110874 (owner: 10Se4598) [21:36:17] (03CR) 10Ori.livneh: [C: 032 V: 032] "Oh no, someone hacked my Gerrit account and merged this change." [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110874 (owner: 10Se4598) [21:36:36] wasn't me [21:37:07] heh [21:37:16] !log ori updated /a/common to {{Gerrit|Idda8cff80}}: Replace easter egg by a more explaining message [21:37:24] Logged the message, Master [21:37:29] that someone should also totally review https://gerrit.wikimedia.org/r/#/c/82100/ [21:37:45] ori: Should probably, change you secretSitPassword then :p [21:38:16] *secretSitePassword [21:39:46] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [21:40:03] !log ori synchronized docroot/noc/conf/highlight.php 'Idda8cff80: Replace easter egg by a more explaining message' [21:40:12] Logged the message, Master [21:41:21] MatmaRex: ok, testing [21:42:46] PROBLEM - Varnishkafka Delivery Errors on cp3021 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 37.233334 [21:45:15] ori: i didn't really test it in funny browsers, so please do :) [21:46:50] i am violating my personal no-pixels rule [21:50:25] se4598: Thanks for the patch (and ori for merging). Do you follow up on the test failures? [21:51:05] scfc_de: did I broke a test? [21:51:11] there is a unit test for this. [21:51:13] * ori facepalms. [21:51:32] * ori fixes. [21:52:29] tests, so we don't delete our super secret password by accident? :D [21:55:57] ori: https://git.wikimedia.org/blob/operations%2Fmediawiki-config.git/f4f834ba0ffdc653e9219035be46a3fffa6a5367/tests%2Fnoc-conf%2FhighlightTest.php#L81 replace the default value with the new message, should be then ok [21:56:39] Currently active MediaWiki versions: 1.23wmf11 1.23wmf12 [21:56:41] that's nice. [21:56:45] never seen that before [21:57:20] (03PS1) 10Ori.livneh: Fix test broken by Idda8cff80 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110886 [21:57:54] (03CR) 10Ori.livneh: [C: 032 V: 032] Fix test broken by Idda8cff80 [operations/mediawiki-config] - 10https://gerrit.wikimedia.org/r/110886 (owner: 10Ori.livneh) [21:58:43] !log ori updated /a/common to {{Gerrit|I8352c4cfc}}: Fix test broken by Idda8cff80 [21:58:44] ori: thanks for fixing and scfc_de for pointing out [21:58:51] Logged the message, Master [21:59:45] ori: Thanks! [21:59:51] !log ori synchronized docroot/noc/conf/highlight.php 'I8352c4cfc: Fix test broken by Idda8cff80 (1/2)' [22:00:00] Logged the message, Master [22:01:35] !log ori synchronized tests/noc-conf/highlightTest.php 'I8352c4cfc: Fix test broken by Idda8cff80 (2/2)' [22:01:43] Logged the message, Master [22:02:46] RECOVERY - Varnishkafka Delivery Errors on cp3021 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [22:15:31] * MaxSem throws a bunch of boos at ori [22:16:27] * ori catches. [22:45:27] PROBLEM - Varnishkafka Delivery Errors on cp3019 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 43.299999 [22:46:26] RECOVERY - Varnishkafka Delivery Errors on cp3019 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0 [23:52:26] PROBLEM - Varnishkafka Delivery Errors on cp3019 is CRITICAL: kafka.varnishkafka.kafka_drerr.per_second CRITICAL: 219.766663 [23:54:26] RECOVERY - Varnishkafka Delivery Errors on cp3019 is OK: kafka.varnishkafka.kafka_drerr.per_second OKAY: 0.0