[00:09:18] PROBLEM - Disk space on kafka1012 is CRITICAL: DISK CRITICAL - free space: / 1061 MB (3% inode=96%) [01:39:07] PROBLEM - puppet last run on mw2156 is CRITICAL: CRITICAL: puppet fail [01:53:21] -rw-r----- 1 root adm 6441237 Jan 24 01:52 error.log [01:53:22] -rw-r----- 1 root deployment 43678484 Jan 23 06:25 error.log.1 [01:53:22] meh [01:53:47] I suppose that logging change sped up breaking deployer access to the error log andrewbogott [01:53:48] PROBLEM - Disk space on kafka1012 is CRITICAL: DISK CRITICAL - free space: / 1063 MB (3% inode=96%) [01:53:50] (from silver) [01:56:58] PROBLEM - Ensure NFS exports are maintained for new instances with NFS on labstore1001 is CRITICAL: CRITICAL - Expecting active but unit nfs-exports is failed [02:06:27] RECOVERY - puppet last run on mw2156 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [02:10:54] 6operations, 10Wikimedia-Mailing-lists: Need listadmin password reset for Translators-l mailing list - https://phabricator.wikimedia.org/T123163#1959549 (10Az1568) a:3Jalexander [02:24:23] !log mwdeploy@tin sync-l10n completed (1.27.0-wmf.10) (duration: 09m 11s) [02:24:28] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [02:31:21] !log l10nupdate@tin ResourceLoader cache refresh completed at Sun Jan 24 02:31:21 UTC 2016 (duration 6m 58s) [02:31:24] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log, Master [03:36:49] PROBLEM - puppet last run on mw1183 is CRITICAL: CRITICAL: Puppet has 1 failures [03:46:58] PROBLEM - Disk space on kafka1012 is CRITICAL: DISK CRITICAL - free space: / 1062 MB (3% inode=96%) [03:55:38] PROBLEM - puppet last run on db2003 is CRITICAL: CRITICAL: puppet fail [03:59:28] PROBLEM - Disk space on kafka1012 is CRITICAL: DISK CRITICAL - free space: / 1034 MB (3% inode=96%) [04:02:07] RECOVERY - puppet last run on mw1183 is OK: OK: Puppet is currently enabled, last run 32 seconds ago with 0 failures [04:22:58] RECOVERY - puppet last run on db2003 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [05:46:28] PROBLEM - Disk space on kafka1012 is CRITICAL: DISK CRITICAL - free space: / 1062 MB (3% inode=96%) [06:14:17] PROBLEM - RAID on db2012 is CRITICAL: CRITICAL: 1 failed LD(s) (Degraded) [06:30:18] PROBLEM - puppet last run on mc2015 is CRITICAL: CRITICAL: puppet fail [06:31:07] PROBLEM - puppet last run on mw1090 is CRITICAL: CRITICAL: Puppet has 1 failures [06:32:09] PROBLEM - puppet last run on kafka1002 is CRITICAL: CRITICAL: Puppet has 1 failures [06:32:58] PROBLEM - puppet last run on mw2021 is CRITICAL: CRITICAL: Puppet has 1 failures [06:33:07] PROBLEM - puppet last run on cp3008 is CRITICAL: CRITICAL: Puppet has 1 failures [06:33:58] PROBLEM - puppet last run on mw2158 is CRITICAL: CRITICAL: Puppet has 1 failures [06:34:37] PROBLEM - puppet last run on mw2018 is CRITICAL: CRITICAL: Puppet has 1 failures [06:34:38] PROBLEM - puppet last run on mw2073 is CRITICAL: CRITICAL: Puppet has 1 failures [06:36:09] PROBLEM - High load average on labstore1001 is CRITICAL: CRITICAL: 57.14% of data above the critical threshold [24.0] [06:53:07] RECOVERY - High load average on labstore1001 is OK: OK: Less than 50.00% above the threshold [16.0] [06:56:08] RECOVERY - puppet last run on mw2021 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [06:56:18] RECOVERY - puppet last run on mw1090 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [06:57:17] RECOVERY - puppet last run on mw2158 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [06:57:28] RECOVERY - puppet last run on kafka1002 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:57:47] RECOVERY - puppet last run on mc2015 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:57:48] RECOVERY - puppet last run on mw2018 is OK: OK: Puppet is currently enabled, last run 57 seconds ago with 0 failures [06:57:48] RECOVERY - puppet last run on mw2073 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [06:58:19] RECOVERY - puppet last run on cp3008 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:16:49] PROBLEM - Disk space on kafka1012 is CRITICAL: DISK CRITICAL - free space: / 1062 MB (3% inode=96%) [07:29:19] PROBLEM - Disk space on kafka1012 is CRITICAL: DISK CRITICAL - free space: / 1033 MB (3% inode=96%) [08:07:38] PROBLEM - puppet last run on db2028 is CRITICAL: CRITICAL: puppet fail [08:34:58] RECOVERY - puppet last run on db2028 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [08:37:26] 6operations, 6Commons, 10MassMessage, 10MediaWiki-JobQueue, 5Patch-For-Review: Not all MassMessage sent - https://phabricator.wikimedia.org/T124441#1959822 (10Steinsplitter) >>! In T124441#1959068, @Steinsplitter wrote: >>>! In T124441#1956519, @Legoktm wrote: >> Eh, this is probably different: >> >> le... [08:52:27] PROBLEM - puppet last run on cp4017 is CRITICAL: CRITICAL: puppet fail [09:03:57] PROBLEM - Disk space on kafka1012 is CRITICAL: DISK CRITICAL - free space: / 1061 MB (3% inode=96%) [09:19:47] RECOVERY - puppet last run on cp4017 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [10:30:17] PROBLEM - check_mysql on db1008 is CRITICAL: SLOW_SLAVE CRITICAL: Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 607 [10:34:12] 6operations, 10Analytics, 10ContentTranslation-Analytics, 10MediaWiki-extensions-ContentTranslation, and 3 others: schedule a daily run of ContentTranslation analytics scripts - https://phabricator.wikimedia.org/T122479#1959957 (10Amire80) [10:35:17] RECOVERY - check_mysql on db1008 is OK: Uptime: 413816 Threads: 2 Questions: 3356538 Slow queries: 2752 Opens: 1310 Flush tables: 2 Open tables: 400 Queries per second avg: 8.111 Slave IO: Yes Slave SQL: Yes Seconds Behind Master: 0 [10:36:07] PROBLEM - Disk space on kafka1012 is CRITICAL: DISK CRITICAL - free space: / 1059 MB (3% inode=96%) [11:03:49] PROBLEM - Disk space on stat1002 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [11:04:57] PROBLEM - dhclient process on stat1002 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [11:05:39] PROBLEM - salt-minion processes on stat1002 is CRITICAL: CHECK_NRPE: Socket timeout after 10 seconds. [11:15:29] PROBLEM - Apache HTTP on mw1143 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:16:28] PROBLEM - HHVM rendering on mw1143 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:26:27] PROBLEM - puppet last run on mw1138 is CRITICAL: CRITICAL: Puppet has 1 failures [11:51:38] RECOVERY - puppet last run on mw1138 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [12:02:37] PROBLEM - puppet last run on mw1058 is CRITICAL: CRITICAL: Puppet has 1 failures [12:19:48] PROBLEM - puppet last run on cp3018 is CRITICAL: CRITICAL: puppet fail [12:29:38] RECOVERY - puppet last run on mw1058 is OK: OK: Puppet is currently enabled, last run 55 seconds ago with 0 failures [12:43:48] PROBLEM - puppet last run on cp3033 is CRITICAL: CRITICAL: puppet fail [12:45:08] RECOVERY - puppet last run on cp3018 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [13:08:58] RECOVERY - puppet last run on cp3033 is OK: OK: Puppet is currently enabled, last run 53 seconds ago with 0 failures [13:19:06] hi! [13:30:31] is there anyone good at puppet-related things here? [13:33:45] victorbarbu: just ask the question :-) [13:33:55] 6operations, 6Performance-Team, 7Performance: Update HHVM package to recent release - https://phabricator.wikimedia.org/T119637#1960105 (10Reedy) Possible iconv related issue in T124574 on older HHVM versions (according to 3v4l), but that could be the versions of iconv they've built hhvm again, as the test f... [13:34:42] I tried creating a Vagrant virtual machine to test and develop my puppet manifests. I use ubuntu/trusty64 for this vm, and I tried following the instructions the guys at puppetlabs provide [13:34:56] I went to that web installer interface [13:35:03] at https://vm-ip:3000 [13:35:08] and followed the wizard [13:35:13] and I got errors [13:35:15] here is the log http://pastebin.com/pqvmTaaQ [13:37:22] victorbarbu: you're probably better off in the puppet support channels for that [13:37:34] why puppet enterprise? [13:37:35] Also [13:37:36] ** /opt/puppetlabs/puppet/bin/puppet module install "/tmp/pe-installer-HClPGioc/install/modules/puppetlabs-pe_puppetdbquery-2015.3.0-rc1-1-gb278efd.tar.gz" --force --ignore-dependencies --modulepath /opt/puppetlabs/puppet/modules [13:37:36] libfacter was not found. Please make sure it was installed to the expected location. [13:37:40] I think that's #puppet [13:38:06] Reedy, puppet enterprise because it has that web console [13:38:25] and since I am however using it for one node in the dev... [13:38:34] but I can follow anything you give me, guys [13:38:53] You've bought it? [13:39:02] no, but it's free for up to ten nodes [13:40:16] victorbarbu: in my experience, 'apt-get install puppet' just works, so that might still be a better option [13:40:29] what about the master? [13:40:47] but it depends a bit on the complexity (that probably won't be enough to actually test all wmf manifests) [13:40:49] for a development env, of course I will want to have the master and one agent on the same vm [13:41:06] Reedy, what do you think? [13:42:48] I've no idea why it's installing itself without having installed facter/libfacter first [13:42:54] That does seem like a puppet issue [15:30:04] ehm, hi... I think https://phabricator.wikimedia.org/T124608 is urgent to fix maybe. Regards. [15:56:34] (03PS2) 10ArielGlenn: make neodymium primary salt master [puppet] - 10https://gerrit.wikimedia.org/r/265245 [15:58:04] (03CR) 10ArielGlenn: [C: 032] make neodymium primary salt master [puppet] - 10https://gerrit.wikimedia.org/r/265245 (owner: 10ArielGlenn) [16:51:38] PROBLEM - puppet last run on stat1002 is CRITICAL: CRITICAL: Puppet last ran 6 hours ago [18:19:48] PROBLEM - puppet last run on cp3021 is CRITICAL: CRITICAL: puppet fail [18:45:07] RECOVERY - puppet last run on cp3021 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [19:04:57] 6operations, 6Commons, 10MassMessage, 10MediaWiki-JobQueue, 5Patch-For-Review: Not all MassMessage sent - https://phabricator.wikimedia.org/T124441#1960509 (10Legoktm) 5Open>3Resolved a:3Legoktm ``` legoktm@terbium:~$ mwscript eval.php --wiki=commonswiki > $g=JobQueueGroup::singleton(); > $q=$g->g... [19:08:27] 6operations, 6Project-Creators: Operations-related subprojects/tags reorganization - https://phabricator.wikimedia.org/T119944#1960512 (10Aklapper) >>! In T119944#1950162, @matmarex wrote: > empty up #Wikimedia-Media-Storage, moving the reports in it to #swift or elsewhere if appropriate, before you go on with... [20:41:26] 7Puppet, 10MediaWiki-extensions-ORES, 6Revision-Scoring-As-A-Service: Fix puppet webservice name to uwsgi-ores-web - https://phabricator.wikimedia.org/T124621#1960573 (10Halfak) 3NEW a:3yuvipanda [20:41:47] 7Puppet, 10MediaWiki-extensions-ORES, 6Revision-Scoring-As-A-Service: Fix puppet webservice name to uwsgi-ores-web - https://phabricator.wikimedia.org/T124621#1960582 (10yuvipanda) a:5yuvipanda>3None [21:18:10] (03CR) 10QChris: [C: 04-1] "Sorry for nagging again..." [debs/gerrit] - 10https://gerrit.wikimedia.org/r/263631 (owner: 10Chad) [21:25:18] PROBLEM - Disk space on kafka1020 is CRITICAL: DISK CRITICAL - free space: / 1060 MB (3% inode=96%) [21:36:56] (03PS1) 10Alex Monk: Disable NewUserMessage on metawiki [mediawiki-config] - 10https://gerrit.wikimedia.org/r/266161 (https://phabricator.wikimedia.org/T122441) [22:13:37] PROBLEM - Disk space on kafka1020 is CRITICAL: DISK CRITICAL - free space: / 1061 MB (3% inode=96%) [22:51:32] (03PS2) 10Tim Landscheidt: puppetmaster: Fix git-sync-upstream for unclean rebases [puppet] - 10https://gerrit.wikimedia.org/r/264692 [23:08:09] PROBLEM - Disk space on kafka1020 is CRITICAL: DISK CRITICAL - free space: / 1060 MB (3% inode=96%) [23:35:09] 6operations, 6Labs, 10wikitech.wikimedia.org: Rename specific account in LDAP, Wikitech, Gerrit and Phabricator - https://phabricator.wikimedia.org/T85913#1960824 (10scfc) a:3demon >>! In T85913#1917543, @demon wrote: > If somebody can give me the rename user rights on wikitech I can do this. As I've said... [23:36:58] PROBLEM - puppet last run on mw1109 is CRITICAL: CRITICAL: Puppet has 1 failures