[00:35:06] 10Operations, 10Internet-Archive, 10Offline-Working-Group: Create backups of Wikimedia content in diverse geographic places - https://phabricator.wikimedia.org/T156544#3862080 (10Pine) 05Open>03stalled Can we get an update about the status of this project, please? I am marking this as "stalled" pending a... [02:54:48] (03PS1) 10Legoktm: mediawiki: Ensure Python 3 is available for Pygments [puppet] - 10https://gerrit.wikimedia.org/r/400458 (https://phabricator.wikimedia.org/T182851) [03:29:26] PROBLEM - MariaDB Slave Lag: s1 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 883.91 seconds [03:48:26] RECOVERY - MariaDB Slave Lag: s1 on dbstore1002 is OK: OK slave_sql_lag Replication lag: 178.67 seconds [06:55:19] 10Operations, 10Internet-Archive, 10Offline-Working-Group: Create backups of Wikimedia content in diverse geographic places - https://phabricator.wikimedia.org/T156544#3862182 (10Aklapper) >>! In T156544#3077677, @Pine wrote: > I am not understanding why this would be low priority. Can you please explain? [[... [07:12:26] PROBLEM - Apache HTTP on mw2111 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:13:16] RECOVERY - Apache HTTP on mw2111 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 616 bytes in 0.121 second response time [07:13:50] (03PS1) 10Marostegui: test-s4.hosts: New file with the test-s4 hosts [software] - 10https://gerrit.wikimedia.org/r/400471 (https://phabricator.wikimedia.org/T180788) [07:14:48] (03PS2) 10Marostegui: test-s4.hosts: New file with the test-s4 hosts [software] - 10https://gerrit.wikimedia.org/r/400471 (https://phabricator.wikimedia.org/T180788) [07:22:26] (03CR) 10Marostegui: [C: 032] test-s4.hosts: New file with the test-s4 hosts [software] - 10https://gerrit.wikimedia.org/r/400471 (https://phabricator.wikimedia.org/T180788) (owner: 10Marostegui) [07:23:10] (03Merged) 10jenkins-bot: test-s4.hosts: New file with the test-s4 hosts [software] - 10https://gerrit.wikimedia.org/r/400471 (https://phabricator.wikimedia.org/T180788) (owner: 10Marostegui) [11:36:27] PROBLEM - Host pc2005 is DOWN: PING CRITICAL - Packet loss = 100% [11:43:49] 10Operations, 10MediaWiki-Containers, 10Release-Engineering-Team, 10Epic, and 3 others: FY2017/18 Program 6 - Outcome 2 - Objective 3: Integrated, container-based development environment - https://phabricator.wikimedia.org/T170456#3862438 (10mobrovac) [12:44:42] 10Operations, 10Analytics, 10ChangeProp, 10EventBus, and 5 others: Select candidate jobs for transferring to the new infrastucture - https://phabricator.wikimedia.org/T175210#3862498 (10mobrovac) [12:46:17] 10Operations, 10Analytics, 10ChangeProp, 10EventBus, and 4 others: Migrate htmlCacheUpdate job to Kafka - https://phabricator.wikimedia.org/T182023#3862500 (10mobrovac) [12:49:16] 10Operations, 10MediaWiki-Platform-Team, 10Epic, 10Performance-Team (Radar), 10Services (watching): 2017/18 Annual Plan Program 8: Multi-datacenter support, Q2 goals - https://phabricator.wikimedia.org/T175213#3862507 (10mobrovac) [13:12:27] PROBLEM - HHVM rendering on mw2245 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:13:17] RECOVERY - HHVM rendering on mw2245 is OK: HTTP OK: HTTP/1.1 200 OK - 73487 bytes in 0.289 second response time [13:22:46] 10Operations, 10Traffic, 10Browser-Support-Internet-Explorer, 10Patch-For-Review, 10User-notice: Removing support for DES-CBC3-SHA TLS cipher (drops IE8-on-XP support) - https://phabricator.wikimedia.org/T147199#2684468 (10Izno) Just for documentation's sake, this ended AutoWikiBrowser on XP ([[https://e... [13:42:37] (03PS1) 10Urbanecm: Set 'watchcreations' preference to true by default on Commons [mediawiki-config] - 10https://gerrit.wikimedia.org/r/400579 (https://phabricator.wikimedia.org/T178750) [13:52:22] 10Operations, 10Edit-Review-Improvements, 10Collaboration-Feature-Rollouts (Collaboration-WL-Graduated-Everywhere), 10Collaboration-Team-Triage (Collab-Team-This-Quarter), 10Performance: Systematically test load speeds of Watchlist and Recent Changes - https://phabricator.wikimedia.org/T176445#3625823 (10... [14:18:57] !log Power cycle pc2005 as it is down [14:19:08] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [14:25:26] RECOVERY - Host pc2005 is UP: PING OK - Packet loss = 0%, RTA = 36.17 ms [14:28:55] 10Operations, 10ops-codfw, 10DBA: pc2005 CPU2 internal error - https://phabricator.wikimedia.org/T183750#3862652 (10Marostegui) [14:29:20] 10Operations, 10ops-codfw, 10DBA: pc2005 CPU2 internal error - https://phabricator.wikimedia.org/T183750#3862652 (10Marostegui) p:05Triage>03Normal [14:29:27] 10Operations, 10ops-codfw, 10DBA: pc2005 crashed: CPU2 internal error - https://phabricator.wikimedia.org/T183750#3862652 (10Marostegui) [14:30:56] PROBLEM - puppet last run on pc2005 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 5 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[pt-heartbeat] [14:35:56] RECOVERY - puppet last run on pc2005 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [14:37:55] 10Operations, 10ops-codfw, 10DBA: pc2005 crashed: CPU2 internal error - https://phabricator.wikimedia.org/T183750#3862668 (10Marostegui) [15:11:29] (03CR) 10Zppix: [C: 031] "LGTM" [mediawiki-config] - 10https://gerrit.wikimedia.org/r/400579 (https://phabricator.wikimedia.org/T178750) (owner: 10Urbanecm) [15:12:08] Urbanecm: reviewed :) [15:43:31] (03CR) 10GeoffreyT2000: [C: 031] "Introducing a "$wmgWatchPagesCreated" variable sounds like a great idea to me." [mediawiki-config] - 10https://gerrit.wikimedia.org/r/400579 (https://phabricator.wikimedia.org/T178750) (owner: 10Urbanecm) [15:58:50] Zppix, thanks a lot [16:01:03] Np [16:09:01] (03CR) 10Thcipriani: [C: 031] "> Do I really need to add the dumps repo here or will it be picked up" (031 comment) [puppet] - 10https://gerrit.wikimedia.org/r/400237 (owner: 10ArielGlenn) [16:36:46] (03PS1) 10ArielGlenn: update for stretch build of same source [debs/mwbzutils] - 10https://gerrit.wikimedia.org/r/400590 [16:42:15] (03CR) 10ArielGlenn: [C: 032] update for stretch build of same source [debs/mwbzutils] - 10https://gerrit.wikimedia.org/r/400590 (owner: 10ArielGlenn) [17:04:21] 10Operations, 10Developer-Relations, 10cloud-services-team (Kanban): Create discourse-mediawiki.wmflabs.org (pilot instance) - https://phabricator.wikimedia.org/T180854#3862744 (10MZMcBride) [17:34:27] (03PS1) 10ArielGlenn: add wmflabs config for dumps scap [dumps/scap] - 10https://gerrit.wikimedia.org/r/400598 [17:39:43] (03CR) 10ArielGlenn: "really no idea if this is how to set up targets for labs, I just stole from someone else..." [dumps/scap] - 10https://gerrit.wikimedia.org/r/400598 (owner: 10ArielGlenn) [18:13:43] greg-g: around at all? :) [18:15:21] one hopes not [18:19:05] you never know! [18:26:46] PROBLEM - Check size of conntrack table on radium is CRITICAL: CRITICAL: nf_conntrack is 90 % full [18:28:46] PROBLEM - Check size of conntrack table on radium is CRITICAL: CRITICAL: nf_conntrack is 91 % full [18:31:46] RECOVERY - Check size of conntrack table on radium is OK: OK: nf_conntrack is 52 % full [18:32:25] hey, apergos. yes, i'm working on the script :) [18:34:33] hello mutante [18:34:38] keep up the good work! [18:35:10] import sys, re, smtplib, argparse [18:37:58] your "it never takes 5 minutes" law already very true. for example not all contacts have the special "address1" with an SMS gateway email adddress [18:38:17] the pylinter likes you to have all your imports on separate lines [18:38:34] as long as it's going to be a short script you might as well learn python style too [18:38:39] oh, ok, i actually just contracted it into 1 [18:38:51] I don't have a gateway email address because my provider doesn't have one [18:38:54] stupid provider [18:39:11] hmm yea, need gateway to a gateway :p [18:39:56] good luck with that [18:41:07] installs jabber server in labs, makes everybody install a jabber client, sends notifications to groups via jabber protocol ... NOT really though, j/k :) [20:53:04] addshore: what's up? I'm mobile and intermittent. [20:53:30] * goat-g is testing out weechat plus glowing bear [20:55:11] goat-g ? rofl [20:56:52] hahaa :) [20:57:05] never change that nick again [21:02:33] speaking of goats - https://twitter.com/WhatTheFFacts/status/946157787553333248 [21:29:02] greg-g: ideally I need to try and backport & deploy https://gerrit.wikimedia.org/r/#/c/399627/ before the 31st! [21:29:28] addshore: I guess you want to ping the other nick? [21:29:39] bah, stupid irc client [21:29:42] goat-g greg-g ^^ [21:29:54] thanks Sagan [21:29:59] np :) [22:03:52] (03Abandoned) 10ArielGlenn: cleanup old files after dataset100 rsync of dumps to labs [puppet] - 10https://gerrit.wikimedia.org/r/336451 (owner: 10ArielGlenn) [22:05:02] (03Abandoned) 10ArielGlenn: remove python script for cron rsync of dumps from dataset1001 to labstore [puppet] - 10https://gerrit.wikimedia.org/r/336205 (owner: 10ArielGlenn) [22:32:56] PROBLEM - MariaDB Slave SQL: s5 on dbstore1002 is CRITICAL: CRITICAL slave_sql_state Slave_SQL_Running: No, Errno: 1032, Errmsg: Could not execute Delete_rows_v1 event on table dewiki.imagelinks: Cant find record in imagelinks, Error_code: 1032: handler error HA_ERR_KEY_NOT_FOUND: the events master log db1070-bin.001709, end_log_pos 633534466 [22:45:37] PROBLEM - MariaDB Slave Lag: s5 on dbstore1002 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 897.19 seconds [23:18:00] (03PS1) 10Dzahn: icinga: script to send custom SMS to Icinga contacts [puppet] - 10https://gerrit.wikimedia.org/r/400615 (https://phabricator.wikimedia.org/T82937) [23:18:29] (03CR) 10jerkins-bot: [V: 04-1] icinga: script to send custom SMS to Icinga contacts [puppet] - 10https://gerrit.wikimedia.org/r/400615 (https://phabricator.wikimedia.org/T82937) (owner: 10Dzahn) [23:26:33] (03PS2) 10Dzahn: icinga: script to send custom SMS to Icinga contacts [puppet] - 10https://gerrit.wikimedia.org/r/400615 (https://phabricator.wikimedia.org/T82937) [23:26:56] (03CR) 10jerkins-bot: [V: 04-1] icinga: script to send custom SMS to Icinga contacts [puppet] - 10https://gerrit.wikimedia.org/r/400615 (https://phabricator.wikimedia.org/T82937) (owner: 10Dzahn) [23:31:06] PROBLEM - Long running screen/tmux on labstore1006 is CRITICAL: CRIT: Long running SCREEN process. (PID: 6774, 1738212s 1728000s).