[00:01:47] (03CR) 10Ricordisamoa: [C: 04-2] "PS168 adds a JSDoc typedef" [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 (owner: 10Ricordisamoa) [00:05:41] ebernhardson: Thank you ! [01:45:19] (03PS169) 10Ricordisamoa: Initial commit [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 [01:52:10] (03CR) 10Ricordisamoa: [C: 04-2] "PS169 adds some JSDoc typedefs" [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 (owner: 10Ricordisamoa) [01:53:43] (03PS170) 10Ricordisamoa: Initial commit [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 [01:54:34] (03CR) 10jerkins-bot: [V: 04-1] Initial commit [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 (owner: 10Ricordisamoa) [01:57:19] (03CR) 10Ricordisamoa: [C: 04-2] "PS170 replaces term_entity_id with term_full_entity_id" [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 (owner: 10Ricordisamoa) [02:00:06] (03PS171) 10Ricordisamoa: Initial commit [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 [02:03:48] (03CR) 10Ricordisamoa: [C: 04-2] "PS171 fixes flake8" [labs/tools/wikidata-slicer] - 10https://gerrit.wikimedia.org/r/241296 (owner: 10Ricordisamoa) [07:57:27] 10Toolforge: Add charset=utf-8 by default to lighttpd - https://phabricator.wikimedia.org/T178601#3698813 (10Dispenser) Here are the changes I made to that script on my server: ```lang=perl ... # mime types next if $extensions{$_}; $extensions{$_} = 1; if (substr($1, 0, 5) eq "text/" or $... [08:43:01] 10Data-Services, 10DBA, 10Dumps-Generation, 10Blocked-on-schema-change, 10MediaWiki-Platform-Team (MWPT-Q2-Oct-Dec-2017): Schema change for refactored comment storage - https://phabricator.wikimedia.org/T174569#3566137 (10Marostegui) Hi, I have tested this alter table on db2084 (10.1), basically tested... [09:14:11] is there any way to start an instance from the CLI? What I'd like to do is basically to (semi?)automatically: [09:14:29] - destroy traffic-misc-varnish5.traffic.eqiad.wmflabs [09:14:54] - create it again with a given image, size, puppet role and hiera config [09:16:05] I've tried my luck by downloading the openstack rc files on https://horizon.wikimedia.org/project/access_and_security/, sourcing them and trying random stuff like 'openstack image list' and such, but I get various types of errors [09:16:37] eg: [09:16:39] Expecting to find domain in user - the server could not comply with the request since it is either malformed or otherwise incorrect. The client is assumed to be in error. (HTTP 400) (Request-ID: req-a4e31b54-5a4d-4741-b434-4a639da90cc9) [09:17:36] 10cloud-services-team, 10DBA, 10Wikidata, 10Wikidata-Sprint: Drop replication of wb_entity_per_page in labs - https://phabricator.wikimedia.org/T178661#3698944 (10Ladsgroup) [09:24:01] 10cloud-services-team, 10DBA, 10Wikidata, 10Wikidata-Sprint: Drop wb_entity_per_page views in labs - https://phabricator.wikimedia.org/T178661#3698996 (10jcrespo) [10:33:00] (03PS1) 10Giuseppe Lavagetto: Add secret for docker-registry,discovery.wmnet [labs/private] - 10https://gerrit.wikimedia.org/r/385347 [10:33:53] (03CR) 10Giuseppe Lavagetto: [V: 032 C: 032] Add secret for docker-registry,discovery.wmnet [labs/private] - 10https://gerrit.wikimedia.org/r/385347 (owner: 10Giuseppe Lavagetto) [13:41:28] 10Data-Services, 10DBA, 10Dumps-Generation, 10Blocked-on-schema-change, 10MediaWiki-Platform-Team (MWPT-Q2-Oct-Dec-2017): Schema change for refactored comment storage - https://phabricator.wikimedia.org/T174569#3699611 (10Anomie) >>! In T174569#3698860, @Marostegui wrote: > The ones setting the default t... [14:23:38] 10Data-Services, 10DBA, 10Dumps-Generation, 10Blocked-on-schema-change, 10MediaWiki-Platform-Team (MWPT-Q2-Oct-Dec-2017): Schema change for refactored comment storage - https://phabricator.wikimedia.org/T174569#3699673 (10jcrespo) @Anomie, for context, he is doing it on a depooled server, that is the mai... [14:27:09] 10cloud-services-team (FY2017-18), 10Operations, 10Puppet, 10User-Joe: Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3699687 (10herron) [14:27:35] 10cloud-services-team (FY2017-18), 10Operations, 10Puppet, 10User-Joe: Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3652273 (10herron) [14:38:42] 10PAWS, 10Pywikibot-Commons: Server error (500) while trying to download files from Commons from PAWS - https://phabricator.wikimedia.org/T178567#3699713 (10Chicocvenancio) >>! In T178567#3697611, @zhuyifei1999 wrote: > Are the list of 500-ed images the same every time you execute the script? Or is the list ra... [14:57:56] 10cloud-services-team, 10DBA, 10Wikidata, 10Wikidata-Sprint: Drop wb_entity_per_page views in Wiki Replicas - https://phabricator.wikimedia.org/T178661#3699770 (10bd808) [14:58:19] 10Data-Services, 10cloud-services-team (Kanban), 10DBA, 10Wikidata, 10Wikidata-Sprint: Drop wb_entity_per_page views in Wiki Replicas - https://phabricator.wikimedia.org/T178661#3698944 (10bd808) [15:00:20] 10Cloud-Services, 10Outreachy (Round-15): Proposal: Improvements for the Toolforge 'webservice' command - https://phabricator.wikimedia.org/T177603#3699774 (10Andrew) Hello @Sowjanyavemuri! I'm doubt that we'll be able to bend the scheduling rules although I agree that the numbers are close. For number-crunc... [15:05:15] 10cloud-services-team (Kanban), 10DC-Ops, 10Operations, 10ops-eqiad: labvirt1015 crashes - https://phabricator.wikimedia.org/T171473#3699779 (10Cmjohnson) Still no sign of failure from the h/w log....it took awhile last time [15:22:45] 10Cloud-Services, 10Outreachy (Round-15): Proposal: Improvements for the Toolforge 'webservice' command - https://phabricator.wikimedia.org/T177603#3699808 (10Sowjanyavemuri) Hi @Andrew, Thanks a lot for considering the request. - Last day of exams for all students for your current term. 24th November, 2017... [15:25:28] Hi all. tools-bastion-03 is very slow today, can someone please take a look? [15:30:49] 10cloud-services-team (Kanban), 10DC-Ops, 10Operations, 10ops-eqiad: labvirt1015 crashes - https://phabricator.wikimedia.org/T171473#3699830 (10Cmjohnson) yeah, I see that they are failed again. I don't know why...i tried swapping another spare from the decom ms-be servers and it lights up but still show... [15:33:16] 10cloud-services-team (Kanban), 10DC-Ops, 10Operations, 10ops-eqiad: labvirt1015 crashes - https://phabricator.wikimedia.org/T171473#3699835 (10Andrew) I installed 20 VMs, and ran stress-ng on each of them, like this: andrew@labpuppetmaster1001:~$ sudo cumin "name:labvirt1015stresstest*" "stress-ng --cpu... [15:34:39] 10cloud-services-team (Kanban), 10DC-Ops, 10Operations, 10ops-eqiad: labvirt1015 crashes - https://phabricator.wikimedia.org/T171473#3699838 (10Andrew) >>! In T171473#3699830, @Cmjohnson wrote: > yeah, I see that they are failed again. I don't know why...i tried swapping another spare from the decom ms-b... [15:59:40] 10cloud-services-team (Kanban), 10DC-Ops, 10Operations, 10ops-eqiad: labvirt1015 crashes - https://phabricator.wikimedia.org/T171473#3699912 (10Cmjohnson) @andrew..wrote that in the wrong ticket [16:00:25] huji: better? [16:03:52] anyone know who tools.himo is? It doesn't appear in the directory for some reason [16:21:30] 10PAWS, 10Pywikibot-Commons: Server error (500) while trying to download files from Commons from PAWS - https://phabricator.wikimedia.org/T178567#3699946 (10zhuyifei1999) I created a script that starts an interactive shell on error to debug this: ```lang=python import pywikibot from pywikibot import pagegenera... [16:25:08] 10PAWS, 10Pywikibot-Commons: Server error (500) while trying to download files from Commons from PAWS - https://phabricator.wikimedia.org/T178567#3699963 (10zhuyifei1999) There are some headers however: ``` >>> req.data.headers {'Date': 'Fri, 20 Oct 2017 16:19:09 GMT', 'Content-Type': 'text/plain', 'Content-Le... [16:32:41] 10PAWS, 10Pywikibot-Commons: Server error (500) while trying to download files from Commons from PAWS - https://phabricator.wikimedia.org/T178567#3699979 (10zhuyifei1999) The same script runs perfectly fine on toolforge. I went ahead and checked the DNS resolving with `host -v`, both point to 208.80.154.240, u... [16:49:27] andrewbogott: a lot! [16:49:41] I also suspected tools.himo because at times it was using 99% CPU [16:53:03] 10cloud-services-team (FY2017-18), 10Operations, 10Patch-For-Review, 10Puppet, 10User-Joe: Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3700038 (10herron) [16:53:32] 10PAWS, 10Pywikibot-Commons: Server error (500) while trying to download files from Commons from PAWS - https://phabricator.wikimedia.org/T178567#3700040 (10zhuyifei1999) Changing the script from `page.download()` to `print(page.latest_file_info.url)`, then in terminal run: ``` zhuyifei1999@PAWS:~$ (for URL in... [16:54:52] 10PAWS, 10Pywikibot-Commons: Server error (500) while trying to download files from Commons from PAWS - https://phabricator.wikimedia.org/T178567#3700043 (10zhuyifei1999) Uh, now all of the files on that list can be downloaded with pywikibot anyhow [17:06:19] 10Data-Services, 10DBA, 10Dumps-Generation, 10Blocked-on-schema-change, 10MediaWiki-Platform-Team (MWPT-Q2-Oct-Dec-2017): Schema change for refactored comment storage - https://phabricator.wikimedia.org/T174569#3700066 (10Marostegui) And also altering revision on enwiki and image on commons could be almo... [17:16:55] andrewbogott: generally, the I/O on that machine is very slow. When I run a pywikibot, it takes just 1-2 minutes for the code to be loaded in memory. everything is on an NFS mount, right? Is that slow? [17:29:22] 10PAWS, 10Pywikibot-Commons: Server error (500) while trying to download files from Commons from PAWS - https://phabricator.wikimedia.org/T178567#3700151 (10Chicocvenancio) I believe the clues about this bug are in the cache headers. In the failures we have: ``` X-Cache: cp1074 miss, cp1074 pass X-Cache-Statu... [17:42:29] 10PAWS, 10Operations, 10Pywikibot-Commons, 10Traffic: Server error (500) while trying to download files from Commons from PAWS - https://phabricator.wikimedia.org/T178567#3700199 (10Chicocvenancio) [17:46:27] 10PAWS, 10Operations, 10Pywikibot-Commons, 10Traffic: Server error (500) while trying to download files from Commons from PAWS - https://phabricator.wikimedia.org/T178567#3696395 (10BBlack) Most likely the error is just inconsistent over the time domain at the backend (swift -> (MW || Thumbor)). If Varnis... [17:51:09] 10PAWS, 10Operations, 10Pywikibot-Commons, 10Traffic: Server error (500) while trying to download files from Commons from PAWS - https://phabricator.wikimedia.org/T178567#3700231 (10zhuyifei1999) I don't think thumbor should be involved here. The script fetch original versions of the files, not the thumbna... [17:51:30] huji: there's nothing specific about the IO there other than it being shared with other users. Anything expensive should definitely be submitted as a grid job. [17:53:20] the NFS bandwidth is limited per instance iirc [19:08:33] tools-bastion-03 is slow [19:13:46] 10cloud-services-team (FY2017-18), 10Operations, 10Patch-For-Review, 10Puppet, 10User-Joe: Upgrade to puppet 4 (4.8 or newer) - https://phabricator.wikimedia.org/T177254#3700540 (10Andrew) [19:18:46] people should really stop abusing NFS on tools-bastion-03 [19:26:30] isn't there a way to discover who is running tools in tools-bastion-03 and stop only that user? [19:33:20] well, if someone send a magic sysrq key [19:33:27] root needed [19:33:58] or otherwise we can look at the process list, which would be very very slow [19:35:14] * zhuyifei1999_ runs $ ps auxf [19:36:52] uh too late [19:37:29] * zhuyifei1999_ suspects tools.c+ 30482 4.6 0.0 4744 624 pts/55 DN+ 19:34 0:04 \_ gzip -d -k 20170921-2000.sql.gz [19:37:51] (tools.citationhunt_ [19:44:46] can someone kill ^ [19:44:52] I will in a moment [19:44:57] ggp: I think this tool^ is maintained by you, isn't it? [19:45:22] sorry, yes. killing killing killing [19:46:22] even killing is gonna take forever due to being in D state and signals can't be delivered :( [19:46:27] ggp: please don't run your jobs on the bastion. Use the grid so that it doesn't mess with other folks [19:46:45] now we're back to having tools.himo eating all the cycles [19:46:50] which pops back as fast as I can kill it [19:47:33] ok, should be a bit of breathing room now [20:04:55] 10PAWS, 10Operations, 10Pywikibot-Commons, 10Traffic: Server error (500) while trying to download files from Commons from PAWS - https://phabricator.wikimedia.org/T178567#3700598 (10BBlack) So, I did some varnishlog tracing on the frontend @zhuyifei1999 was hitting with a reproduction of this. I caught on... [20:06:09] 10PAWS, 10Operations, 10Pywikibot-Commons, 10Traffic, 10media-storage: Server error (500) while trying to download files from Commons from PAWS - https://phabricator.wikimedia.org/T178567#3700601 (10BBlack) [20:56:16] 10Cloud-Services: Upgrade wmcs instances and masters to puppet 4.8 - https://phabricator.wikimedia.org/T178717#3700697 (10Legoktm) [21:10:51] 10Cloud-Services: Upgrade wmcs instances and masters to puppet 4.8 - https://phabricator.wikimedia.org/T178717#3700715 (10Paladox) if you want you can use puppet-paladox3.git.eqiad.wmflabs.org as a test for puppet 4.8 :). Though i just need to know when you would do it :). [21:55:54] 10Cloud-Services, 10PAWS, 10Beta-Cluster-Infrastructure, 10Wikidata, 10Wikimedia-Logstash: Remove puppet class role::labs::lvm::mnt - https://phabricator.wikimedia.org/T178722#3700748 (10hashar) [22:02:04] 10Cloud-Services, 10PAWS, 10Beta-Cluster-Infrastructure, 10Wikidata, 10Wikimedia-Logstash: Remove puppet class role::labs::lvm::mnt - https://phabricator.wikimedia.org/T178722#3700766 (10hashar) deployment-logstash2.deployment-prep.eqiad.wmflabs has three mounts: ``` /dev/vd/second-local-disk /srv...