[04:36:49] twentyafterfour: https://phabricator.wikimedia.org/p/mmodell/ seems to have a duplicate "the"? [05:16:19] have some people can help me ? [08:12:27] Hi. If somebody could move this along https://phabricator.wikimedia.org/T177737 would be highly appreciatte it. [08:26:20] lluis_tgn: based on timezones, there might not even be anyone already for a while to actually deploy such a change [08:27:04] yes, that's what I thought. Ty anyway. [09:08:04] [[Tech]]; AhrimanAmmaneh; [none]; https://meta.wikimedia.org/w/index.php?diff=17311877&oldid=17297483&rcid=10668776 [13:34:28] Can anybody give me a hint of a possible reason for some results of an insource search not including the snippet? https://sv.wiktionary.org/w/index.php?title=Special:S%C3%B6k&limit=250&offset=0&ns0=1&search=insource%3A%2F%28%5C%7B%5C%7Bli-%5B%5E%5C%7C%5C%7D%5D%2B%5B%5C%7C%5C%7D%5D%5B%5E%5C%7D%5D%2A%5C%7D%5C%7D%3F%29%2F [13:34:55] (is this a bug, or am I doing something wrong?) [13:46:17] Try insource:/(\{\{li-[^\|\}]+[\|\}]?)/ [13:54:35] yeryry: then all snippets are shown, but the regex is not exactly equivalent... [13:55:18] yeryry: is it expected behavior to remove the snippets if the search becomes to complex? [13:56:46] The ones that aren't shown on your regex are long results rather than the short li-artikel-obest... May be it doesn't show a snippet when the regex match itself is over a certain size? [13:57:47] yeryry: good point... I believe you're right... [13:58:07] yeryry: I'll settle for that explaination, thanks :) [13:59:13] And I think my reduced regex matches the same pages? Doesn't actually match as much text, but showing the snippets may be more useful? [14:11:23] yeryry: in this case they show the same regex, but I am dependent on knowing what is causing the missing snippets, I am relying on the snippets to check if there is a problem with old syntax when introducing a new [14:11:41] yeryry: in this case they show the same entries, I mean [14:12:50] yeryry: comparing the results for insource:/\{\{li-[^\|\}]+[\|\}][^\}]+/ and insource:/\{\{li-[^\|\}]+[\|\}][^\}]{1,40}/ shows that you were right [14:13:02] heh, yeah, I was just trying similar [14:13:42] insource:/(\{\{li-[^\|\}]+[\|\}][^\}]{0,135})/ shows "moder" but 140 rather than 135 doesn't [14:15:11] The snippet is ~160 chars, so possibly it just decides that if your match is taking up almost all of the snippet, you know what it is already so don't need to see it? [14:15:21] But that doesn't really account for regex... [14:16:18] yeryry: seems the limit is 149 [14:18:21] yup [14:18:39] insource:/kaa.{146}/ shows snippets, 147 doesn't. [14:38:08] https://github.com/wikimedia/mediawiki/blob/master/includes/search/SearchEngine.php#L368 is used in https://github.com/wikimedia/mediawiki/blob/master/includes/search/SearchHighlighter.php not sure why it would show nothing based on that, but 75 is half of 150... [14:55:47] yeryry: hehe I also tried to dig deep in this... I am not sure where I lost it... it's somehow set before or after the search is returning a CirrusSearch\Search\ResultSet. CirrusSearch seems to be based on Elastic Search and in their example they use 150: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-highlighting.html [14:56:26] yeryry: maybe your links refer to the old search motor? [14:57:57] yeryry: however, I can't seem to find the actual setting of 149/150 and I can't find a way to override it... No suitable parameter in https://www.mediawiki.org/wiki/API:Query [14:58:52] Maybe... https://sv.wiktionary.org/wiki/Special:ApiSandbox#action=cirrus-config-dump&format=json shows "CirrusSearchFragmentSize": 150 [14:59:35] But there's no reason why it shouldn't just show you 150 chars of your search hit with no surrounding text, if it is too big, rather than nothing at all.. [15:01:10] yeryry: ElasticSearch in the link above states: "In the case where there is no matching fragment to highlight, the default is to not return anything. ", so it seems to be by design [15:01:54] yeryry: thanks I will check your link [15:06:11] yeryry: I can't find CirrusSearchFragmentSize, where did you see it? [15:07:03] On my link, click the blue Utför begäran at the top [15:23:24] yeryry: I don't think those settings are overridable from a api query, I think I'll just grab the entire text if the snippet returns empty. [15:24:52] Yeah, I was just trying to show that there was a similar cirrus setting that may be affecting it [15:26:30] yeryry: yeah, thanks :) [16:55:55] * jem has just left a topic for the next technical IRC meeting