[10:56:42] lunch [13:29:10] I got a question from Research regarding Cirrus Dumps: What is the meaning of the timestamp in the paths, e.g. https://dumps.wikimedia.org/other/cirrus_search_index/20251123. Is that the date/time the maintenance script was launched? [13:34:10] pfischer: seems like the end of the data_interval, so "generally" the day the script was scheduled, in general this is equivalent to when the dump was extracted, could be slightly different if there are failures/retries [13:34:50] most of the times you might see revisions <= to this date [13:45:28] I think Balthazar and Ben did some work on standardizing the file / directory names for dumps. As part of the move to Airflow. [13:45:38] yep, dcausse is on point [13:59:22] Thank you, dcausse, brouberol! [13:59:40] np! [15:17:55] \o [15:18:44] pfischer: the dates are typical airflow dates, thats going to be data_interval_end [15:20:46] err, double checked, this one is data_interval_start (i guess it caried forward from oozie) [15:22:28] o/ [15:23:41] hm must be data_interval_end otherwise we won't have 20251123 yet? [15:24:28] hmm, we use {snapshot} in f-strings, and the only `snapshot = ` i see is `snapshot = '{{ data_interval_start | ds_nodash }}'` [15:25:24] ah but the export path is using '{props.public_dumps_location}/{{{{ data_interval_end | ds_nodash }}}}' [15:25:29] oh! [15:25:31] not great that these do not line up :/ [15:25:48] duh..i should remember i did that. it was like a month ago. It was to keep the same dates as the old dumps [15:26:24] from a user perspective that's a lot better imo to use the end [15:26:40] yea, it's probably more obvious. it's just not the same [15:27:11] yes have to think about translating the two whenever trying to compare what's in the datalake and what's exposed [15:53:47] T411107 🙂 [15:53:47] T411107: Harmonize semantics of Cirrus dump timestamp - https://phabricator.wikimedia.org/T411107 [16:25:04] IIRC we do use data_interval_end in all dumps DAGs running in test-k8s. was the data_interval_start on your end, on the new cirrussearch/hadoop DAG? [16:34:39] brouberol: yes mainly related to cirrus dump constructed from a spark job, data_interval_start is used in the hive partition but we translate this partition date to data_interval_end on the public facing website [16:34:54] gotcha, thanks [17:50:54] dinner [18:29:58] banned cirrussearch1112, let it drain, and brought it back. Maybe will resolve the latency alerts (was highest-load node, peark load of ~60) [18:30:15] err, will bring it back, haven't unbanned yet [19:18:54] ebernhardson: any issues with me rebooting `search-loader1002.eqiad.wmnet` and `search-loader2002.codfw.wmnet`, one host at a time? [19:19:00] ryankemper: nope, should be fine [19:23:32] uptime of `506 days` :P [19:46:28] i don't think we do live migration, suggests the VM host has been up longer :) [19:55:07] enwiki surprisingly not bad, started up the embedding notebook and based on the mean task time so far, it will probably take about 3 hours to embed enwiki_content with 1TB memory and 384 cores [19:55:27] i think the model was listed as 80M parameters [21:19:54] what a tedious limitation..."Note that OpenSearch Dashboards can only run against a secure cluster, so if you uninstall the Security plugin, you’ll also need to uninstall the OpenSearch Dashboards plug" [21:28:48] (the docs are a lie, the entrypoint has an env var that turns off security)