WEBVTT

00:00.000 --> 00:18.440
Welcome everybody to my talk, bridging the gap from WordPress to RepLage. As always,

00:18.440 --> 00:24.800
a few words about myself first. My name is Nicholas, I'm technically at Aluga.com, which

00:24.800 --> 00:30.000
is a multi-lingual with your platform and dubbing tool suit. I'm personally

00:30.000 --> 00:34.280
consider myself a restation and nix enthusiasts, so this topic might seem a bit out of place

00:34.280 --> 00:40.800
from my usual conference topics. I do like computer graphics, interactive media, pen and

00:40.800 --> 00:45.960
paper role playing games, actually what I missed here for this room. I do like natural languages,

00:45.960 --> 00:51.240
so I'm fluent in English, my name is Muito Sparford Storage, Aluga Espanyol, but I know I

00:51.240 --> 00:56.240
have a lot of people who don't know how to use it, but I'm not sure how to use it, but

00:56.240 --> 01:03.240
how to use it. So you can find me on the web at Quartz.f and I'm also on the Federer's,

01:03.240 --> 01:09.520
on Nicholas Quartz.com, where I'm also personally one of the elements that is running

01:09.520 --> 01:14.640
this instance. And now that we have covered that part, also this is my first

01:14.640 --> 01:19.720
foster, and as of a few months ago, my first time helping out with a death room. Thanks

01:19.720 --> 01:27.320
for taking over DAV, so I can close this tip. Yeah, so what do I expect from this talk?

01:27.320 --> 01:31.560
This is not a mature, open-source project that you can like grab and throw at your web

01:31.560 --> 01:36.760
present instance and then have it become multi-lingual. It's mostly about sharing a fun

01:36.760 --> 01:41.600
little story, a bit of a hacker spirit, how I solve the translations for our own

01:41.600 --> 01:47.400
road press, power, marketing and product pages, and yeah, involves three free and

01:47.400 --> 01:51.240
open-source projects, road press, elementary, which is based on road press and

01:51.240 --> 01:56.800
rub late. We'll start with the motivation for this project, then we'll talk a

01:56.800 --> 02:01.640
bit about what the initial idea was and what solution we arrived eventually, how

02:01.640 --> 02:06.840
I implemented it, and what conclusions I derived from having finished this project

02:06.840 --> 02:12.440
and roughly the past one and the half years, which was not like full-time job

02:12.440 --> 02:17.520
but here and then adjusting it and hacking on it. So first of all, why road press, for

02:17.520 --> 02:23.120
our product marketing pages at all, initially like our company existed for the past 12

02:23.120 --> 02:28.200
years at this point, and for most of the time we were fine implementing our whole

02:28.200 --> 02:33.080
marketing product pages directly as a collaboration between marketing and designers

02:33.080 --> 02:38.080
and developers, but as we all of us probably know, the development time is expensive,

02:38.080 --> 02:43.880
so needing this feedback loop back and forth between marketing, designers and developers

02:43.880 --> 02:49.240
takes a lot of time that would like to spend on our core product itself. So we decided

02:49.240 --> 02:53.680
two years ago that we would move this process to road press, because our marketing team

02:53.680 --> 02:58.280
was familiar with already, our new designer was familiar with already, and yeah, they

02:58.280 --> 03:03.560
can just adjust it at a text, at text, redact text, without involving the development

03:03.560 --> 03:09.200
team. I'd like to note that this decision was made before the whole road press was a

03:09.200 --> 03:16.240
WP engine thing one and a bit a year ago, so I'm not here to say, use road press for

03:16.240 --> 03:21.040
everything, I'm just saying, it worked out for us, it's still working out for us, if

03:21.040 --> 03:26.440
you know, better solution than just go with it, know how to feel it. Why rep late?

03:26.440 --> 03:30.840
There are, of course, existing rep press plugins to do localization internationalization

03:30.840 --> 03:37.680
to rep press CMS pages, namely WPML, PoliLang, and G-Translate, and we did try them out,

03:37.680 --> 03:43.040
but they, the whole workflow, felt unfamiliar to our in-house translators, and our freelancers

03:43.040 --> 03:46.280
who are much more used to rep late already, because we're using it for our product

03:46.280 --> 03:53.800
translations, so the onboarding like for rep late was unnecessary, while we would have to

03:53.800 --> 03:58.480
like give more to each of our translators, give more teaching, how do you properly use

03:58.480 --> 04:03.760
this tool, how do you, how are you kept in the loop when you translations are required,

04:03.760 --> 04:07.360
whereas it's rep late, they just open our rep late instance and already knew, okay, these

04:07.360 --> 04:12.680
are the strings I have to translate, nice, or check, or edit, whatever. Also, rep late has

04:12.680 --> 04:18.320
everything in one place, like it allows us to easily take existing translations, we had

04:18.320 --> 04:22.880
already for new, similar strings, for translation memory, you can define a glossary, which

04:22.880 --> 04:27.240
is really nice, and also, of course, it integrates with automated translation APIs so we

04:27.240 --> 04:32.160
can, like, have an initial automated translation, and then go over to the translators in

04:32.160 --> 04:36.520
our house, so you can see these are the translations that were automatically generated,

04:36.520 --> 04:41.120
check them for correctness, precision, whatever.

04:41.120 --> 04:49.240
Okay, so, of course, the final target is to get every translatable string on our CMS pages

04:49.240 --> 04:54.120
into the rep present, into the rep late instance, how do you do that? Like, we have quite

04:54.120 --> 04:59.320
a bit of text, like marketing people, it's like to talk, and that's a lot of stuff to

04:59.320 --> 05:04.240
translate, and also, as opposed to UI translation, where most of the time you can just

05:04.240 --> 05:10.360
say, I have one string that interpolates values, marketing people always reinvent the whole

05:10.360 --> 05:14.120
text so you basically have to translate it completely in you.

05:14.120 --> 05:20.640
So, rep late can translate a lot of formats, in our case, it's not the rep restator base,

05:20.640 --> 05:25.480
you can just plug into rep late, so you need to find an intermediary format, and for most

05:25.480 --> 05:29.920
time when you're having an automated pipeline, JSON is the ideal format for that, and

05:29.920 --> 05:32.120
rep late can handle JSON just fine.

05:32.120 --> 05:37.160
So, that means we need to get JSON message files out of our rep instance, so we can

05:37.160 --> 05:38.160
feed them into rep late.

05:38.160 --> 05:40.480
How do you do that?

05:40.560 --> 05:45.920
Speaking, we set a key, we want one content block in the page to equal one string in

05:45.920 --> 05:49.240
our JSON file, so one message in our rep late instance.

05:49.240 --> 05:53.960
That's a bit much of generalization, so some block types can have multiple strings, like

05:53.960 --> 06:00.960
lists can have string for each list item, or also if you have mixed content, if there's

06:00.960 --> 06:04.140
some other HTML that's a bit more complex in between, then also you might have multiple

06:04.140 --> 06:10.060
strings for a single widget type, but that's at this point in implementation detail.

06:10.060 --> 06:15.460
Then from that later we can again export JSON and have to feed that back into our pipeline

06:15.460 --> 06:20.500
some hour have to get that into the rep press page again, to like, serve the localized

06:20.500 --> 06:21.500
version.

06:21.500 --> 06:27.300
So, in-house we don't have much experience with PHP or rep press plugins.

06:27.300 --> 06:31.140
It's not zero experience, but it's enough that we have set up here, we're not going

06:31.140 --> 06:35.540
to develop this as a rep press plugin, but I personally have a lot of experience with static

06:35.540 --> 06:39.900
page builders, and this is not much else than a static page builder, except that you don't

06:39.900 --> 06:46.660
take markdown or like an SQLite database as an input, but a rep press HTML page.

06:46.660 --> 06:52.100
I will briefly mention why we didn't go with the rep press rest API in the moment.

06:52.100 --> 06:57.460
Now the problem is, if you want to have like one message per content type or content

06:57.460 --> 07:02.020
block inside the page, you need to be able to uniquely identify the content blocks of the

07:02.020 --> 07:03.020
page.

07:03.020 --> 07:07.340
You need a message identifier that the JSON and then rep late can use to say, this is the same

07:07.340 --> 07:10.420
message when it changed in the original language, and then potentially has to be adjusted

07:10.420 --> 07:12.020
in the toggle language.

07:12.020 --> 07:16.780
Unfortunately, the go-to solution for WordPress, which is the Gutenberg editor, does not

07:16.780 --> 07:18.980
have stable unique IDs.

07:18.980 --> 07:24.140
You can check out these two issues if you want to see a very lengthy discussion about that,

07:24.140 --> 07:28.900
and also you will unfortunately arrive to the conclusion that unique stable IDs will never

07:28.900 --> 07:30.380
come for Gutenberg blocks.

07:30.380 --> 07:32.660
There was very sad for me.

07:32.660 --> 07:35.820
Yeah, so what can you do instead?

07:35.820 --> 07:40.980
You could implement custom block types and then say, okay, this is an attribute on my custom

07:40.980 --> 07:46.620
block that is just for the message ID, and then have your marketing or designers, insert

07:46.620 --> 07:50.420
the message identifier or like generate a random one, so it's stored inside the block

07:50.420 --> 07:52.420
and remain stable.

07:52.420 --> 07:56.820
That's problematic if you're copying the pages, which are marketing, people are doing

07:56.820 --> 07:58.740
a lot, because they don't want to start from scratch, right?

07:58.740 --> 08:03.060
They have like a template, and then it looks somewhat familiar for all product pages, and

08:03.060 --> 08:09.180
then they just replace some blocks, edit some details, et cetera, and then you need different

08:09.180 --> 08:13.380
message identifiers for the blocks where they change the entire text.

08:13.380 --> 08:19.340
Also, you can't do the same for like, built-in blocks of Gutenberg, which means you

08:19.340 --> 08:22.860
can't like, you have to reinvent the wheel if you want to go down this path.

08:22.860 --> 08:25.420
So we said, okay, what else could we do?

08:25.420 --> 08:30.540
We could just say what's the one thing that stays stable for message?

08:30.540 --> 08:32.300
It's the hash of it.

08:32.300 --> 08:35.900
We didn't stay at this, but for a moment, that seemed like the only solution we got, like

08:35.900 --> 08:41.980
to take the shot to 256 hash of the message, and then use that as an identifying

08:41.980 --> 08:42.980
rep late.

08:42.980 --> 08:46.340
But that comes with the downside, whenever this text is edited, that's considered a complete

08:46.340 --> 08:48.460
new message inside rep late.

08:48.460 --> 08:53.980
Now translation memory makes this somewhat okay, because maybe rep late is able to stay,

08:53.980 --> 08:58.580
still say, okay, someone fixed the type in the original English text, but this is still

08:58.580 --> 09:01.420
the translation that should be assigned to this message.

09:01.420 --> 09:07.540
But at this point, another thing came up that our designer was a lot more comfortable with

09:07.540 --> 09:09.260
the other matter rep side builder.

09:09.260 --> 09:13.660
Elementor is theoretically speaking, also practically speaking, free note and source, so far,

09:13.660 --> 09:18.140
it's like on GitHub and you can get it under the GQLV3, they have a pro version, but

09:18.220 --> 09:22.700
of course, GPL and commercial software do not necessarily cancel each other out, like you

09:22.700 --> 09:25.580
can get a license, and then you're free to do with the code, whatever you want, they're

09:25.580 --> 09:30.140
just not going to put a pro version, source code themselves on their rep side to download

09:30.140 --> 09:31.660
without buying it.

09:31.660 --> 09:38.620
Now Elementor, which they call their block, like equivalent, their good block equivalent,

09:38.620 --> 09:43.260
do have stable IDs, and that made localization of these pages a lot easier, and also

09:43.260 --> 09:47.500
you can attach custom data, like they have inside the Elementor rep side builder, they have

09:47.500 --> 09:52.220
a section for data set attributes, and if you've ever coded plain HTML, you might know

09:52.220 --> 09:57.340
the data dash attributes, where you can assign whatever name you want to put in whatever

09:57.340 --> 10:02.220
you want, and then do with that what you want.

10:02.220 --> 10:07.500
So a bit more about how do these identifiers for blocks look like, this is like one part of

10:07.500 --> 10:12.220
our landing page, so we have like four text widgets on here, so you can see that we want

10:12.220 --> 10:17.420
to translate, more namely we have a headline, and we have something called a text editor

10:17.420 --> 10:24.060
widget below it, and if we look at the HTML source code of that, we can see they also

10:24.060 --> 10:29.020
expose these things directly on the HTML you get from WordPress as data set attributes.

10:29.980 --> 10:35.260
So we have this diff, like every Elementor widget is wrapped in a diff for the widget itself,

10:35.980 --> 10:41.740
like now for semantic HTML that might hurt a bit of purists, but for our case it was completely

10:41.740 --> 10:47.340
fine, so we can get the ID from this widget, we can see what the widget type is,

10:47.420 --> 10:50.860
and now if we want to search for everything on the page that's written, the translateable,

10:50.860 --> 10:56.300
we first pass the HTML, get all the elements that have this element I widget, and then filter

10:56.300 --> 11:03.180
for which it has to run to translate. Of course, the localized part like the part we want to

11:03.180 --> 11:08.300
replace and translate is not directly the element itself, so we still have to do a bit of

11:08.300 --> 11:12.700
the send until we hit the text node translate, but it will save it more about that in a moment

11:12.700 --> 11:24.060
in the implementation part. How did we implement this? Originally, we implemented this as an

11:24.060 --> 11:31.180
on-the-fly server in Rust, because we said, well, maybe people aren't accessing specific pages,

11:31.180 --> 11:36.220
and even then we can just catch them on the fly, but eventually we recognize that we're updating

11:36.300 --> 11:43.820
the implementation to work around certain element issues much more than we're changing the pages

11:43.820 --> 11:50.380
themselves. It was easier to say, we go with TypeScript so we can easily exchange the details

11:50.380 --> 11:54.780
of this pipeline and then run this as I see eye pipeline when added, ever the pages were updated

11:54.780 --> 11:59.580
and then deployed to our web space, have a CDN in front, etc. Still, this is relying a lot

11:59.580 --> 12:04.700
on the Rust ecosystem, because we're using DNO, we're using the HTML parsers, DCSS parsers,

12:04.780 --> 12:10.700
from the server project, and yeah, that made things a lot easier. Now, I already mentioned,

12:10.700 --> 12:16.300
we have a fixed list of translatable widget types, I will show actually what that means.

12:17.340 --> 12:21.580
If I go, this is a resource code, I'm planning to, as I will mention the end,

12:21.580 --> 12:26.060
planning to upload more generalized version of it, I didn't get through it before first

12:26.060 --> 12:35.180
of them, unfortunately, I'm sorry, but very up here we have like a list of translatable widgets.

12:35.180 --> 12:40.620
So, headings that are any kind of headlines, level 1, 3, we want to translate that text blogs,

12:40.620 --> 12:46.460
which is mostly paragraphs, but not only paragraphs, we want to translate that there are also

12:46.460 --> 12:52.300
some buttons we have to translate. Those are a bit ups, because they have a very convoluted

12:52.300 --> 12:58.060
content path, so that has some special handling, but overall, that worked out well enough.

12:59.340 --> 13:05.180
Okay, so the pipeline itself starts with some pre-processing, like we have to rewrite your

13:05.180 --> 13:10.060
else for where we're deploying it, download the assets from WordPress, maybe modify the CSS,

13:10.060 --> 13:15.580
and also adjust the CSS a bit, as I will show in the next slide. We want to extract the messages

13:15.580 --> 13:20.140
from the WordPress sites where you can generate JSON file for Replate, we want to translate it

13:20.220 --> 13:24.540
inside Replate, so that's the part where a bit of automatic translation followed by human

13:24.540 --> 13:29.580
quality checks comes in, and then we want to export it again from Replate to JSON and feed that

13:29.580 --> 13:35.580
into our pipeline to generate the final localized version of it. For the pre-processing,

13:35.580 --> 13:39.980
there's some things that are very specific to our setup, so they're not particularly interesting

13:39.980 --> 13:46.140
for localization itself. For example, we don't want to just pull in the whole WordPress page and then

13:46.140 --> 13:50.620
serve that to the end user because we're integrating that into our larger page, so that means in

13:50.620 --> 13:59.260
particular, we have this page, that's what it looks like on our WordPress instance, like that doesn't

13:59.260 --> 14:04.940
have much around it, we're stripping footers, headers, etc, because our website itself already has

14:04.940 --> 14:11.900
that, then we're feeding that into Replate, this is, for example, the strings we have for this

14:11.900 --> 14:21.660
rep page, so that's every text you see on this page, basically, is included in our

14:21.660 --> 14:29.500
template instance to edit, quality check, etc, and then we end up with this on our actual website,

14:29.500 --> 14:34.700
so that's like a React shell and in between we're embedding this static HTML that was output

14:34.700 --> 14:41.740
for our page here for the German local. We're also removing all JavaScript from the

14:41.820 --> 14:45.660
website, we're currently not allowing the marketing pages through embed scripts because they

14:45.660 --> 14:50.300
would be able to access the cookies of our main product, it's still a bit of an issue for us where

14:50.300 --> 14:54.540
we're considering to actually split marketing from the actual product of different domains,

14:54.540 --> 14:58.620
anywhere on who knows like host names separation for cookies, probably has run into this before,

15:00.540 --> 15:05.900
and what we're also doing is we're wrapping all the CSS in an additional ruler, so an additional

15:05.980 --> 15:11.260
rule, so for example, if something changes the font rate for any heading line, we won't,

15:11.260 --> 15:19.260
I don't want to have that effect the rest of our product, right? Now, how we're doing that

15:19.260 --> 15:26.300
is through a package called Lightning CSS, that's based on also the Rust server CSS parser that I

15:26.300 --> 15:32.380
mentioned before, that allows you to transform basically any rule you have in the CSS file cheat,

15:32.460 --> 15:37.180
that's very neat if you do custom, like if you have custom things you want to do to your CSS.

15:38.780 --> 15:43.580
And finally, we're hedging all the files we download from WordPress and then append the

15:43.580 --> 15:47.260
content hash to the file names. I know there are WordPress plugins that can do that like directly

15:47.260 --> 15:52.060
inside WordPress, but at this point we already had our custom pipeline from start to end and we're

15:52.060 --> 15:56.620
like, okay, we do download the files, we can just hash them and append and there's no need to

15:56.620 --> 16:00.780
involve additional PHP plugins in this process that potentially slow down the rest of it.

16:02.380 --> 16:06.540
Okay, so how do we extract messages? I already mentioned that we're using servers,

16:06.540 --> 16:11.580
HTML parser for that, that's needly wrapped in a package called DinoDom, but they're also

16:11.580 --> 16:16.940
MPM packages and also other programming language packages that can access the same parser

16:16.940 --> 16:23.900
and allow you to efficiently pass and search through an HTML DOM tree.

16:24.700 --> 16:29.900
We're looking for the Elementor Witchit inside the DOM tree, I showed you the State her tie Witchit,

16:29.900 --> 16:34.780
which we can just scan for. There are also like, if you know, beautiful soup for Python,

16:34.780 --> 16:39.500
there are similar packages for Dino and MPM, where you can just insert and see as the

16:39.500 --> 16:43.900
selector, get all the elements that match the selector and then work on them.

16:43.900 --> 16:48.300
Then we're checking if that tribe is translatable, if yes, we proceed, then we curse into this

16:48.300 --> 16:54.940
widget until we hit the translatable parts. At that point, I want to note that not only text notes

16:55.020 --> 17:02.060
are translatable, like text notes are the parts that are like in the very end of your DOM tree

17:02.060 --> 17:06.460
that the actual text content is contained in, you also have to translate things like all attributes

17:06.460 --> 17:11.580
and titles and labels, so you need some special casing of course for these. And for the things

17:11.580 --> 17:17.100
where Witchit are not sufficient or we can't detect the actual text content inside of the page.

17:17.100 --> 17:20.860
We have a fallback where we just assign a dataset attribute with a custom message identifier,

17:20.860 --> 17:26.940
so that's like the really last resort, if we see during testing a, this message didn't end up

17:26.940 --> 17:31.980
in replayed, why is that we look into it, assign the custom message identifier and then we're done with it.

17:33.340 --> 17:38.060
Now that we've extracted the JSON file, actually, we'll show what that looks like.

17:39.980 --> 17:44.140
This is what we get from our pipeline for the page we had just now, so that's showing the page

17:44.140 --> 17:49.500
slot, then the widget type and the unique ID that elementary has assigned internally,

17:49.580 --> 17:55.740
and then we're just passing that two replayed, which we saw here, and can edit them in replayed.

17:56.460 --> 18:00.780
So now we are done with the automatic translations, our team has looked over the messages,

18:00.780 --> 18:04.140
done some corrections to say, this is fine, we can deploy this,

18:05.100 --> 18:09.100
we will need to somehow feed these messages back into our WordPress pipeline.

18:09.820 --> 18:15.980
That has to use the same algorithm to detect what content notes should be translated,

18:15.980 --> 18:20.460
because you need to, of course, you need to replace the same note like the same text notes,

18:20.460 --> 18:24.940
with the same ID, the same message that's assigned to the ID, or the same alternative view,

18:24.940 --> 18:29.660
to set error, so that's all happening in one pass. And then we get the translations from

18:29.660 --> 18:35.420
the JSON and insert them in the position of the note we want to replace. We're doing that

18:35.500 --> 18:51.740
at the end, yes, but does it involve placeholder attributes? That's what I meant with special

18:51.740 --> 18:56.460
handling, so for example, all attributes, label attributes, etc. That's like that needs some

18:56.460 --> 19:03.180
special rules to say, this type like input types have placeholders or title attributes can be on

19:03.260 --> 19:07.740
any element. That's title attributes are something you don't handle at the moment, but for example,

19:07.740 --> 19:13.900
all attributes and images are something that we'd like just add, with a special message,

19:13.900 --> 19:24.140
identify it into the JSON, yes. I will have to think about that at the end or speak

19:24.140 --> 19:32.380
private, yes. Also, we realize that we need some inline HTML. For example, some languages don't

19:32.460 --> 19:38.380
like italic, italic text, I think, cursive text. In the rabbit you don't do that usually,

19:38.380 --> 19:43.580
but like the main English version, the reference version, has this cursive text. And then the

19:43.580 --> 19:47.260
translator can say, okay, this has a BTEC, I removed that from the translation, but other

19:47.260 --> 19:52.700
translations might want to heat that. Or also something we realized is like we originally tried to

19:52.700 --> 19:58.700
split the strings at the boundaries of like the HTML elements, and then we had text note,

19:58.780 --> 20:03.660
bold text text note, that of course assumes that you have the same order of these words

20:03.660 --> 20:07.580
in the translation, which is not the case. So you might like start with the bold text that the

20:07.580 --> 20:12.940
rest is the normal text. So instead we said, okay, formatting should be allowed inside the messages,

20:12.940 --> 20:17.980
we take that as one thing, and then we sanitize the messages, we get back from wrap like to allow

20:17.980 --> 20:22.940
only formatting, and also in some cases, anchor links, because you might want to link to a different

20:22.940 --> 20:31.820
version of a page depending on the local U-translate U. Yeah, so that already lets us arrive at the

20:31.820 --> 20:39.180
conclusion. My conclusion from this project is if it's HTML, you can translate it. It really

20:39.180 --> 20:44.780
mostly doesn't matter where you got that HTML from. And if it generates and consumes a map of

20:44.780 --> 20:50.620
messages, you can use wrap late. You just need to get from that step one to step two. Ideally,

20:50.620 --> 20:56.140
you have text only notes, and you know how to stable and uniquely identify them. The next best thing

20:56.140 --> 21:00.380
is you know how to reliably find and replace the text inside these notes, that's basically what we

21:00.380 --> 21:05.740
had to do in the end. And if you don't have a stable ID, you could fall back to the ID,

21:05.740 --> 21:10.940
an idea of hashing the original message, but it's not ideal, so maybe try to find something that

21:10.940 --> 21:18.220
uniquely and stabilize the text first. Also, source code, I mentioned, I didn't manage to like get a

21:18.300 --> 21:24.060
cleaned operation of it up and running on codeburg for today, so I already created the repository.

21:24.060 --> 21:31.420
If you want to start a bookmoderate and then it should arrive sometime next week, and you can

21:31.420 --> 21:38.460
like get a better idea of how this isn't fermented. Yeah, and with that we arrive at the end of the

21:38.460 --> 22:00.460
talk, and I'm happy to answer some more questions. How do they work around the lack of

22:00.540 --> 22:08.540
a stable ID and put in book? So how do other repressed translation plugins work around the

22:08.540 --> 22:18.460
issue of not having a stable ID in Gutenberg? The Gutenberg blocks do like they are somehow stored

22:18.460 --> 22:23.980
in the MySQL database, so they do internally have an ID, but there are no plans to expose them.

22:23.980 --> 22:28.460
Now, as a WordPress plugin, that's not that much of an issue if you know where to look,

22:28.540 --> 22:33.500
so what these translations plugins do is they hook directly into the WordPress database

22:33.500 --> 22:37.980
to identify these blocks, but there wasn't an option for us because we're not too deeply

22:37.980 --> 22:39.980
involved in repressed plugin development.

22:58.460 --> 23:27.420
So the question is, or it's starting with a statement, like you can uniquely identify a web page by its

23:27.660 --> 23:34.780
URL, or in this case it's a repressed page name or path, and then it's unlikely that like two

23:34.780 --> 23:41.740
two times the same text at the same page would have a different translation. So that's not the

23:41.740 --> 23:48.140
issue I wanted to get to. The problem is like you have like the original English version as

23:48.140 --> 23:53.020
your reference, right? And then someone might to change the words, but the meaning doesn't change,

23:53.100 --> 24:00.060
or you might like change punctuation, change a typo, etc. Now the string of that string,

24:00.060 --> 24:06.140
the hash of that string changes, so it's an entirely new message inside of the JSON export.

24:06.700 --> 24:12.060
Now as I mentioned in rep late, you have this thing for translation memory, so you can still say

24:12.060 --> 24:16.300
up to a certain similarity that should like automatically assign translation,

24:16.300 --> 24:21.580
that of course helps rocking around that, but we also don't want to put the threshold

24:21.580 --> 24:29.180
too low for translation memory. Like if we have very similar sentences that actually mean a

24:29.180 --> 24:33.740
bit different because of like one word changed, then we don't want to automatically assign the

24:33.740 --> 24:39.980
translation of a similar structure. Now with the unique idea on directly the widget that has the

24:39.980 --> 24:44.780
content, we can say this is the same content widget, even if like one word inside of the change. So

24:44.860 --> 24:51.660
it's probably okay that translation is unchanged, but still our translators are notified

24:51.660 --> 24:56.460
this like this string has a change, the original message, please double check it if the translation

24:56.460 --> 25:06.700
is correct. If things don't content is changing, then that's not a lot of the question,

25:06.700 --> 25:13.420
if the content is stable, do you still need this? Probably not, but it's easier to work with.

25:14.140 --> 25:18.380
Like if you say I'm translating the HTML page in one go and then it never changes,

25:19.180 --> 25:23.980
that's probably fine depending on what you do. If you never like if you're absolutely sure

25:23.980 --> 25:27.580
you're not going to edit this in the next two years, then yes, sure, that works.

25:37.580 --> 25:41.180
As I said, this repository is currently empty and I'm going to upload it next week.

25:43.420 --> 25:57.820
Then thank you again, a lot for listening and having me today.