WEBVTT

00:00.000 --> 00:13.600
All right, so our next talk is going to be breaking that architecture barriers, running

00:13.600 --> 00:21.000
8x86 apps and games on ARM by Tommy Vaseca, I hope they're not saying it right.

00:21.000 --> 00:22.000
Thank you.

00:22.000 --> 00:23.000
Thank you.

00:23.000 --> 00:27.000
Bonjour, put the tools and welcome to my talk.

00:28.000 --> 00:31.000
We don't have a lot of time, so I was just going to bore you really quickly about

00:31.000 --> 00:32.000
who I am.

00:32.000 --> 00:36.000
I come from a couple of different corners.

00:36.000 --> 00:40.000
One of the first, oh, quick start again.

00:40.000 --> 00:41.000
There we go.

00:41.000 --> 00:46.000
One of the first things in the community that I've been doing is working on video game

00:46.000 --> 00:47.000
console and my data.

00:47.000 --> 00:52.000
So if you've ever been trying to work on play, game queue, we, or previous games,

00:52.000 --> 00:56.000
or PSB games on a computer, probably some of my work was involved there.

00:57.000 --> 01:03.000
I really enjoyed just the feeling of being able to play video games on systems that

01:03.000 --> 01:04.000
there were never designed for.

01:04.000 --> 01:09.000
Whenever I first saw this kind of thing happened in practice,

01:09.000 --> 01:13.000
it was always mind blowing to me and I always wanted to understand how these things work,

01:13.000 --> 01:15.000
and so I just still went to the screen.

01:15.000 --> 01:19.000
But professionally, for the longest time, it looked like I was doing other things.

01:19.000 --> 01:23.000
It was actually studying physics, up to my master's degree.

01:24.000 --> 01:26.000
Good stuff to be had there.

01:26.000 --> 01:29.000
Turns out more pictures.

01:30.000 --> 01:32.000
So some cute pictures come out of there.

01:32.000 --> 01:36.000
Some cool masks, like I really liked the theoretical stuff, really good challenges,

01:36.000 --> 01:39.000
but at the end of the day, it didn't really work out for me.

01:39.000 --> 01:42.000
So I ended up working on graphics drivers,

01:42.000 --> 01:46.000
so making actual hardware work for rendering video games,

01:46.000 --> 01:50.000
specifically in terms of Vulcan drivers.

01:50.000 --> 01:56.000
So if you have used an imagination, a GPU or a MD GPU on Linux,

01:56.000 --> 02:00.000
chances are you were using some of my work as well.

02:00.000 --> 02:05.000
But today we're going to talk about a different fun challenge and that is facts.

02:05.000 --> 02:11.000
And I heard there's a couple of people who like trains here, so I brought this little picture.

02:11.000 --> 02:12.000
There we go.

02:12.000 --> 02:16.000
The airport express from Berlin to the airport,

02:16.000 --> 02:19.000
but sadly it's not what we're going to talk about today.

02:19.000 --> 02:22.000
Instead it's going to be about boring emulator stuff.

02:22.000 --> 02:24.000
Sorry.

02:24.000 --> 02:27.000
So actually a little bit about facts itself.

02:27.000 --> 02:31.000
So the project was started about seven years ago by Ryan Hoadeck.

02:31.000 --> 02:34.000
I think what you pronounce this name.

02:34.000 --> 02:39.000
Seven years ago, I personally joined the project four years ago,

02:39.000 --> 02:42.000
so it's been a while as well.

02:42.000 --> 02:44.000
We've had our ups and downs.

02:44.000 --> 02:48.000
It's been a fun ride, but we're really getting somewhere nowadays.

02:48.000 --> 02:52.000
So yeah, what's actually involved in making x86 emulation working?

02:52.000 --> 02:57.000
So the task that we have is we have a whole software library,

02:57.000 --> 03:00.000
like from office software to other proprietary programs,

03:00.000 --> 03:04.000
but in particular games, like people want to have games on their hardware.

03:04.000 --> 03:08.000
Always sounds like a fun little thing, but at the end of the day,

03:08.000 --> 03:12.000
this is what keeps people away from switching from Windows or Mac OS to Linux.

03:12.000 --> 03:17.000
It's that they can't play their games on their machines that they could use for other software.

03:17.000 --> 03:21.000
So we want to have a solution for that, and that's why we want to have a translation layer

03:21.000 --> 03:27.000
that that's existing software automatically for armed devices.

03:27.000 --> 03:32.000
And yeah, at the lowest level, what you have here is the CPU.

03:32.000 --> 03:39.000
And you can sort of build a layer diagram out of this.

03:39.000 --> 03:43.000
If my slides are cooperating, yes, thank you.

03:43.000 --> 03:47.000
On top of the CPU hardware, that's it's the operating system kernel,

03:47.000 --> 03:50.000
which takes care of all sorts of things from memory management,

03:50.000 --> 03:54.000
threading, process management, making sure, for example,

03:54.000 --> 03:58.000
that your video game that is running cannot read your browser passwords at the same time.

03:58.000 --> 04:00.000
That would be kind of undesirable.

04:00.000 --> 04:07.000
On top of the Linux kernel, lives a set of libraries from your standard language long,

04:07.000 --> 04:12.000
run times like the C run time, the Java, the JBM,

04:12.000 --> 04:15.000
the top net runtime, for example.

04:15.000 --> 04:20.000
Next to that we have graphics APIs to make things actually come up on screen.

04:20.000 --> 04:25.000
And of course, we have the actual game engines like Unity, Unreal, you name it.

04:25.000 --> 04:27.000
And at the top of all of these, you have the actual game,

04:27.000 --> 04:30.000
like the Kit and Adventure, for example.

04:30.000 --> 04:35.000
And we're looking at the stack, and we're tasked with the problem,

04:35.000 --> 04:39.000
every single element of this needs some sort of equivalent

04:39.000 --> 04:41.000
in our translation layer.

04:41.000 --> 04:46.000
But let's start, let's start at the base.

04:46.000 --> 04:51.000
We're just replacing our physical x86 CPU with an ARM CPU.

04:51.000 --> 04:57.000
Well, the standard approach to what we can do is binary real compiler.

04:57.000 --> 05:03.000
So we take all of the instructions of our program and map them from x86 to ARM.

05:03.000 --> 05:07.000
So it's not simple, but let's actually look at what that looks like in detail.

05:07.000 --> 05:15.000
So if we look at some example x86 code, this is just the...

05:15.000 --> 05:18.000
Where's my mouse?

05:18.000 --> 05:20.000
This is just a simple title example.

05:20.000 --> 05:23.000
I was some basic instructions that I came up with.

05:23.000 --> 05:27.000
The details don't really matter all that much, but conceptually what we're doing here

05:27.000 --> 05:31.000
is we're iterating through all of these instructions.

05:31.000 --> 05:37.000
One by one, and we're meeting corresponding...

05:37.000 --> 05:40.000
corresponding.

05:40.000 --> 05:42.000
I'm going on side.

05:47.000 --> 05:48.000
There we go.

05:48.000 --> 05:50.000
So we're using an intermediate representation.

05:50.000 --> 05:54.000
Because we don't, like ultimately we want to have ARM code,

05:54.000 --> 05:59.000
but for reasons that are kind of technical, it's more convenient for us to actually generate

05:59.000 --> 06:05.000
a lot of garbage in terms of an intermediate language that represents this semantics that we want.

06:05.000 --> 06:12.000
It blows up a little, but this allows us to do sort of optimization paths on top of the whole thing.

06:12.000 --> 06:16.000
So the garbage that we generate from every single one of these instructions

06:16.000 --> 06:21.000
gets compressed down to something more legible, legible.

06:21.000 --> 06:26.000
I'm trying to find my mouse point, but that was actually...

06:27.000 --> 06:29.000
My mouse point, but I...

06:29.000 --> 06:30.000
Cool.

06:30.000 --> 06:32.000
The next thing.

06:32.000 --> 06:39.000
So what happens after we kind of blow up the original X86 code into our intermediate representation,

06:39.000 --> 06:41.000
and we then optimize the down.

06:41.000 --> 06:46.000
That's the point where we can turn the remaining instructions back into ARM instructions.

06:46.000 --> 06:50.000
And that is in principle code that is runable on any machine.

06:50.000 --> 06:54.000
Now, small problem here, it's not always that easy.

06:54.000 --> 07:03.000
So if you look at a slightly more involved example, there we go.

07:03.000 --> 07:07.000
So just a simple loop example in this case.

07:07.000 --> 07:11.000
You can see the ARM code on the right side is actually blowing up quite significantly,

07:11.000 --> 07:13.000
so it doesn't even fit on the slide anymore.

07:13.000 --> 07:16.000
So there's a couple of reasons for that.

07:16.000 --> 07:19.000
Some of it is just in here and translation overhead.

07:19.000 --> 07:22.000
So every time we enter our exit one of these blocks,

07:22.000 --> 07:24.000
it's just the thing that happens.

07:24.000 --> 07:29.000
We need to emit some boilerplate code just to deal with the fact that we're not the original code.

07:29.000 --> 07:35.000
We're just set a studio sandbox and we need to enter and exit that sandbox every now and then.

07:35.000 --> 07:40.000
And that's in particular where all of the garbage at the very bottom comes from.

07:40.000 --> 07:45.000
But also if you look at something like the this decrement instruction,

07:45.000 --> 07:49.000
which just takes the value that is given on its right,

07:49.000 --> 07:51.000
and it subtracts one from it.

07:51.000 --> 07:53.000
Where we come in and loops, for example.

07:53.000 --> 07:57.000
This is mapped to, I think, this.

07:57.000 --> 08:00.000
Yeah, these three instructions here.

08:00.000 --> 08:06.000
So you can see that the actual decrement operation is mapped to your subtraction,

08:06.000 --> 08:07.000
which seems intuitive.

08:07.000 --> 08:10.000
But then you get a couple of garbage around that,

08:10.000 --> 08:13.000
in form of the C set and RMI f instructions.

08:13.000 --> 08:16.000
And this is due to a thing called flag handling.

08:16.000 --> 08:21.000
So every time an x86 arithmetic instruction, like an additional multiplication runs,

08:21.000 --> 08:26.000
it could be that the result doesn't fit into a 32 bit or 64 bit integer.

08:26.000 --> 08:29.000
So there's overflow flags or negative flags,

08:29.000 --> 08:31.000
underflow flags, all of these kind of things.

08:31.000 --> 08:36.000
They are computed unconditionally by your x86 CPU to inform the program

08:36.000 --> 08:39.000
that's something ran wrong with the computation.

08:39.000 --> 08:43.000
And an x86 happens all the time, whereas on arm, it's optional.

08:43.000 --> 08:46.000
The program or the compiler explicitly has to opt in.

08:46.000 --> 08:49.000
And this difference is something we need to compensate for.

08:49.000 --> 08:52.000
So we get these extra instructions on the right side.

08:52.000 --> 08:57.000
And our optimization passes really help with reducing this to the bare minimum necessary.

08:57.000 --> 09:00.000
Normally it would be like 10 or 20 instructions

09:00.000 --> 09:02.000
of overhead just to compute all of these flags.

09:02.000 --> 09:05.000
But thanks to our optimization passes, we can get rid of this.

09:05.000 --> 09:10.000
That's mostly.

09:10.000 --> 09:13.000
Yeah, one thing I should probably also add is,

09:13.000 --> 09:17.000
there's a ton of these x86 instructions depending on how you count.

09:17.000 --> 09:20.000
It could be like up to 6,000.

09:20.000 --> 09:22.000
It's definitely more of an upper bone.

09:22.000 --> 09:26.000
But imagine you have to go through the process of identifying 6,000 functions,

09:26.000 --> 09:29.000
decoding them, re-implementing them in arm,

09:29.000 --> 09:32.000
and then somehow making it all fast as well.

09:32.000 --> 09:35.000
That's a great challenge in task.

09:35.000 --> 09:39.000
One other challenge in task is the x86 memory model.

09:39.000 --> 09:42.000
And they're not much time to go over this.

09:42.000 --> 09:46.000
But imagine you have a very simple program with two threads running in parallel.

09:46.000 --> 09:49.000
And what the first thread is doing is really simple.

09:49.000 --> 09:52.000
It's just resetting some state.

09:52.000 --> 09:58.000
And then it's waiting for a result to be computed on the second thread.

09:58.000 --> 10:02.000
The second thread now might just say, okay, let's compute 42.

10:02.000 --> 10:04.000
It takes a while.

10:04.000 --> 10:08.000
But once it's done, it assigns the result to our result variable.

10:08.000 --> 10:13.000
And then it sets it done flag to one to wake up the other thread on the left.

10:13.000 --> 10:15.000
The other thread we're done for the result,

10:15.000 --> 10:19.000
and now evenly you might expect the result to be 42.

10:19.000 --> 10:22.000
Since that's what we've computed on the other thread.

10:22.000 --> 10:25.000
On x86, you would be perfectly correctal about that.

10:25.000 --> 10:28.000
But on arm, maybe not.

10:28.000 --> 10:35.000
So the problem here is that the x86 memory model has an implicit guarantee here.

10:35.000 --> 10:45.000
That stores also variable rights on that we trigger on thread 2.

10:45.000 --> 10:46.000
Okay, thread 2.

10:46.000 --> 10:47.000
You know what I mean?

10:47.000 --> 10:49.000
The most because it doesn't work properly.

10:50.000 --> 10:53.000
So actually it's computing, it's writing the variables in the order result.

10:53.000 --> 10:55.000
And then done.

10:55.000 --> 11:01.000
But on arm, we actually don't have the guarantee that the first thread is observing the rights in the same order.

11:01.000 --> 11:07.000
thread 1 might actually on an arm platform see the done flag being set first.

11:07.000 --> 11:09.000
And then the result flag.

11:09.000 --> 11:12.000
And that is problematic because now we might have a race condition.

11:12.000 --> 11:18.000
We might exit out of the slew immediately before the result has been written and we'll try to print it.

11:18.000 --> 11:22.000
So in that case we would still print the original initializer up here.

11:22.000 --> 11:24.000
And we would see zero.

11:24.000 --> 11:29.000
So this mismatch in terms of the semantics of the memory model is really difficult to resolve.

11:29.000 --> 11:34.000
Because we don't know for which memory access is important.

11:34.000 --> 11:41.000
And so basically we have to patch it up somehow for every single memory access that we have in the program.

11:41.000 --> 11:44.000
And there's a couple of strategies we can employ to do this.

11:44.000 --> 11:49.000
The root force solution is to just make every single memory access atomic.

11:49.000 --> 11:52.000
It's very expensive.

11:52.000 --> 11:56.000
Instead we try to use euristics wherever possible.

11:56.000 --> 12:02.000
So actually one other thing we can do in terms of atomic is we can use something called half barriers.

12:02.000 --> 12:06.000
This is a trick that I think we took from the mono runtime.

12:07.000 --> 12:15.000
That makes the overhead a little bit better, but still if you enable us for everything you still get a performance it of something like 20% or least.

12:15.000 --> 12:17.000
So this is really something we can't afford.

12:17.000 --> 12:25.000
So we try to employ additional euristics on top of that with a really fun hack for a unity game specifically where unity you.

12:25.000 --> 12:27.000
I think it's mono as well.

12:27.000 --> 12:31.000
It uses some sort of ring buffer internally to store something.

12:31.000 --> 12:36.000
So it relies really strongly on these memory model semantics.

12:36.000 --> 12:47.000
And we can very reliably pattern match on the exact instruction set used by the cyclic buffer though.

12:47.000 --> 12:55.000
So we have something like a relative note from a register with an offset of 61 bytes.

12:55.000 --> 12:59.000
It's something like that. So very specific instruction pattern.

12:59.000 --> 13:05.000
We know more or less reliably that this will match the cyclic buffer.

13:05.000 --> 13:09.000
So we enable these strong memory semantics only for this code.

13:09.000 --> 13:12.000
And that makes unity games work for everything else.

13:12.000 --> 13:18.000
All the other memory accesses for most games we can just disable this accurate memory model simulation and we're just working.

13:18.000 --> 13:24.000
So it's a really smart combination between a reasonably fast default implementation.

13:24.000 --> 13:32.000
So smart euristics and some other fun tricks that you can quite at the time to talk about though.

13:32.000 --> 13:38.000
And yeah, so this is just the last thing is just the audio we might observe on the arm program.

13:38.000 --> 13:44.000
So that's taking care of the CPU simulation.

13:44.000 --> 13:52.000
The next thing that we wanted to show talk about is the kernel layer.

13:52.000 --> 14:00.000
So the kernel has a couple of subsystems. As I was saying, the threads and like pose signals for example is one element for file system.

14:00.000 --> 14:04.000
We have system calls and we have the GPU driver.

14:04.000 --> 14:09.000
And what effects needs to do for these is to build a sort of compatibility layer.

14:09.000 --> 14:17.000
So every time a user space program tries to, for example, access files or talk to other threads.

14:17.000 --> 14:24.000
Effects sort of acts as a pseudo runtime between the game and our arm 64 Linux kernel.

14:24.000 --> 14:30.000
Just to make sure that what the game sees underneath itself actually looks like an X86 kernel.

14:30.000 --> 14:37.000
So you just need to have this building block in between that wraps all of the functionalities like this.

14:37.000 --> 14:46.000
What does this look like? Well, the game at some point will have a code block that uses a system call instruction, just like this.

14:46.000 --> 14:50.000
Yeah, like the system points actually here.

14:50.000 --> 14:58.000
This just was pre-ghost kernel functionality. We run this thing through our X86 reconpiler to get a equivalent arm code.

14:58.000 --> 15:05.000
But in place of the system call instruction, we place a call to an external C++ function.

15:05.000 --> 15:13.000
Which then looks fairly normal. It's just a switch on the system call ID and then triggers accordingly other code actions.

15:13.000 --> 15:20.000
So the whole point here being is that we get from a very low leather perspective of this raw binary machine instructions.

15:20.000 --> 15:28.000
We try to build abstractions a lot of the top of this so that we can actually focus on implementing the system call logic using plain C++ logic.

15:28.000 --> 15:36.000
See, plain C++ code instead of having to dive in these low level assembly details.

15:37.000 --> 15:46.000
Yes, if you can want to pause the slides online, you can look at the details here, but the gist of it is this is often quite straightforward to do.

15:46.000 --> 15:51.000
But again, there's hundreds of these system calls, so it's a lot of work.

15:51.000 --> 15:56.000
And most of them are straightforward, but some of them tend to be quite tricky as well.

15:56.000 --> 16:04.000
Particularly when it comes to memory management, for example, where effects also needs to allocate its own memory and then needs to make sure there's no conflicts there.

16:06.000 --> 16:16.000
So that's the kind of compatibility layer, and now the big question becomes what's about the last layer we had in our diagram, the actual libraries.

16:16.000 --> 16:21.000
The C runtime, the unity engine, or graphics API.

16:21.000 --> 16:31.000
Well, it turns out we don't actually need to do anything special for these anymore, because if we have a reasonably accurate binary reconpiler and a reasonably accurate kernel,

16:31.000 --> 16:38.000
then all of the libraries above will just magically work as a part of complex interacting system with each other.

16:38.000 --> 16:49.000
But as far as effects are concerned, the libraries together with the game just form one huge blob that interacts with the kernel, and that's all we need to worry about.

16:49.000 --> 16:54.000
So we get this nice separation layer here in the middle.

16:54.000 --> 17:07.000
Where we are running unmodified X86 on software on our new ARM hardware, that is made to look so to speak like X86 hardware using effects as a compatibility layer.

17:07.000 --> 17:23.000
There is however one more trick we can apply to this all thing, or rather two tricks even, because if we look at the graphics, the graphics API like Vulkan or OpenGL, this is actually a huge bottleneck of emulation sometimes,

17:23.000 --> 17:27.000
because games are accessing these APIs so often.

17:27.000 --> 17:32.000
And now you have a well-defined API here.

17:32.000 --> 17:40.000
We know the game is talking to an X86 library, but can't we just take the existing ARM Vulkan implementation instead?

17:40.000 --> 17:46.000
And indeed, that's what we can do with a couple of tricks.

17:46.000 --> 17:49.000
Maybe. There we go.

17:49.000 --> 17:58.000
So whenever the game tries to access this book in X86 library, we just inject our own smaller replacement wrapper library on top of that.

17:58.000 --> 18:14.000
And what this replacement wrapper library we do, it doesn't do a lot, but it just forwards all of the different individual API calls to the corresponding ARM library.

18:14.000 --> 18:23.000
And this has two advantages. First of all, we don't need to emulate the original X86 library, so we save some emulation overhead.

18:23.000 --> 18:28.000
But then also we completely interesting.

18:28.000 --> 18:34.000
More interesting. Oh yeah, okay, great.

18:34.000 --> 18:37.000
So we saved translation overhead, and oh yeah.

18:37.000 --> 18:40.000
So I just leave it like this.

18:40.000 --> 18:47.000
We saved translation overhead, and what also happens is that we can cite that the wiggling kernel compatibility layer.

18:47.000 --> 18:50.000
Because it's clearly just making troubles anywhere.

18:50.000 --> 19:01.000
So we get these two APIs. Another thing that we actually can do in addition, though, is to save further compilation overhead, we can employ a code cache.

19:01.000 --> 19:09.000
So on the second and third, so in subsequent runs of the game, we can just reuse the compilation result from prior runs.

19:09.000 --> 19:16.000
So that, for example, when I'm entering a new scene in a game that I played before, that I don't get a micro study every time.

19:16.000 --> 19:21.000
I enter that area, but rather we can load the code that is already compiled from disk.

19:21.000 --> 19:26.000
And so we can save a couple of a bunch of compilation overhead like this.

19:26.000 --> 19:31.000
So with that being said, this was more or less the entirety of effects.

19:31.000 --> 19:35.000
I mean, obviously the details always look different at the end of the day.

19:35.000 --> 19:42.000
And I think my slides crashed. Interesting.

19:42.000 --> 19:50.000
So at this point, I would just reload the slides and I prepared a little demo, which is very convincing now.

19:50.000 --> 19:53.000
So do you want a demo, actually?

19:54.000 --> 19:56.000
Yeah, I'd like to thank you.

19:56.000 --> 20:03.000
So the demo is actually something, and I really wish the slides are working now.

20:03.000 --> 20:09.000
Oh well, so the demo is already breaking, because you're looking at the demo.

20:09.000 --> 20:11.000
And we're seeing the demo effect in action.

20:11.000 --> 20:16.000
I think my machine is just running out of RAM, that's a problem probably.

20:16.000 --> 20:21.000
So these slides actually weren't using a react-based slide presentation framework.

20:21.000 --> 20:26.000
And it's running in an X86 build a Firefox.

20:26.000 --> 20:30.000
We're just quite impressive when it works.

20:30.000 --> 20:32.000
Thank you.

20:38.000 --> 20:40.000
And now it's gone.

20:40.000 --> 20:43.000
It should not even kill it anymore.

20:43.000 --> 20:46.000
Excellent.

20:46.000 --> 20:50.000
Well, I've prepared a little other thing for you though.

20:55.000 --> 20:57.000
Hello.

20:57.000 --> 21:03.000
So we can actually run steam in this thing as well.

21:03.000 --> 21:07.000
Okay, sure.

21:08.000 --> 21:14.000
So to say, sometimes I already launch steam, but this is just a regular steam installation started using effects.

21:14.000 --> 21:24.000
And you can use it just like any other steam installation and launch games using it, where exactly where we get there.

21:37.000 --> 21:47.000
Ah, there's audio, not on stream I think, but maybe people in the room can hear it.

21:47.000 --> 21:54.000
So it should have come to Brussels if you can't hear the audio.

21:54.000 --> 22:02.000
And it's really quickly wanted to show that this is actually usable, not just for joy to do games, but actually for real 3D games as well.

22:02.000 --> 22:06.000
Some people even run on more powerful machines on this than this.

22:06.000 --> 22:11.000
Like things like cyberpunk, for example, so even triple A games can work to some extent.

22:11.000 --> 22:14.000
Also, I think the game might be crashing as well.

22:14.000 --> 22:17.000
So now it's working, yay!

22:24.000 --> 22:27.000
Yeah, working just as you were expecting.

22:27.000 --> 22:34.000
So we have five minutes left, so I'm just going to try to fix my browser.

22:57.000 --> 23:12.000
Are you going to take questions?

23:12.000 --> 23:17.000
Are you going to take questions?

23:17.000 --> 23:30.000
Yeah, the problem is I can't actually kill the existing Firefox instance here.

23:30.000 --> 23:32.000
So it can't restart.

23:32.000 --> 23:37.000
There we go.

23:37.000 --> 23:47.000
That was annoying.

23:47.000 --> 23:55.000
Okay.

23:55.000 --> 24:05.000
We've got slides again.

24:05.000 --> 24:09.000
So you can also see the funny picture that I went into show before.

24:09.000 --> 24:13.000
So do you want a demo?

24:13.000 --> 24:16.000
Hello, Firefox.

24:16.000 --> 24:20.000
You already looking at it.

24:20.000 --> 24:21.000
Great.

24:21.000 --> 24:22.000
Okay.

24:22.000 --> 24:26.000
So making this really quick now because we have less than five minutes left.

24:26.000 --> 24:28.000
This effects actually used in the wild.

24:28.000 --> 24:30.000
Yes, yes, I will in the next project.

24:30.000 --> 24:37.000
It's using us to run or integrate PC games on Apple hardware running Linux.

24:37.000 --> 24:42.000
Parallels desktop is using us for something we don't actually know why, but they created it in some blog posts.

24:42.000 --> 24:44.000
So yay!

24:44.000 --> 24:52.000
Crossover is integrated effects into code we've integrated effects into their crossover product.

24:52.000 --> 24:55.000
Also really cool to see that one working.

24:55.000 --> 24:58.000
And finally, Firebase is going to release the steam frame.

24:58.000 --> 25:02.000
Sometime this year, which effects is an elementary part of at this point.

25:02.000 --> 25:05.000
Thanks to Valve for actually funding a lot of this work.

25:05.000 --> 25:10.000
So I'm personally paid by Valve to work on facts and a couple of other people as well.

25:10.000 --> 25:12.000
None of this would be possible without them.

25:12.000 --> 25:14.000
Thanks to them.

25:15.000 --> 25:27.000
Really quick note about Windows support, which I didn't mention above, but probably some of you might be interested in playing Windows games.

25:27.000 --> 25:39.000
So what you can do in fact is you can just run wine itself as an x86 Linux program.

25:39.000 --> 25:53.000
So it's a Windows game and you run this game using x86 wine with the corresponding set of Windows libraries on top of the x86 Linux libraries on top of the x86.

25:53.000 --> 25:59.000
On top of the x86 Linux kernel and the x86 hardware, but now you swap out the CPU.

25:59.000 --> 26:05.000
So you can introduce recompile and the kernel compatibility layer by effects, and this whole stack all works.

26:05.000 --> 26:11.000
But one other small thing we can do is we can do two things actually.

26:11.000 --> 26:17.000
We can swap out the x86 build of wine with an arm build, more or less skipping over details here.

26:17.000 --> 26:26.000
And we can recompile instead of to arm to a custom binary interface called arm64EC.

26:26.000 --> 26:28.000
Think of it like a calling convention.

26:29.000 --> 26:40.000
And what this allows us to do is shift the recompile up the stack a little bit and then recompile the Windows libraries themselves to this arm64EC adi.

26:40.000 --> 26:48.000
That allows us to throw away the x86 Linux library Linux libraries entirely and the kernel compatibility layer entirely.

26:48.000 --> 26:57.000
So we end up with a much simpler stack where the essentially only the game and some few remaining libraries remain to be emulated.

26:57.000 --> 27:01.000
So much more convenient for emulation.

27:01.000 --> 27:10.000
If you compare to the other stack where we just run x86 wine and effects, just a few less instances here.

27:10.000 --> 27:19.000
We end up with a much simpler architecture where there's few components to emulate and much less time left over, much less time.

27:19.000 --> 27:21.000
With your work to be done over all.

27:21.000 --> 27:26.000
If you want to read more about this, just check out the blog post down here.

27:26.000 --> 27:30.000
And what's next, we will end this talk.

27:30.000 --> 27:34.000
But what we will make text better is the short version of it.

27:34.000 --> 27:38.000
We don't use AI for our work and that's all I have.

27:38.000 --> 27:39.000
Thank you so much.

