WEBVTT

00:00.000 --> 00:11.960
Cool. Hi everyone. Hey guys doing. My name is Aurelian Bumble or Aurelian. If you speak French,

00:11.960 --> 00:16.400
I'm just actually my first time at Fazlem and it's kind of a funny feeling for me being here

00:16.400 --> 00:21.120
because I'm born and raised here in Belgium and you don't have to screw up and move on with

00:21.120 --> 00:26.360
a chicago abroad for work and now I'm back here for work, right?

00:26.360 --> 00:34.040
Workwise, I work with the Confidential Containers Project, Lately I've been looking

00:34.040 --> 00:37.840
at storage. Otherwise, it's just helped out with different aspects of the project, no

00:37.840 --> 00:43.640
to be the CI and then I also work with the Cadillac Containers Project where I'm a part of

00:43.640 --> 00:48.840
the Architecture Committee there which is a group that, you know, steers a project and has

00:48.840 --> 00:54.720
a final CM decisions and in a Microsoft, my Simon really is, the Linux Confidential

00:54.720 --> 00:59.720
Platform as part of the Azure Linux team. I work with our channel and Kita, that presented

00:59.720 --> 01:06.360
an earlier, I work with a lot of time, right? So I know some people in this room.

01:06.360 --> 01:11.880
So when you talk about, you know, Confidential Computing, a lot of the time, people focus

01:11.880 --> 01:17.000
on the compute part, right? So weapons in the memory, not so much about networking or

01:17.000 --> 01:23.400
storage, right? And so today I'm going to share with you folks a little bit about what we've

01:23.480 --> 01:29.360
been working on with the Confidential Containers Community to implement secure storage from

01:29.360 --> 01:34.920
the perspective of, you know, containers in a Confidential setting, right? And so mind

01:34.920 --> 01:40.160
you, a lot of this stuff is still very much a work in progress, but at least this will

01:40.160 --> 01:45.360
give you a good overview on the current tenant things. And so I'll talk about both the

01:45.360 --> 01:51.440
implementation itself. We'll also have a ties into the broader ecosystem, right? And

01:51.440 --> 01:56.120
by the way, we have a PR, that's not for review. I can share with you folks if you want

01:56.120 --> 02:01.320
to, you know, criticize my code, please feel free.

02:01.320 --> 02:07.600
Now, first and foremost, what is the Confidential Container, right? So when you think

02:07.600 --> 02:13.040
containers, you think typically your standard Docker runsee containers, right? Where you

02:13.040 --> 02:19.480
share the whole kernel with the containers. Now you can do better isolation than that using

02:19.480 --> 02:23.920
the kind of containers you put the containers in a virtual machine, right? That still doesn't

02:23.920 --> 02:30.080
prevent you from a potentially malicious host that could attack your containers, right? And

02:30.080 --> 02:35.520
so to address this problem, we have the Confidential Containers Project or Cocoa, that's

02:35.520 --> 02:39.440
the, that's the term that we're using on the project to refer to it, right, Cocoa. And

02:39.440 --> 02:47.120
we build on type of cata to leverage a trusted execution environment or a T, to guarantee

02:47.120 --> 02:52.840
the Confidentiality of that VM, right? For example, by encrypting the VM memory, right? And

02:52.840 --> 02:59.200
even though the workloads, the containers themselves, can use that to run a process called

02:59.200 --> 03:03.840
remote anestation, that don't give you essentially a cryptographic proof of the contents

03:03.840 --> 03:09.040
of the VM, right? And then later on, I also want you to use the concept of container security

03:09.040 --> 03:16.080
policy that we use to secure the VM boundary. So, you know, the interface between the VM and

03:16.080 --> 03:22.400
outside of the VM, right, which is entrusted by by designing a threat mark, right? So now,

03:22.400 --> 03:27.240
these three are container run times, right? Be address a question of, how do I run a container

03:27.240 --> 03:31.240
on the single machine, right? Now, when you're talking about the point of scale, when you

03:31.240 --> 03:35.920
have loads of containers that you want to deploy to a cluster of machines, right? You want

03:35.920 --> 03:41.880
to use something like Kubernetes to handle what we call the orchestration of your containers,

03:41.880 --> 03:47.840
and so that's going to handle the deployment, you know, scaling, you know, balancing,

03:47.840 --> 03:52.960
services coverage, fault tolerance, all that good stuff, right? And so, it's very important

03:52.960 --> 03:58.560
for the Coco Project to work well with Kubernetes, because that's what people use, right,

03:58.560 --> 04:03.560
to deploy to scale. So if you want people to use Coco, you want to work well with Kubernetes.

04:03.560 --> 04:07.960
And with this, get us one of the aspects that's managed through Kubernetes, old storage.

04:07.960 --> 04:12.200
So it's going to be, you know, pretty important for us here to have some understanding

04:12.200 --> 04:17.960
that Kubernetes is here in the picture, and that we have to work with it.

04:17.960 --> 04:20.440
Is there a little bit more order?

04:20.440 --> 04:32.040
I just, I think I'm being sold out more about it. I agree with a lot of it. So before I dive

04:32.120 --> 04:38.520
into the, before I dive into the details of the storage implementation itself, I'm going

04:38.520 --> 04:45.640
to quickly run through this already simplified diagram of the life cycle of a confidential

04:45.640 --> 04:50.160
container, right? And so we'll build on top of this diagram to understand the storage

04:50.160 --> 04:57.400
later on, right? And so this is the view from the host. And so when you want to create

04:57.480 --> 05:03.560
a container, you're going to create this container spec here, which is a YAMO file that you sent

05:03.560 --> 05:09.000
to your cluster, right, to Kubernetes. And then on every host-oriented cluster is going to be

05:09.000 --> 05:13.800
a component called the Q-blood from Kubernetes. And the YAMO file is going to reach that Q-blood,

05:13.800 --> 05:18.200
the Q-blood is going to talk to the Cateron time. And then the Cateron time will trigger

05:19.000 --> 05:24.520
the virtual machine manager, right, the VMM, to create this confidential VM right here, right?

05:24.520 --> 05:31.160
And in this confidential VM will be installed a few components, including the Rust Cater agent.

05:31.160 --> 05:36.760
And this Cateron agent will be responsible for creating the individual containers inside your VM,

05:36.760 --> 05:43.000
right? And then it's going to talk to the Cateron time interface with it over our PC, right? And

05:43.000 --> 05:49.000
so one key aspect to understand is that anything that is outside this, you know, this green box,

05:49.000 --> 05:54.360
the confidential VM, right, is not trusted from the perspective of the VM components, right?

05:56.600 --> 06:01.480
Now, when you talk about storage, broadly there's two types of storage that you want to consider.

06:01.480 --> 06:07.480
There's a firmware storage and then persistent storage. And so I'll start with a firmware storage here.

06:08.040 --> 06:12.440
Hopefully, you have some time to say a few words about, you know, persistent storage before,

06:12.440 --> 06:17.160
you know, I get kicked out of the stage. So let me dive right into it. So

06:18.040 --> 06:22.200
even if in most storage, right, there's a pretty crucial use case to support,

06:22.200 --> 06:27.080
because every container that's on point is going to need to store data that is not going to

06:27.080 --> 06:34.840
live the container, but that is also not going to fit into memory, right? And so enabling that,

06:34.840 --> 06:42.120
you unlock use cases like sharing data between the containers in UVM. I don't know,

06:42.360 --> 06:48.600
having temporary lock storage before sending those lives over to logging service, right, in the cloud,

06:48.600 --> 06:55.160
or caching and check points that's not charging so far, right? And so always the first goal here,

06:55.160 --> 07:00.760
right, is going to be security and especially confidentiality, because even if the storage is

07:00.760 --> 07:06.920
a firmware, you don't want it to be accessible to the trusted host or to any other

07:07.080 --> 07:12.200
untrusted control plan components, right? And then we want to be careful of how we integrate

07:12.200 --> 07:17.480
with Kubernetes, right, because Kubernetes already has storage features that people are leveraging,

07:17.480 --> 07:23.320
right? And so we want to make the transition to Cocoa as smooth as possible, right? So we want to be careful

07:23.320 --> 07:28.920
of how we integrate with Kubernetes. And so with this, the key ideas in the design are going to be

07:29.880 --> 07:36.440
from the host itself, right? We're going to create a temporary block device. We're going to pass

07:36.440 --> 07:42.840
this block device into the VM. And then from inside of the VM, we're going to encrypt and format

07:42.840 --> 07:49.560
this device, right? And so here, one important challenge is going to be securing this interface

07:49.560 --> 07:54.920
between the VM and, you know, outside of the VM, right, with the, with the kind of runtime,

07:55.080 --> 08:01.160
because we're injecting a VM inside that devices at that VM, right? And so we want to make

08:01.160 --> 08:07.320
sure that the interfaces is protected, right? And we'll do this through the container security policy.

08:09.080 --> 08:14.280
Now this is the same diagram that I showed two slides ago, so we have to basic

08:15.000 --> 08:22.280
container lifecycle, right? And so again, you started by creating this container spec, right?

08:23.240 --> 08:28.120
And so here, with our confidential storage type, it's a new storage type, right,

08:28.120 --> 08:33.640
that we're introducing into the control plane, right? And to implement a new storage type

08:33.640 --> 08:38.760
in Kubernetes, you want to implement something that's called a container storage interface driver.

08:38.760 --> 08:45.080
So it's CSI driver, right? And so, you create a container spec. You put a reference to your CSI

08:45.080 --> 08:49.480
div in it, right, for your storage. Then you send the spec to your keyboard, the queue is going to

08:49.480 --> 08:55.000
see that reference to your CSI driver, right? It's going to call it to your driver. In this case,

08:55.000 --> 09:00.840
our driver is, you know, create the temporary block device, right? Send our block device to the

09:00.840 --> 09:06.600
counter runtime. And then the counter runtime will virtualize that device as a photo block device

09:06.600 --> 09:13.880
into the VM, right? And then also send some metadata about it down to the catty agent over our PC,

09:13.960 --> 09:19.880
right? Now, when the catty agent sees your block device with that, you know,

09:19.880 --> 09:25.480
custom reference, it knows it's a confidential storage, right? It's going to call it to this component

09:25.480 --> 09:31.960
called the confidential data hub, the CDH. And it's in this CDH that we're going to first generate

09:31.960 --> 09:37.000
a random encryption key. We're going to use that key to encrypt the device, right? And then we're

09:37.000 --> 09:43.400
going to format that device and finally mount it into the container, right? And so,

09:43.880 --> 09:49.480
one property here is that, you know, the key is generated inside of the VM, the VM's encrypted,

09:49.480 --> 09:54.680
right? The key does not lead to VM. And the key gets destroyed with the VM, right? So without this,

09:54.680 --> 10:01.400
we can guarantee that the data is only, you know, accessible to the confidential VM here, right?

10:03.400 --> 10:07.720
Now, this is the last, the last time this is done, right? So it's not going to get any more

10:07.880 --> 10:15.880
complicated in this. The last aspect is going to be the security policy. So, you know, to

10:15.880 --> 10:20.600
security, the interface right between the VM and the outside. So, one key thing that you have

10:20.600 --> 10:25.640
to understand about Kubernetes, right? So, when you create this container spec, when you send

10:25.640 --> 10:32.200
your YAMO file over to the cubelet, the cubelet is not actually going to send the YAMO spec

10:32.200 --> 10:37.720
as it is to the kind of agent. It's actually going to transform augment that that spec into

10:37.720 --> 10:44.520
a much lower level spec, right? And it's this lower level spec that will actually be executed

10:44.520 --> 10:55.560
by the, by the kind of agent. And so, in the kind of agent, when you receive that spec,

10:56.600 --> 11:01.480
you know, you're going to execute it like the lower level spec, but because remember that

11:01.560 --> 11:05.480
all those components outside of the VM are entrusted, they can still temper with the spec,

11:05.480 --> 11:11.640
right? Between the moment where you create a container spec and the moment where it reaches the

11:11.640 --> 11:17.240
the kind of agent, right? And so, you want to protect this lower level spec, you want to guard it, right?

11:17.240 --> 11:23.000
And we do this with a tool that we call the policy generator. And so, this policy generator is going

11:23.000 --> 11:29.960
to take your YAMO spec. It's going to generate a security policy that maps one to one to your lower

11:30.040 --> 11:36.200
level spec, right? Re-inject this policy into the YAMO spec. And then we're going to send both

11:36.760 --> 11:43.720
the YAMO spec and the security policy into the VM, right? And then once it reaches the

11:43.720 --> 11:51.080
the kind of agent, we're going to enforce that the, that the YAMO spec and the security policy match,

11:51.080 --> 11:57.560
right? And if they don't, we're going to reject the container creation request, right? And so,

11:57.640 --> 12:02.120
this is pretty important in our design because, as I was saying before, we're injecting a device

12:02.120 --> 12:07.240
into the VM, right? And so, we also want to make sure that that device, the metadata, even included

12:07.240 --> 12:12.200
without device is not temperate with, right? And so, we're going to include the device as part of

12:12.200 --> 12:19.400
the security policy as well by modifying the policy generator, right? And so, this was one half of it.

12:20.360 --> 12:27.080
So, if you notice here, the, you know, we've enforced the policy inside a kind of agent, right?

12:27.160 --> 12:31.400
But the policy is still not trusted at this stage, right? Because it's still coming from inside the VM,

12:31.400 --> 12:37.080
right? So now, we want to ensure the trustworthiness of the policy itself, right? And we do this

12:37.080 --> 12:43.080
through the outestation process. And so, here, the way that it's going to work is that after the

12:43.080 --> 12:50.120
kind of agent has enforced the policy, it can, you know, connect this outestation agent here, right?

12:50.120 --> 12:56.040
And we will have a remote and trusted outestation service on the right here. And this outestation

12:56.040 --> 13:02.280
service is going to have a reference security policy. So, the ground truth for your policy, right?

13:02.280 --> 13:08.760
And so, then the, the outestation agent will send the container security policy, right?

13:08.760 --> 13:15.240
Fetch it from the T, send it over to the outestation service, the outestation service is going to,

13:15.240 --> 13:19.960
you know, compare the policy that receives and the policy that it stores, I'd refer to security

13:19.960 --> 13:26.200
policy, right? And it's going to send that result over to the, to the outestation agent, right?

13:26.200 --> 13:32.680
And if that process passes successfully, then you can guarantee that your policy is trusted,

13:33.400 --> 13:37.960
hence your container is trusted, and hence your device is trusted, right? And so, that closes

13:37.960 --> 13:44.200
the loop here on a confidential referral storage implementation, right? Because we've, you know,

13:45.160 --> 13:53.080
ensure trust readiness and to end here, right? So, that was it on the, that the only part, so,

13:53.080 --> 13:58.600
you know, now we can take a breath a little bit. I'm going to show a very quick demo in a shell here,

13:58.600 --> 14:03.720
where we're going to deploy a spec, so I'll go look at the spec, we'll deploy it and we'll

14:03.800 --> 14:13.640
play with it a little bit, right? So, let's look at the spec we have, you know, the Yama spec,

14:13.640 --> 14:17.960
right, a bunch of metadata up top, the name of the container, you're going to create my app,

14:19.080 --> 14:24.840
then I, you know, generated the policy before, the, the talk here, I'd just have some time.

14:24.840 --> 14:29.000
So, it's basically to phone code it, it's pretty large, so most of it is going to be truncated,

14:29.000 --> 14:33.400
right? But that's how it appears in the spec as in the annotation. And then we make a

14:33.400 --> 14:40.600
reference to our custom storage site by the CSI driver, which is called Coco, local CSI here,

14:40.600 --> 14:46.840
we clean about 10 gigs of storage, and then we mount this storage inside a container on slash

14:46.840 --> 14:55.320
mount slash encrypted, right? Now, we can deploy this into the cluster with acute cutout

14:55.320 --> 15:00.680
apply, the container was created. Now, because this is a test environment, right? We can

15:00.680 --> 15:05.560
execute it into the container, and then we can list the faster than that we just mounted.

15:05.560 --> 15:11.240
So, from the right to left here, it's not in on slash management and such encrypted, like I said before,

15:11.240 --> 15:18.200
we have about 10 gigs of storage, it's an x-hole fast system, and then here we have a virtual

15:18.200 --> 15:23.080
device, right? Because the original host device was, was encrypted, right? And so here,

15:23.080 --> 15:28.120
we used a DM quit and DM integrity, which will create a virtual device, they're not

15:28.120 --> 15:33.880
part of the device, right? And then we can see the into this folder, it's going to be empty

15:33.880 --> 15:38.600
at first, right? Because we just created it, which is for metadata, right? And then you can write

15:38.600 --> 15:43.800
some very sensitive and complex payload into that storage, and then you can read back from

15:43.800 --> 15:49.480
it, right? And so here, I wanted to show that from the perspective of the container, right?

15:49.480 --> 15:54.680
For you as a container developer, this encryption layer is totally transparent, right? So,

15:54.680 --> 15:59.720
the container does not have to do any setup, you need to configure the encryption

15:59.720 --> 16:05.880
settings and so forth, right? Only have to do is specify that CSI driver in your container spec, right?

16:07.400 --> 16:14.120
Now, really quickly on persistent storage, there's, I'm going to show one design, you know,

16:14.120 --> 16:19.960
there's a few designs that I'm thinking about, that fulfill different goals, right? The good thing

16:19.960 --> 16:25.160
is that the design, if you understand the firmware storage, the design is very close, there's only

16:25.160 --> 16:33.960
two components that change the CSI driver, and then the, those are the old slides, never mind,

16:33.960 --> 16:39.320
the CSI driver, and then the QR code service. So, this was the firmware storage, this is

16:39.320 --> 16:45.160
persistent storage, right? So, two key changes here, right? And the CSI driver, we're not any

16:45.160 --> 16:50.840
more going to create a new block device, right? We're actually going to get some storage that's

16:50.840 --> 16:57.320
pre-provision, right, from somewhere in the cloud most likely, right? And so, this bug device is

16:57.320 --> 17:02.200
going to be pre-provision, it's already going to be encrypted, and it's already going to have

17:02.200 --> 17:08.760
data in it, right? And so, now when we get into the, the confidential data hub, we're not going

17:08.760 --> 17:12.920
to generate a random key anymore, we need to get a key from somewhere, right, to decrypt,

17:13.880 --> 17:19.960
forgive me, the storage I was already created, right? And so, we're going to get a key from

17:19.960 --> 17:25.160
this key blocker service here, and the adecision clearance is very similar to before, right? So,

17:25.160 --> 17:30.360
we triggered the adecision agent, the adecision agent connects to the key blocker service,

17:30.360 --> 17:37.640
the KBS here, right? And then the KBS itself is going to perform the, the adecision, right? And then

17:37.640 --> 17:44.120
if adecision passes, the key blocker service will release a key to the adecision agent,

17:44.120 --> 17:50.280
and then the CDH can decrypt the storage, and then, you know, expose it to the container.

17:50.280 --> 17:53.800
And now, yeah, that was that was it folks, so the next step, so it didn't be for me to

17:53.800 --> 17:58.200
kind of merge that, if anyone wants to HPR, trying to figure out this persistent storage stuff,

17:58.200 --> 18:01.480
and if you have some ideas, you're going to come and talk to me after this, like, please,

18:01.480 --> 18:06.280
for, for, yeah, I'm going to open to you any ideas. Thank you so much, folks. Thank you.

18:10.680 --> 18:14.680
Thank you, thank you, thank you. So, we have two minutes, we'll take one question, I think, one, it's

18:15.000 --> 18:36.520
a question. So, you're asking this for a fan of storage, you're often putting that in that stuff that is known.

18:36.520 --> 18:44.600
Right, no, no, by the way, I think that's the assumption. So, you're saying,

18:44.600 --> 18:49.160
for a fan of storage? Yeah, okay, this is why it's the case, what I said it.

18:49.160 --> 18:54.200
So, I ask you, is that a statement? That part is the statement, the question is, how do you

18:54.200 --> 18:58.280
move together to get a fan of storage? Okay, so let me tell you another question. So, the question

18:58.280 --> 19:05.400
is, for a fan of storage, you, when you know that there is something that is the most

19:05.400 --> 19:09.880
noise that you know about. Yeah. So, you have a non-linked space, for instance, let's say,

19:10.200 --> 19:13.960
container, you may show something like that. Right. How will you mitigate it, you're going to

19:13.960 --> 19:19.960
integrate the text and the text? So, for example, container images are not stored in

19:19.960 --> 19:26.600
this a formal storage, right? This is just for data storage, right? So, when you're saying,

19:26.600 --> 19:33.240
if we have more content in that, if we're more storage, I guess I do still want to send a

19:33.240 --> 19:43.480
question because, well, that's storage going to be encrypted, right? Right. If you know what is inside,

19:44.520 --> 19:54.760
then what are you talking about? We play the text, there is an evite. No text, no text, no text, no text.

19:54.760 --> 20:01.320
Yeah, I'm not sure, I know that there is, you know, for example, we use a gene integrity, right?

20:01.320 --> 20:05.080
And I know that's still available to replay the text, right? That's something we were about.

20:07.080 --> 20:11.320
That's something that we're working on right and we're evaluating different solutions, but yeah,

20:11.320 --> 20:15.320
right now, you know, we're just getting started on this, right? So, we don't really have a,

20:15.320 --> 20:19.720
I don't really have a good answer to give you, if you will, can you? Okay, that was it folks.

20:19.720 --> 20:23.320
Thank you so much, thank you for your time.

