WEBVTT

00:00.000 --> 00:10.000
Hello everybody here at the Lightning Talks at Boston here in Brussels.

00:10.000 --> 00:18.280
I want to introduce to you Sybil Savon and he's talking about an SSH Proxy Health Load Balance

00:18.280 --> 00:19.280
SSH.

00:19.280 --> 00:29.520
Give him a warm welcome and enjoy the talk.

00:29.520 --> 00:32.280
Thank you for coming.

00:32.280 --> 00:39.280
So first, a bit of a preemble, I'm working at CEA in France.

00:39.280 --> 00:46.360
It's a national research institute, we are about 20,000.

00:46.360 --> 00:53.600
We work on fundamental research to industry.

00:53.600 --> 01:01.040
We work on medicine, on energy, on defense.

01:01.040 --> 01:10.320
All these subjects need a lot of compute power, so we have many supercomputers.

01:10.320 --> 01:23.280
Our need was to enable all the users to access these computers.

01:23.280 --> 01:31.280
The problem is we have many nodes, logging nodes on each supercomputer.

01:31.280 --> 01:34.760
Those nodes are on a private network.

01:34.760 --> 01:44.400
The users are on a public network, so we have to have a gateway in between.

01:44.400 --> 01:48.160
People need to be able to connect with SSH.

01:48.240 --> 01:51.240
I guess many of you know what is SSH.

01:51.240 --> 01:59.680
It's a remote command line connection and of course secure.

01:59.680 --> 02:06.600
So we had a solution in 2014 by Arno Gignar.

02:06.600 --> 02:10.200
He made SSH Proxy, the first version of it.

02:10.240 --> 02:19.440
It's written in Go, there is only one configuration file in YAML.

02:19.440 --> 02:24.320
It can handle a lot of connections.

02:24.320 --> 02:37.600
Currently, we have a thousand users simultaneously connected with 100,000-day connections.

02:37.600 --> 02:48.000
We are ready for 100 gigabits per second.

02:48.000 --> 02:53.680
I will just explain what we are doing at CA.

02:53.680 --> 03:06.200
I will show you a typical platform and we will detail the configuration file step-by-step.

03:06.200 --> 03:17.640
On this slide, you can see the final platform.

03:17.640 --> 03:26.520
We just start with the external network, the internal network, the gateway is in between.

03:26.520 --> 03:31.480
The user wants to connect on the internal network.

03:31.480 --> 03:40.560
He does, for example, SSH, she may, he runs on the gateway which runs open SSH.

03:40.560 --> 03:48.480
In open SSH, we have just one line in the configuration file, the first command one,

03:48.480 --> 03:56.920
who says just launch SSH Proxy, SSH Proxy, then we will choose a destination node for

03:56.920 --> 04:04.200
the user and we will connect him by launching an SSH client.

04:04.200 --> 04:13.000
In fact, it's really easy, SSH Proxy takes the in from one size and plugs it in on the

04:13.000 --> 04:20.640
out of the other side and the same out on the in.

04:20.640 --> 04:30.080
Of course, such a simple platform, we have this simple file, just two lines, we define

04:30.080 --> 04:37.360
the name of the service, it's not even necessary, but it's always good to have a name.

04:37.360 --> 04:45.880
And you set an array of destinations, here we have one destination, easy, okay, no we

04:45.880 --> 04:55.560
have two destinations, not really complicated, we can choose how the destination is chosen

04:55.560 --> 05:04.680
with root select, here by default is it's random and right now if the user connects

05:04.680 --> 05:13.080
a second timer, the second connection will also be chosen randomly, so he may land on blue

05:13.080 --> 05:23.600
or on red, it's totally random, it's not perfect, so we add something for it to be state

05:23.600 --> 05:33.000
full, we use it is CD, which is a distributed database, it's just three lines on the

05:33.000 --> 05:44.640
configuration file, no, we have a state full proxy, so if the user connects a second time,

05:44.640 --> 05:51.280
then he will always land on the same login node with the default configuration, we will

05:51.280 --> 06:02.200
see later, okay, so as the it CD is distributed, you can put as many gateways as you

06:02.200 --> 06:12.600
want, if you need some redundancies, some more throughput, if you want, you can add more.

06:12.600 --> 06:24.320
And finally, if you have more platforms, you just have to add some overrides to check

06:24.320 --> 06:36.200
the source IP, where you connect from, and depending on these IPs, you will land on the default

06:36.200 --> 06:47.600
platform or on the other, which is left here, this is exactly how we use it at CA with 1000

06:47.600 --> 06:53.920
of connected users.

06:53.920 --> 07:04.960
It's right now, it's really easy, we have many features, we will have a look at some,

07:04.960 --> 07:14.320
the first one, root select as we saw, can be random, but if you want it to be based on the

07:14.320 --> 07:24.720
active connections, you can also be based on the currently used bandwidth, so if you want

07:24.720 --> 07:33.760
to spread the different users connections, it may be interesting, the random one works

07:33.760 --> 07:41.520
really generally, then you can choose the model, which can be balanced or sticky, by default

07:41.520 --> 07:50.800
it's sticky, but if you want someone to land on different nodes each time he connects,

07:50.800 --> 07:58.280
you choose balanced, it can be useful when you make parallel transfers and not just a simple

07:58.280 --> 08:00.080
connection.

08:00.080 --> 08:08.480
And finally, an option we added, because it had to be added, the max connection per user,

08:08.480 --> 08:17.920
you can easily imagine a scenario where a user has a robot connecting every 10 seconds and

08:17.920 --> 08:25.120
you can have problems really quickly, so with this we can avoid these problems.

08:25.120 --> 08:34.640
So here is an example of the sticky and balanced configurations, by default, here we have

08:34.720 --> 08:43.040
a sticky connection, so let's say the six connections are made sequentially, so the first

08:43.040 --> 08:52.160
user connects to two shimei, okay, he runs on the red node, it's totally random, then he connects

08:52.160 --> 08:57.200
a second time, then it's not random because it's sticky, he will land on the same, he connects

08:57.200 --> 09:04.560
a third time, okay, he will land on the red also, then another user connects on the

09:04.640 --> 09:16.880
shimei balanced, then he will, he will land on one of the nodes, it's chosen by connections,

09:16.880 --> 09:27.280
as you can see on the last line, and so he SSH proxy, chooses the nodes with the last connections

09:28.800 --> 09:34.480
established, so it can be blue, triple or golden, so it will be random between

09:34.480 --> 09:44.000
these three, and then on each connection it will, it will land on the other nodes.

09:47.280 --> 09:52.960
No, that's, no, that we have a lot of connections established, we want to monitor them,

09:52.960 --> 10:02.240
so there is SSH proxy CTL which can show all the connections, all the state, all of all the

10:02.240 --> 10:11.440
login nodes, and it can be easily scriptable because you can output it in JSON or in CSV,

10:13.040 --> 10:20.720
and SSH proxy CTL can also, if you want, manipulate the state of the login nodes, so if you want

10:20.720 --> 10:29.040
to take a node out of production for maintenance, for example, you just have to SSH proxy CTL

10:29.120 --> 10:42.240
disabled the node and it's done. Finally, we already saw the override system with the new version

10:42.240 --> 10:54.800
2, which will be available soon, really soon, we can override all the options of SSH proxy,

10:54.880 --> 11:03.920
we can override them for a given service like we saw with much source, but we can also override

11:03.920 --> 11:16.400
it for specific users or the group of users, and we can combine all these overrides with all

11:16.400 --> 11:27.920
on end. There are applied from top to bottom, and each one is applied. For example, if I connect

11:28.640 --> 11:36.160
with this configuration, my name is Cyril, so I will match user Cyril and I will land on

11:36.240 --> 11:45.120
perier or badwa. But if someone else connects and is in the group parting, he will end also

11:45.120 --> 11:53.760
in perier or badwa, but if he's in the group sparkling and yeast, at first the SSH proxy will

11:53.760 --> 12:01.440
think he will land on perier and badwa, but then as there is another match later, he will finally end

12:01.440 --> 12:11.040
on the distinctions blue red report on Galvan. There are many more features I want to talk a lot

12:11.040 --> 12:19.600
about them. If you want to dive in all the other features or if you want to speak about SSH proxy,

12:20.400 --> 12:29.280
you can come, we can have a drink and talk about this. Thank you for listening.

