Getting an FTP server running on Kubernetes is a little tricky. The FTP service uses multiple ports in its negotation and you need to make sure that the the conversation always connects to the same Kubernetes pod. FTP is not a great choice nowadays, SFTP is better, but in this case, I needed to build a server that would handle uploads from devices that were not that configurable, so it had to be plain FTP to get going. Much better though to secure it with tls if possible.

In addition I wanted to write more or less directly to cloud storage (for backup) and use pubsub to initiate a workflow based on the uploaded data.

FTP server on Node

I’m not using a pre baked server because of all the reasons I mentioned earlier, so instead I chose this https://github.com/trs/ftp-srv  mainly because

  • You can create a custom file system, which allows me to create one based on cloud storage
  • FTP is not secure, so I wanted to lock it down as much as possible – limiting it to only being able to process small uploads, and nothing else.
  • ftp-srv supports tls, although I’m not yet using it in this proof of concept

The code for this entire project is at https://github.com/brucemcpherson/sensor

Kubernetes cluster

For testing, I’m using a preemptible cluster to keep the costs down. I always find its easier to create kubernetes stuff through gcloud and kubectl rather than using the UI in the cloud console. Like that you can easily repeat it.  So first, let’s get a small cluster up and running in your cloud project

Set up credentials and check cluster looks ok

Assign yourself as admin

Staging, dev, production

I generally have 3 different environments for testing locally, testing in docker and kubernetes, and a final version. These will (potentially) each have different parameters and settings, so the first task is to create a settings file (secrets.js) which will be shared between all apps in this project and which partitions the parameters into local, development and production. These will be labelled ‘tl’,’td’ and ‘tp’. It looks something like this

and each of tp,td and tl contain stuff like this. I’ll get into the details of these settings when they are needed. I always find it’s best to start with this kind of structure to avoid refactoring later, and making it easy to add (or remove) different environments.

Whitelisting

ftp-srv gives the ability to whitelist only certain ftp directives. Since I only want to upload files, and nothing more, this is the minimum I can get away with.

Yarn/npm

Because multiple apps will make up this project, I have a top level package.json for things that are used in every app, then a specific for each app.

common package.json

ftp app package.json

 

Ftp app

I won’t replicate all the code here as it’s on github, but just highlight and explain a few pieces

Event handling

The only client/server events that need handling are  login, and client-error.  Login checks the user/password combination and then goes on to upload the file. When the upload is done, I also need to handle sending a pubsub message (the pubsub code is on github), consisting of some control information and the contents of the file just uploaded.

The custom file system

ftp-srv allows the creation of a custom file system. The only method that needs to be overridden is write – which is called to handle streaming of data as it arrives, since all other operations are not whitelisted. Rather than writing it to a file, which is the normal action, I need to stream it to cloud storage. (As it turned out, ftp-srv had some problem that made its stream incomptible with cloud storage streaming, so in practice, the file is temporarily written to the container’s local storage, then that file is streamed to cloud storage and finally deleted)

Deployment

kubectl apply -f deploy.yaml

This is set up to use ports 18101 – 18104 for passive ftp. The number of ports in use will define how many simulataneous uploads can happen. The range is passed to the ftp app via an environment variable. Note the selector app=ftptd. This will be used by the service that exposes the pods of this deployment externally as a service target.

The PASVURL environment variable defines the externally facing ip address that the loadbalancer service, to be created next, will assign. We don’t know it at this point, but once assigned it can just be patched in here and the .yaml file reapplied.

The external loadbalancer

kubectl apply -f svc.yaml

The deployment pods need a service to expose them externally. For simplicity, I’m using the same ports here as the pods, but if required these could be forwarded to different target ports.

Because passive ftp is not stateless, it’s important that clients always connect to the same pod instance. Using sessionAffinity: ClientIP means that the same client will connect to the same pod with second and subsequent requests. The client also checks that the IP matches the one it supplied, so we need to set externalTrafficPolicy: Local to avoid the ip address being translated from an external to an internal one in Passive mode.

After a little while, kubectl get service will show an actual external ip address for the service (it will show <pending> for while). Copy this external address into the deploy.yaml

and redo

kubectl apply -f deploy.yaml

Building the app

I prefer to build locally rather than using cloud build, as it makes it easier to test the docker image locally before trying it on the cluster.

This build script takes 2 arguments – the name of the app plus the run mode , builds the image, tags it (note that the images are tagged with the run mode to allow different images to be used in development versus production on the same cluster), pushes it to the cloud container service, then deletes the matching pods allowing them to be recreated with the updated image. You could further enhance the tagging with version number if necessary – but the tag should match the tagged image name in the deployment yaml.

sh build.sh ftp td

The docker file

Since the environment variables are set up in the yaml files, this can be very minimal.

running locally

If you want to test the docker image locally (any simple ftp client should do – I’m using ncftp), this script should do it

with an env.list file of

Alternatively, you can run it completely locally with node index

Private files

I haven’t published these, but you need 2 files in common/private

  • A service account file with the capability to write to storage and pubsub, exactly as downloaded from cloud console. You can specify these through env variables, but I prefer to do it this way
  • A user file with username passwords that looks like this. There are much better ways to handle passwords, but this will do to get started with
These are referenced in the gcp and user properties in the settings file.

Pubsub

In this article, I won’t go into consuming the messages sent, but we need a topic and a subscription, which will probably be different between run modes. These are referenced in the settings file with the abstracted names ‘dataArrived’ and ‘dataReady’. Ftp publishes to the dataArrived topic, and any consumers will subscribe to ‘dataReady’.

Setting up topics and subscriptions through the UI can be error prone and laborious, so here’s a script to bulk delete and create the subscriptions needed for this project.

Summary

It’s not great to have to deal with ftp nowadays, but there we have it – ftp on kubernetes.

Source code is here https://github.com/brucemcpherson/sensor