Scalable container-based CI with GitLab and AWS Fargate

We've been using GitLab Community Edition for a long time to host most of our source control. GitLab CE, for those who are not familiar, is the self-hosted version of GitLab.com. It's free and open-source, and we like to self-host because it also allows us to be confident our work is well managed and secure. By combining this powerful tool with AWS Fargate containerization we can deliver fast, secure, scalable and cost-effective automation and DevOps to any IT project.

Part one

At Code Enigma, our standard approach for all our clients is to create a GitLab CE instance when we onboard them, usually with AWS EC2, and use that for source control and orchestration. By leveraging AWS Fargate to deploy containerized automation with almost no lights-on costs, we achieve a complete DevOps solution that pulls from the best parts of both private innovation and open-source software. The key benefits of this solution are:

Low lights-on cost
Low maintenance
Highly scalable
Far stronger network security than other CI platforms
Data sovereignty
Superior back-up capability
Unparalleled flexibility

This article is split across two parts, the first part (this post) introduces the technology involved and provides further reading. The reading is quite hands-on and technically detailed, so if you like what you see but don't have the time to dig in yourself, don't hesitate to contact us.

The second part is intended for those who have completed the GitLab proof of concept and want to understand more about more complicated setups.

Introducing GitLab CI

GitLab CE, just like the online platform and the paid Enterprise version, has a whole load of tools to help developers work more efficiently, from the basic features of Git version control itself through to user interfaces for change management, issue queues, labelling, project management tools, deployment tools and much more. For the purposes of this blog, we're going to zero in on a piece of software called GitLab Runner. This is the engine behind GitLab's built-in continuous integration (CI) services.

You can find out more about GitLab and how it handles CI in their documentation, but if you've used Travis or CircleCI or GitHub Actions or or or... you get the picture... it should be a familiar model. A YAML file in the repository root containing crafted instructions to carry out automated tasks on certain trigger events.

So what is the runner? It is a standalone piece of software that sits there polling for jobs from one (or more) GitLab instances over the GitLab API. The GitLab Runner application itself can be running multiple "runners" and it is up to GitLab and your configuration which runners are available to which projects, whether they are public or private, and how and where they operate.

When a runner picks up a job it runs the automation using whatever environment is configured for that runner. Typically this will be GitLab's default shell runner, which just uses the Linux shell of whatever server it is running on to execute the automation (note, GitLab Runner does not have to be on the GitLab server itself, in fact often you wouldn't want that). However, GitLab Runner has an architecture that supports plugins which they call "custom executors". Consequently, there are several available "executors" for GitLab Runner available on their website or built into the platform.

This blog looks specifically at using the AWS Fargate "custom executor" for GitLab to set up scalable and cost-effective automation and continuous integration.

What is Fargate?

Simply put, Fargate is the AWS platform for automatically managing container compute capacity so you don't have to. Usually, with containerization, you need to have some underlying infrastructure ready to take workloads handed to it by whatever container management software you're using. Platforms like Fargate take away the need to keep available and maintain (thus pay for) compute resource all the time. Let's work through a simple example.

You have a busy development team and you want your CI to run in containers so there's no queuing of jobs, no developers waiting around for somebody else's automated testing to complete before their build can run. You set up a couple of servers to run your containerization software, and probably another one to handle the actual orchestration (which workload goes to which server/cluster, etc.), then you create your CI container(s) and set up your CI software to deploy to a container instead of queuing up to use a local shell script.

Fast forward several weeks of set-up and hurrah, whenever a developer pushes some code a new container gets created in your self-managed cluster, the automation magic happens inside it and it disappears again. This is great, your code deployments won't get in the way of anything else, they're separated and efficient and when nobody's doing anything they're not taking up resource. Containerization ftw!

But it's still a bit of a pain. For one, your servers are always on, even when your developers aren't there, which is 75% of the hours in a week! And someone on your team has to patch them, and if they break someone needs to go and figure out why, and while they're figuring it out everything is broken. Managing your own containers isn't that straightforward and that's where platforms like Fargate come in.

They manage all that stuff for you, your software just asks, via the AWS API, for a container of specified dimensions using specified image and a few minutes later you have it, until it's finished whatever it's doing and it automatically shuts down again. You only paid for the time that container was up and doing something. So when your developers go home for the evening or the weekend, there's no infrastructure sitting there running for literally no reason at all. Indeed, even when your developers are coding and not deploying, there's no infrastructure sitting there running for literally no reason at all.

There's also nothing for you to maintain except your GitLab server, which is small beer, it's just a single server with two Linux packages installed. Much easier to maintain, much less wasteful.

Getting started with GitLab and AWS Fargate

If you're not interested in getting your hands dirty you can stop here and hand it over to a colleague, or perhaps give Code Enigma a call. Because from here on out we stop talking about it and start getting into the guts of setting it up.

If you're still with me, firstly, homework! Sorry, but there's some reading up to do and some practical to get into. Fortunately, the GitLab team have already written some pretty good documentation and a proof of concept to get you up and running. If you simply follow the instructions in this blog post, you'll be a long way to understanding and setting up your own Fargate platformed CI.

I'd also encourage you to combine the GitLab documentation above with this Medium blog, which contains a lot more technical specifics and detail, before continuing, as it will fill in blanks around the overall architecture. Note, it is based on the premise of using the GitLab.com platform instead of self-hosted GitLab, so there are parts that you won't need, but the principles are the same and all the stuff about how it works and shipping your own container and GitLab Runner config are all valid.

And before you try and get clever you should definitely work through the GitLab documentation above and set up a runner from end to end, working at Fargate and tested with your own instances of GitLab and GitLab Runner. You'll discover any issues as you go, you'll have to learn a bit more about AWS and Docker to get going, and the learning will stand you in good stead for the slightly trickier next steps.

You should also get comfortable with building container images with Docker. It's going to be important you understand this, fortunately, there's the Docker documentation to help you with image building, and it's pretty good.

Once you've gotten yourself up to speed on all that you'll be in a good place, however, there is quite a lot missing from this that you need for a "real world" set-up to make your containerized automation work. That's why I'm writing this blog - to fill in the blanks. So bookmark this page, go away and get yourself read up and set up and come back. We'll see you in a couple of days for part 2. ;-)

Click here to continue to part 2 of this blog.

Part one

Introducing GitLab CI

What is Fargate?

Getting started with GitLab and AWS Fargate

Otherwise, let's start your project together