Hardening your cluster

Language Icon 1 month ago · 11 min read
Server v4.8 Server Admin
Contribute Go to Code

This section provides supplemental information on hardening your Kubernetes cluster.

Network topology

A server installation basically runs three different type of compute instances: The Kubernetes nodes, Nomad clients, and external VMs.

Best practice is to make as many of the resources as private as possible. If users will access your CircleCI server installation via VPN, there is no need to assign any public IP addresses, as long as you have a NAT gateway setup. Otherwise, you will need at least one public subnet for the circleci-proxy load balancer.

It is also recommended to place Nomad clients and VMs in a public subnet to enable users to SSH into jobs and scope access via networking rules.

An nginx reverse proxy is placed in front of Kong and exposed as a Kubernetes service named circleci-proxy. nginx is responsible routing the traffic to the following services: kong and nomad.
When using Amazon Certificate Manager (ACM), the name of the nginx service will be circleci-proxy-acm instead of circleci-proxy. If you have switched from some other method of handling your TLS certificates to using ACM, this change will recreate the load balancer and you will have to reroute your associated DNS records for your <domain> and app.<domain>.
When using Nomad, clients and servers should be configured to use MTLS for secure communication.

Network traffic

This section explains the minimum requirements for a server installation to work. Depending on your workloads, you might need to add additional rules to egress for Nomad clients and VMs. As nomenclature between cloud providers differs, you will probably need to implement these rules using firewall rules and/or security groups.

Where you see "external," this usually means all external IPv4 addresses. Depending on your particular setup, you might be able to be more specific (for example, if you are using a proxy for all external traffic).

The rules explained here are assumed to be stateful and for TCP connections only, unless stated otherwise. If you are working with stateless rules, you need to create matching ingress or egress rules for the ones listed here.

Reverse proxy status

You may wish to check the status of the services routing traffic in your CircleCI server installation and alert if there are any issues. Since we use both nginx and Kong in CircleCI server, we expose the status pages of both via port 80.

Service Endpoint

nginx

/nginx_status

Kong

/kong_status

Kubernetes load balancers

Depending on your setup, your load balancers might be transparent (that is, they are not treated as a distinct layer in your networking topology). In this case, you can apply the rules from this section directly to the underlying destination or source of the network traffic. Refer to the documentation of your cloud provider to make sure you understand how to correctly apply networking security rules, given the type of load balancing used by your installation.

Ingress

If the traffic rules for your load balancers have not been created automatically, here are their respective ports:

Name Port Source Purpose

circleci-proxy/-acm

80

External

User interface and frontend API

circleci-proxy/-acm

443

External

User interface and frontend API

circleci-proxy/-acm

3000

Nomad clients

Communication with Nomad clients

circleci-proxy/-acm

4647

Nomad clients

Communication with Nomad clients

circleci-proxy/-acm

8585

Nomad clients

Communication with Nomad clients

Egress

The only egress needed is for TCP traffic to the Kubernetes nodes on the Kubernetes load balancer ports (30000-32767). This egress is not needed if your load balancers are transparent.

Common rules for compute instances

These rules apply to all compute instances, but not to the load balancers.

Ingress

If you want to access your instances using SSH, you will need to open port 22 for TCP connections for the instances in question. It is recommended to scope the rule as closely as possible to allowed source IP addresses and/or only add such a rule when needed.

Egress

You most likely want all of your instances to access internet resources. This requires allowing egress for UDP and TCP on port 53 to the DNS server within your VPC, and TCP ports 80 and 443 for HTTP and HTTPS traffic. Instances building jobs (that is, the Nomad clients and external VMs) also will likely need to pull code from your VCS using SSH (TCP port 22). SSH is also used to communicate with external VMs, and should be allowed for all instances with the destination of the VM subnet and your VCS.

Kubernetes nodes

Intra-node traffic

By default, the traffic within your Kubernetes cluster is regulated by networking policies. This should be sufficient to regulate the traffic between pods. No additional requirement are needed to reduce traffic between Kubernetes nodes any further (it is fine to allow all traffic between Kubernetes nodes).

To make use of networking policies within your cluster, you may need to take additional steps, depending on your cloud provider and setup. Here are some resources to get you started:

Ingress

If you are using a managed service, you can check the rules created for the traffic coming from the load balancers and the allowed port range. The standard port range for Kubernetes load balancers (30000-32767) should be all that is needed here for ingress. If you are using transparent load balancers, you need to apply the ingress rules listed for load balancers above.

Egress

Port Destination Purpose

4647

Nomad clients

Communication with the Nomad clients

all traffic

other nodes

Allow intra-cluster traffic

Nomad clients

Nomad clients do not need to communicate with each other. You can block traffic between Nomad client instances completely.

Ingress

Port Source Purpose

4647

K8s nodes

Communication with Nomad server

64535-65535

External

Rerun jobs with SSH functionality

Egress

Port Destination Purpose

22

VMs

SSH communication with VMs

4647

Nomad Load Balancer

Internal communication

External VMs

Similar to Nomad clients, there is no need for external VMs to communicate with each other.

Ingress

Port Source Purpose

22

Kubernetes nodes

Internal communication

22

Nomad clients

Internal communication

2376

Kubernetes nodes

Internal communication

2376

Nomad clients

Internal communication

54782

External

Rerun jobs with SSH functionality

Egress

You will only need the egress rules for internet access and SSH for your VCS.

Notes on AWS networking with machine provisioner

When using the EC2 provider for machine provisioner, there is an assignPublicIP option available in the values.yaml file.

machine_provisioner:
  ...
  providers:
    ec2:
      ...
      assignPublicIP: false

By default, this option is set to false, meaning any instance created by machine provisioner will only be assigned a private IP address.

Private IP addresses only

When the assignPublicIP option is set to false, restricting traffic with security group rules between services can be done using the Source Security Group ID parameter.

Within the ingress rules of the VM security group, the following rules can be created to harden your installation:

Port Origin Purpose

54782

CIDR range of your choice

Allows users to SSH into failed virtual machine based jobs and to retry and debug

Using public IP addresses

When the assignPublicIP option is set to true, all EC2 instances created by machine provisioner are assigned public IPv4 addresses. Also, all services communicating with them do so via their public addresses.

When hardening an installation where the machine provisioner uses public IP addresses, the following rules can be created:

Port Origin Purpose

54782

CIDR range of your choice

Allows users to SSH into failed virtual machine based jobs to retry and debug.