PoshJosh's Blog

Jenkins on AWS - Best practices

March 10, 2020

I’ve had some experience in this realm for a while now, but I’m having a little trouble with one issue in particular. Before I divulge, I’ll present my thoughts on best practice and and what I’ve been able to implement:

  • Terraform everything (in accordance to terragrunt’s “style guide” i.e. organization)

  • THIS IS A BIG ONE: for the jenkins master task, make sure to use the following args to make sure jenkins jobs aren’t super slow as hell to start:

-Djava.awt.headless=true -Dhudson.slaves.NodeProvisioner.initialDelay=0 -Dhudson.slaves.NodeProvisioner.MARGIN=50 -Dhudson.slaves.NodeProvisioner.MARGIN0=0.85

THIS IS A GAME CHANGER (more-so on k8s clusters when the ecs plugin isn’t used… hint, it’s shit).

  • Create an EFS (in a separate terraform module) and mount it to the jenkins ECS cluster at /var/jenkins_home. Makes jenkins much more reliable through outages and easier to upgrade.

  • Run a logging agent (via docker container) like logspout or newrelic or whatever IN USER_DATA and not as a task - that way you get logs if there are issues during user_data/cloud_init… this I’m actually not sure about. Running a container outside the context of an ECS task means the ECS agent can’t really track it and allocate mem/cpu properly… but it does help with user_data triage.

  • Use pipelines and git plugins to drive jobs. All jenkins jobs should be in source control!

  • Make sure you setup docker cleanup jobs on DAY 1! If you hace limited access to your cluster and you run out of disk due to docker cache, networks, volumes, etc… you’re screwed till the admin ssh’s in and runs a prune. Get a docker system prune going or the equivalent for each docker resource with appropriate filters… i.e. filter for anything older than a few days and is dangling.

  • Use Jenkins Global Libraries to make Jenkinsfiles cleaner (I always just use vars instead of groovy/java style packages because it’s easier and less ugly)

  • Jenkinsfiles should mostly call other bash files, make files, python scripts to generate and load prop files, etc. The less logic you put in a Jenkinsfile (which is just modified groovy) the better. String interpolation, among other things, is a fuckery that we don’t have time to triage.

  • (out-of-scope) Move to using k8s/EKS instead of ECS asap because the ECS plugin for jenkins is absolute shit and it doesn’t use priority correctly (sorry whoever developed it and… oh wait abandoned it and hasn’t merged anything for years… for for real it’s cool, just give admin to someone else).

  • (cultural) Stop calling them slaves. “Hey @eng, we’re rotating slaves due to some cache issues. If you have been affected by race conditions in that past, our new update and slave rotation should fix that. Our update may have killed your job that was running on an old slave, just wait a few and the new slaves will be ready” <—This just doesn’t look good.

Run your agents in a different cluster/ASG

  • Run the Jenkins agent at least as a different user than the controller if you’re running on the same server. They should have different levels of permissions.

  • Someone laid it out on another comment, but create both users, put them in the docker group, give them different home directories, and never run as root.

  • Also, if you put your agents in a separate ECS cluster and a different ASG, you can automatically prune and clean up your unneeded build artifacts during the natural course of scaling.

You will need to run the agent as privileged

  • Unless you don’t want to build docker images or invoke new processes within the agent.

  • Mount the hosts home directory and docker sock, and on the host create cgroups for net_cls, systemd… And I can’t remember the others. Basically, Jenkins agents will need to create new processes, so they must have extended access to the host. They need to be able to create and manage their processes. And to do that those cgroups need to exist. There like 5 of them, and I can’t remember what they are since I threw them in an ansible task and forgot about it.

Snapshot the master nightly

  • There is a way on the AWS console to set up nightly snapshots of an EBS volume. Make sure the master home volume is on its own EBS and backed up nightly.

  • You will need that snapshot if you want to move to a new AZ or region. Just spin up a new home volumes from snapshot, mount, and go.

  • You will only ever have 1 master instance.

References


Written byChinomso IkwuagwuExcélsior

Limited conversations with distributed systems.

Modifying legacy applications using domain driven design (DDD)

Gherkin Best Practices

Code Review Best Practices

Hacking Cypress in 9 minutes

Some common mistakes when developing java web applications

How to make a Spring Boot application production ready

SQL JOINS - A Refresher

Add Elasticsearch to Spring Boot Application

Add entities/tables to an existing Jhipster based project

CSS 3 Media Queries - All over again

Maven Dependency Convergence - quick reference

Amazon SNS Quick Reference

AWS API Gateway Quick Reference

Amazon SQS Quick Reference

AWS API Gateway Quick Reference

AWS Lambda Quick Reference

Amazon DynamoDB - Quick Reference

Amazon Aurora

Amazon Relational Database Service

AWS Database Services

AWS Security Essentials

Amazon Virtual Private Cloud Connectivity Options

Summary of AWS Services

AWS Certified Solutions Architect - Quick Reference

AWS CloudFront FAQs - Curated

AWS VPC FAQs - Curated

AWS EC2 FAQs - Curated

AWS Achritect 5 - Architecting for Cost Optimization

AWS Achritect 4 - Architecting for Performance Efficiency

AWS Achritect - 6 - Passing the Certification Exam

AWS Achitect 3 - Architecting for Operational Excellence

AWS Achitect 2 - Architecting for Security

AWS Achitect 1 - Architecting for Reliability

Amazon DynamoDB Accelerator (DAX)

Questions and Answers - AWS Certified Cloud Architect Associate

Questions and Answers - AWS Certified Cloud Architect Associate

AWS Connectivity - PrivateLink, VPC-Peering, Transit-gateway and Direct-connect

AWS - VPC peering vs PrivateLink

Designing Low Latency Systems

AWS EFS vs FSx

AWS Regions, Availability Zones and Local Zones

AWS VPC Endpoints and VPC Endpoint Services (AWS Private Link)

AWS - IP Addresses

AWS Elastic Network Interfaces

AWS Titbits

Jenkins on AWS - Automation

Jenkins on AWS - Setup

Jenkins on AWS - Best practices

Introduction to CIDR Blocks

AWS Lamda - Limitations and Use Cases

AWS Certified Solutions Architect Associate - Part 10 - Services and design scenarios

AWS Certified Solutions Architect Associate - Part 9 - Databases

AWS Certified Solutions Architect Associate - Part - 8 Application deployment

AWS Certified Solutions Architect Associate - Part 7 - Autoscaling and virtual network services

AWS Certified Solutions Architect Associate - Part 6 - Identity and access management

AWS Certified Solutions Architect Associate - Part 5 - Compute services design

AWS Certified Solutions Architect Associate - Part 4 - Virtual Private Cloud

AWS Certified Solutions Architect Associate - Part 3 - Storage services

AWS Certified Solutions Architect Associate - Part 2 - Introduction to Security

AWS Certified Solutions Architect Associate - Part 1 - Key services relating to the Exam

AWS Certifications - Part 1 - Certified solutions architect associate

AWS Virtual Private Cloud (VPC) Examples

Curated info on AWS Virtual Private Cloud (VPC)

Notes on Amazon Web Services 8 - Command Line Interface (CLI)

Notes on Amazon Web Services 7 - Elastic Beanstalk

Notes on Amazon Web Services 6 - Developer, Media, Migration, Productivity, IoT and Gaming

Notes on Amazon Web Services 5 - Security, Identity and Compliance

Notes on Amazon Web Services 4 - Analytics and Machine Learning

Notes on Amazon Web Services 3 - Managment Tools, App Integration and Customer Engagement

Notes on Amazon Web Services 2 - Storages databases compute and content delivery

Notes on Amazon Web Services 1 - Introduction

AWS Auto Scaling - All you need to know

AWS Load Balancers - How they work and differences between them

AWS EC2 Instance Types - Curated

Amazon Web Services - Identity and Access Management Primer

Amazon Web Services - Create IAM User

Preparing Jenkins after Installation

Jenkins titbits, and then some

Docker Titbits

How to Add Chat Functionality to a Maven Java Web App

Packer - an introduction

Terraform - an introduction

Versioning REST Resources with Spring Data REST

Installing and running Jenkins in Docker

Automate deployment of Jenkins to AWS - Part 2 - Full automation - Single EC2 instance

Automate deployment of Jenkins to AWS - Part 1 - Semi automation - Single EC2 instance

Introduction to Jenkins

Software Engineers Reference - Dictionary, Encyclopedia or Wiki - For Software Engineers