Buildkite - Getting Started with iOS Agents in the Cloud

June 3, 2021

Buildkite is a CI/CD tool that allows you to build the way you want. The easiest way to get started is to simply add a Buildkite Agent to your various environments (dev/qa/staging) and the agent uses your existing build tools with very little extra work. It’s easily installed with Homebrew, includes an attractive cloud-hosted UI, simplified pipelines, and has an impressive suite of plugins and integrations.

If you’re new to Buildkite and want a step-by-step guide to getting started, please read this blog by Dr. Nic Williams. It’s a long form blog that will walk you through everything you need to get started with using Buildkite for your CI/CD needs.

If you already have a Buildkite account, pipelines, etc., and want to learn some of the best practices for automating your agent environments and deploying them to the cloud, this blog is for you.

This blog walks you through the steps to:

Create your own base image or use a provided one from your chosen hosting solution
Automate the installation of your application dependencies
Set up the Buildkite agent to automatically run at startup.
Deploy images locally or to a cloud hosting solution.
Deploy MacOS AMIs to AWS with some tips, tricks, and best practices

iOS, Buildkite, and Automation!

In modern software development, writing tests has become the de facto standard for ensuring quality bug-free code. For iOS developers, this is especially important because the App Store review process can be laborious, taking days or longer, and App Store ratings and user reviews are critical to exposure and adoption.

Modern DevOps best practices encourage treating Infrastructure as Code (IaC). IaC is the management of infrastructure (networks, virtual machines, load balancers, and connection topology) in a descriptive model, using the same kind of versioning as the development team uses for source code (GitHub, etc). An IaC model generates an identical environment every time it is applied. It is a key DevOps practice and is used in conjunction with CI/CD, among other‌s.

Buildkite is quickly gaining ground as a preferred CI solution for iOS developers. Most commonly, Buildkite iOS pipelines have relied on developers running their own infrastructure for the agents, usually one or more mac minis set up somewhere in the office. This is a simple, and inexpensive long term solution for many companies, especially companies who already maintain onsite infrastructures of their own. But, with a large percentage of businesses migrating to the cloud, DevOps best practices are evolving to support that move. This includes treating infrastructure as ephemeral (pet vs. cattle), managing infrastructure as code, and automating all the things! (Testing, Provisioning, Building, Deploying)

Even Apple has changed their licensing restrictions to allow their operating systems to be virtualized as long as they are running on genuine Apple hardware. This has started to expand the iOS cloud hosting market with AWS recently launching hosted macOS images to complement some of the smaller mac-cloud businesses like Flow.Swiss and MacStadium.

In this blog, I will show you some tips and tricks of automating macOS image creation, getting your Buildkite agent up and running, and deploying the image either On-Prem, on AWS or on another cloud provider like Flow.Swiss.

Base Image Creation with MacInBox

If you are planning on running your VMs on your own mac infrastructure, or if you are planning to use a smaller mac cloud hosting service, you will have to create your own base image. Creating a macOS base image was not always an easy thing to do manually. MacOS was not written with virtualization in mind, and there are several manual tasks/GUI tasks that make it very hard to automate. Thankfully, there is a simple open-source gem called ‘macinbox’, which creates a vanilla base image from a downloaded macOS installer (Catalina, etc.) and stuffs it into a vagrant box.

I won’t go into the details of how to create the base image, as the macinbox readme has quite detailed instructions to walk you through it, but I will point out that the design philosophy of the tool is to do everything that needs to be done to a fresh install of macOS before the first boot to turn it into a Vagrant box that boots macOS with a seamless user experience. However, this tool is also intended to do the least amount of configuration possible. Nothing is done that could instead be deferred to a provisioning step in a Vagrantfile or Packer template. I chose to use a packer template because it’s a robust option that conforms to our best practices of creating infrastructure as code, and I’ll take you through that in the next section.

Build Automated Machine Images (AMIs) with HashiCorp’s Packer

Packer is a great tool for building AMIs. Out of the box Packer comes with support to build images for AWS, Docker, GCP, Microsoft Azure, VirtualBox, VMware, and more. I’m not going to do a deep dive into Packer, it’s a pretty straight forward tool with very good documentation. I will however go over the basics, as they pertain to setting up our macOS image for this scenario. There are definitely some tips and tricks that you will need to know in order to automate dependency installation within macOS. Hopefully, this section will cover most of the basics that will be needed for most macOS/iOS developers.‌

‌Packer templates can be written in JSON or HCL, we’ll be using JSON just because it is a more ubiquitous format. There are two main Blocks to be called; ‘builders’ and ‘provisioners’. The ‘builders’ block will tell Packer where the base image is located that we created with macinbox and tell it what type of image it is. In our case, I’ll be using the Parallels format running in a vagrant box. I personally like Parallels format because it uses the Mac hypervisor, which allows it to map its virtual resources directly to the actual physical hardware, which results in great performance.

The ‘provisioners’ block is where we will do all of the installations of our dependencies. As you will see, I prefer to create shell scripts in separate files, and then point the provisioner to run them instead of doing everything inline. It makes it easier for me to read, and easier for me to separate the different steps. To do it this way, it’s useful to be able to pass variables to the scripts, so that we can use a single variable to update or upgrade individual dependencies. I’ll show you how to pass those variables to the shell scripts as environmental variables.‌

‌Here is a simple base version of our Packer template, with a ‘provisioners’ block that does nothing but sleep for 30 seconds.

  "variables": {
    "buildkite_agent_token": "agent_token_goes_here"
  },
  "builders": [
    {
      "communicator": "ssh",
      "source_path": "base-images/catalina-base-image.box",
      "provider": "parallels",
      "add_force": true,
      "type": "vagrant"
    }
  ],
  "provisioners": [
        {
            "type": "shell",
            "inline": [
                "sleep 30"
            ]
        }
  ]

The first thing we will need to provision will be xcode command line tools. This is a necessary first step before we are able to install Homebrew. I am telling the provisioner to use a bash script called install-xcode-cli-tools.sh, found in the same directory which will check to see if the CLT’s are already installed, and if not, installs them.

{
  "type": "shell",
  "script": "./install-xcode-cli-tools.sh"
}

And, here is what the script looks like:

#!/bin/bash
echo "Checking Xcode CLI tools"
# Only run if the tools are not installed yet
# To check that try to print the SDK path
xcode-select -p &> /dev/null
if [ $? -ne 0 ]; then
  echo "Xcode CLI tools not found. Installing them..."
  touch /tmp/.com.apple.dt.CommandLineTools.installondemand.in-progress;
  PROD=$(softwareupdate -l |
    grep "\*.*Command Line" |
    head -n 1 | awk -F"*" '{print $2}' |
    sed -e 's/^ *//' |
    tr -d '\n')
  softwareupdate -i "$PROD" -v;
else
  echo "Xcode CLI tools OK"
fi

As most engineers who use macOS will already know, the next thing you’ll want to install will be Homebrew. This will give us the package manager that we’ll use to install most of the other components that we’ll need in the OS. Using the “environment_vars” option in the provisioners block, we will set an ENV variable called BUILDKITE_AGENT_TOKEN.

{
  "type": "shell",
  "environment_vars": "BUILDKITE_AGENT_TOKEN={{user `buildkite_agent_token`}}",
  "script": "./install-homebrew-dependencies.sh"
}

We’re going to need the Buildkite agent installed, along with any other Homebrew dependencies, and we’re going to want to set our agent token through a variable, and set our agent to start up with the OS.

#!/bin/bash
echo "Installing Homebrew and other Brew Dependencies"
#Install Homebrew and add it to path
/bin/bash -c  "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh) | /usr/bin/ruby"
#Install Buildkite Agent
/usr/local/bin/brew tap buildkite/buildkite
/usr/local/bin/brew install buildkite-agent
#This updates the Buildkite config with the agent token
sed -i '' "s/xxx/$BUILDKITE_AGENT_TOKEN/g" "$(/usr/local/bin/brew --prefix)"/etc/buildkite-agent/buildkite-agent.cfg
#Set Buildkite agent to start up when VM starts up
/usr/local/bin/brew services start buildkite/buildkite/buildkite-agent
#You can install any other homebrew dependencies here too
echo "Homebrew and Dependencies Installed"

If you’re doing iOS development, you’ll need to install Xcode itself. This can be tricky, and there are two main ways to accomplish this. The first is using xcpretty’s ‘xcode-install’ gem. When it works, it’s great, but (without going into too many details) there are some times when loading Xcode manually is necessary. If you want to use the gem, just follow their instructions, but if or when you need to do it manually, this is how I do it. First, download the version of Xcode you want from the Apple developer’s download section. Then you copy it from the “source” which is the path to the file (in this example, it assumes that the file is in the current directory with your scripts), to the “destination” which is the path in the VM you are creating.

{
  "type": "file",
  "source": "{{user `xcode_version`}}",
  "destination": "{{user `xcode_version`}}"
},
{
  "type": "shell",
  "environment_vars": "XCODE_VERSION={{user `xcode_version`}}",
  "script": "./install-xcode.sh"
}

Then, after passing the xcode_version with an ENV Variable in the provisioners block, I use a bash script to do all the things to Xcode to make it ready to use. The last line fixes a bug in the system install of Ruby, when using Catalina and Xcode 12.3 or later. This will become important later if you’re using AWS as well.

#!/bin/bash
echo "Installing $XCODE_VERSION"
echo "Expanding Xcode xip file... (This will take a while)"
xip --expand $XCODE_VERSION
echo "Removing Xcode xip file."
rm $XCODE_VERSION
echo "Moving Xcode application to Applications folder"
mv Xcode.app /Applications/Xcode.app
echo "Verifying security assessment policy on Xcode version... (This will take a while)"
spctl --assess --raw /Applications/Xcode.app
echo "Setting (Selecting) the current Xcode as the default for command-line tools"
sudo xcode-select -s /Applications/Xcode.app
echo "Accepting Xcode License"
sudo "/Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild" -license accept
echo "Installing and First Start for Xcode"
sudo "/Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild" -runFirstLaunch
echo "Fixing simlink in Ruby Sys Install Borked by Xcode 12.2 and later"
cd $(xcode-select -p)/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/System/Library/Frameworks/Ruby.framework/Versions/2.6/usr/include/ruby-2.6.0 && sudo ln -s universal-darwin20 universal-darwin19

Finally, most of us will probably want Bundler installed for our application managed dependencies. This is simple enough we can just leave it inline in the provisioners block of our main Packer template.

{
  "type": "shell",
  "inline": [
    "sudo gem install bundler"
  ]
}

Running the Packer Build to Create the .box File.

Now that we’ve created a Packer template, we can run the template using the following command. It will create a vagrant .box file and place it into a folder called output-vagrant/.

packer build your_packer_template.json

If you’ve already run this before, and want to overwrite your previous vagrant output, just add the force command.

packer build -force your_packer_template.json

Running the Vagrant Box Locally, if you are managing on-prem hardware

And, finally, (provided we have Vagrant and Parallels desktop installed), we can run our AMI locally by simply changing into the “output-vagrant” folder and running

vagrant up

When it starts up, we should be able to look at our Buildkite UI and see that there is now an agent running in our pool.

Running on a Cloud Provider such as Flow.Swiss

There are several smaller (than AWS) cloud providers like Flow.Swiss that provide fairly inexpensive Mac hardware for you to use if you don’t want the upfront cost, management, or hassle of buying and maintaining your own hardware. When you sign up for a cloud provider like this one, you spin up any given number of machines through their UI which you can either provision directly or run an already provisioned VM.

Just installing all of your dependencies and a Buildkite agent directly on the machine might be tempting if you are only running a machine or two, but trust me, in the long run it will end up being more expensive in both time and money. To do it the right way and run automated containers on a cloud provider like this is pretty much the same process as running one locally. After you have provisioned your machines, and have your packer image, you can just upload the .box and run vagrant up.

Running on AWS

AWS is a slightly different beast. The base images that AWS provides already have some tools installed. These include Homebrew, Command Line Tools, and Xcode.‌

First, you have to launch an image by following the commands as referenced in the AWS docs. You can do this one of two ways, via the CLI

aws ec2 allocate-hosts --auto-placement on --region us-east-2 --availability-zone us-east-2b --instance-type mac1.metal --quantity 1

Or, you can use the AWS Management Console as described by the following procedure.

To launch a Mac instance onto a Dedicated Host

Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/
In the navigation pane, choose Dedicated Hosts.
Choose Allocate Dedicated Host and then do the following:
a) For Instance family, choose mac1. If mac1 doesn’t appear in the list, it’s not supported in the currently selected region.
b) For Instance type, select mac1.metal.
c) For Availability Zone, choose the Availability Zone for the Dedicated Host.
d) For Quantity, keep 1.
e) Choose Allocate.
Select the Dedicated Host that you created and then do the following
a) Choose Actions, Launch instances onto host.
b) Select a macOS AMI.
c) Select the mac1.metal instance type.
d) On the Configure Instance Details page, verify that Tenancy and Host are preconfigured based on the Dedicated Host you created.
e) Complete the wizard, specifying EBS volumes, security groups, and key pairs as needed.
A confirmation page lets you know that your instance is launching. Choose View Instances to close the confirmation page and return to the console. The initial state of an instance is pending. The instance is ready when its state changes to running and it passes status checks.

Now, we have a Dedicated Host allocated, which is where we’re going to create our EC2 Mac instance. The next step is to use the Dedicated Host to create the AMI via Packer.

Here is part of the packer template I’ve created to create the image, and a description of what each field does.

{
    "variables": {
        "aws_access_key": "AKIA....",
        "aws_secret_key": "0Yj1....",
        "ami_name": "catalina-ami",
        "aws_region": "us-east-1",
        "ssh_username": "ec2-user",
        "vpc_id": "vpc-d8fa6da5",
        "subnet_id": "subnet-c6bdf999"
    },
    "builders": [{
        "type": "amazon-ebs",
        "access_key": "{{user `aws_access_key`}}",
        "secret_key": "{{user `aws_secret_key`}}",
        "region": "{{user `aws_region`}}",
        "instance_type": "mac1.metal",
        "force_deregister": true,
        "ssh_username": "{{user `ssh_username`}}",
        "associate_public_ip_address": true,
        "subnet_id": "{{user `subnet_id`}}",
        "ami_name": "{{user `ami_name`}}",
        "source_ami_filter": {
            "filters": {
                "name": "amzn-ec2-macos-10.15.*",
                "root-device-type": "ebs",
                "virtualization-type": "hvm"
            },
            "owners": ["amazon"],
            "most_recent": true
        },
        "run_tags": {
            "Name": "packer-build-image"
        }
    }],

Under variables key section, the variables include:

aws_access_key & aws_secret_key: For testing purposes you can put your keys here. For production, you should use one of the techniques discussed in Dr. Nic’s blog linked earlier in this blog. If you need help finding or creating your access key, click here.
ami_name: Name to be given to AMI generated by Packer.
ami_id: The ID of the base AMI that we created on the dedicated host.
aws_region: Region where Temporary instance will be created and newly created AMI will be stored.
ssh_username: AMI SSH user. Since we’re using the Catalina image available on AWS as our base image, the default ssh user is ec2-user.
vpc_id & subnet_id: The VPC ID and the Subnet ID to be used by a temporary instance created by Packer. It needs to be accessible from the workstation machine. I recommend you use a public subnet.

Under the builders key section, the variables include:

type: The type of storage used, Elastic Block Store.
instance_type: The EC2 instance type to use while building the AMI.
force_deregister: Force Packer to first deregister an existing AMI if one with the same name already exists. Default false.
associate_public_ip_address: If using a non-default VPC, public IP addresses are not provided by default. If this is true, your new instance will get a Public IP. default: false
source_ami_filter: The initial AMI used as a base for the newly created machine image. Its value can be an AMI ID or a filter to get the ID. The filter can be any identifying metadata that uniquely identifies the AMI. Here is a link to the docs about this field, in case you need further assistance figuring out the parameters.
run_tags: Key/value pair tags to apply to the instance that is launched to create the EBS volumes.

Now, the packer template is ready for our provisioners which should look like the previous sections above, and we can then create a base image using the packer build command, also like above. But what else can we automate using packer to make our lives easier? The answer is, a lot.

Here are a few of the most common things you might want to automate using your packer template.

Changing the Disk Size

Tests can be very large. Including the OS image, and whatever tooling you need to install, and then all of the tests; One of the major things you might want to automate is enlarging the disk size. On MacOS, this isn’t exactly straight forward. First you have to change the EBS default disk size, then you have to tell the OS to increase the partition size.

The default settings for the root volume of a mac1.metal are a size of 60 gigs with a volume type of gp2. I like to use gp3 because it has a few advantages, including better pricing. Also, 60 GiB doesn’t leave much space for for your tests after installation of Xcode and other tooling, so I am going to use 150 gigs for the volume size. To do this, we’re going to create another variable called ebs_size_gb.

    "ebs_size_gb": "150"

Then, the settings for changing the default device type and size go into Packer’s “launch_block_device_mappings” and we also add the “ebs_optimized” option in the builders section.

"launch_block_device_mappings": {
    "device_name": "/dev/sda1"
    "volume_size": "{{user `ebs_size_gb`}}"
    "volume_type": "gp3"
    "iops": "3000"
    "throughput": "125"
    "delete_on_termination": true
  },
"ebs_optimized": true

Now that we have the EBS being created to the size we want, we have to tell the MacOS partition to use the entire volume, otherwise we’ll have a 150 gig volume, with a 60 gig partition. We do this in the provisioners block of the packer script.

{
    "type": "shell",
    "inline": [
        "PDISK=$(diskutil list physical external | head -n1 | cut -d' ' -f1)",
        "APFSCONT=$(diskutil list physical external | grep Apple_APFS | tr -s ' ' | cut -d' ' -f8)",
        "yes | sudo diskutil repairDisk $PDISK",
        "sudo diskutil apfs resizeContainer $APFSCONT 0"
    ]
}

Timeout Settings

Starting and stopping EC2 Mac instances can take longer than starting other types of instances. So, you will probably want to increase Packer’s timeout settings so that Packer doesn’t prematurely cancel the build due to long running processes. You can do this by setting the “ssh_timeout” and “aws_polling” config options in the builders section. Here, I’m setting the timeout to 2 hours, and the polling to a max of 60 attempts every 30 seconds.

"ssh_timeout": "2h",
"aws_polling": {
    "delay_seconds": "30"
    "max_attempts": "60"
}

Using SSM Session Manager

By default, Packer launches the instance in a public subnet. If you do not want to expose a public IP you can run it in a private subnet instead. Using SSM, the instance doesn’t use a public IP and you don’t have to add a security group rule to open up port 22, like you do when using SSH. In order to do this, make sure that the SSM agent is installed on the host, and has appropriate permissions to open the connection.

First, add the configuration option “ssh_interface” to the builders section

"ssh_interface": "session_manager"

Then, we can use Packer’s “temporary_iam_instance_profile_policy_document” to pass in a policy document. We can just copy and paste the managed policy AmazonSSMManagedInstanceCore.

"temporary_iam_instance_profile_policy_document": {
    "Statement": [
    {
        "Action": [
            "ssm:DescribeAssociation",
            "ssm:GetDeployablePatchSnapshotForInstance",
            "ssm:GetDocument",
            "ssm:DescribeDocument",
            "ssm:GetManifest",
            "ssm:GetParameter",
            "ssm:GetParameters",
            "ssm:ListAssociations",
            "ssm:ListInstanceAssociations",
            "ssm:PutInventory",
            "ssm:PutComplianceItems",
            "ssm:PutConfigurePackageResult",
            "ssm:UpdateAssociationStatus",
            "ssm:UpdateInstanceAssociationStatus",
            "ssm:UpdateInstanceInformation"
        ],
        "Effect": "Allow",
        "Resource": [
            "*"
        ]
    },
    {
        "Action": [
            "ssmmessages:CreateControlChannel",
            "ssmmessages:CreateDataChannel",
            "ssmmessages:OpenControlChannel",
            "ssmmessages:OpenDataChannel"
        ],
        "Effect": "Allow",
        "Resource": [
            "*"
        ]
    },
    {
        "Action": [
            "ec2messages:AcknowledgeMessage",
            "ec2messages:DeleteMessage",
            "ec2messages:FailMessage",
            "ec2messages:GetEndpoint",
            "ec2messages:GetMessages",
            "ec2messages:SendReply"
        ],
        "Effect": "Allow",
        "Resource": [
            "*"
        ]
     }
   ],
   "Version": "2012-10-17"
}

Clear Previous Launch History

Clearing any history of previous launches will make any new instances launch as if it was their first boot. To do this, we can use the “clean –all” command for the EC2 launch daemon used for Mac instances. It can also run any provided user data, and you can also use it to run commands at startup. This is the command to add to the provisioners section of the script

{
    "type": "shell",
    "inline": [
        "sudo /usr/local/bin/ec2-macos-init clean --all"
    ]
}

Catalina Image, and the System Install of Ruby

One thing to be aware of at the time of this writing is that Amazon’s Catalina image also has Xcode 12.4 installed, which breaks the system install of Ruby (by renaming a hard-linked framework file). This means that the base image ships with ruby broken. The easy solution I provided above in the Packer section will fix it by symlinking the old name to the new name. You can do this if you don’t want to upgrade the system Ruby:

cd $(xcode-select -p)/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/System/Library/Frameworks/Ruby.framework/Versions/2.6/usr/include/ruby-2.6.0 && sudo ln -s universal-darwin20 universal-darwin19

OK, Great! What’s next?

Now that the AMI is created, we can use it to launch a new instance to create a Buildkite agent. Then we can connect to it with SSH or enable VNC to connect to a remote desktop if we need to do anything further to the image, or troubleshoot any problems we might have with our tests.

Launch an Instance

To launch a new instance of our AMI, take the ID of the Dedicated Host, and specify an IAM Role, which has the required permissions to use SSM.

aws ec2 run-instances --instance-type mac1.metal --image-id ami-your_ami_id --region us-east-2 --placement HostId=h-your_host_id --iam-instance-profile Name=SSMInstanceRole

Connecting to the instance via SSH

We can start an SSH session using SSM with the following command:

aws ssm start-session --target <YOUR_INSTANCE_ID> --region us-east-2

Enable VNC

Using the SSH connection, we can set a password for the user ec2-user and activate remote GUI access.

sudo passwd ec2-user
sudo /System/Library/CoreServices/RemoteManagement/ARDAgent.app/Contents/Resources/kickstart \
    -activate -configure -access -on \
    -configure -allowAccessFor -specifiedUsers \
    -configure -users ec2-user \
    -configure -restart -agent -privs -all
sudo /System/Library/CoreServices/RemoteManagement/ARDAgent.app/Contents/Resources/kickstart \
    -configure -access -on -privs -all -users ec2-user

Connect to the GUI:

Now we can connect to the GUI using a VNC viewer.

1. Create a tunnel with SSM between our local machine and EC2:

aws ssm start-session \
    --target i-0bd054c24ed30074a --region us-east-2 \
    --document-name AWS-StartPortForwardingSession \
    --parameters '{"portNumber":["5900"], "localPortNumber":["5900"]}'

2. Connect using Mac’s built-in VNC viewer:

open vnc://[email protected]:5900

If you are using another OS, you might have to install a VNC viewer. Finally, we should see the GUI of the EC2 Mac instance, which we can use our new password to log in to.

Conclusion

I hope this gives you a good jumping off point for automating MacOS images and AMIs for using with your iOS and MacOS CI pipelines using Buildkite. There’s a lot more information in Packer’s Docs, Amazon’s Mac page, Youtube, etc… but if you need more specific help, you can always contact us (Stark & Wayne), and we’d be more than happy to help.

Remember: Automate All The Things; Treat Infrastructure as Code; Keep your code DRY; Follow industry best practices; Don’t be afraid to ask for help; And good luck!

Written by:
Lucas Bunt

Senior Cloud Engineer

Buildkite – Getting Started with iOS Agents in the Cloud