Preflight Platform Checklist: A Mechanic’s Guide

Years ago, the documentation around getting Cloud Foundry going was not exactly thorough. There was a small treasure trove of carefully cultivated manifests, Google Docs, and READMEs with instructions which quickly became outdated.

Let’s face it, it was a small miracle back then to get a manifest for Cloud Foundry to work!

In the subsequent years, a number of resources have been made available thanks to the hard work of many great people. And, there are many great resources now that did not exist a few years ago:

This blog post, however, covers the sharp edges which may be missed.

People, Places, Things

Sounds like the final category listings on Wheel of Fortune? Mapped to our corner of the universe, we are going to break this blog into three sections: People, IaaS, and Networking; hopefully highlighting the common pitfalls that may stall your project.

Why people first? In the original draft, we actually did IaaS and Networking first as the obvious topics, but that’s not how it works in real life. There are discussions and choices you have to consider before you can answer many questions.

Category: People

People are critical to the success of any Cloud Foundry deployment. This is definitely one of the topics that we haven’t found written down before so we start here. There are questions that need to be asked which only people can answer. You’ll want to be in touch with the following folks ahead of time:

  • Firewall team.
  • Security team.
  • Infrastructure team.

Once you’ve identified the team, there are some questions that you will need to answer. Everyone’s opinion counts and some may scream louder than others but it is important to ask the following:

Planning and decisions are critical early on and will help your environment from turning “Jurassic”:

  • When will maintenance windows be set?
  • Are we scheduling stemcell upgrades?

Verifying everybody agrees on names before you deploy:

  • Do your SSL certificates have the correct wild cards?
  • Do you have all the intermediate certs?
  • DNS matches and resolves?

The worst time to find out you needed backups is when you need them:

  • Without a backup plan, your production environment is just a “Proof of Concept”.
  • Decide what you want to backup, where and how long to keep them.
  • Get backups “Day 1”.
  • Until you have tested a restore, you don’t have backups.
  • Some backups should periodically go to different accounts or IaaS’s.

Internal databases go away when you destroy and rebuild:

  • Internal vs External Databases.
  • Internal vs External Blobstores.
  • Can you delete BOSH, deploy it again, and restore?
  • Can you delete Cloud Foundry, deploy it again, and restore?

A few of the above hint at your databases for your platform, here are a couple of quick reminders that while I’ve seen them scattered about in documentation deserve to be grouped together:

  • Remember to backup your Credential Store (Credhub and/or Vault). Without these, you’ll be up something creek without a paddle.
  • Never backup the Locket Database (well, you can back it up, but NEVER restore it!). The restored locks are no longer running so they won’t complete and unlock and you will have hard to diagnose issues and failures in the future.

Category: Places

Never fail to push an analogy past its breaking point. Now that we have folks fully engaged, where are we going to run Cloud Foundry?

On infrastructure, of course! There are a bunch of different CPIs which BOSH understands and thus we can deploy Cloud Foundry to such as VMware, AWS, Azure, GCP, and OpenStack.

Each of these is going to require knowing a basic set of questions:

  • Credential details to feed BOSH.
  • Internal or External SQL databases.
  • Internal or External blobstore.
  • What the load balancer will need to be.
  • Where will SSL termination go?

Our talk from Cloud Foundry Summit 2019 goes into deeper detail on each of these infrastructures and the known gotchas of each, check it out:

One important point I will bring back out here: Figure out who’s responsible for the credit card. It is obvious to set up a reminder before the expiration date to update billing details, but you also have to have a way to track where you used it for cases where the card unexpectedly fills up or if you have to replace it because it was lost or stolen. If your credit card doesn’t work when your IaaS bills you, they may just turn everything off.

Category: Things

Play nice with the network admins, there are only a few people who can break your deployment more quickly than these folks!

This category is, for the most part, defined in the resources documented at the beginning of this blog post but this wasn’t always the case. Many of us kept cheat sheets around on some really old sticky notes at this point so deserve a bit of special attention.

When deploying a new Cloud Foundry environment (or trying to figure out why one went to hell real damn quick) this is the list of ports we keep handy:

If possible when laying out your network, be aware of home router ranges. VPN’s do not like when the home and remote network use the exact same IP’s. This type of error causes chuckles the first time you run into it but is rather annoying when rousted out of bed. And there aren’t any chuckles when you have to fix something and realize your VPN access is broken. Please do everyone a favor and avoid these ranges:

  • 192.168.0.X
  • 192.168.1.X
  • 192.168.2.X
  • 10.0.0.X
  • 10.0.1.X
  • 172.16.0.X
  • 172.16.1.X

If your IaaS has corporate firewalls, factor in a few days to debug them, every time, as you will have problems with them.

If your IaaS requires proxy’s to access the internet, factor in a few days to debug them, every time, as you will have problems with them.

New rules get added, definitions updated, API’s evolve. They may decide to disable internal ICMP messages because of a recent CVE, etc. Your deployment, network, and IaaS are going to change and evolve. Remember above, get to know the people.

Final Thoughts

There is now a large swath of information around the initial bootstrapping requirements to getting Cloud Foundry up and running. You need to have a handle on the People, Places, and Things to be successful, hopefully, the checklist items here have helped to fill in some of the gaps.

For all the “day 2” and beyond there are frameworks and ideologies which are talked about in other blog posts, keep reading here at starkandwayne.com/blog.

Spread the word

twitter icon facebook icon linkedin icon