BOSH Lite Redux: Addressing a “Funky State”

Recently, I ran into a situation where my BOSH lite deployment was in a “funky state” (technical term). In my case, “funky state” means:

  • I could not upgrade Cloud Foundry
  • I could not successfully complete any BOSH deployments, even those working elsewhere
  • Running vagrant destroy and then trying again did not resolve the problem.

I did not experience the above in that order, and in fact the last straw was when I tried to upgrade CF locally. When I ran the provision script, it errored out when trying to execute the api_z1/0 job:

Failed: Action Failed get_task: Task 546d1e15-a944-448b-7e89-e15f81ad72d4 result: Applying: Applying job cloud_controller_ng: Applying package buildpack_php for job cloud_controller_ng: Decompressing package files: Shelling out to tar: Running command: 'tar --no-same-owner -xzvf /var/vcap/data/tmp/bosh-blobstore-externalBlobstore-Get750237002 -C /var/vcap/data/tmp/bosh-agent-applier-packages-CompiledPackageApplier-Apply767957975', stdout: './
./php_buildpack-cached-v3.2.1.zip
', stderr: '
gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now

At first I figured I was in a hurry and I didn’t need the PHP buildpack. So I removed the PHP buildpack from the manifest and tried again. The provision script failed again at the same point with a new buildpack error:

Failed: Action Failed get_task: Task f18bb32e-1283-4cb6-57f9-466565ce0def result: Running Drain Script: Running drain script: Starting command /var/vcap/jobs/cloud_controller_ng/bin/drain job_unchanged hash_new buildpack_binary metron_agent buildpack_go buildpack_python cloud_controller_ng consul libpq nginx buildpack_nodejs common mysqlclient buildpack_java buildpack_java_offline buildpack_php buildpack_ruby buildpack_staticfile nginx_newrelic_plugin ruby-2.1.6: fork/exec /var/vcap/jobs/cloud_controller_ng/bin/drain: no such file or directory

That was the point where I decided to try to do everything again from scratch because sometimes the best thing to do is “reset”. I deleted/destroyed the following:

  • The cloned bosh-lite repository
  • The cloned cf-release repository
  • The Vagrant VM running BOSH lite

For the last one: I accidentally deleted the bosh-lite directory with the Vagrantfile before destroying the Vagrant VM. Not my best moment, to be sure, but I was able to easily destroy the VM using the VirtualBox UI.

Doing all the things from scratch

I did the following steps in this order:

  • Cloned the bosh-lite repository
  • Started the vagrant VM with vagrant up
  • When I started the vagrant VM, I was shown a message that A newer version of the box 'cloudfoundry/bosh-lite' is available!. This was a new message, so I took it as a sign of progress. To update the box I ran vagrant box update.
  • I cloned the cf-releases repository.
  • I ran ./update in the new cf-releases directory
  • I ran ./bin/provision_cf back in the bosh-lite directory

Thankfully everything ran without a hitch and is now going smoothly once again! ᕦ( ^-^)ᕤ

System Resource Usage

After all was said and done, my Macbook Pro with 16 GB of RAM was using about 97% of its memory to run everything and approximately 40 GB of disk space. Make sure you have ample disk space before setup.

Spread the word

twitter icon facebook icon linkedin icon