I have complained extensively over the past couple of years about the over-automation of developer environments in TripleO. But wait, you say, isn't automation a good thing? And yes, it is, but the automation needs to happen in the right places (feel free to append "IMHO" to anything in this post ;-). The problem with a bunch of developer-specific automation is that it hides very real user experience problems because the developers just use their simplified interface and never touch the regular user interface. Essentially it's the opposite of dogfooding, which is something I feel is critical to writing good software. This is also known as the "Works in Devstack" problem for OpenStack as a whole (I will not be tackling that problem here though).
So if I don't like what most people are doing today, what would I prefer? That will be the topic of this post. I'll discuss what I do, and possible areas for improvement.
TLDR: Follow the docs.
Okay, that's a massive oversimplification, but it sums up the broad strokes of my philosophy on development. Developers need to be reading and following the docs, not just writing them (although writing them is also good!). That's what the users will do and if you want developers to improve the user experience the best way to accomplish that is for developers to have the user experience.
Some will argue that this is too difficult. I will counter that if it's too difficult for our developers to use our docs, then how the hell are our users supposed to figure them out? I will grant that our "basic" deployment docs are too complicated and include too many optional or advanced features. Unfortunately it is hard to get traction to clean that up when you're changing something that most developers don't look at. If developers are using the docs then they will be motivated to improve the docs rather than the developer tooling. Everybody wins.
Does this mean that I go through the docs step-by-step every time I do a deployment? Not at all. For one thing, I've been working on TripleO for so long that I could do a basic deployment without any docs whatsoever. But that's sort of irrelevant here since it's not going to be the case for 95+% of people working on TripleO.
What is more useful are the notes that I have about setting up a development environment. This is not an Ansible playbook, it's not a Puppet manifest, it's not even a script. It's a plain text file that documents the commands I have run in the past to set up my environment. It is also something that every user I have ever worked with has for their environment, basically a trimmed version of the docs with any extraneous bits removed and site-specific values included where appropriate. Some users will go even further and write an Ansible playbook or script to automate their deployment. This is fine, but until TripleO starts shipping a playbook as its top-level interface its developers should be using the interface that we do ship (please refer back to that IMHO aside from earlier).
In my case, my notes can be copy-pasted verbatim into a terminal and they will do everything from repo setup to node registration without my intervention. They stop short of an overcloud deployment because in general I end up needing to customize that in some way so there's no point wasting time on a stock deployment. If this sounds an awful lot like developer automation, well, it kind of is. The difference is that I'm still running the exact same commands a user would, and these notes came out of reading the docs extensively over the years. I also have to customize the steps on a regular basis, so in reality I only do these end-to-end simple runs maybe 50% of the time. The rest of the time I go through step-by-step and read the docs to figure out how I need to modify my standard commands to do what I need.
Like a user would. Seeing a pattern?
In addition, this level of automation is a recognition of the fact that developers can't always babysit their deployments. Everyone is busy, and nobody has time to sit around and watch their undercloud deploy for 30 minutes just so they can kick off the image build right after that. Queuing up commands so the deployment is somewhat fire and forget is a necessary concession to developer time constraints. I'm not so militantly opposed to automation that I would claim everyone should spend two hours doing nothing but watch their deployment every time they set up an environment. :-)
I debated whether to include my notes in this post. I don't want people substituting this blog post for the TripleO docs either. Ideally everyone would start at the docs and come up with their own development workflow and notes that reflect the things they do most often. Fortunately my notes are fairly specific to my environment so I'm not too concerned about that happening, and they're also only a basic framework. As I noted above, in many cases I have to customize the commands further so this post can't function as a replacement for the docs anyway.
So without further ado, here is what I do when setting up a basic development environment. I'm including some notes on what each section does so this isn't the copy-pastable version. And obviously this process changes from time to time as the documented install process changes, so YMMV on this working in the future:
bin/deploy.py --quintupleo --name test --id test --poll -e env-base.yaml -e environments/all-networks-port-security.yaml
bin/build-nodes-json.py -e env-test.yaml
scp nodes.json centos@[undercloud floating ip]:~
sudo yum install -y git
# There are too many things named tripleo-* as it is, so I give this a different prefix :-)
git clone https://git.openstack.org/openstack-infra/tripleo-ci git-tripleo-ci
echo '#!/bin/bash' > tripleo.sh
echo 'git-tripleo-ci/scripts/tripleo.sh $@' >> tripleo.sh
chmod +x tripleo.sh
curl "http://ucw-bnemec.rhcloud.com/?local_interface=eth1&network_cidr=9.1.1.0%2F24&node_count=10&undercloud_hostname=$(hostname -s).localdomain&local_ip=9.1.1.1%2F24&local_mtu=1500&network_gateway=9.1.1.1&undercloud_public_vip=9.1.1.2&undercloud_admin_vip=9.1.1.3&dhcp_start=9.1.1.4&dhcp_end=9.1.1.23&inspection_start=9.1.1.24&inspection_end=9.1.1.33&undercloud_service_certificate=&generate=Generate+Configuration" | grep -v html\> | sed -e 's/
/\n/g' | tee undercloud.conf
echo "enable_telemetry = false" >> undercloud.conf
echo "enable_legacy_ceilometer_api = false" >> undercloud.conf
echo "enable_ui = false" >> undercloud.conf
echo "enable_validations = false" >> undercloud.conf
echo "enable_tempest = false" >> undercloud.conf
sudo su
echo "preserve_hostname: true" > /etc/cloud/cloud.cfg.d/99_hostname.cfg
exit
export allinone=1
sudo yum install -y wget
wget -r --no-parent -nd -e robots=off -l 1 -A 'python2-tripleo-repos-*' https://trunk.rdoproject.org/centos7/current/
sudo yum install -y python2-tripleo-repos-*
sudo tripleo-repos current-tripleo-dev --rdo-mirror http://mirror01.regionone.tripleo-test-cloud-rh1.openstack.org:8080/rdo --centos-mirror http://mirror01.regionone.tripleo-test-cloud-rh1.openstack.org
sudo yum install -y python-tripleoclient
openstack undercloud install
. stackrc
[ "$allinone" != "1" ] && sleep 600
curl -O http://11.2.2.3/CentOS-7-x86_64-GenericCloud-1707.qcow2
export DIB_LOCAL_IMAGE=~/CentOS-7-x86_64-GenericCloud-1707.qcow2
export DIB_DISTRIBUTION_MIRROR=http://mirror.centos.org/centos
export DIB_EPEL_MIRROR=http://dl.fedoraproject.org/pub/epel
export http_proxy=http://roxy:3128
export no_proxy=9.1.1.1,192.0.2.1,9.1.1.2,192.0.2.2,192.168.0.1,192.168.0.2,192.168.24.1,192.168.24.2
export DIB_YUM_REPO_CONF="/etc/yum.repos.d/delorean*"
openstack overcloud image build
. stackrc
openstack overcloud image upload --update-existing
# The proxy can sometimes cause issues and we're done with it, so clear the variable
unset http_proxy
# Beware, unsafe tmp location. These are new dev systems so I don't care, but don't copy this pattern. cat >> /tmp/eth2.cfg <<EOF_CAT network_config: - type: interface name: eth2 use_dhcp: false addresses: - ip_netmask: 10.0.0.1/24 - ip_netmask: 2001:db8:fd00:1000::1/64 EOF_CAT sudo os-net-config -c /tmp/eth2.cfg -v sudo iptables -A POSTROUTING -s 10.0.0.0/24 ! -d 10.0.0.0/24 -j MASQUERADE -t nat
[ -f "nodes.json" ] && openstack overcloud node import --provide nodes.json
openstack overcloud deploy --templates --libvirt-type qemu -e /usr/share/openstack-tripleo-heat-templates/environments/disable-telemetry.yaml
Telemetry was causing a lot of problems at one point and I don't need it, so I've taken to disabling it by default to save time.
And that's it. Again, this doesn't come close to covering all of the things I do, but for details on the rest you'll have to read the TripleO Docs.