QuintupleO Status Update

Edit: Updated 2015/3/19 with more current diffs of my changes.

At the Atlanta OpenStack Summit we had a session on something called QuintupleO, otherwise known as "TripleO wasn't confusing enough, let's add another layer" :-) Barring a few specific concerns from other teams, which I believe have now been addressed to their satisfaction, everyone seemed to be on board with the idea. But what exactly is QuintupleO, and where does it stand today? Read on to find out.

Introduction

To understand what QuintupleO is, first we need to briefly discuss TripleO. TripleO is part of the OpenStack deployment program, and stands for OpenStack on OpenStack. In essence it's using OpenStack services to deploy and manage OpenStack on baremetal servers. However, because most developers don't have an IPMI-capable server farm lying around, we typically use a simulated baremetal environment built out of virtual machines for local development. In fact, our CI is all running on virtual machines too.

To make these virtual environments work as much like a real baremetal environment as possible, we deploy to the virtual machines in much the same way. They are PXE booted with a deployment kernel and ramdisk that are responsible for making the local disk available to Ironic so it can copy the final image over. Once that is done, they are rebooted with the final kernel and ramdisk and you have a fully deployed system. The only major difference in the virtual environment is that instead of actual IPMI calls, the virtual machines are controlled via virsh commands. Yes, we're calling virsh directly, and that's what QuintupleO is hoping to eliminate.

Problems

See, by managing VM's directly like that we end up re-solving a number of largely solved problems. There's this thing called Nova that is very good at managing virtual machines, even large numbers of them spread across many, many hosts. This is something we want for TripleO for a few reasons:

  1. Our CI infrastructure runs on dozens of physical machines with (I believe) hundreds of virsh instances that need to be managed. A lot of the work needed to improve CI would involve re-implementing pieces of Nova. Not something we want to do, obviously.
  2. With the advent of HA in TripleO deployed clouds, the resource requirements for doing a full deployment have ballooned to a point where a 16 GB dedicated box cannot provide enough memory for all of the virtual machines needed. One way to get around that problem would be to allow use of OpenStack instances instead of virsh instances. That way you could theoretically do your TripleO development against any public (or private, for that matter) cloud that exposes the necessary features, using any hypervisor, anywhere in the world. Sounds cloudy to me. :-)

So, why aren't we doing that? A few reasons:

  • Nova needs to be able to PXE boot instances
  • Ironic needs a fake IPMI driver that can manage Nova instances
  • Neutron needs a way to disable address spoofing so we can run our own DHCP server

These are all relatively simple things, but integrating the functionality cleanly into the respective projects is non-trivial, which is why it hasn't happened yet. Fortunately there is serious discussion around all of these features because QuintupleO is not the only interested party for them, so if we're lucky they'll all be implemented in the Kilo cycle (I guess I'm an optimist today ;-).

Solutions

Disclaimer: Vicious, vicious hacks ahead. Don't do anything I'm about to discuss unless you completely understand the implications.

While we wait for the official, ready-for-production solutions, I've been investigating what it would take to get QuintupleO running with some one-off patches. While this obviously wouldn't enable use of public clouds, it could be helpful for CI, and possibly even in some private internal clouds that could be patched to support QuintupleO. As it turns out, when you don't care if your changes are hacks it's not that difficult to make this stuff work. :-)

Just in case you somehow missed the warning above, I'll say again that these changes favor simplicity over correctness. They work for me, but YMMV.

PXE Boot in Nova

Diff:

diff --git a/nova/virt/libvirt/driver.py b/nova/virt/libvirt/driver.py
index 6097fbf..e419df6 100644
--- a/nova/virt/libvirt/driver.py
+++ b/nova/virt/libvirt/driver.py
@@ -4012,6 +4012,10 @@ class LibvirtDriver(driver.ComputeDriver):
             self._conf_non_lxc_uml(virt_type, guest, root_device_name, rescue,
                     instance, inst_path, image_meta, disk_info)
 
+        if (CONF.libvirt.virt_type in ['qemu', 'kvm'] and
+                flavor.extra_specs.get('libvirt:pxe-first')):
+            guest.os_boot_dev = ['network'] + guest.os_boot_dev
+
         self._set_features(guest, instance.os_type, caps, virt_type)
         self._set_clock(guest, instance.os_type, image_meta, virt_type)

To use, run nova flavor-key [flavor-id] set libvirt:pxe-first=1 on a flavor you want to use for PXE booting. Libvirt-specific because that's what I'm using.

Ironic

Edit: Don't use this. There's a better alternative

I don't have a diff of this one because it was done in installed code, but the basic changes are:

In the ssh.py driver file, _get_boot_device_map, add 'openstack' to the list of recognized virt_types. It doesn't particularly matter where - my hack doesn't actually use the list of boot devices anyway.

Also in ssh.py, _get_command_sets, add the following block for the 'openstack' driver:

elif virt_type == 'openstack':
        # Requires that the ssh power user sources an appropriate stackrc file at login
        return {
            'base_cmd': 'LC_ALL=C',
            'start_cmd': 'nova start {_NodeName_}',
            'stop_cmd': 'nova stop {_NodeName_}',
            'reboot_cmd': 'nova reboot {_NodeName_}',
            'list_all': "nova list | tail -n +4 | head -n -1 | awk '{print $2}'",
            'list_running': "nova list | egrep 'ACTIVE|powering-on' | grep -v powering-off | awk '{print $2}'",
            'get_node_macs': ("neutron port-list | grep "
                "`nova show {_NodeName_} | awk -F '|' '/network/{print $3}'` |"
                "awk -F '|' '{print $4}' | tr -d ':' | tr -d ' '"),
            'set_boot_device': '/bin/true',
            'get_boot_device': '/bin/true',
        }

As noted in the comment, this assumes the ssh power user will have its environment properly configured to be able to run Nova/Neutron commands at login. It also assumes Neutron is in use. You may also notice that it stubs out the get and set boot device calls. Because of that, to use this with an instance, it must first have been created with the Nova pxe flavor above.

To use it, simply configure the Ironic node as you would for virsh, but change the type to openstack. Also, do note that Ironic seems to expect these calls to be synchronous, which in Nova's case they are not. This means that the list_running command is a little bit naughty and reports back the intended state of the instance, not necessarily its actual state. So if an instance is running, but in the process of powering off it will be reported as off. This could be an issue, but in my testing it never caused any problems. Have I mentioned the hack-ish nature of these changes?

Neutron

Diff:

diff --git a/neutron/agent/linux/iptables_firewall.py b/neutron/agent/linux/iptables_firewall.py
index 4830bc2..1d71913 100644
--- a/neutron/agent/linux/iptables_firewall.py
+++ b/neutron/agent/linux/iptables_firewall.py
@@ -243,6 +243,8 @@ class IptablesFirewallDriver(firewall.FirewallDriver):
             mac_ipv6_pairs.append((mac, ip_address))
 
     def _spoofing_rule(self, port, ipv4_rules, ipv6_rules):
+        # Disable spoofing rules for QuintupleO
+        return
         #Note(nati) allow dhcp or RA packet
         ipv4_rules += [comment_rule('-p udp -m udp --sport 68 --dport 67 '
                                     '-j RETURN', comment=ic.DHCP_CLIENT)]
@@ -273,6 +275,8 @@ class IptablesFirewallDriver(firewall.FirewallDriver):
                                        mac_ipv6_pairs, ipv6_rules)
 
     def _drop_dhcp_rule(self, ipv4_rules, ipv6_rules):
+        # Disable DHCP rule for QuintupleO
+        return
         #Note(nati) Drop dhcp packet from VM
         ipv4_rules += [comment_rule('-p udp -m udp --sport 67 --dport 68 '
                                     '-j DROP', comment=ic.DHCP_SPOOF)]
@@ -440,7 +444,7 @@ class IptablesFirewallDriver(firewall.FirewallDriver):
 
     def _drop_invalid_packets(self, iptables_rules):
         # Always drop invalid packets
-        iptables_rules += [comment_rule('-m state --state ' 'INVALID -j DROP',
+        iptables_rules += [comment_rule('-m state --state ' 'INVALID -j ACCEPT',
                                         comment=ic.INVALID_DROP)]
         return iptables_rules

This one caused me the most grief, mostly because I'm less familiar with Neutron and the networking aspect as a whole. I did eventually find the right calls to disable the address spoofing and DHCP-blocking iptables rules. This got Ironic PXE deploys from a Nova instance to another Nova instance working, but I will admit I probably don't fully understand the implications of the change (though the security implications are obvious even to me - beware).

Edit: Now that I've actually tried deploying an overcloud this way, I discovered I needed to undo the invalid packets rule too or the overcloud instances couldn't do things like DHCP. I'm not clear why that is necessary, but since this is just a hack I went with it.

Conclusion

So where does that leave us? With these changes it would theoretically be possible to use QuintupleO for the TripleO CI environment. The biggest issue I'm aware of would be the fact that the Ironic code in the images would not be OpenStack-aware, so we would need to cherry-pick my changes in every image build. That's something we typically prefer to avoid, so I think a final version of the Ironic OpenStack driver should be the top priority right now. The other changes could be applied to a CI OpenStack installation and largely left alone without affecting any of our testing. Long-term, of course, we want to get all of this functionality enabled in the respective projects by default. That will need to happen before TripleO development in a public OpenStack cloud can happen.

And that's pretty much the state of QuintupleO from my perspective. I would love to have some more discussions about short and long-term plans for this either in Paris or before, if possible. If you have any comments or questions, please look me up. I'm bnemec (or beekneemech on casual nick Friday ;-) in #tripleo on Freenode, or you can just send something tagged [TripleO] to the openstack-dev mailing list. Thanks.