Submitted by bnemec on Fri, 01/17/2025 - 22:05
Over the past few years there have been quite a few changes in how day one (or perhaps more accurately, deployment-time, since they also apply to scaleout operations on day two) networking functions. This particular phase of networking is especially tricky because cluster resources are not, for the most part, available yet. This means you can't use any of the normal operators that handle network configuration later in the deployment process.
Submitted by bnemec on Fri, 01/05/2024 - 22:31
Just a quick announcement of a tool I wrote recently to help with debugging of Keepalived behavior in an OpenShift On-Prem IPI cluster. This is specifically intended to handle the logs from the keepalived pods running in the openshift-[platform]-infra namespace, although with a little work it could probably be generalized to work with most any Keepalived configuration.
Submitted by bnemec on Mon, 11/21/2022 - 21:13
The Problem
This is some design work I did a while back as a result of an edge case that we had not considered in the original design of the loadbalancer architecture for OpenShift on-prem networking. Our (mistaken) assumption was that apiservers would either be up or down and our healthchecks were written with that in mind. As it turns out, it is possible for a cluster to be in an unhealthy state but not completely down. This results in intermittent failures of API calls, which causes flapping of the healthchecks. One could argue that the healthchecks are correctly representing the state of the cluster, but the problem is that VIP failovers break all connections to the API which can exacerbate the instability of a flaky cluster. Each time the VIP fails over it forces every client to reconnect, and if the apiservers are already struggling to handle the load then having a huge number of connections come in at once just makes it worse.
Submitted by bnemec on Thu, 05/05/2022 - 19:26
Oh my.
-George Takei
Mostly writing this down so Google knows about it the next time I search. On a fresh VM I tried to use NMState to apply a configuration that included OVS bridges and interfaces. This failed with the error libnmstate.error.NmstateDependencyError: Open vSwitch support not properly installed or started
. I had installed and started OVS, so I was very confused. I vaguely recalled that there was an integration package for NetworkManager, but a dnf search openvswitch
turned up nothing.
Submitted by bnemec on Tue, 06/27/2017 - 20:52
Submitted by bnemec on Fri, 08/05/2016 - 22:26
I've finally gotten around to recording a demo of the current iteration of my tool to generate TripleO network isolation templates. You can watch the demo video (in which I say "umm". A lot. Sorry.) or check out the tool itself. There are also some templates generated with the tool if you want examples to start from.