Submitted by bnemec on Fri, 08/30/2024 - 19:49
Previously the Keepalived Log Parser required manual log collection via must-gather or some other mechanism. Recently I added some functionality to allow it to read logs directly from a live cluster by taking a kubeconfig as input instead of a log directory. This may make the tool more useful to non-developer users who want to see what's going on in their cluster, which is a semi-regular request we have gotten.
Here's a quick YouTube video demonstrating this new functionality. Hope you find it useful.
Submitted by bnemec on Fri, 03/22/2024 - 21:24
Like most tech companies these days, Red Hat is encouraging everyone to brush up on their AI knowledge. For my part I have been doing a number of online training courses lately and thought I would write up my experience for posterity.
Submitted by bnemec on Fri, 01/05/2024 - 22:31
Just a quick announcement of a tool I wrote recently to help with debugging of Keepalived behavior in an OpenShift On-Prem IPI cluster. This is specifically intended to handle the logs from the keepalived pods running in the openshift-[platform]-infra namespace, although with a little work it could probably be generalized to work with most any Keepalived configuration.
Submitted by bnemec on Mon, 11/21/2022 - 21:13
The Problem
This is some design work I did a while back as a result of an edge case that we had not considered in the original design of the loadbalancer architecture for OpenShift on-prem networking. Our (mistaken) assumption was that apiservers would either be up or down and our healthchecks were written with that in mind. As it turns out, it is possible for a cluster to be in an unhealthy state but not completely down. This results in intermittent failures of API calls, which causes flapping of the healthchecks. One could argue that the healthchecks are correctly representing the state of the cluster, but the problem is that VIP failovers break all connections to the API which can exacerbate the instability of a flaky cluster. Each time the VIP fails over it forces every client to reconnect, and if the apiservers are already struggling to handle the load then having a huge number of connections come in at once just makes it worse.
Submitted by bnemec on Fri, 09/02/2022 - 21:24
Fair warning: This is gonna be looooong. Proceed at your own risk. ;-)
Introduction
Since I started working with OpenShift on baremetal one of the things I've wanted to do is deploy OpenShift using OpenStack Virtual Baremetal to provide the host VMs. The usual developer setup is dev-scripts, which uses libvirt to stand up a virtual baremetal environment. This works fine, but it has a few drawbacks:
Submitted by bnemec on Mon, 07/11/2022 - 21:06
If you open a bug with NetworkManager, there is a high probability that the first thing they will ask you is to provide trace logs from around the time whatever bad behavior you're reporting occurs. This isn't terribly complicated to do, but most people are not familiar with the NetworkManager logging configuration so when asked for trace logs their first response is: How? I'm writing this up so I can just provide a link here when I get that question.
Submitted by bnemec on Thu, 05/05/2022 - 19:26
Oh my.
-George Takei
Mostly writing this down so Google knows about it the next time I search. On a fresh VM I tried to use NMState to apply a configuration that included OVS bridges and interfaces. This failed with the error libnmstate.error.NmstateDependencyError: Open vSwitch support not properly installed or started
. I had installed and started OVS, so I was very confused. I vaguely recalled that there was an integration package for NetworkManager, but a dnf search openvswitch
turned up nothing.
Submitted by bnemec on Tue, 03/16/2021 - 16:19
Just writing this down so I can find it easily in the future. My C922 webcam somehow got stuck at a 640x480 resolution, which looks weird in this day of widescreen monitors. The fix was the following command: v4l2-ctl -d /dev/video2 -v width=1280,height=720
My laptop also has a (bad) integrated webcam so that's why I had to specify /dev/video2. On Fedora I also had to install the v4l-utils package to have v4l2-ctl available.
Submitted by bnemec on Fri, 01/29/2021 - 18:08
I was running fstrim on a couple of old drives in anticipation of installing them in an even older RAID controller that doesn't support the trim command.[0] To do this, I formatted the entire drive as ext4, mounted it, then ran fstrim -v /mnt/temp
to discard all of the unused blocks. I did this using a USB-SATA adapter, and what I noticed was that after running fstrim the adapter activity light was still blinking like crazy. I was curious if fstrim was still running, even though the command had completed already.
Submitted by bnemec on Wed, 06/10/2020 - 17:09
The Oslo team held its second virtual PTG this week. We had a number of good discussions and even ran slightly over the 2 hours we scheduled, so I think it was a successful event. The first hour was mostly topics relating to Oslo itself, while the second hour was set aside for some cross-project discussions with the Nova team. Read on for details of both hours.
Pages