Skip to main content
Version: 0.0.1

Infrastructure

ArgoCD

warning

Deprecated: 5/5/25 No longer use this application as virtualization moved from kubernetes to docker

A service that provides gitops deployments, by synching with an external repository it allows manifests to be replicated into the defined cluster. With this I no longer needed to connect to a node in the cluster to deploy new manifests. This was a super amazing find as it always provided a really simple and nice to use UI to get an understanding of all the resources deployed for an application, with logging and visibility in errors happening when deploying a set of manifests.

Using applications you define the type of manifest you are deploying, helm, kustomize, or regular manifests, and the source from which it comes from, e.g a github repo or a helm chart. I originally manually maintained a list of applications that I would need to apply to argocd so that it would recognize the application and generate and deploy it's resources. Over time i found applicationsets that allowed me define an application template that would take a group of folders and iterate over them and create an application for each that would be applied to argocd. This has allowed me to add or remove applications with just making a change to my github repo.

Getting argocd configured requires some manual steps such as deploying it's resources, updating the admin password on first configuration, and configuring the repo authentication in the ui. Once those are configured you are ready to go with deploying applications.

note

Currently argocd monitors the manifests folder in my kubernetes repo to deploy resources.

Bitwarden (Secrets Manager)

warning

Deprecated: 5/5/25 No longer use this application as virtualization moved from kubernetes to docker. In addition infisical has been a fantastic service to used instead.

Currently I used bitwarden as my password manager but it was suprising to see that they also have added a secrets manager that is meant to be used in application development. In the process of looking for a secrets manager operator to deploy in kubernetes I came across bitwarden's secrets manager operator that made is both easy and secure of being able to import sensitive credentials to be used in my kubernetes app deployments. I no longer needed to pass in secrets directly in files and in environment variables.

The configuration on how to use it in the cluster was super straightforward. To identify a secret all you need to do is specify the ID created by bitwarden and define the key name for which the value will be populated. This also allowed only needing to modify the value in bitwarden and it would be synchronized into all the secrets that are deployed using that value. In addition, I only need to ensure the token used (bw-auth-token) is added to the different namespaces that a BitwardenSecret would be added.

CloudflareDDNS

warning

Deprecated (7/13/25)

I wouldn't consider this a necessary service that I run, but the purpose of this is to continuously update my DNS provider, Cloudflare, with the IP address of my domain dripdrop.pro. Considering I learned that the public ip or the home network doesn't change this service might no be necessary, but it still is good for the small off chance that it ever does.

Crowdsec

This is a new service I'm testing out with traefik that handles malicious ips from sending requests to my services. This was promptly only because one day I noticed really odd traffic going to my dripdrop app that was causing it to go down. I believe it was triggering an exploit on vite cause the container to grow in memory usage over time, but fortunately with memory limits it never became super serious. But this did lead me to set up crowdsec to hopefully block these types of requests in the future. Setting this up with traefik wasn't so hard, especially with a really good example given by traefik-crowdsec-bouncer- plugin that is what gets it to work with crowdsec. A really cool feature about it, is that it can read traefik logs directly from docker.

One thing that I'm still figuring out is how to get appsec to work with it, which is basically a way to dynamically handle new exploits as they became known in the crowdsec ecosystem. It turns out that I had to manually modify the yaml files in order to get the appsec component running and to have it start monitoring traefik logs to parse. Installing the crowdsec enginer was really easy and I could start seeing the alerts and bans that my server has picked up.

note

A really funny thing that happened was that gatus triggered an IP ban on myself and so I needed to figure out how to avoid that. I ended up learning about crowdsec's whitelist feature and specifically PostOverflow which is evaluating a whitelist using extra features, including DNS. I was able to dynamically set it such that any IP that points to dripdrop.pro would not be blocked by crowdsec.

Crowdsec works really well in comparison to something like fail2ban since it relies on a central shared source to handle blocking malicious ips. Using a shared ecosystem any service that flags an IP basically gets it automatically blocked in other services using crowdsec as well.

DDNS Updater

Fulfilled the same purpose of cloudflareddns but the reason I migrated to this service was to reduce the dependence on hotio and linuxserver images. In addition I actually was not certain where the source code for this service was so decided to move away to this. The author of this tool is also the author of gluetun!

Komodo

This application was the main reasons I had the urge to go back from kubernetes to docker virtualization. Having gotten a taste of git ops deployments and the beauty of it's automatic management I wanted something that felt like ArgoCD but with docker. Well this application is exactly that, but with a little bit more configuration writing. Using this application alone I was able to consolidate ArgoCD, Reloader, and Bitwarden Secrets Manager all into one with simple actions. Learning how to create actions, procedures, and resource syncs didn't take very long and eventually I was able to get everything up and running within a day. One really small annoying thing is that komodo can't manage itself (which is understandable), so I had to specifically exclude it from syncs and to upgrade it myself when a new version came out.

Adding to this, as I built out shared compose files to handle automated backups and service initialization for local volumes, it's build feature has been amazing to use, allowing me to build and image and just push it to github's image repo with ease.

Mergerfs

One of the backbone services of my setup, this service is one of the few services installed directly on the NAS. Mergerfs is a filesystem pooling service, that allows combining multiple drives into a single view filesystem. It's similar to RAID but it does not work on the disk level, but on the filesystem level. With openmediavault this is a fairly easy configuration to manage, but there are some attributes that I needed to add to the fileystem mount options to ensure that it provided the best performance for my services. The create policy I selected is mfs (most free space) where it will always create new files on the disk with the most free space. The default policy epmfs (existing path/most free space) first attempts to see if the directory exists and continues using the drive that the directory is located and then defaults the mfs once the disk runs out of space. I found that due to files existing under a single media folder it was constantly filling up a single disk. The most important mount options is func.getattr=newest, it ensure all information about the filesystem is up to date for services like jellyfin that monitor files in the media folder. I was surprised to see that drive access speeds were similar to RAID in my tests (may not be accurate).

Nvidia GPU Operator

warning

Deprecated: 5/5/25 No longer use this application as virtualization moved from kubernetes to docker. Also the Nvidia 3060 will be repurposed for gaming vms.

After purchasing a RTX 3060 it did require many more steps to mount it into virtual containers due to nvidia's proprietary drivers. For docker it would have been a simple configuration of installing nvidia drivers on the node and installing [nvidia container toolkit](https://docs.nvidia.com/datacenter/cloud-native/container- toolkit/latest/install-guide.html) to be enabled against it.

For kubernetes there exists the [nvidia gpu operator](https://docs.nvidia.com/datacenter/cloud-native/gpu- operator/latest/getting-started.html). What essentially this does is it automatically deploys pods to install the latest drivers, install the nvidia container toolkit, configure it against containerd, and automatically handles including drivers for any pod that requests the gpu resource. In attempting to have it configured I found that it's actually not possible to have the operator auto install drivers and update pods due to [not supporting secure boot](https://docs.nvidia.com/datacenter/cloud-native/gpu- operator/latest/troubleshooting.html#efi-secure-boot).

Installing nvidia drivers for the server included a lot of packages that aren't needed since it's a headless server, I found this gist listed the minimal packages for getting the drivers all set up. After manually installing drivers and setting up the container runtime I was all set in using it in pods.

tip

When running the configure command for nvidia container toolkit you need to symlink k3s's config.toml from /var/lib/rancher/agent/etc/containerd/config.toml to /etc/containerd/config.toml as that is where it expects the file.

note

In order mount the gpu to a pod you must define runtimeClassName: 'nvidia' in the pod spec. Do not use limits as it will enforce that no other pods can claim the gpu. Only pods deployed on the same node as the gpu can use it.

Reloader

warning

Deprecated: 5/5/25 No longer use this application as virtualization moved from kubernetes to docker

Added this service to the cluster recently, I did not realize that there was a service to auto reload deployments when a config map or secret changes. It was an absolute pain to need to go into argocd to reload the deployment after updating a config map. With this simple to setup service it will automatically do that for me.

Renovate

Another recent addition to the cluster, this became the replacement for diun and watchtower in my cluster. It handles auto updating the different resources that exists in a github repo. Using my docker repo and some small package rules, renovate creates PRs that let me know when an image needs to be updated. In addition in the created PR the release notes for the version increment is listed, allowing me to handling almost everything by looking at a PR and just merging it if I decide the version does not interfere with my current configuration. I also added it to the dripdrop repos so it's also managing package updates for my python server and react app.

note

Running the helm regex match only works on the single instance and will not target any other deployments for an update in the same file.

Snapraid

When understanding how to manage and create the NAS pool there were two options that I landed on, RAID and Mergerfs + Snapraid. The biggest reason that I did not go forward with RAID was due to it's necessity of needing SATA connections to provide the best throughput, but at the time i was starting with the mini-pc so that was not something that I could accomplish. RAID would give me real-time data redundancy and increased speed based on the drive configuration. But one major factor that it could not handle was the ability to add additional drives with ease as it required rebuilding the entire pool to accomplish that. Snapraid was an alternative used by many, and in combination with mergerfs provided a RAID-like drive pool. Mergerfs would provide the filesystem pooling and snapraid would handle the data redundancy part. It works by running a scheduled task that would hash drives together to provide the parity of the pool and due to it's nature, allows for easily expandable pools. Because the only thing being backed up is media and music and nothing sensitive it felt like the best path forward. Rather than using openmediavault's snapraid plugin I disabled it's scheduled task and replaced it's usage with the snapraid aio script. It provided much better notification support outside of just email and allowed me to have better insights into the state of array between syncs. I've moved from a weekly sync with 25% of the array being scrubbed to a daily sync with 4% of the array scrubbed per day.

note

Scrubbing is also a very important task to catch any potential bit-rot that may occur in the drive pool. With a scrub-frequency of 1 and a scrub-percentage of 25% it will scrub a quarter of the array every week, essentially scrubbing the entire array in a month.

S6 Overlay

This really was a game changer to find, as it really helped out in setting up images that required tasks before and after a process. It was a bit of a learning curve with v3 of the overlay structure, but now that I have some working examples it has allowed me to do so much. 389-ds had a really weird entrypoint that I setup to handle initial configuration, but with s6 it was so easy. Also the issue I ran into with windows where I needed to unbind the vfio drivers from the gpu but was blocked due to the vm not letting go of the gpu was completely resolved with s6 as it just has a nice logic flow and handles process finish scripts really easily. I now understand why so many images, such as those from hotio and linuxserver use this tech.

This also allowed me to do some more complicated docker setups like system- tools, which was basically this all in one cron and dashboard actions for my server.

warning

Do not put echo statements in oneshot services. It seems to execute the entire line

Tailscale

I've always heard of tailscale but never really made use of it until very recently. I did not realize how amazing this service is, it allows remote connections to devices not exposed to the public. For the longest time I kept a port open on the router to allow for ssh-ing into the minipc and through that I was able to access the local network. With tailscale that configuration is completely gone and has left me more comfortable with the attack surface on the router. In addition, tailscale has docker images that allow a container to be deployed into your tailscale network and is able to access services open in the network. This enabled access openmediavault GUIs for both the minipc and nas from anywhere I have access to internet.

Traefik

During my time using docker and even for some of when my kubernetes cluster existed I primarly used docker-swag to handle routing as well as retrieving let's encrypt certificates for my domains. Due to swag's simplicity it never made sense to migrate. I have heard of traefik and knew it was an up and coming reverse-proxy, but I did not realize how powerful it would be in my kubernetes cluster. Overtime I learned that this was very anti kubernetes and that I should be using an ingress to route external traffic to services within the cluster. Due to the lower barrier to entry for traefik and the fact it was already built in to k3s I went with it. Learning how to create IngressRoutes took some time, it requires binding to a service and by matching a rule being defined by a Host domain and optional PathPrefix or PathRegex logic. Once really cool thing was it came with letsencrypt within it, so it automatically generated certificates the moment a Host was detected within a route.

It was also a perfect solution as it's widely supported in authentication services with forward-auth so it allowed a pretty seamless transition. With k3s, since traefik is auto deployed when bootstrapping the cluster, the only way to override the configuration was to use a HelmChartConfig. Using that config I was able to set a persistent volume to store the ssl certs and maintain them between reboots. An additional configuration that was needed was to create the folder with the appropriate permissions as there could be issues for traefik when attempting to add new certificates. Now that my cluster fully runs with traefik I don't imagine I would ever go back to nginx or docker-swag as a reverse proxy to use. When I first started setting up kubernetes I continued to use the swag config by pushing all http traffic there.

Updated (5/5/25)

With updating virtualization from kubernetes to docker I had to spend time learning how to configure traefik with docker and to my surprise it was even faster than learning it with k3s. Using docker labels and a couple of command args to the docker command for traefik I was able to easily recreate all routing in less than a day.