Infrastructure
ArgoCD
Deprecated: 5/5/25
No longer use this application as virtualization moved from kubernetes to docker
A service that provides gitops deployments, by synching with an external repository it allows manifests to be replicated into the defined cluster. With this I no longer needed to connect to a node in the cluster to deploy new manifests. This was a super amazing find as it always provided a really simple and nice to use UI to get an understanding of all the resources deployed for an application, with logging and visibility in errors happening when deploying a set of manifests.
Using applications you define the type of manifest you are deploying, helm,
kustomize, or regular manifests, and the source from which it comes from, e.g
a github repo or a helm chart. I originally manually maintained a list of
applications that I would need to apply to argocd so that it would recognize
the application and generate and deploy it's resources. Over time i found
applicationsets that allowed me define an application template that would
take a group of folders and iterate over them and create an application for
each that would be applied to argocd. This has allowed me to add or remove
applications with just making a change to my github repo.
Getting argocd configured requires some manual steps such as deploying it's
resources, updating the admin password on first configuration, and configuring
the repo authentication in the ui. Once those are configured you are ready to go
with deploying applications.
Currently argocd monitors the manifests folder in my kubernetes repo to deploy resources.
Bitwarden (Secrets Manager)
Deprecated: 5/5/25
No longer use this application as virtualization moved from kubernetes to docker. In addition infisical has been a fantastic service to used instead.
Currently I used bitwarden as my password manager but it was suprising to see
that they also have added a secrets manager that is meant to be used in
application development. In the process of looking for a secrets manager
operator to deploy in kubernetes I came across bitwarden's secrets manager
operator that made is both easy and secure of being able to import sensitive
credentials to be used in my kubernetes app deployments. I no longer needed to
pass in secrets directly in files and in environment variables.
The configuration on how to use it in the cluster was super straightforward. To
identify a secret all you need to do is specify the ID created by bitwarden
and define the key name for which the value will be populated. This also allowed
only needing to modify the value in bitwarden and it would be synchronized
into all the secrets that are deployed using that value. In addition, I only
need to ensure the token used (bw-auth-token) is added to the different
namespaces that a BitwardenSecret would be added.
CloudflareDDNS
Deprecated (7/13/25)
I wouldn't consider this a necessary service that I run, but the purpose of this
is to continuously update my DNS provider, Cloudflare, with the IP address of
my domain dripdrop.pro. Considering I learned that the public ip or the home
network doesn't change this service might no be necessary, but it still is good
for the small off chance that it ever does.
Crowdsec
This is a new service I'm testing out with traefik that handles malicious ips
from sending requests to my services. This was promptly only because one day I
noticed really odd traffic going to my dripdrop app that was causing it to go
down. I believe it was triggering an exploit on vite cause the container to grow
in memory usage over time, but fortunately with memory limits it never became
super serious. But this did lead me to set up crowdsec to hopefully block
these types of requests in the future. Setting this up with traefik wasn't so
hard, especially with a really good example given by traefik-crowdsec-bouncer- plugin that is what gets it to work with crowdsec. A really cool feature about
it, is that it can read traefik logs directly from docker.
One thing that I'm still figuring out is how to get It turns out that I had to manually modify the yaml
files in order to get the appsec to work with it,
which is basically a way to dynamically handle new exploits as they became known
in the crowdsec ecosystem.appsec component running and to have it start
monitoring traefik logs to parse. Installing the crowdsec enginer was really
easy and I could start seeing the alerts and bans that my server has picked up.
A really funny thing that happened was that gatus triggered an IP
ban on myself and so I needed to figure out how to avoid that. I ended
up learning about crowdsec's whitelist feature and specifically
PostOverflow which is evaluating a whitelist using extra features,
including DNS. I was able to dynamically set it such that any IP that points to dripdrop.pro would not be blocked by crowdsec.
Crowdsec works really well in comparison to something like fail2ban since it
relies on a central shared source to handle blocking malicious ips. Using a
shared ecosystem any service that flags an IP basically gets it automatically
blocked in other services using crowdsec as well.
DDNS Updater
Fulfilled the same purpose of cloudflareddns but the reason I migrated to this
service was to reduce the dependence on hotio and linuxserver images. In
addition I actually was not certain where the source code for this service was
so decided to move away to this. The author of this tool is also the author of
gluetun!
Komodo
This application was the main reasons I had the urge to go back from
kubernetes to docker virtualization. Having gotten a taste of git ops
deployments and the beauty of it's automatic management I wanted something that
felt like ArgoCD but with docker. Well this application is exactly that, but
with a little bit more configuration writing. Using this application alone I was
able to consolidate ArgoCD, Reloader, and Bitwarden Secrets Manager all
into one with simple actions. Learning how to create actions, procedures, and
resource syncs didn't take very long and eventually I was able to get everything
up and running within a day. One really small annoying thing is that komodo
can't manage itself (which is understandable), so I had to specifically exclude
it from syncs and to upgrade it myself when a new version came out.
Adding to this, as I built out shared compose files to handle automated backups and service initialization for local volumes, it's build feature has been amazing to use, allowing me to build and image and just push it to github's image repo with ease.
Mergerfs
One of the backbone services of my setup, this service is one of the few
services installed directly on the NAS. Mergerfs is a filesystem pooling
service, that allows combining multiple drives into a single view filesystem.
It's similar to RAID but it does not work on the disk level, but on the
filesystem level. With openmediavault this is a fairly easy configuration to
manage, but there are some attributes that I needed to add to the fileystem
mount options to ensure that it provided the best performance for my services.
The create policy I selected is mfs (most free space) where it will always
create new files on the disk with the most free space. The default policy epmfs (existing path/most free space) first attempts to see if the directory exists
and continues using the drive that the directory is located and then defaults
the mfs once the disk runs out of space. I found that due to files existing
under a single media folder it was constantly filling up a single disk. The
most important mount options is func.getattr=newest, it ensure all information
about the filesystem is up to date for services like jellyfin that monitor
files in the media folder. I was surprised to see that drive access speeds
were similar to RAID in my tests (may not be accurate).
Nvidia GPU Operator
Deprecated: 5/5/25
No longer use this application as virtualization moved from kubernetes to docker. Also the Nvidia 3060 will be repurposed for gaming vms.
After purchasing a RTX 3060 it did require many more steps to mount it into
virtual containers due to nvidia's proprietary drivers. For docker it would
have been a simple configuration of installing nvidia drivers on the node and
installing [nvidia container toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-
toolkit/latest/install-guide.html) to be enabled against it.
For kubernetes there exists the [nvidia gpu operator](https://docs.nvidia.com/datacenter/cloud-native/gpu-
operator/latest/getting-started.html). What essentially this does is it
automatically deploys pods to install the latest drivers, install the nvidia container toolkit, configure it against containerd, and automatically handles
including drivers for any pod that requests the gpu resource. In attempting to
have it configured I found that it's actually not possible to have the operator
auto install drivers and update pods due to [not supporting secure
boot](https://docs.nvidia.com/datacenter/cloud-native/gpu-
operator/latest/troubleshooting.html#efi-secure-boot).
Installing nvidia drivers for the server included a lot of packages that aren't needed since it's a headless server, I found this gist listed the minimal packages for getting the drivers all set up. After manually installing drivers and setting up the container runtime I was all set in using it in pods.
When running the configure command for nvidia container toolkit you need to symlink k3s's config.toml from /var/lib/rancher/agent/etc/containerd/config.toml to /etc/containerd/config.toml as that is where it expects the file.
In order mount the gpu to a pod you must define runtimeClassName: 'nvidia' in the pod spec.
Do not use limits as it will enforce that no other pods can claim the gpu.
Only pods deployed on the same node as the gpu can use it.
Reloader
Deprecated: 5/5/25
No longer use this application as virtualization moved from kubernetes to docker
Added this service to the cluster recently, I did not realize that there was a
service to auto reload deployments when a config map or secret changes. It was
an absolute pain to need to go into argocd to reload the deployment after
updating a config map. With this simple to setup service it will automatically
do that for me.
Renovate
Another recent addition to the cluster, this became the replacement for diun
and watchtower in my cluster. It handles auto updating the different resources
that exists in a github repo. Using my docker repo and some small package
rules, renovate creates PRs that let me know when an image needs to be
updated. In addition in the created PR the release notes for the version
increment is listed, allowing me to handling almost everything by looking at a
PR and just merging it if I decide the version does not interfere with my
current configuration. I also added it to the dripdrop repos so it's also
managing package updates for my python server and react app.
Running the helm regex match only works on the single instance and will not target any other deployments for an update in the same file.
Snapraid
When understanding how to manage and create the NAS pool there were two
options that I landed on, RAID and Mergerfs + Snapraid. The biggest reason
that I did not go forward with RAID was due to it's necessity of needing
SATA connections to provide the best throughput, but at the time i was
starting with the mini-pc so that was not something that I could accomplish.
RAID would give me real-time data redundancy and increased speed based on the
drive configuration. But one major factor that it could not handle was the
ability to add additional drives with ease as it required rebuilding the entire
pool to accomplish that. Snapraid was an alternative used by many, and in
combination with mergerfs provided a RAID-like drive pool. Mergerfs would
provide the filesystem pooling and snapraid would handle the data redundancy
part. It works by running a scheduled task that would hash drives together to
provide the parity of the pool and due to it's nature, allows for easily
expandable pools. Because the only thing being backed up is media and music
and nothing sensitive it felt like the best path forward. Rather than using
openmediavault's snapraid plugin I disabled it's scheduled task and replaced
it's usage with the snapraid aio script. It provided much better notification
support outside of just email and allowed me to have better insights into the
state of array between syncs. I've moved from a weekly sync with 25% of the
array being scrubbed to a daily sync with 4% of the array scrubbed per day.
Scrubbing is also a very important task to catch any potential bit-rot that may occur in the drive pool. With a scrub-frequency of 1 and a scrub-percentage of 25% it will scrub a quarter of the array every week, essentially scrubbing the entire array in a month.
S6 Overlay
This really was a game changer to find, as it really helped out in setting up
images that required tasks before and after a process. It was a bit of a
learning curve with v3 of the overlay structure, but now that I have some
working examples it has allowed me to do so much. 389-ds had a really weird
entrypoint that I setup to handle initial configuration, but with s6 it was so
easy. Also the issue I ran into with windows where I needed to unbind the vfio
drivers from the gpu but was blocked due to the vm not letting go of the gpu was
completely resolved with s6 as it just has a nice logic flow and handles process
finish scripts really easily. I now understand why so many images, such as those
from hotio and linuxserver use this tech.
This also allowed me to do some more complicated docker setups like system- tools, which was basically this all in one cron and dashboard actions for my
server.
Do not put echo statements in oneshot services. It seems to execute the entire line
Tailscale
I've always heard of tailscale but never really made use of it until very
recently. I did not realize how amazing this service is, it allows remote
connections to devices not exposed to the public. For the longest time I kept a
port open on the router to allow for ssh-ing into the minipc and through that I
was able to access the local network. With tailscale that configuration is
completely gone and has left me more comfortable with the attack surface on the
router. In addition, tailscale has docker images that allow a container to
be deployed into your tailscale network and is able to access services open in
the network. This enabled access openmediavault GUIs for both the minipc and
nas from anywhere I have access to internet.
Traefik
During my time using docker and even for some of when my kubernetes cluster
existed I primarly used docker-swag to handle routing as well as retrieving
let's encrypt certificates for my domains. Due to swag's simplicity it never
made sense to migrate. I have heard of traefik and knew it was an up and
coming reverse-proxy, but I did not realize how powerful it would be in my
kubernetes cluster. Overtime I learned that this was very anti kubernetes
and that I should be using an ingress to route external traffic to services
within the cluster. Due to the lower barrier to entry for traefik and the fact
it was already built in to k3s I went with it. Learning how to create
IngressRoutes took some time, it requires binding to a service and by
matching a rule being defined by a Host domain and optional PathPrefix or
PathRegex logic. Once really cool thing was it came with letsencrypt within
it, so it automatically generated certificates the moment a Host was detected
within a route.
It was also a perfect solution as it's widely supported in authentication
services with forward-auth so it allowed a pretty seamless transition. With
k3s, since traefik is auto deployed when bootstrapping the cluster, the only
way to override the configuration was to use a HelmChartConfig. Using that
config I was able to set a persistent volume to store the ssl certs and
maintain them between reboots. An additional configuration that was needed was
to create the folder with the appropriate permissions as there could be issues
for traefik when attempting to add new certificates. Now that my cluster
fully runs with traefik I don't imagine I would ever go back to nginx or
docker-swag as a reverse proxy to use. When I first started setting up
kubernetes I continued to use the swag config by pushing all http traffic
there.
Updated (5/5/25)
With updating virtualization from kubernetes to docker I had to spend time
learning how to configure traefik with docker and to my surprise it was even
faster than learning it with k3s. Using docker labels and a couple of command
args to the docker command for traefik I was able to easily recreate all
routing in less than a day.