Skip to main content
Version: 1.33.1

Installation on Bare-Metal Nodes

With SD v1.33.0 - v1.32.1 - v1.31.2, we introduced some key parameters to help installing SD into bare-metal nodes.

The purpose of this guide is to explain caveats to take into consideration when installing Kubernetes on a bare-metal environment, as well as provide a concrete example that can be further customized based on your specific needs.

Prerequisites and Planning

Special care will need to be taken when preparing to install SIGHUP Distribution on bare-metal environments.

Generally speaking, a bare-metal cluster can be considered a special case of an on-premises cluster where, due to the high capacity of the nodes, the pods density is (much) higher to take full advantage of the hardware capacity.

Aside from the SD requirements, you will need to carefully plan for:

  • Storage: you will need fast storage, especially for etcd.
    • The storage size could heavily depend on how big the container images for your workload are.
    • Plan for future growth - e.g.: you may choose to use a SAN and iSCSI volumes to allow dynamic resizing.
    • For production deployments, we suggest to mount dedicated storage volumes for:
      • /var/lib/containerd - Container image and runtime storage.
      • /var/lib/kubelet - Kubelet data and pod volumes.
      • /var/lib/etcd - etcd database (control plane nodes only).
  • Kubernetes Networking: the key question here is "how many Pods per node will this environment need to allow?"
    • Use big enough Pod and Service CIDRs. You will need to host maxPodsPerNode x numNodes
    • Consider that each node will have its own dedicated subnet of the Pod subnet CIDR. Each subnet needs to hold at least maxPodsPerNode IPs.
    • Plan for future growth now, because changing network parameters after installation is quite difficult.
      • Is it possible that you will add more nodes in the future? If so, the Pod/Service CIDRs must allow more subnets for additional nodes.
      • Is it possible that you will need to allow more pods per node in the future? If so, the nodes subnet size must be larger than initial estimates.
  • Access:
    • Ensure you have either physical access to the machines or a virtual KVM setup to manage them remotely, in case something goes wrong and you need to reset them.
    • SSH access into the running OS. Make sure you have one user with root privileges available.
  • Tools: furyctl will take care of most of the configuration. Depending on your environment, you may also want to use Ansible and/or OpenTofu for additional configuration needs.

Example scenario

Let's create an example scenario that we will use to explain some of the possible choices:

  • Nodes: 6x 40CPUs - 512GB RAM - 50GB SAS 15kRPM disk
  • SAN: To allow creation of iSCSI volumes to be mounted on nodes
  • Currently, we predict that our workload will not exceed 200 pods per node
  • We also think that maybe we could need to add nodes to this cluster in the future
  • Given that all nodes have the same specs, we want to use the control-plane nodes to also host workload because they would be wasted otherwise

With this environment, we will need to:

  • Create volumes at least for ETCD in the SAN, as the mechanical disks are not performant enough
    • if possible, automate the creation and mounting of volumes with OpenTofu and/or Ansible
  • Plan the Kubernetes network (tip: use ipcalc to make these calculations):
    • We think that we will create at most 200 pods per node -> a /24 subnet seems to be enough
    • What if our needs change over time? What if we want to ensure we use as much of the available resources as possible? -> we may be better off using a /23 network per node, even if we will not exhaust it in the near future
    • A /23 subnet would allow us to place ~500 pods per node
    • 500 Pods x 6 Nodes = 3000 -> it means we need at least a /20 Pod CIDR; we need also a different /20 Service CIDR
    • How many /23 subnets can we fit in a /20 CIDR? -> 8 subnets, which means we can add at most other 2 nodes to our environment - is this enough?
    • Let's say we think we may need to add other 6 nodes in the future -> with a /19 Pod CIDR we will have space for 16 /23 subnets, which gives us much better growth management
    • Subnets must be chosen from the private ranges specified in RFC1918. Possible implementation:
      • Pod CIDR: 172.17.0.0/19
      • Service CIDR: 172.18.0.0/19
      • Node netmask: /23
  • Adjust kernel and kubelet's default configuration:
    • for nodes with a high-density of pods (processes) running at the same time, some parameters of the Linux kernel must be adjusted.
    • by default the kubelet is limited to run 110 pods per node
    • it will be useful to instruct the kubelet to reserve some quota of the available resources for guaranteeing the node's healthiness

Installation

Before installing SD, make sure that:

  • Each machine has a password-less sudo user, which can only login using SSH keys
  • Each machine has the appropriate storage configuration - in our case, for example, we would need to:
    • create volumes in the SAN
    • install the open-iscsi system package in each machine
    • mount volumes in the machines and format them
    • add entries in /etc/fstab to mount volumes at boot

Create a SD cluster configuration file using the OnPremises provider. You now need to edit it to add the bare-metal-specific configurations.

Customize the configuration

Based on our example scenario, edit the generated furyctl.yaml and add the following parameters:

Kubernetes networking

Configure the Pod and Service CIDRs according to your planning:

spec:
kubernetes:
podCidr: 172.17.0.0/19
svcCidr: 172.18.0.0/19

Kubelet configuration

Configure kubelet parameters to ensure proper resource management on bare-metal nodes:

spec:
kubernetes:
advanced:
kubeletConfiguration:
maxPods: 500 # must be <= the max amount of IPs in the node subnet
systemReserved:
cpu: 2000m
memory: 4Gi
ephemeral-storage: 1Gi

maxPods: The maxPods parameter controls how many pods Kubernetes will allow on each node. This must align with your per-node subnet size:

  • /24 subnet = ~250 usable IPs -> set maxPods: 250 or lower
  • /23 subnet = ~500 usable IPs -> set maxPods: 500 or lower
  • /22 subnet = ~1000 usable IPs -> set maxPods: 1000 or lower

If you set maxPods higher than your subnet allows, pods will fail to get IP addresses. If you set it too low, you'll waste IP space.

systemReserved: Reserves resources for system daemons (OS, kubelet, containerd, CNI, etc.). Without these reservations, pods can consume all available resources, starving critical system processes and causing node instability.

Rule of thumb - scale based on total node resources:

  • cpu:
    • 1000m (1 core) for nodes with <16 cores
    • 2000m (2 cores) for nodes with 16-64 cores
    • 4000m (4 cores) for nodes with >64 cores
  • memory:
    • 1-2Gi for small nodes (<32GB RAM)
    • 2-4Gi for medium nodes (32-128GB RAM)
    • 4-8Gi for large nodes (>128GB RAM)
  • ephemeral-storage:
    • 1Gi for minimal logging
    • 2-5Gi for verbose logging or longer retention

Those are just very simple suggestions to start with some values. Actual values to be used will be determined based on the kind of workload and system requirements on each specific environment. Monitor actual usage in production and adjust as needed.

Kernel parameters

Bare-metal nodes running many pods need increased inotify limits. The default Linux limits are too low and will cause "too many open files" errors. Add these parameters at the global level (applies to all nodes):

spec:
kubernetes:
advanced:
kernelParameters:
- name: "fs.inotify.max_user_instances"
value: "8192"
- name: "fs.inotify.max_user_watches"
value: "524288"

Why these parameters matter: Each container can watch files using inotify. With 200-500 pods per node, the default limits (128 instances, 8192 watches) are quickly exhausted, causing pod failures. These increased values prevent that.

tip

You can also set kernelParameters per master or per node group if you need different values:

spec:
kubernetes:
masters:
kernelParameters:
- name: "param.name"
value: "param-value"
nodes:
- name: worker
kernelParameters:
- name: "param.name"
value: "other-value"

Note: parameters specified in this way take precedence and discard entirely the ones specified in the .spec.kubernetes.advanced.kernelParameters section!

Control plane and worker nodes

Configure your control plane and worker nodes. In bare-metal environments, you may want to run workloads on control plane nodes to maximize hardware utilization:

spec:
kubernetes:
masters:
hosts:
- name: master1
ip: 192.168.1.10
- name: master2
ip: 192.168.1.11
- name: master3
ip: 192.168.1.12
taints: [] # Allow workloads on control plane nodes
nodes:
- name: worker
hosts:
- name: worker1
ip: 192.168.1.20
- name: worker2
ip: 192.168.1.21
- name: worker3
ip: 192.168.1.22
taints: []
note

By default, Kubernetes taints control plane nodes to prevent regular workloads from running on them. Setting taints: [] removes all taints, allowing you to schedule workloads on control plane nodes. This can be useful in bare-metal environments where you want to maximize utilization.

If you prefer to keep control plane nodes dedicated, simply omit the taints parameter.

Networking module

SD supports both Calico and Cilium as CNI providers. In our example, we planned for a /23 subnet per node to allow up to 500 pods.

Using Calico:

spec:
distribution:
modules:
networking:
type: calico
tigeraOperator:
blockSize: 23 # /23 subnet per node = ~500 usable IPs

Using Cilium:

spec:
distribution:
modules:
networking:
type: cilium
cilium:
maskSize: "23" # /23 subnet per node = ~500 usable IPs

Apply the configuration

Before applying, make sure you have:

  • Created the PKI with furyctl create pki (see OnPremises provider docs)
  • Configured SSH access to your machines setting up the spec.kubernetes.ssh section of the config file
  • Customized with your own parameters the rest of the configuration file

Then apply the configuration:

furyctl apply
tip

Follow the full installation guide in the OnPremises provider documentation for complete details on SSH setup, load balancers, and other required configurations.