Do You Actually Need Kubernetes?

Cloud infrastructure and server technology

We spent three months evaluating whether to migrate from Docker Compose on a single VPS to Kubernetes. During that time I read what felt like every "Kubernetes for beginners" article on the internet, sat through four vendor demos, and had more architecture discussions than I care to count. In the end, we partially migrated. Not everything. Just the pieces where the tradeoff made sense.

I want to walk through our thinking because I suspect a lot of small-to-mid teams are asking the same question and getting sales pitches instead of honest answers.

The Situation We Were In

Our stack was an API server, a React frontend, a PostgreSQL database, Redis, and a background job worker. All running on a single 8-core VPS with Docker Compose. It worked fine most of the time. The problem showed up during traffic spikes — product launches, marketing campaigns, anything that brought a sudden increase in users.

When traffic spiked, someone had to manually SSH into the server, check the resource usage, maybe spin up another container or increase the resource limits. If this happened during business hours, no big deal. If it happened at 2 AM, someone's phone rang and they had to drag themselves out of bed to scale things up.

We also couldn't do zero-downtime deploys. Every time we pushed a new version of the API, there was a brief period — maybe 3-5 seconds — where active connections got dropped. For most users that meant a failed request that succeeded on retry. Not terrible, but not great either, and it made us nervous about deploying during peak hours.

These were the two concrete problems: manual scaling and deploy interruptions. Everything else was secondary.

What Kubernetes Offers

The value proposition of Kubernetes is that you describe what you want your system to look like, and it makes reality match that description. You say "I want 3 copies of this API server running, and if any of them crash, start a replacement." Kubernetes does the rest.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: core-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: core-api
  template:
    metadata:
      labels:
        app: core-api
    spec:
      containers:
        - name: app
          image: internal-registry/core-api:v2.1
          resources:
            requests:
              memory: "256Mi"
              cpu: "250m"
            limits:
              memory: "512Mi"
              cpu: "500m"
          livenessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 10
            periodSeconds: 15
          readinessProbe:
            httpGet:
              path: /ready
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 10

The livenessProbe checks if the container is alive. If it fails, Kubernetes kills the container and starts a fresh one. The readinessProbe checks if the container is ready to receive traffic. If it fails, the container stays running but gets removed from the load balancer until it recovers.

This was genuinely appealing to us. If a container crashed due to an unhandled exception or a memory leak, it would get restarted automatically. No pager. No SSH session. No manual intervention.

The auto-scaling was the other big draw.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: core-api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: core-api
  minReplicas: 2
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

When average CPU across pods exceeds 70%, Kubernetes spins up more replicas. When it drops, it scales back down. During a launch event, we could theoretically handle 10x traffic without anyone touching a keyboard. At 3 AM, when nobody's on the site, we'd run just 2 replicas instead of paying for peak capacity 24/7.

The cost savings argument was real. We did the math and estimated that right-sizing with auto-scaling would save us about 30% on our hosting bill compared to our current approach of provisioning for peak and leaving it there all the time.

Rolling Deployments

The zero-downtime deploy problem was actually the easier sell internally. Kubernetes does rolling deployments by default — it starts new pods with the updated version, waits for them to pass their readiness probes, then gradually routes traffic away from old pods. At no point are there zero healthy pods handling requests.

We tested this in a staging environment and the results were good. We could deploy a new API version while running a continuous load test and not see a single dropped connection. After months of carefully timing our deploys to avoid peak hours, this felt like a big upgrade.

The Downsides We Discovered

I don't want to make this sound like Kubernetes was an obvious win. There were real costs that we didn't fully appreciate until we started using it.

The learning curve was steep. Our team had strong Docker skills but Kubernetes is a different world. Concepts like Services, Ingress, ConfigMaps, Secrets, Namespaces, RBAC — there's a lot to learn before you can deploy even a simple application. We budgeted two weeks for the team to get comfortable. It took closer to five weeks before people felt confident making changes without worrying about breaking something.

Debugging is harder. With Docker Compose, you docker logs <container> and see the output. With Kubernetes, your pod might have been rescheduled to a different node, or it might have been killed and replaced with a new one, and the logs from the old pod are gone unless you've set up log aggregation (which is its own project). We spent a week setting up Loki and Grafana just to get searchable logs. That's a week of work that didn't exist in our Docker Compose setup.

YAML fatigue is real. Kubernetes is configured through YAML manifests. Lots of them. A simple application that would take 20 lines in a docker-compose.yml can easily require 150+ lines of Kubernetes YAML across multiple files — Deployment, Service, Ingress, HPA, ConfigMap, maybe a PersistentVolumeClaim. There are tools like Helm and Kustomize that help manage this, but they're additional complexity on top of the already complex base.

Local development gets awkward. You can't easily run a full Kubernetes cluster on a developer's laptop. Tools like minikube and kind exist, but they're resource-hungry and the experience isn't as smooth as docker compose up. We ended up keeping Docker Compose for local development and using Kubernetes only for staging and production. This means the local environment doesn't perfectly match production, which is the thing Docker was supposed to fix in the first place. Bit ironic.

Config drift is a real risk. If someone runs kubectl apply manually to fix a production issue at midnight and forgets to commit the change to git, the cluster state and your source code are now out of sync. We mitigated this by adopting ArgoCD for GitOps — the cluster syncs from a git repository and manual changes get overwritten. But that's yet another tool to set up and maintain.

What We Actually Decided

After all this evaluation, we didn't go all-in on Kubernetes. We migrated our stateless services — the API server and the frontend — to a managed Kubernetes cluster on GKE. These are the things that benefit most from auto-scaling and zero-downtime deploys.

We kept PostgreSQL and Redis on managed cloud services outside Kubernetes. Running databases inside Kubernetes is technically possible but it adds significant operational complexity — persistent volume management, backup strategies that account for pod rescheduling, replication across nodes. The general advice I've heard from people who've done it is "don't, unless you have a specific reason to." We didn't have a specific reason.

The background job worker was a borderline case. We migrated it to Kubernetes because it's stateless and we wanted it to scale with the queue depth. But honestly, it would have worked fine as a Docker container on a dedicated VPS too. The migration added some complexity without a huge benefit for that particular component.

Six Months In

The auto-scaling has been great. We had a surprise traffic spike last month — some social media post went viral and linked to our app — and the API scaled from 2 pods to 11 pods without anyone doing anything. In the old setup, that would have been a crash followed by a frantic SSH session.

The zero-downtime deploys have been great too. We deploy multiple times a day now. Nobody checks the clock to see if it's peak hours first.

The operational overhead is real though. We've spent more time on infrastructure tooling in the last six months than in the previous two years. Setting up monitoring, setting up log aggregation, writing Helm charts, debugging networking issues, learning the intricacies of Ingress controllers. This is time that didn't go toward building product features.

Would I recommend Kubernetes for a small team? Honestly, it depends. If you're running a single application with predictable traffic, a VPS with Docker Compose is simpler, cheaper, and good enough. If you have unpredictable scaling needs, multiple services, and you need deploy reliability, the complexity of Kubernetes starts to pay for itself.

We're in that second category, but just barely. A year ago, we probably weren't. The tipping point for us was the night pager incidents and the deploy anxiety. If we hadn't had those concrete pain points, I'm not sure we would have migrated, and I'm not sure we should have.

The Kubernetes ecosystem has this gravitational pull where once you adopt it, you keep adopting more tools that integrate with it — service meshes, policy engines, progressive delivery tools. Each one is individually reasonable but collectively they add up to a lot of moving parts. I'm actively trying to resist that pull and keep our setup as simple as we can. We don't need Istio. We don't need Linkerd. We need auto-scaling and rolling deploys, and we have those now. Everything else can wait until we have a real reason for it.

Cloud infrastructure and server technology

I want to walk through our thinking because I suspect a lot of small-to-mid teams are asking the same question and getting sales pitches instead of honest answers.

The Situation We Were In

These were the two concrete problems: manual scaling and deploy interruptions. Everything else was secondary.

What Kubernetes Offers

apiVersion: apps/v1
kind: Deployment
metadata:
  name: core-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: core-api
  template:
    metadata:
      labels:
        app: core-api
    spec:
      containers:
        - name: app
          image: internal-registry/core-api:v2.1
          resources:
            requests:
              memory: "256Mi"
              cpu: "250m"
            limits:
              memory: "512Mi"
              cpu: "500m"
          livenessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 10
            periodSeconds: 15
          readinessProbe:
            httpGet:
              path: /ready
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 10

This was genuinely appealing to us. If a container crashed due to an unhandled exception or a memory leak, it would get restarted automatically. No pager. No SSH session. No manual intervention.

The auto-scaling was the other big draw.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: core-api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: core-api
  minReplicas: 2
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Rolling Deployments

The Downsides We Discovered

I don't want to make this sound like Kubernetes was an obvious win. There were real costs that we didn't fully appreciate until we started using it.

What We Actually Decided

Six Months In

The zero-downtime deploys have been great too. We deploy multiple times a day now. Nobody checks the clock to see if it's peak hours first.

Do You Actually Need Kubernetes?

The Situation We Were In

What Kubernetes Offers

Rolling Deployments

The Downsides We Discovered

What We Actually Decided

Six Months In

Anurag Sinha

Found this useful?

Comments

Related Articles

How Docker Compose Actually Went for My Development Setup

Learning Docker — What I Wish Someone Had Told Me Earlier

Cloud-Native Architecture: What It Means and What It Costs

Do You Actually Need Kubernetes?

The Situation We Were In

What Kubernetes Offers

Rolling Deployments

The Downsides We Discovered

What We Actually Decided

Six Months In

Anurag Sinha

Found this useful?

Comments

Related Articles

How Docker Compose Actually Went for My Development Setup

Learning Docker — What I Wish Someone Had Told Me Earlier

Cloud-Native Architecture: What It Means and What It Costs