Are Microservices for You? Let's Discover Together.
When That Meetup in Reggio Emilia Changed How I Think About Architecture
I remember the first time I attended a Fullstack Meetup in Reggio Emilia. I was standing in a circle of maybe 12 developers, when a grizzled senior architect started ranting. âYou just need two lines of bash,â he said. âTwo lines, and you can deploy a monolith that serves millions.â
He wasnât wrong. But the discomfort in that circle was palpable. We engineers have a beautiful, sometimes toxic trait: we crave complexity. If a solution feels too simple, we assume itâs âjunior.â We convince ourselves we need to handle every possible scenario, every hypothetical scaling issue that hasnât happened yet, every edge case that lives only in our architectural anxiety dreams.
One year and countless production incidents later, Iâve learned a humbling truth:
Microservices arenât a destination, theyâre a deliberate choice of problems.
And the question nobody wants to ask is: Is your team actually big enough to justify the pain Iâm about to describe?
When Amazon Said âActually, Never Mindâ
You know your industry has a problem when Amazonâthe company that literally invented the cloud platform microservices run onâpublishes a blog post titled âScaling up the Prime Video audio/video monitoring service and reducing costs by 90%.â
Their monitoring team had built the âperfectâ modern system: serverless, decoupled, distributed. AWS Step Functions orchestrating Lambda functions. The kind of architecture diagram that gets you upvotes on Reddit.
Except it was hemorrhaging money and couldnât scale.
The problem? Every. Single. Network. Hop. They were serializing data, storing it in S3, retrieving it for the next function, over and over. The orchestration overhead alone was killing them. So they did the unthinkable: they refactored back into a monolith.
The results were staggering:
- 90% cost reduction (not a typo)
- Massive scalability improvements from in-memory processing
- Simpler code no more orchestration layer gymnastics
Think about this: If Amazon whose business model is literally renting you distributed infrastructure is consolidating services, you need to ask yourself why youâre rushing to fragment yours.
The Hidden Tax of the Distributed Dream
We often talk about the âMicroservice Premiumâ the inherent tax paid in operational complexity and cognitive load. This tax is collected in the small, painful moments of a developerâs day:
1. The Night RabbitMQ Made Me Question My Career Choices
Let me tell you about the time I spent an entire evening debugging what turned out to be⌠a keepalive timeout.
We had RabbitMQ running using @cloudamqp/amqp-client in Node.js. Everything looked perfect in the logs messages flowing, services communicating beautifully. Then I noticed something weird: the connection was dropping and reconnecting every 2 minutes. Not âroughly 2 minutes.â Not âbetween 1-3 minutes.â Exactly 120 seconds. Like clockwork.
Hereâs what I learned, face-first into CloudAMQP documentation:
The WebSocket Keepalive Problem:
- The client library was using WebSockets under the hood
- WebSockets require periodic keepalive packets or they timeout
- If youâre not actively sending/receiving messages, the connection appears âidleâ to intermediate proxies
- The 2-minute timeout was the proxy giving up on our âdeadâ connection
RabbitMQ calls add physical distance and network hops. Development feels fast locally, but production bleeds performance.
Code â Network â Queue â Service B (Latency â)
And the fun doesnât stop there:
- The reconnection dance: Every disconnect triggered a full reconnection cycle re-establishing channels, re-declaring exchanges, re-binding queues. All that overhead, twice a minute.
- Configuration drift hell: If Service A declares an exchange as
durable: trueand Service B declares the same exchange withdurable: false, RabbitMQ throws:
and shuts down your entire pipelinePRECONDITION_FAILED
I once spent three hours tracking down why a service wouldnât start, only to discover it was an exchange auto-delete flag mismatch. The error message was useless. The documentation was sparse. Stack Overflow had nothing. I felt very alone.
2. The Dependency Saga
The âTechnology Diversityâ promised by microservices often turns into a polyglot nightmare.
Case in point: The node-canvas library:
- Installing it requires deep system-level dependencies like Cairo, Pango, and libjpeg
- If you develop on a Mac but deploy to an Alpine Linux container, pre-built binaries fail due to the
libcvs.muslconflict - You are forced to install Python,
g++, andmakeinto your production images, bloating them and increasing build times from seconds to 10+ minutes
This isnât âtechnology diversityâ itâs dependency hell with extra steps.
3. The Windows Container Incident (Or: Why You Should Develop on Linux)
If youâve never had to run Docker on Windows Server, congratulations. You are blessed.
For the rest of us whoâve lived through it: remember that time you tried to delete a corrupted Docker image and Windows literally said âAccess Deniedâ to the Administrator account?
Hereâs the nightmare sequence:
- Dockerâs
windowsfilterdirectory holds a file lock - Your antivirus also holds a file lock (why? who knows)
- The layer becomes corrupted
- You try to delete it: Access Denied
- You log in as Administrator: Access Denied
- You cry a little đ˘
- You Google for 45 minutes and find a sketchy PowerShell script that forcibly seizes ownership from the
SYSTEMaccount - You run it, praying you donât brick your entire Docker installation
- It works đ
- You immediately start researching Linux alternatives
I wish I was exaggerating. I still have that PowerShell script bookmarked.
4. Ops is Not Optional
If you adopt microservices, you cannot do manual deployments. CI/CD becomes non-negotiable.
Hereâs what happens when you try to manually deploy 12 microservices:
- You SSH into the first server and pull the latest Docker image
- Service #1 starts successfully
- You move to Service #2, realize you forgot to update the environment variables
- Service #2 crashes on startup
- Service #1 is now calling the old version of Service #2âs API
- Users start reporting errors
- You fix Service #2, but now Service #3 depends on both, and youâre not sure which version combinations are compatible
- Itâs 11 PM. Youâre still deploying.
We use Azure DevOps, and once we set up self-hosted agents, the quality of life improvement was massive. But thereâs a catch: if you self-host agents, you own the build environment. That means:
- Managing Docker versions across agents
- Keeping build tools updated (Node, .NET SDK, Python)
- Debugging why Agent #3 builds successfully but Agent #1 fails with the exact same code
- Monitoring disk space because Docker layers fill up drives faster than you expect
5. Logging: The Black Hole
In the âold days,â I could SSH into a server and tail -f a log file. Simple. Direct. Effective.
In a microservices world, hereâs what debugging a single user request looks like:
- Step 1: User reports error at
14:23:47 - Step 2: Check API Gateway logs⌠request routed to Order Service
- Step 3: SSH into Order Service container⌠log says it called Payment Service
- Step 4: SSH into Payment Service⌠no obvious errors, but it called Inventory Service
- Step 5: SSH into Inventory Service⌠nothing unusual, but waitâtimestamps are in UTC, Order Service was in CET, Payment Service was in PST because someone changed the timezone in that Docker image for âtestingâ
- Step 6: Spend 45 minutes mentally converting timezones to correlate three log streams
- Step 7: Finally find the error: a 500ms timeout in Inventory Service that only happens under load
The brutal truth:
- Without structured logging (JSON), you canât parse logs programmatically
- Without correlation IDs, you canât trace requests across services
- Without centralized logging (Sentry, Datadog, ELK), youâre playing archaeology with text files
If you donât invest in observability from Day 1, youâre flying blind. And trust me, manually correlating timestamps across different log streams at 2 AM is a special kind of hell.
The Orchestration Cliff: K8s vs. Rancher
Once you have containers, you must run them. Kubernetes (K8s) has won the war, but it is a âbeastâ with a vertical learning curve.
K8s vs. Rancher: A Quick Comparison
| Feature Category | Managed K8s (AKS/EKS) | SUSE Rancher |
|---|---|---|
| Control Plane | â Managed by cloud provider | â ď¸ Managed by you (Single Point of Failure) |
| Developer UI | â Often slow and disjointed | â Unified âCluster Explorerâ view |
| Multi-Cloud | â Limited to the provider | â âSingle pane of glassâ across Azure, AWS, and on-prem |
My take: If youâre a small team (< 20 devs), managed K8s (AKS/EKS) is worth the cost. If you need multi-cloud or have strong ops expertise, Rancher gives you more controlâbut youâre also responsible when things break.
Most developers (myself included at the start) admit they only know enough to âedit some damn YAML files.â This creates a dangerous single-point-of-failure where the entire team is blocked waiting for the one âK8s expertâ to come back from vacation and explain why the pod is stuck in CrashLoopBackOff even though it works on my machine.
The âBoring Technologyâ Philosophy
Dan McKinleyâs âChoose Boring Technologyâ essay is a guiding light. Every company has finite âinnovation tokensâ.
The Choice:
- Boring (Postgres, Monoliths, C#): Known failure modes; you can Google the error and find the answer
- Exciting (Bleeding-edge Service Meshes): âUnknown unknownsâ that force you to debug the vendorâs code instead of your business logic
Spend your innovation tokens wisely. You donât get many.
The Verdict: Start with Modular Monolith. Seriously.
Hereâs my advice, earned through production scars:
Start with a Modular Monolith. Not because youâre being âcautiousâ or âjuniorââbecause youâre being smart.
The Path I Actually Recommend:
- Build a Modular Monolith First: Use strict namespaces, clear boundaries, and domain-driven design principles in a single deployable unit
- Treat Boundaries Like APIs: Even within your monolith, design module interfaces as if they might become network calls someday
- Invest in Observability Early: Structured logging, correlation IDs, and metrics are valuable whether youâre distributed or not
- Deploy it. Ship it. Make money with it.
Then, only if you hit one of these hard constraints:
- You have 100+ developers and teams are blocking each other on deployments
- You need genuinely different scaling profiles (your video processing needs 32-core machines while your API runs fine on 2-core instances)
- You have compliance requirements that mandate physical service isolation
âŚconsider microservices.
But if you go that route, embrace the complexity:
- Youâre signing up for everything I described above: the RabbitMQ debugging sessions, the Docker file permission nightmares, the deployment pipeline complexity
- You need dedicated DevOps expertise not âBob who knows Dockerâ
- You need a logging/observability platform from Day 1 not âwhen we get around to itâ
- You need to accept that your junior developers will spend their first month just understanding how to run the entire system locally
Microservices donât make your system simpler. They trade code complexity for operational complexity. Make sure youâre ready for that trade.
Sometimes, the most âseniorâ decision you can make is to write two lines of bash, deploy a monolith, and go home on time.
Trust me, your future self, the one getting a phone call because service mesh certificates expired, will thank you.