How To Handle Distributed Transactions in Microservices
Every engineer who moves to microservices eventually hits the “transaction wall.” The simplicity of ACID is gone. Data is scattered. Keeping business logic consistent gets messy—and how you handle it separates robust systems from future outages.
Why Distributed Transactions Are Hard
In the monolith, ACID handled everything. Microservices break that safety net. Each service owns its own data, database, and sometimes its own infrastructure. Network calls fail. Services go down. Data gets out of sync. You can’t “just” rollback—atomicity is gone, and distributed systems theory (hello, CAP theorem) becomes reality.
The Patterns That Actually Work
1. Two-Phase Commit (2PC): Almost Never the Right Move
# Pseudocode for 2PC – Don't do this in microservices for service in services: service.prepare() if all_prepared(): for service in services: service.commit() else: for service in services: service.rollback()
Why not? Blocking, slow, and introduces a single point of failure. You’re trading one headache for another.
2. Sagas: The Practical, Scalable Pattern
Orchestrated Saga (with Temporal, Python Example)
Break a big transaction into steps. On failure, execute compensating steps.
from temporalio import workflow, activity @workflow.defn class OrderSagaWorkflow: @workflow.run async def run(self, order_id): try: await workflow.execute_activity(reserve_inventory, order_id) await workflow.execute_activity(charge_payment, order_id) await workflow.execute_activity(arrange_shipment, order_id) except Exception: await workflow.execute_activity(cancel_inventory, order_id) await workflow.execute_activity(refund_payment, order_id)
- Frameworks: Temporal, Camunda, Axon.
Choreographed Saga (Event-Driven, Node.js/Kafka Example)
Each service reacts to events, produces the next step.
// orderService.js (Node.js, pseudo-code) kafkaConsumer.on('order_created', async (order) => { try { await reserveInventory(order); kafkaProducer.send('inventory_reserved', order); } catch { kafkaProducer.send('order_failed', order); } });
- Frameworks: Kafka Streams, Eventuate Tram, NATS Jetstream.
3. Outbox Pattern: Bulletproof Event Consistency
Write domain changes and events together, atomically, then publish events asynchronously.
# Save domain change and event in one DB transaction with db.session.begin(): order.status = "confirmed" outbox.append({"event": "order_confirmed", "order_id": order.id}) # Background worker publishes events def outbox_worker(): for event in get_pending_outbox(): kafka_producer.send(event["event"], event) mark_outbox_event_published(event)
- Frameworks: Debezium, Kafka Connect, Watermill.
4. Idempotency: Survive Retries
Every external call should be idempotent. Clients must send a unique key; services deduplicate.
def charge_payment(order_id, idempotency_key): if already_processed(idempotency_key): return "Already charged" # process payment mark_processed(idempotency_key) return "Charged"
- Reality: Payment APIs, order systems, anything with money or user state.
Trending Tech & Frameworks
- Workflow engines: Temporal, Camunda, Netflix Conductor.
- Messaging/Event platforms: Kafka, Pulsar, NATS Jetstream.
- CDC/Outbox: Debezium, native Postgres, Kafka Connect.
- Observability: OpenTelemetry, Honeycomb—never fly blind.
Hard Lessons Learned
- There’s no free lunch: Distributed transactions force you to pick—consistency, availability, or partition tolerance. Accept trade-offs.
- Business consistency > technical consistency: What really must never break? Protect that. Accept eventual consistency elsewhere.
- Visibility is survival: Always know what step failed and why. Instrument, log, and trace every flow.
- Never cross service DB boundaries: If you need distributed joins, your boundaries aren’t real.
What Actually Works
Pattern | Best For | When to Use |
---|---|---|
2PC | Tight-coupled legacy, low scale | Almost never in modern systems |
Saga | Business workflows | Orders, bookings, payments |
Outbox Pattern | Event-driven, messaging | E-commerce, billing, logistics |
Idempotency | Every external API | Payments, notifications, actions |
Next Steps: For Builders, Founders, and Curious Minds
If you’re serious about scaling—don’t let distributed transactions be an afterthought. Prototype a simple saga. Try Temporal or build a minimal outbox flow. Instrument everything.
If you want to learn from someone who’s made these mistakes so you don’t have to—reach out. I’m always open to collaborate, brainstorm, or trade stories.
Let’s Connect
If this post helped, Share it. DM me. Subscribe for real-world system design, not just theory. We build better systems together—one lesson at a time.
Continue reading
More systemJoin the Discussion
Share your thoughts and insights about this system.