How To Handle Distributed Transactions in Microservices

June 28, 2025

Every engineer who moves to microservices eventually hits the “transaction wall.” The simplicity of ACID is gone. Data is scattered. Keeping business logic consistent gets messy—and how you handle it separates robust systems from future outages.


Why Distributed Transactions Are Hard

In the monolith, ACID handled everything. Microservices break that safety net. Each service owns its own data, database, and sometimes its own infrastructure. Network calls fail. Services go down. Data gets out of sync. You can’t “just” rollback—atomicity is gone, and distributed systems theory (hello, CAP theorem) becomes reality.


The Patterns That Actually Work

1. Two-Phase Commit (2PC): Almost Never the Right Move

# Pseudocode for 2PC – Don't do this in microservices for service in services: service.prepare() if all_prepared(): for service in services: service.commit() else: for service in services: service.rollback()

Why not? Blocking, slow, and introduces a single point of failure. You’re trading one headache for another.


2. Sagas: The Practical, Scalable Pattern

Orchestrated Saga (with Temporal, Python Example)

Break a big transaction into steps. On failure, execute compensating steps.

from temporalio import workflow, activity @workflow.defn class OrderSagaWorkflow: @workflow.run async def run(self, order_id): try: await workflow.execute_activity(reserve_inventory, order_id) await workflow.execute_activity(charge_payment, order_id) await workflow.execute_activity(arrange_shipment, order_id) except Exception: await workflow.execute_activity(cancel_inventory, order_id) await workflow.execute_activity(refund_payment, order_id)

Choreographed Saga (Event-Driven, Node.js/Kafka Example)

Each service reacts to events, produces the next step.

// orderService.js (Node.js, pseudo-code) kafkaConsumer.on('order_created', async (order) => { try { await reserveInventory(order); kafkaProducer.send('inventory_reserved', order); } catch { kafkaProducer.send('order_failed', order); } });
  • Frameworks: Kafka Streams, Eventuate Tram, NATS Jetstream.

3. Outbox Pattern: Bulletproof Event Consistency

Write domain changes and events together, atomically, then publish events asynchronously.

# Save domain change and event in one DB transaction with db.session.begin(): order.status = "confirmed" outbox.append({"event": "order_confirmed", "order_id": order.id}) # Background worker publishes events def outbox_worker(): for event in get_pending_outbox(): kafka_producer.send(event["event"], event) mark_outbox_event_published(event)
  • Frameworks: Debezium, Kafka Connect, Watermill.

4. Idempotency: Survive Retries

Every external call should be idempotent. Clients must send a unique key; services deduplicate.

def charge_payment(order_id, idempotency_key): if already_processed(idempotency_key): return "Already charged" # process payment mark_processed(idempotency_key) return "Charged"
  • Reality: Payment APIs, order systems, anything with money or user state.

  • Workflow engines: Temporal, Camunda, Netflix Conductor.
  • Messaging/Event platforms: Kafka, Pulsar, NATS Jetstream.
  • CDC/Outbox: Debezium, native Postgres, Kafka Connect.
  • Observability: OpenTelemetry, Honeycomb—never fly blind.

Hard Lessons Learned

  • There’s no free lunch: Distributed transactions force you to pick—consistency, availability, or partition tolerance. Accept trade-offs.
  • Business consistency > technical consistency: What really must never break? Protect that. Accept eventual consistency elsewhere.
  • Visibility is survival: Always know what step failed and why. Instrument, log, and trace every flow.
  • Never cross service DB boundaries: If you need distributed joins, your boundaries aren’t real.

What Actually Works

PatternBest ForWhen to Use
2PCTight-coupled legacy, low scaleAlmost never in modern systems
SagaBusiness workflowsOrders, bookings, payments
Outbox PatternEvent-driven, messagingE-commerce, billing, logistics
IdempotencyEvery external APIPayments, notifications, actions

Next Steps: For Builders, Founders, and Curious Minds

If you’re serious about scaling—don’t let distributed transactions be an afterthought. Prototype a simple saga. Try Temporal or build a minimal outbox flow. Instrument everything.

If you want to learn from someone who’s made these mistakes so you don’t have to—reach out. I’m always open to collaborate, brainstorm, or trade stories.


Let’s Connect

If this post helped, Share it. DM me. Subscribe for real-world system design, not just theory. We build better systems together—one lesson at a time.

Join the Discussion

Share your thoughts and insights about this system.