📦 The Outbox Pattern: A Review and Implementation

The Problem

Sometimes we want to perform two distinct, critical operations: publish something to persistent storage (like writing data to a database) while also sending out a notification (via a message queue). Treating these as two separate, non-transactional calls raises a major problem:

  • You could write to the database but fail to publish to the topic.
  • The reverse could also arise: you succeed on a publish but fail the database write.

This leads to an inconsistent state where you were expecting both events to have completed, resulting in a mess of error handling and potential data integrity issues.

The Proposed Solution: The Outbox Pattern

One robust solution to this is to implement what is known as the Outbox Pattern.

This pattern uses the database’s built-in transactionality to ensure atomic operations. It has three main components:

  1. The Database (DB): Your main persistent store.
  2. The Message Queue (MQ): The system for sending notifications (e.g., Kafka, RabbitMQ).
  3. The Change Data Capture (CDC) Agent: A dedicated process that monitors the DB for changes.

How It Works

  1. Atomic Write: Your database schema is augmented with an Outbox table (or equivalent structure). When you write your business data changes to the database, you simultaneously include a write to the Outbox table within the same local transaction. The database’s logic ensures that either the entire write (business data + outbox entry) goes through, or neither does, guaranteeing atomicity.
  2. CDC Pickup: Once the database transaction commits, the Outbox table has a new entry. The CDC Agent actively monitors this table (often by reading the transaction log or a dedicated replication slot).
  3. Message Publishing: The CDC Agent picks up the new entry, transforms the data if necessary, and reliably publishes the message to the message queue topic on your behalf.
  4. Completion: The process guarantees that when you want to write to the database and publish a topic at the same time, we have a stronger guarantee that either both will occur or neither will occur. This mitigates the chance of an incomplete state where one succeeds while the other fails.

One Specific Implementation

I’ve implemented this pattern as an example in the repository: Bold-Blueprint

  • Database: Postgres
  • Message Queue: Kafka
  • CDC Agent: Debezium

The docker-compose.yaml file is configured to spin up the entire system. You should be able to run docker compose up -d in the directory to get everything running.

The main administrative script is deploy_connector.ps1, which configures Debezium (using connector.json) to monitor the outbox table in Postgres and publish the resulting message to a topic in Kafka. A simple Go program is included to write to the database, allowing you to see the corresponding message published in Kafka.


📝 Notes on Development

Vibe Coding with Gemini

Significant portions of this repo were “vibe coded” with Gemini. While I did have to intervene at several points when things weren’t working, it was a neat test to see how well Gemini could be used as a coding partner.

If you retrain your thinking to have something like Gemini get the busy work out of the way, it’s a pretty neat experience that frees up mental capacity for other parts of the system.

  • Example Usage: I had Gemini write up the docker-compose.yaml file section by section while I used another Gemini session to compare and contrast the options for implementing the outbox pattern.
  • Context Management: I chose to do it section by section because generative AI still has issues with long context windows, especially in agent mode. A trick that worked well was periodically asking the model to summarize the conversation so far, which I could then use as a detailed prompt when starting a new chat to ensure continuity.

Generative AI for Usage in Simple Scripts

Likely due to the amount of data it has had to work with, Gemini was very good at generating scripts. It generated basically all of the Powershell scripts, and they were functional.

Initially, it suggested commands using utilities like curl, which are not available in Windows by default. When prompted, it was able to quickly and effectively convert those commands to their Powershell equivalents. I still don’t think it’s very good with large codebases, but for things where you can limit it to one or two files (and a relatively short history), generative AI seems to do pretty well.

Debugging with Gemini

While Gemini did okay with debugging the Docker Compose file and the generated Go files, there was a significant struggle when trying to set up a Kafka cluster that Debezium could talk to. I should have noted the errors, but it would go around and around in circles trying to resolve the issues.

Ultimately, I sourced the suggested Docker Compose blocks from the official Kafka Docker Hub page and then let Gemini work with those. There were still some issues, but it was eventually resolved. At a certain point, not even agent mode will save you, but it was nice to get a solid foundation to start with.

Debezium Transforms and Casing

Originally, I had the columns in the Outbox table use snake_case (e.g., aggregate_id).

Debezium would absolutely not work when the column titles were snake_case and I wanted Debezium to transform what would ultimately be published to the topic. While the raw payload field could be interpreted, there was a lot of extraneous information without the transform.

Gemini was unable to solve why the transform config would consistently try to read the column as aggregateid instead of aggregate_id. I also couldn’t find a clear reason, but ultimately, I changed the aggregate_id column to camelCase (aggregateId). After changing all the column titles in the Outbox table to use camelCase, the issue was solved.