Webhooks & Event-Driven Architecture

Webhooks are HTTP callbacks that allow one system to notify another in real-time when an event occurs. In a distributed environment, they must be treated as an unreliable queue.

🛡️ Reliable Consumption Patterns

1. The Fast Acknowledgment Pattern

To avoid timeouts and redundant retries from the provider:

  • Ingress: The API endpoint should ONLY verify the security signature and save the raw payload to a database.
  • Response: Return 200 OK immediately.
  • Processing: A background worker (e.g., Celery, BullMQ, Go routine) picks up the raw event and executes the business logic.

2. Idempotency (Deduplication)

Assume you will receive duplicate events.

  • Idempotency Key: Create a unique key by hashing provider_name + provider_event_id.
  • Constraint: Store this in a table with a UNIQUE constraint. If an insert fails, skip processing—it’s a duplicate.

3. State Transition Logic (Out-of-Order Events)

Webhooks are rarely delivered in the order they were generated.

  • Pattern: Instead of relying on timestamps, check the Domain State.
  • Example: If an order status is already SHIPPED, ignore a delayed PAYMENT_SUCCESS webhook if it triggers an invalid transition.

🔒 Security Best Practices

  • HMAC Signatures: Always verify the X-Hub-Signature (or equivalent) using a shared secret.
  • Replay Protection: Reject requests with a timestamp older than 5 minutes.
  • Allowlisting: If possible, restrict incoming webhook traffic to the provider’s known IP ranges.

🚑 Failure & Retries

  • Exponential Backoff: Use a schedule like 1s, 5s, 30s, 5m, 30m, 2h to avoid hammering your own services during an outage.
  • Dead Letter Queue (DLQ): Move events that fail after retries to a separate queue for manual intervention.

Source: Ingested from YouTube: How WebHooks Work