Webhooks & Event-Driven Architecture
Webhooks are HTTP callbacks that allow one system to notify another in real-time when an event occurs. In a distributed environment, they must be treated as an unreliable queue.
🛡️ Reliable Consumption Patterns
1. The Fast Acknowledgment Pattern
To avoid timeouts and redundant retries from the provider:
- Ingress: The API endpoint should ONLY verify the security signature and save the raw payload to a database.
- Response: Return
200 OKimmediately. - Processing: A background worker (e.g., Celery, BullMQ, Go routine) picks up the raw event and executes the business logic.
2. Idempotency (Deduplication)
Assume you will receive duplicate events.
- Idempotency Key: Create a unique key by hashing
provider_name + provider_event_id. - Constraint: Store this in a table with a
UNIQUEconstraint. If an insert fails, skip processing—it’s a duplicate.
3. State Transition Logic (Out-of-Order Events)
Webhooks are rarely delivered in the order they were generated.
- Pattern: Instead of relying on timestamps, check the Domain State.
- Example: If an order status is already
SHIPPED, ignore a delayedPAYMENT_SUCCESSwebhook if it triggers an invalid transition.
🔒 Security Best Practices
- HMAC Signatures: Always verify the
X-Hub-Signature(or equivalent) using a shared secret. - Replay Protection: Reject requests with a timestamp older than 5 minutes.
- Allowlisting: If possible, restrict incoming webhook traffic to the provider’s known IP ranges.
🚑 Failure & Retries
- Exponential Backoff: Use a schedule like 1s, 5s, 30s, 5m, 30m, 2h to avoid hammering your own services during an outage.
- Dead Letter Queue (DLQ): Move events that fail after retries to a separate queue for manual intervention.
Source: Ingested from YouTube: How WebHooks Work