n8n + Supabase/Postgres: Upserts, Rate Limits, Webhooks

💡

TL;DR: Orchestrate in n8n, compute in Postgres. Batch hard (100–500). Use INSERT … ON CONFLICT for upserts, dedupe with keys, prefer ordered IDs (ULID or UUID v7), and stream via webhooks with idempotency

Architecture Split

What you’ll learn: How to split work between n8n and Postgres for speed, stability, and low latency

Naive imports push every transform and write through n8n one row at a time. That burns memory and network and it thrashes indexes

Treat n8n as the conductor, not the orchestra
Let Postgres do heavy lifting: upserts, filtering, ordering, joins
Keep payloads small and push arrays in batches, not singletons

A clean split works better

What belongs where

n8n

Orchestration, retries, rate limiting, batching, fan-out and fan-in
Light transforms and validation
Webhooks and workflow state

Postgres and Supabase

Upserts with ON CONFLICT, constraints, dedupe enforcement
Window functions for across-row calculations, projections, server-side filtering
Materialized views for read models (stored query results)

Keep compute in the database and keep the workflow lean

Common patterns

Pattern	When	Notes
Webhooks to DB	Source can push events	Lowest latency, requires idempotency (safe to retry without new effects)
Polling to Batches	No webhooks available	Simple to start, reconcile on a schedule
Supabase REST	Need managed auth and Row Level Security (per-user row policies)	Great for CRUD, bulk via upsert RPC or SQL
Direct SQL	Need maximum throughput	Fewer hops, mind connection pooling

When uncertain, start with REST or Webhooks, then evolve to direct SQL for hot paths

flowchart TD
    A[Source Data] --> B[n8n Orchestrate]
    B --> C[Build SQL]
    C --> D[Postgres Upsert]
    D --> E[Read Model]

    classDef trigger fill:#e1f5fe,stroke:#01579b
    classDef process fill:#fff3e0,stroke:#ef6c00
    classDef action fill:#e8f5e8,stroke:#2e7d32

    class B trigger
    class C process
    class D action

Transition: With roles set, move to bulk upserts and dedupe

Bulk Upserts

What you’ll learn: How to upsert in batches, guard freshness, and dedupe before writes

Row-by-row INSERT or UPDATE in a loop is the fastest way to go slow. You will hit API limits and saturate n8n memory

Use a single INSERT … ON CONFLICT per batch
Deduplicate inside n8n before hitting the database
Gate updates by freshness so newer rows win

ON CONFLICT basics

-- Conflict on a unique key (external_id or a composite)
INSERT INTO public.items (
  external_id, name, email, updated_at, payload
) VALUES
  -- values expanded by n8n per batch
  ($1,$2,$3,$4,$5), ($6,$7,$8,$9,$10)
ON CONFLICT (external_id) DO UPDATE SET
  name       = EXCLUDED.name,
  email      = EXCLUDED.email,
  updated_at = EXCLUDED.updated_at,
  payload    = EXCLUDED.payload
WHERE EXCLUDED.updated_at > items.updated_at  -- only if newer

Gotchas

The conflict target must be UNIQUE via index or constraint
Updating many columns can bloat row size, be selective
Add a WHERE guard to prevent stale overwrites

Default to a single conflict key when possible

Upsert keys

Scenario	Conflict Target	Why
Third-party master ID	UNIQUE(external_id)	Simple and stable from source
No stable ID	UNIQUE(source, natural_key1, natural_key2)	Composite natural identity
Time-series facts	UNIQUE(external_id, period_start)	One record per entity and period

Keep keys narrow to reduce index size

Dedupe in n8n

Explain once: Upsert means insert or update based on a unique key

Remove Duplicates settings

Mode: within current input to collapse repeats
Scope: workflow if you want cross-run idempotency
Key: external_id or a normalized composite

Additional tips

Normalize values before comparing, for example lower-case emails and trim whitespace
Keep a small rolling cache when payloads are large using a Code node with an LRU map

Batch the SQL in n8n

HTTP Request to fetch page N
Code to map to database columns
SplitInBatches with size 100–500
Code to build VALUES list and parameter array
Postgres or Supabase node for a single INSERT … ON CONFLICT
IF errors then Wait with exponential backoff and retry up to 3

flowchart TD
    A[Fetch Page] --> B[Normalize]
    B --> C[Dedupe]
    C --> D[Batch 100-500]
    D --> E[Build SQL]
    E --> F[Upsert]
    F --> G{Error?}
    G -->|Yes| H[Wait and Retry]
    G -->|No| I[Notify]

    classDef trigger fill:#e1f5fe,stroke:#01579b
    classDef process fill:#fff3e0,stroke:#ef6c00
    classDef action fill:#e8f5e8,stroke:#2e7d32
    classDef alert fill:#f3e5f5,stroke:#7b1fa2

    class A trigger
    class B process
    class C process
    class D process
    class E process
    class F action
    class H alert
    class I action

Transition: After consistent writes, control rate and concurrency

Rate Limits

What you’ll learn: How to throttle APIs, pick batch sizes, and avoid spikes

APIs, n8n, and database pools all have ceilings. Naive loops spike then stall

Batch requests and inserts
Add spacing between bursts
Use per-API throttles, not one global brake

Built-in helpers

SplitInBatches to control memory and concurrency per chunk
Wait to add 200–1000 ms between batches for strict APIs
Retry on Fail with backoff 1s then 2s then 4s capped near 30s
HTTP Request batching via items per batch and batch interval

Use node-level features first, then code if needed

Batch size design

100–500 items per database upsert in most cases
20–100 items per outbound API call when payloads are small
Aim for p50 latency times batch size near 0.5–2 seconds per burst
Keep a soft ceiling of 5–10 concurrent executions for bursty flows

Rolling-window throttle

// Code node: per-API token bucket style throttle
// Token bucket: a rate limiter that lets short bursts then refills over time
const limit = { max: 60, windowMs: 60_000 }  // 60 req per 60s
const key = 'crm_rate_window'
const now = Date.now()
const ts = (getWorkflowStaticData('global')[key] ||= [])
while (ts.length && now - ts > limit.windowMs) ts.shift()
if (ts.length >= limit.max) {
  const waitMs = limit.windowMs - (now - ts) + 50
  return new Promise(res => setTimeout(() => { ts.push(Date.now()); res(items) }, waitMs))
}
ts.push(now)
return items

flowchart TD
    A[Call In] --> B{Token Available?}
    B -->|Yes| C[Proceed]
    B -->|No| D[Wait]
    D --> B

    classDef process fill:#fff3e0,stroke:#ef6c00
    classDef action fill:#e8f5e8,stroke:#2e7d32

    class A process
    class C action
    class D process

💡

Performance tip: throttle per API, not globally, to keep fast services fast while respecting strict ones

Transition: With stable throughput, choose ID strategies and schema that scale

IDs and Schema

What you’ll learn: How to pick ordered IDs, index wisely, and model JSONB versus normalized tables

Random UUID v4 scatters inserts across the index. Under load it fragments and slows

Prefer ordered IDs such as ULID or UUID v7 for append-like index behavior
Separate ingest tables from read-optimized tables when needed
Index only what you read or collide on

ID choices

Explain once: ULID is a time-ordered unique ID, UUID v7 is a time-ordered UUID with native support in modern databases

ID Type	Ordering	Notes
UUID v4	None	Native and easy, worst for hot writes
ULID	Time ordered	Store as text or bytea, sortable, needs generator
UUID v7	Time ordered	Native type in new versions, best balance for write and sort

Practical DDL

-- Conflict or lookup index kept narrow
CREATE UNIQUE INDEX uq_items_external ON public.items (external_id);

-- Ordered ID as primary key (generated in app or via extension)
-- CREATE TABLE public.items (
--   id uuid PRIMARY KEY DEFAULT uuid_v7(),
--   ...
-- );

-- Ingest versus read model
CREATE MATERIALIZED VIEW public.items_read AS
SELECT id, name, email
FROM public.items
WHERE deleted_at IS NULL;

Rules of thumb

Keep conflict target columns short and selective
Avoid wide composite primary keys on hot tables
Use JSONB for raw payloads but index only the paths you filter on

JSONB or normalized

Choice	When	Tradeoffs
JSONB ingest plus targeted indexes	Upstream shape shifts often	Flexible and fewer migrations, but JSON path indexes add write cost
Normalized columns	Stable schema and heavy querying	Fast reads and type safety, more migration churn when shapes change

erDiagram
    Item {
        string id
        string external_id
        string name
        string email
        datetime updated_at
        string payload
    }

    EventLog {
        string event_id
        datetime received_at
    }

Transition: With a solid schema, wire near real-time webhooks with idempotency

Webhooks Flow

What you’ll learn: How to build an idempotent webhook pipeline with a reconciliation poll

Polling works, yet webhooks cut latency and API calls. Replays happen, so design for idempotency

Prefer webhooks for near real-time
Store event IDs to drop repeats safely
Run a daily or weekly reconciliation poll to close gaps

Idempotent handler

-- Event sink to dedupe deliveries
CREATE TABLE IF NOT EXISTS public.event_log (
  event_id text PRIMARY KEY,
  received_at timestamptz NOT NULL DEFAULT now()
);

// n8n Webhook then Code guard then Postgres
// Idempotency: safe to retry without extra effects
const id = $json.event_id  // stable from source
// Try insert; if conflict, short-circuit the workflow
// Use a Postgres node:
// INSERT INTO event_log(event_id) VALUES($1) ON CONFLICT DO NOTHING
// If affectedRows === 0 then already processed, return early

Use event_id plus source in a composite when IDs are not globally unique
Keep a TTL policy if the event space is large, for example 30–90 days

Hybrid model

Webhook for create, update, and delete events
Nightly poll to reconcile missed events using updated_at or checkpoints
Emit metrics for counts, lag, and error rate

Reference flow

flowchart TD
    A[Webhook or Poll] --> B[Normalize]
    B --> C[Dedupe]
    C --> D[Batch 100-500]
    D --> E[Build SQL]
    E --> F[Upsert]
    F --> G{Error?}
    G -->|Yes| H[Backoff]
    H --> F
    G -->|No| I[Notify]
    F --> J[Dead Letter]

    classDef trigger fill:#e1f5fe,stroke:#01579b
    classDef process fill:#fff3e0,stroke:#ef6c00
    classDef action fill:#e8f5e8,stroke:#2e7d32
    classDef alert fill:#f3e5f5,stroke:#7b1fa2

    class A trigger
    class B process
    class C process
    class D process
    class E process
    class F action
    class H alert
    class I action
    class J alert

Scale from 10k to 1M

Scale batches from 100 to 250 to 500 while watching database write time and CPU
Raise pool size only when the database is underutilized to avoid connection storms
Reindex or run CLUSTER off-hours if UUID v4 is unavoidable
Add partial indexes that match read filters and avoid over-indexing ingest
Track p95 end-to-end latency and error budgets, then tune the bottleneck

Quick recap

Batch everything at 100–500
Use INSERT … ON CONFLICT with a freshness guard
Pick strong dedupe keys and dedupe in-workflow
Prefer ordered IDs such as ULID or UUID v7
Use webhooks for real-time plus a reconciliation poll
Rate-limit with rolling windows per API

💡

Next steps: ship a small pilot flow with about 1k rows, measure, then scale. Keep IDs ordered, conflicts tight, and batches boring. Boring at scale is beautiful