10 min read

n8n + Supabase/Postgres: Upserts, Rate Limits, Webhooks

Hero image for n8n + Supabase/Postgres: Upserts, Rate Limits, Webhooks
Table of Contents

💡

TL;DR: Orchestrate in n8n, compute in Postgres. Batch hard (100–500). Use INSERT … ON CONFLICT for upserts, dedupe with keys, prefer ordered IDs (ULID or UUID v7), and stream via webhooks with idempotency

Architecture Split

What you’ll learn: How to split work between n8n and Postgres for speed, stability, and low latency

Naive imports push every transform and write through n8n one row at a time. That burns memory and network and it thrashes indexes

  • Treat n8n as the conductor, not the orchestra
  • Let Postgres do heavy lifting: upserts, filtering, ordering, joins
  • Keep payloads small and push arrays in batches, not singletons

A clean split works better

What belongs where

n8n

  • Orchestration, retries, rate limiting, batching, fan-out and fan-in
  • Light transforms and validation
  • Webhooks and workflow state

Postgres and Supabase

  • Upserts with ON CONFLICT, constraints, dedupe enforcement
  • Window functions for across-row calculations, projections, server-side filtering
  • Materialized views for read models (stored query results)

Keep compute in the database and keep the workflow lean

Common patterns

PatternWhenNotes
Webhooks to DBSource can push eventsLowest latency, requires idempotency (safe to retry without new effects)
Polling to BatchesNo webhooks availableSimple to start, reconcile on a schedule
Supabase RESTNeed managed auth and Row Level Security (per-user row policies)Great for CRUD, bulk via upsert RPC or SQL
Direct SQLNeed maximum throughputFewer hops, mind connection pooling

When uncertain, start with REST or Webhooks, then evolve to direct SQL for hot paths

flowchart TD
    A[Source Data] --> B[n8n Orchestrate]
    B --> C[Build SQL]
    C --> D[Postgres Upsert]
    D --> E[Read Model]

    classDef trigger fill:#e1f5fe,stroke:#01579b
    classDef process fill:#fff3e0,stroke:#ef6c00
    classDef action fill:#e8f5e8,stroke:#2e7d32

    class B trigger
    class C process
    class D action

Transition: With roles set, move to bulk upserts and dedupe

Bulk Upserts

What you’ll learn: How to upsert in batches, guard freshness, and dedupe before writes

Row-by-row INSERT or UPDATE in a loop is the fastest way to go slow. You will hit API limits and saturate n8n memory

  • Use a single INSERT … ON CONFLICT per batch
  • Deduplicate inside n8n before hitting the database
  • Gate updates by freshness so newer rows win

ON CONFLICT basics

-- Conflict on a unique key (external_id or a composite)
INSERT INTO public.items (
  external_id, name, email, updated_at, payload
) VALUES
  -- values expanded by n8n per batch
  ($1,$2,$3,$4,$5), ($6,$7,$8,$9,$10)
ON CONFLICT (external_id) DO UPDATE SET
  name       = EXCLUDED.name,
  email      = EXCLUDED.email,
  updated_at = EXCLUDED.updated_at,
  payload    = EXCLUDED.payload
WHERE EXCLUDED.updated_at > items.updated_at  -- only if newer

Gotchas

  • The conflict target must be UNIQUE via index or constraint
  • Updating many columns can bloat row size, be selective
  • Add a WHERE guard to prevent stale overwrites

Default to a single conflict key when possible

Upsert keys

ScenarioConflict TargetWhy
Third-party master IDUNIQUE(external_id)Simple and stable from source
No stable IDUNIQUE(source, natural_key1, natural_key2)Composite natural identity
Time-series factsUNIQUE(external_id, period_start)One record per entity and period

Keep keys narrow to reduce index size

Dedupe in n8n

Explain once: Upsert means insert or update based on a unique key

Remove Duplicates settings

  1. Mode: within current input to collapse repeats
  2. Scope: workflow if you want cross-run idempotency
  3. Key: external_id or a normalized composite

Additional tips

  • Normalize values before comparing, for example lower-case emails and trim whitespace
  • Keep a small rolling cache when payloads are large using a Code node with an LRU map

Batch the SQL in n8n

  1. HTTP Request to fetch page N
  2. Code to map to database columns
  3. SplitInBatches with size 100–500
  4. Code to build VALUES list and parameter array
  5. Postgres or Supabase node for a single INSERT … ON CONFLICT
  6. IF errors then Wait with exponential backoff and retry up to 3
flowchart TD
    A[Fetch Page] --> B[Normalize]
    B --> C[Dedupe]
    C --> D[Batch 100-500]
    D --> E[Build SQL]
    E --> F[Upsert]
    F --> G{Error?}
    G -->|Yes| H[Wait and Retry]
    G -->|No| I[Notify]

    classDef trigger fill:#e1f5fe,stroke:#01579b
    classDef process fill:#fff3e0,stroke:#ef6c00
    classDef action fill:#e8f5e8,stroke:#2e7d32
    classDef alert fill:#f3e5f5,stroke:#7b1fa2

    class A trigger
    class B process
    class C process
    class D process
    class E process
    class F action
    class H alert
    class I action

Transition: After consistent writes, control rate and concurrency

Rate Limits

What you’ll learn: How to throttle APIs, pick batch sizes, and avoid spikes

APIs, n8n, and database pools all have ceilings. Naive loops spike then stall

  • Batch requests and inserts
  • Add spacing between bursts
  • Use per-API throttles, not one global brake

Built-in helpers

  • SplitInBatches to control memory and concurrency per chunk
  • Wait to add 200–1000 ms between batches for strict APIs
  • Retry on Fail with backoff 1s then 2s then 4s capped near 30s
  • HTTP Request batching via items per batch and batch interval

Use node-level features first, then code if needed

Batch size design

  • 100–500 items per database upsert in most cases
  • 20–100 items per outbound API call when payloads are small
  • Aim for p50 latency times batch size near 0.5–2 seconds per burst
  • Keep a soft ceiling of 5–10 concurrent executions for bursty flows

Rolling-window throttle

// Code node: per-API token bucket style throttle
// Token bucket: a rate limiter that lets short bursts then refills over time
const limit = { max: 60, windowMs: 60_000 }  // 60 req per 60s
const key = 'crm_rate_window'
const now = Date.now()
const ts = (getWorkflowStaticData('global')[key] ||= [])
while (ts.length && now - ts > limit.windowMs) ts.shift()
if (ts.length >= limit.max) {
  const waitMs = limit.windowMs - (now - ts) + 50
  return new Promise(res => setTimeout(() => { ts.push(Date.now()); res(items) }, waitMs))
}
ts.push(now)
return items
flowchart TD
    A[Call In] --> B{Token Available?}
    B -->|Yes| C[Proceed]
    B -->|No| D[Wait]
    D --> B

    classDef process fill:#fff3e0,stroke:#ef6c00
    classDef action fill:#e8f5e8,stroke:#2e7d32

    class A process
    class C action
    class D process
💡

Performance tip: throttle per API, not globally, to keep fast services fast while respecting strict ones

Transition: With stable throughput, choose ID strategies and schema that scale

IDs and Schema

What you’ll learn: How to pick ordered IDs, index wisely, and model JSONB versus normalized tables

Random UUID v4 scatters inserts across the index. Under load it fragments and slows

  • Prefer ordered IDs such as ULID or UUID v7 for append-like index behavior
  • Separate ingest tables from read-optimized tables when needed
  • Index only what you read or collide on

ID choices

Explain once: ULID is a time-ordered unique ID, UUID v7 is a time-ordered UUID with native support in modern databases

ID TypeOrderingNotes
UUID v4NoneNative and easy, worst for hot writes
ULIDTime orderedStore as text or bytea, sortable, needs generator
UUID v7Time orderedNative type in new versions, best balance for write and sort

Practical DDL

-- Conflict or lookup index kept narrow
CREATE UNIQUE INDEX uq_items_external ON public.items (external_id);

-- Ordered ID as primary key (generated in app or via extension)
-- CREATE TABLE public.items (
--   id uuid PRIMARY KEY DEFAULT uuid_v7(),
--   ...
-- );

-- Ingest versus read model
CREATE MATERIALIZED VIEW public.items_read AS
SELECT id, name, email
FROM public.items
WHERE deleted_at IS NULL;

Rules of thumb

  • Keep conflict target columns short and selective
  • Avoid wide composite primary keys on hot tables
  • Use JSONB for raw payloads but index only the paths you filter on

JSONB or normalized

ChoiceWhenTradeoffs
JSONB ingest plus targeted indexesUpstream shape shifts oftenFlexible and fewer migrations, but JSON path indexes add write cost
Normalized columnsStable schema and heavy queryingFast reads and type safety, more migration churn when shapes change
erDiagram
    Item {
        string id
        string external_id
        string name
        string email
        datetime updated_at
        string payload
    }

    EventLog {
        string event_id
        datetime received_at
    }

Transition: With a solid schema, wire near real-time webhooks with idempotency

Webhooks Flow

What you’ll learn: How to build an idempotent webhook pipeline with a reconciliation poll

Polling works, yet webhooks cut latency and API calls. Replays happen, so design for idempotency

  • Prefer webhooks for near real-time
  • Store event IDs to drop repeats safely
  • Run a daily or weekly reconciliation poll to close gaps

Idempotent handler

-- Event sink to dedupe deliveries
CREATE TABLE IF NOT EXISTS public.event_log (
  event_id text PRIMARY KEY,
  received_at timestamptz NOT NULL DEFAULT now()
);
// n8n Webhook then Code guard then Postgres
// Idempotency: safe to retry without extra effects
const id = $json.event_id  // stable from source
// Try insert; if conflict, short-circuit the workflow
// Use a Postgres node:
// INSERT INTO event_log(event_id) VALUES($1) ON CONFLICT DO NOTHING
// If affectedRows === 0 then already processed, return early
  • Use event_id plus source in a composite when IDs are not globally unique
  • Keep a TTL policy if the event space is large, for example 30–90 days

Hybrid model

  1. Webhook for create, update, and delete events
  2. Nightly poll to reconcile missed events using updated_at or checkpoints
  3. Emit metrics for counts, lag, and error rate

Reference flow

flowchart TD
    A[Webhook or Poll] --> B[Normalize]
    B --> C[Dedupe]
    C --> D[Batch 100-500]
    D --> E[Build SQL]
    E --> F[Upsert]
    F --> G{Error?}
    G -->|Yes| H[Backoff]
    H --> F
    G -->|No| I[Notify]
    F --> J[Dead Letter]

    classDef trigger fill:#e1f5fe,stroke:#01579b
    classDef process fill:#fff3e0,stroke:#ef6c00
    classDef action fill:#e8f5e8,stroke:#2e7d32
    classDef alert fill:#f3e5f5,stroke:#7b1fa2

    class A trigger
    class B process
    class C process
    class D process
    class E process
    class F action
    class H alert
    class I action
    class J alert

Scale from 10k to 1M

  • Scale batches from 100 to 250 to 500 while watching database write time and CPU
  • Raise pool size only when the database is underutilized to avoid connection storms
  • Reindex or run CLUSTER off-hours if UUID v4 is unavoidable
  • Add partial indexes that match read filters and avoid over-indexing ingest
  • Track p95 end-to-end latency and error budgets, then tune the bottleneck

Quick recap

  1. Batch everything at 100–500
  2. Use INSERT … ON CONFLICT with a freshness guard
  3. Pick strong dedupe keys and dedupe in-workflow
  4. Prefer ordered IDs such as ULID or UUID v7
  5. Use webhooks for real-time plus a reconciliation poll
  6. Rate-limit with rolling windows per API
💡

Next steps: ship a small pilot flow with about 1k rows, measure, then scale. Keep IDs ordered, conflicts tight, and batches boring. Boring at scale is beautiful

📧