BA
Back to all client work

Sole Automation Engineer · AI Workflow Architect

n8n Content Pipeline — Topic to Published SEO Article in Under 4 Minutes

Self-hosted n8n pipeline triggered by one Telegram command. Six stages: parallel research, SEO analysis, multi-agent Writer→Critic→Refiner loop, Claude QA, final scoring.

Topic-to-publish dropped from ~18 hours to 3–4 minutes; replaced 15+ hrs/week of manual work.

n8n (self-hosted)Docker (10 containers)PostgreSQLSearXNG · FirecrawlOpenAI GPT-4oClaude Sonnet (via OpenRouter)DataForSEOTelegram Bot API
n8n Content Pipeline — Topic to Published SEO Article in Under 4 Minutes

n8n Content Pipeline: Topic to Published SEO Article in Under 4 Min My role. n8n Automation Developer & AI Workflow Architect

Project description.

The problem: a content team was spending 15–20 hours per article on research, drafting, editing, and QA — with inconsistent quality. What I built: a self-hosted n8n pipeline triggered by one Telegram command. Six stages run end-to-end — parallel research (SearXNG + Firecrawl), DataForSEO keyword analysis, a multi-agent Writer→Critic→Refiner loop, Claude Sonnet QA, and a final scoring pass. Each stage has quality gates and retry logic. Result: topic-to-publish dropped from ~18 hours to 3–4 minutes, replacing 15+ hrs/week of manual work. Stack: n8n, Claude, OpenAI, Firecrawl, Docker. Skills and deliverables

n8n Automation Claude

n8n Content Pipeline: Topic to Published SEO Article in Under 4 Minutes

Role: Sole Automation Engineer
Stack: n8n (self-hosted), Docker, PostgreSQL, SearXNG, Firecrawl, OpenAI, OpenRouter (Claude), DataForSEO, Telegram Bot API
Status: Functional prototype — core pipeline works end-to-end, some production gaps remain (noted below)


What It Does

An automated content pipeline that takes a Skool community name and URL as input and produces a fully researched, SEO-optimized, editorially reviewed article — delivered as Markdown and HTML files via Telegram — in under 4 minutes.

The system handles everything: web research, page scraping, SEO keyword analysis, AI-powered writing with an iterative critic/refiner loop, multi-layer quality assurance, and file delivery. No human touches the content between trigger and output.


Architecture Overview

Trigger (Webhook or Telegram Bot)
  │
  ├── Stage 1: Deep Research (SearXNG + Firecrawl + GPT-4o)
  │     └── Quality Gate → retry up to 2x on failure
  │
  ├── Stage 2: SEO Research (DataForSEO + GPT-4o)
  │     └── Quality Gate → retry up to 2x on failure
  │
  ├── Stage 3: Data Packaging (merge research + SEO into structured JSON)
  │     └── Quality Gate → validate required fields
  │
  ├── Stage 4-5: Editorial + QA (Claude Sonnet via OpenRouter)
  │     ├── Writer → Critic → Score Check
  │     │     └── Loop: Refiner → Critic (up to 3 iterations)
  │     ├── Plagiarism Check
  │     ├── Fact Check
  │     └── Word Count + Tone Analysis
  │
  └── Stage 6: Master QA + Publish (Claude Sonnet via OpenRouter)
        ├── Final 5-dimension scoring
        ├── Generate Markdown (YAML frontmatter) + HTML (styled)
        ├── Send files via Telegram
        └── Quality Gate → retry full pipeline up to 2x on failure

Six n8n workflows orchestrated by a parent workflow, plus a standalone Telegram bot workflow for triggering via chat.


Workflow Breakdown

FileWorkflowWhat It Does
01_orchestrator.jsonOrchestratorReceives webhook, coordinates all stages, manages quality gates, retries, and Telegram notifications
02_deep_research.jsonDeep Research7 parallel SearXNG searches + Firecrawl page scraping + GPT-4o synthesis
03_seo_research.jsonSEO ResearchDataForSEO keyword volumes + SERP analysis + GPT-4o keyword clustering
04_editorial_qa.jsonEditorial QAWriter/Critic/Refiner loop + plagiarism + fact-check + tone/word-count QA
05_master_publish.jsonMaster PublishFinal scoring, Markdown + HTML generation, Telegram file delivery
06_telegram_bot.jsonTelegram BotPolls for /review commands, triggers the pipeline via internal webhook
00_init_db.sqlDatabase SchemaPostgreSQL schema for job persistence (defined but not wired into workflows)

Infrastructure — Fully Self-Hosted

Everything runs in Docker via a single docker-compose.yml. No external SaaS dependencies for scraping or search — only API keys for AI models and SEO data.

10 containers in the stack:

ServiceImagePurpose
n8nn8nio/n8n:latestWorkflow engine
PostgreSQLpgvector/pgvector:0.6.2-pg15n8n backend database
SearXNGsearxng/searxng:latestPrivacy-respecting meta-search engine
SearXNG Redisvalkey/valkey:alpineSearXNG cache
Firecrawl APIghcr.io/firecrawl/firecrawl:latestWeb scraping with JS rendering
Firecrawl Playwrightghcr.io/firecrawl/playwright-service:latestHeadless browser for Firecrawl
Firecrawl Redisredis:alpineFirecrawl job queue
Firecrawl RabbitMQrabbitmq:3-managementFirecrawl message broker
Firecrawl PostgreSQLghcr.io/firecrawl/nuq-postgres:latestFirecrawl storage

All services communicate over a private Docker bridge network. Health checks ensure startup ordering.


AI Model Strategy

TaskModelWhy
Research synthesisGPT-4o (OpenAI)Fast, good at structured extraction and summarization
SEO keyword clusteringGPT-4o (OpenAI)Reliable JSON output for analytical tasks
Article writingClaude Sonnet (OpenRouter)Stronger long-form prose, better editorial voice
Critic scoringClaude Sonnet (OpenRouter)Nuanced evaluation with structured JSON scores
RefinementClaude Sonnet (OpenRouter)Addresses critic feedback while preserving voice
QA checksClaude Sonnet (OpenRouter)Plagiarism, fact-check, tone analysis
Final master scoringClaude Sonnet (OpenRouter)5-dimension quality assessment

The split is deliberate: GPT-4o handles the structured data extraction where speed matters; Claude handles the editorial work where writing quality matters.


Key Engineering Decisions

Quality Gates with Retry Logic

Every stage has a quality gate. If a gate fails, the pipeline retries that stage (up to 2 attempts). If Stage 6 (final scoring) fails, the entire pipeline restarts from Stage 1 with failure context appended so subsequent attempts can compensate.

Writer/Critic/Refiner Loop

The editorial stage doesn't just generate once. It runs an iterative loop:

  1. Writer produces a draft
  2. Critic scores it across multiple dimensions (returns JSON)
  3. If average score < 5/10, the Refiner rewrites based on critic feedback
  4. Loop repeats up to 3 times
  5. A "last chance refiner" runs if all iterations are exhausted

State Restoration Pattern

Telegram notification nodes sit in the middle of the pipeline for real-time progress updates. Since n8n passes data linearly, dedicated "Restore Data" nodes re-inject the correct pipeline state after every Telegram side-channel call, preventing data corruption downstream.

Self-Hosted Search and Scraping

SearXNG and Firecrawl are self-hosted specifically to avoid rate limits and per-request costs from third-party scraping APIs. The tradeoff is infrastructure complexity (5 extra containers), but it gives unlimited scraping at zero marginal cost.


Skills Demonstrated

  • n8n workflow design — sub-workflow orchestration, webhook triggers, quality gates, retry patterns, state management across branching paths
  • Multi-model AI orchestration — routing different task types to appropriate LLMs, structured JSON prompting, fallback parsing
  • Docker infrastructure — 10-service compose stack with health checks, dependency ordering, shared networking, volume persistence
  • SEO automation — DataForSEO API integration, keyword volume analysis, SERP competitor analysis, intent classification
  • Web scraping at scale — Firecrawl with Playwright for JS-rendered pages, SearXNG for aggregated search results
  • Editorial automation — iterative critic/refiner pattern, multi-dimensional scoring, plagiarism and fact-checking
  • Bot development — Telegram bot with command parsing, polling, and bidirectional notifications
  • Error handlingcontinueOnFail on external calls, JSON parse fallbacks, defensive data reads, failure context propagation

How to Run It

# 1. Clone and configure
cp .env.example .env
# Fill in: OPENAI_API_KEY, OPENROUTER_API_KEY, DATAFORSEO_LOGIN,
#          DATAFORSEO_PASSWORD, TELEGRAM_BOT_TOKEN, TELEGRAM_CHAT_ID,
#          POSTGRES_PASSWORD, N8N_USER, N8N_PASSWORD

# 2. Start the stack
docker-compose up -d

# 3. Import workflows into n8n (http://localhost:5678)
#    Import in order: 02, 03, 04, 05, 06, then 01
#    Ensure workflow IDs match the hardcoded references

# 4. Activate all workflows

# 5. Trigger via webhook
curl -X POST http://localhost:5678/webhook/skool-pipeline \
  -H "Content-Type: application/json" \
  -d '{"community_name": "Skool Games", "founder_name": "Sam Ovens", "community_url": "https://www.skool.com/games"}'

# Or trigger via Telegram: /review Skool Games, Sam Ovens, https://www.skool.com/games

Tech Stack Summary

CategoryTechnology
Workflow Enginen8n (self-hosted)
InfrastructureDocker Compose (10 containers)
DatabasePostgreSQL 15 (pgvector)
SearchSearXNG (self-hosted meta-search)
ScrapingFirecrawl + Playwright (self-hosted)
AI — ResearchOpenAI GPT-4o
AI — EditorialAnthropic Claude Sonnet (via OpenRouter)
SEO DataDataForSEO (keyword volumes + SERP)
Message QueueRabbitMQ, Redis (Valkey)
Bot InterfaceTelegram Bot API
Output FormatsMarkdown (YAML frontmatter), HTML (styled CSS)

Built as a freelance project. The pipeline consistently produces 1,500–2,500 word SEO articles from a single community URL input, with research, keyword targeting, and editorial QA — all automated, all under 4 minutes.

Project gallery

Tap any image to view full size.