Paul Hoekstra

Data engineer and builder; writes Paul’s Pipeline on AI, data, and side projects.

Last updated: 2026-04-12

Overview

Paul Hoekstra writes Paul’s Pipeline on Substack, covering practical AI agent engineering from a data engineering perspective. His writing is grounded in real production experience — “written by a data engineer who spends too much time tinkering and then writes about it so you don’t have to make the same mistakes.”

His Agentic Engineering series (4 parts) is a practitioner’s framework for getting reliable results from coding agents across four layers: Configuration, Capability, Orchestration, and Guardrails.

Key Ideas

The configuration gap: the difference between mediocre and amazing agent results is mostly about configuration, not the model
CLAUDE.md as a cost center: every token in CLAUDE.md is paid on every API call; bloat actively hurts performance
Skills beat model size: Haiku + human-curated skills (27.7%) beats Opus without skills (22.0%) on SkillsBench
HARD-GATE directives: XML-tagged checkpoints that exploit the model’s training to enforce process compliance
Anti-rationalization tables: pre-emptive lists of model excuses paired with corrections — short-circuits the model’s tendency to justify skipping steps

Connections

agentic-engineering — his four-layer framework
claude-code-skills — practical extensions: HARD-GATE, anti-rationalization, SkillsBench evidence
thin-harness-fat-skills — complementary framing; Hoekstra adds the token cost argument for keeping CLAUDE.md lean

Sources

Agentic Engineering, part 1: The Configuration Layer — added 2026-04-12
Agentic Engineering, part 2: What the Agent Doesn’t Know — added 2026-04-12
Agentic Engineering, part 3: The Orchestration Layer — added 2026-04-12
Agentic Engineering, part 4: Keeping Agents on a Leash — added 2026-04-12

second-brain

Explorer

paul-hoekstra

Paul Hoekstra

Overview

Key Ideas

Connections

Sources

Graph View

Table of Contents

Backlinks

Chat