Yaay, we finally built a functional workflow that turns any article into a structured Notion note — here’s what made it actually work:
We read a lot - and save almost nothing.
Not because we don’t want to — because the gap between “this is interesting” and “this is organised and findable later” has always been just large enough to skip. Browser bookmarks became a graveyard. Read-later apps became a longer graveyard.
So we built something to close that gap.
The idea is simple
Send a URL. Get a structured Notion note. Never think about it again.
In practice: a webhook receives the URL, the workflow fetches the page, a Code node strips it down to readable content, OpenAI summarises it into structured JSON, and Notion saves it with title, summary, key insights, tags, source URL, and date — all mapped to the right properties automatically.
Webhook → HTTP Request → Code Node → OpenAI → Notion
The thing that actually makes it work
Early version went straight from HTTP Request to OpenAI. The summaries were mediocre and the cost was high. Took a while to figure out why.
A typical webpage is not an article. It is an article buried inside 30,000–80,000 tokens of navigation HTML, script bundles, cookie banners, tracking pixels, and CSS class names. The model was spending most of its context window on noise.
The fix is three regex replacements and a substring in a Code node:
const input = $input.first().json;
const html = input.data || input.body || input.html || input;
const text = String(html)
.replace(/<script[\s\S]*?<\/script>/gi, ' ')
.replace(/<style[\s\S]*?<\/style>/gi, ' ')
.replace(/<[^>]+>/g, ' ')
.replace(/\s+/g, ' ')
.trim()
.substring(0, 8000);
return [{ json: { text, url: $('Webhook').item.json.body.url } }];
Strips scripts first, then styles, then all remaining tags, collapses whitespace, truncates to 8,000 characters. Drops the payload from ~60K tokens to under 4K. The model only sees what matters. Summaries got better. Cost dropped significantly.
The prompt
Summarize this article as a research note. Return JSON:
{"title":"","summary":"","key_insights":[],"tags":[]}
Text: {{ $json.text }}
JSON output mode is enabled in the node settings. The Notion node maps directly to the fields with no parsing step needed.
What lands in Notion
Six properties per note — Title, Summary, Key Insights (bullet formatted), Tags (multi-select), URL, Date Saved. Clean enough to actually use later.
Two things worth knowing before you run it
Some sites block automated requests. Add User-Agent: Mozilla/5.0 as a header on the HTTP Request node and most of those 403s go away.
JavaScript-heavy SPAs return shell HTML with no content — the page renders client-side after load. For those I push the URL through a read-later service first to get the rendered text, then send that to the webhook instead.
JSON will be shared feel free to reply. The only thing to update is the Notion database ID in the Notion node — grab it from your database URL. Share the database with your n8n integration before testing or the write will fail silently.
Happy to answer questions on any part of the build.
Tags: show-and-tell knowledge-management notion openai webhook code-node productivity