Best way to reliably find a company’s official website domain (at scale, low cost)?

Hi everyone,

I’m currently building a workflow to enrich company data by finding the official website domain based on the company name (sometimes with city / address).

So far I’ve tested a few approaches:

  • Perplexity node → good results, but feels too expensive for my use case

  • OpenAI / LLM approach → cheaper, but often returns wrong domains (wrong url or unrelated companies)

Goal:

  • I’m working with large datasets (50k+ companies)

  • Cost per lookup needs to stay well below 1 cent / search

  • Needs to be reasonably reliable, not necessarily 100% perfect

What’s your recommended approach to reliably determine a company’s official website domain at scale, without using expensive APIs?
Any patterns, nodes, or workflows that worked well for you?

Thanks in advance, really appreciate any input :raising_hands:

My go-to for this kind of research task is Apify’s Google scraper, it starts from around 0.3 cents per search page.

The general idea is that you can Google search for the company’s name in quotes;

Pull the first few results and then excluding any directory results, it’s very likely that you can get the company’s website

And you can just use the Apify node to call this from n8n and do any post-processing and export with the data obtained. I just run the workflow directly in Apify and export it as Excel.

The input Json would look something like this, keeping the limits low to minimize costs:

{  "queries": "company name 1\ncompany name 2\n...", 
"resultsPerPage": 5,  
"maxPagesPerQuery": 1, 
"aiMode": "aiModeOff", }

You can also enable an enrichment for around an additional 0.4 cents which would get some prospect’s contact info and LinkedIns