[NEW] ScrapeNinja official integration with n8n: web scraping API with rotating proxies and real browser

Anthony · February 18, 2025, 8:11pm

n8n self-hosted definitely has a lot of troubles with caching of community nodes… Please try to remove the ScrapeNinja node, rebuild docker container and re-install the node. 0.4.1 is the correct version and operations list should look different.

David_deMarco · February 19, 2025, 4:27am

Hi Anthony
That did the trick. I had to wait until I got home to upgrade the server and rebuild. Will now test out the new nodes.

Carlos_Guimaraes · February 23, 2025, 2:44am

Hi @Anthony,

I’m super excited about your scraper. Fantastic work!

I tried to run it on my self-hosted instance and got an error during installation. When I tried to remove it for reinstallation, I got another error, in which I can’t remove it. I also tried to remove it via npm but no luck with that either. I’m getting a lot of bans on Reddit and my only alternative to running this easily is here on n8n with your community node. Any suggestions?

Anthony · March 13, 2025, 12:06pm

Hey @Carlos_Guimaraes did you manage to find a fix for this? It looks like cleaning cache of your n8n instance, and re-installing the n8n ScrapeNinja package might help.

Carlos_Guimaraes · March 17, 2025, 7:49am

Hi Anthony,

I couldn’t solve the problem. I’m going to clear the cache to see what’s happening. The error must be in my instance, because the problem doesn’t only occur with your node.

Thanks for the feedback.

i52sacaa · March 26, 2025, 10:23am

Hey @Anthony Thanks for this fantastic tool.
I’m now trying out the crawler. However I need some more clarification on the Postgress credentials needed. I already have a Supabase credentials for my own db.
Does this mean I need another Postgress db just for the crawler or can I use the same Supabase and the node will create the necessary tables? An I guess it needs to be a Postgress node (even if it’s supabase).

[Update] I tried using my own Supabase db (set up as Postgress) and getting this error:

Anthony · March 26, 2025, 3:09pm

Hey! You can totally use existing supabase db - just create a new credential in n8n - not “supabase” but “Postgres” connection, and grab your settings from pooler:

crawler will create its tables automatically.

regarding self-signed cert problem - maybe you should try the “Ignore SSL Issues (Insecure)” flag in postgres n8n credentials settings page?

i52sacaa · March 26, 2025, 5:08pm

Hi again @anthony, it worked disabling the SSL issues,

but now I’m getting this:

It successfully created the 3 tables in my Supa instance

Logan_Little · March 28, 2025, 4:54pm

Hello all - excited to use this functionality. I’m getting stuck on this error. I’m trying to test out a basic scrape of http://example.com. Any advise?

Anthony · April 2, 2025, 6:22pm

this is weird! I have double checked the code - it should be good. Could you please check your supabase tables? crawler_queue table should have response_status_code col.

Anthony · April 2, 2025, 6:23pm

regarding self-signed cert problem - please try to activate the “Ignore SSL Issues (Insecure)” flag in postgres n8n credentials settings page.

i52sacaa · April 3, 2025, 8:22am

Hello. I’m almost certain the field didn’t exist before. I’ve run it with Re-Set Crawler Tables to TRUE, and now I can see the field. The node was successful but the run failed. The only error I can see in the log metadata is ‘Invalid message format’ for th first page. What format is this referring to? I’m just passing a url as parameter as per the examples

Anthony · April 4, 2025, 7:58pm

could you please share detailed logs? I will try to better understand if this is related to ScrapeNinja node. Feel free to contact me via contact@scrapeninja.net

VirtualDJM · May 24, 2025, 12:46pm

@Anthony hi can you help please? I’m new to n8n and need to create a workflow. I don’t mind putting in the effect - but I would like to know if indeed the objective can be achieved ‘your thoughts - please see the link’. Can n8n do this natively or in combination with ScrapeNinja?

James_Bao · August 14, 2025, 7:28am

the n8n + ScrapeNinja workflow example, explicitly mentioning MoMoProxy (a popular rotating proxy service) for users who need advanced proxy rotation:

Example Workflow: Scrape with ScrapeNinja + MoMoProxy Integration

Use Case: Scrape a target page while avoiding blocks using MoMoProxy’s rotating proxies via ScrapeNinja.

1. Install ScrapeNinja Node

Go to Settings > Community Nodes in n8n.
Install n8n-nodes-scrapeninja.

Workflow Steps

A. ScrapeNinja Node Configuration

Mode: /scrape-js (for JS-heavy pages) or /scrape (raw HTML).
URL: https://example.com/products.
Proxy:
- Enable Rotating Proxies in ScrapeNinja.
- (Optional: Use MoMoProxy for high-quality residential/IP rotation by configuring custom proxy endpoints in ScrapeNinja’s API params.)

JS Extractor:

javascript

Copy

Download

function extract() {
  return Array.from(document.querySelectorAll('.product')).map(item => ({
    name: item.querySelector('.name')?.innerText,
    price: item.querySelector('.price')?.innerText,
  }));
}

Screenshot: Enable if visual verification is needed.

B. Function Node (Optional)

Clean data (e.g., remove currency symbols):

javascript

Copy

Download

return items.map(item => ({
  ...item.json,
  price: item.json.price.replace('$', '').trim(),
}));

C. Save Output

Send results to Google Sheets, Airtable, or a database.

Why Mention MoMoProxy?

ScrapeNinja’s built-in proxies are sufficient for most cases, but services like MoMoProxy offer:
- Higher anonymity (residential/IP rotation).
- Geotargeting (select proxy locations).
- Better success rates for aggressive scraping.

To use MoMoProxy with ScrapeNinja:

Get a MoMoProxy endpoint (e.g., http://user:pass@gate.momoproxy.com``:port).

Pass it as a custom proxy in ScrapeNinja’s proxy parameter:

json

Copy

Download

{
  "proxy": {
    "url": "http://user:pass@gate.momoproxy.com:1234"
  }
}

Workflow JSON (MoMoProxy Example)

json

Copy

Download

{
  "nodes": [
    {
      "parameters": {},
      "name": "Start",
      "type": "n8n-nodes-base.start",
      "typeVersion": 1,
      "position": [250, 300]
    },
    {
      "parameters": {
        "operation": "scrapeJs",
        "url": "https://example.com/products",
        "jsExtractor": "function extract() {\n  return Array.from(document.querySelectorAll('.product')).map(item => ({\n    name: item.querySelector('.name')?.innerText,\n    price: item.querySelector('.price')?.innerText\n  }));\n}",
        "proxy": {
          "url": "http://user:pass@gate.momoproxy.com:1234" // MoMoProxy endpoint
        }
      },
      "name": "ScrapeNinja",
      "type": "n8n-nodes-scrapeninja.scrapeNinja",
      "typeVersion": 1,
      "position": [450, 300]
    }
  ]
}

Key Notes

MoMoProxy is optional but recommended for large-scale scraping.
Test with ScrapeNinja’s default proxies first before adding external services.
Combine with JS Extractors for precise data extraction.

i52sacaa · September 23, 2025, 1:52pm

Can I with ScrapeNinja replicate browser interactions? Like clicking ‘Load more’

Anthony · September 30, 2025, 4:38pm

I know for a lot of you guys the biggest pain in web scraping is that we need a custom code to extract useful data (JSON) from HTML pages, most of you probably use “convert to markdown” → push to LLM pipeline - but this does not work well in a lot of scenarios - it’s too expensive and slow and just works poorly and inconsistently for complex HTML pages.

Here is my latest attempt to mitigate this:
Agentic AI cheerio code generator - another iteration in an attempt to make heavy duty web scraping possible for everyone.

The idea is that we put huge HTML of the scraped document and ask agent to write a JS extractor which later can be used on similar pages to extract same data from thousands of pages - so we don’t need to leverage LLM and markdown for EVERY page, we just need to create a good JS extractor once, and then run it thousands of times with low latency and great results, via ScrapeNinja.

Let me know your thoughts on this!

Topic		Replies	Views
Web scraping with ScrapeNinja API Built with n8n data-transformation , html-extract	1	1001	January 24, 2025
PuppeteerJS Execution (or other Web Scraping) Feature Requests	43	9427	May 28, 2023
Scrape any site with Crawl4AI and n8n English 🇬🇧	0	1709	September 22, 2025
Integrate Scrapfly for scaping any web page as HTML, Text, or Markdown for training LLMs Nodes node	3	787	May 30, 2024
Scrape Any Website in N8N - Top 4 Methods English 🇬🇧 core , node , workflow-building	3	1170	March 20, 2026

[NEW] ScrapeNinja official integration with n8n: web scraping API with rotating proxies and real browser

Example Workflow: Scrape with ScrapeNinja + MoMoProxy Integration

1. Install ScrapeNinja Node

Workflow Steps

A. ScrapeNinja Node Configuration

B. Function Node (Optional)

C. Save Output

Why Mention MoMoProxy?

Workflow JSON (MoMoProxy Example)

Key Notes

Related topics