Bug in integration n8n and Apify

Hi everybody!
I don’t know if it is the bug in n8n or Apify.
Have an essue with integration with Apify via http request node.
I use post method and JSON to pull info in actor I need.

The problem:
From google sheet I get URLs (indefinite number) but it’s other thing, that I’m gonna solve.

So, this URLs go into this JSON like that:
{
“aggressivePrune”: true,
“crawlerType”: “playwright:adaptive”,
“debugLog”: false,
“debugMode”: false,
“dynamicContentWaitSecs”: 10,
“expandIframes”: false,
“htmlTransformer”: “readableText”,
“ignoreCanonicalUrl”: false,
“initialConcurrency”: 5,
“keepUrlFragments”: false,
“maxConcurrency”: 5,
“maxCrawlDepth”: 1,
“maxCrawlPages”: 500,
“maxRequestRetries”: 5,
“maxResults”: 5000,
“maxScrollHeightPixels”: 5000,
“maxSessionRotations”: 10,
“proxyConfiguration”: {
“useApifyProxy”: true,
“apifyProxyGroups”: [
“BUYPROXIES94952”
]
},
“readableTextCharThreshold”: 100,
“removeCookieWarnings”: true,
“saveFiles”: false,
“saveHtml”: false,
“saveHtmlAsFile”: false,
“saveMarkdown”: true,
“saveScreenshots”: false,
“startUrls”: [
{
“url”: “{{ $json.URL }}”,
“method”: “GET”
}
],
“useSitemaps”: false
}

On the web I make Apify to use only 128mb from 8gb of free usage.
But when I start this http node, on the web it starts to use all my memory, all 8gb:



That is the problem.

Thanks for answerring!

Information on your n8n setup

  • n8n version: 1.63.4
  • Database (default: SQLite): I guess default (sorrym don’t understand where to find this info)
  • n8n EXECUTIONS_PROCESS setting (default: own, main): I guess default (sorrym don’t understand where to find this info)
  • Running n8n via (Docker, npm, n8n cloud, desktop app): n8n cloud
  • Operating system: windows

Hey @Vasilii_Danilin , try engaging Loop node with a delay in between the requests and see if that helps. I suspect that HTTP Request node sends all the requests concurrently (when there are many rows retuened by the spreadsheet) that creates heavy load for Apify.

If that doesn’t help I would check with Apify.

2 Likes

Hi @ihortom!
I understood, that in my JSON that goes in http node I use the format, when it writes every link that I have in the google sheet and use it to start Apify actor.
So now I just make a single array of references to run a single actor with all references.
This is the expression of Set node to make a bundle:

{{
  JSON.stringify($items("Google Sheets").map(item => ({
    url: item.json.Link,
    method: "GET"
  })))
}}

Here is the JSON of http node:

{
  "aggressivePrune": true,
  "clickElementsCssSelector": ".jobTitle > a, .job-card-container__link",
  "crawlerType": "playwright:adaptive",
  "debugLog": false,
  "debugMode": false,
  "dynamicContentWaitSecs": 10,
  "expandIframes": true,
  "htmlTransformer": "readableText",
  "ignoreCanonicalUrl": false,
  "initialConcurrency": 5,
  "keepUrlFragments": false,
  "maxConcurrency": 5,
  "maxCrawlDepth": 1,
  "maxCrawlPages": 500,
  "maxRequestRetries": 5,
  "maxResults": 5000,
  "maxScrollHeightPixels": 5000,
  "maxSessionRotations": 10,
  "proxyConfiguration": {
      "useApifyProxy": true,
      "apifyProxyGroups": [
          "BUYPROXIES94952"
      ]
  },
  "readableTextCharThreshold": 100,
  "removeCookieWarnings": true,
  "saveFiles": false,
  "saveHtml": false,
  "saveHtmlAsFile": false,
  "saveMarkdown": true,
  "saveScreenshots": false,
  "startUrls": {{ $json["startUrls"] }},
  "useSitemaps": false
}

But thanks for the help!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.