Selenium Node for webscraping

The idea is:

To include a selenium node that will extend n8n’s webscraping abilities, users can write their own webdriver commands in javascript and send it to selenium for performing the actions/test and the node will return whatever the webdriver returns

My use case:

Currently without a node, I need to send a request to my own python based webapp which is used to control another instance (docker container) of a standalone selenium + browser by sending commands in python. Instead a node could be used to send commands to a selenium (either installed on the same machine or remotely hosted) using javascript.

I think it would be beneficial to add this because:

It will add many webscraping features to n8n, for example it has the power to command a browser and emulate human interactions with webpages, as well as scrape websites for data. its also free.

Any resources to support this?

Are you willing to work on this?

Yea, i will put my input in whenever its needed.

Okay so I found that selenium has a docker image for selenium standalone with your choice of browser, which makes it simple to set up, you just need to send the webdriver requests to its port and it will run remotely. which seems fairly simple.

A bit of an update, I have successfully used selenium with N8N but its a complicated process my setup

Selenium grid running: 4 selenium firefox nodes (use docker compose to run)
Self built Python flask webapp to take http requests from N8N and send the jobs to selenium grid over local network( also running in docker container)
N8N (also in a docker container)

how it works, the job is sent out via http request, which is received by the python flask app, then the python app sends the request to selenium to go click a bunch of buttons and input data to download a PDF, after that i use python to return the PDF file to N8N then use N8N to format the filename and save it on the server.

It would be nice to have a node instead of the python webapp just to send simple Selenium tasks in javascript to selenium grid to execute.

Hi @Josh-Ghazi

Why not give it a go your self. You seem to have it figured out. :wink:
There was a webinar by Marcus not to long ago on how to start building your own node.
It’s not too difficult and there will be plenty of people around to help you, if you wanted to try it for your self.

Thank you for the encouragement, Im still new to coding and computer science in general, however it is something that I would really like to achieve. So I will try to watch that video and see if its something i can tackle.

If I can give back to the community I it would be all the incentive I need to do it.

You will learn as you go.
In my experience it is a good way to start with you Typescript journey.

1 Like

@Josh-Ghazi perhaps you can clone the Puppeteer community node and use it as a model :wink:

If i had known i would have probably just used puppeteer.

Both options are good :+1:

By the way, I think the puppeteer community node is broken :frowning:

1 Like

I think it needs a bit more configuration to get it to work. Should be a topic on the community talking about this.

2 Likes

Hi @Josh-Ghazi, I am also struggeling with selenium and want to do some testing.
Also ended up with having a docker selenium grid and doing HTTP Requests to the Selenium HUB like this:

1. Create a Session
POST http://localhost:4444/wd/hub/session

Body Payload:

{
  "desiredCapabilities": {
    "browserName": "firefox",
    "timeouts": {
      "implicit": 5000,
      "pageLoad": 300000,
      "script": 30000
    }
  },
  "capabilities": {
    "firstMatch": [{}],
    "alwaysMatch": {
      "browserName": "firefox"
    }
  }
}

2. Navigate to a Page
POST http://localhost:4444/wd/hub/session/“sessionId”/url

Body Payload:

    "url": "https://www.google.com"
  }

3. Take a Sreenshot
GET http://localhost:4444/wd/hub/session/“sessionId”/screenshot

The Screenshot is served in a Base64 Values and has to be converted with the “Move Binary Data” node:

4. Delete Session
DELETE http://localhost:4444/wd/hub/session/“sessionId”


My problem is that I cannot get scripts to work which I created with Selenium IDE · Open source record and playback test automation for the web

I do the following which is failing with timouts (and I removed the timeouts and also multiplied them without any other result).

POST http://localhost:4444/wd/hub/session/“sessionId”/execute/async

The script has to be handled with {{ JSON.stringify($json["testscript"]) }} to get recognized. But they are still timing out.

Body Payload:

{
    "script": "('testrun', async function() {\n    await driver.get(\"https://www.google.com/\")\n    await driver.manage().window().setRect({ width: 1075, height: 896 })\n    await driver.findElement(By.css(\"#W0wltc > .QS5gu\")).click()\n    await driver.findElement(By.name(\"q\")).click()\n    await driver.findElement(By.name(\"q\")).sendKeys(\"selenium webdriver api\")\n    await driver.findElement(By.css(\"form > div:nth-child(1)\")).click()\n    await driver.findElement(By.css(\"center:nth-child(1) > .gNO89b\")).click()\n    {\n      const element = await driver.findElement(By.css(\".MjjYud:nth-child(1) > div:nth-child(1) > .g:nth-child(1) .LC20lb:nth-child(2)\"))\n      await driver.actions({ bridge: true }).moveToElement(element).perform()\n    }\n    await driver.findElement(By.css(\".MjjYud:nth-child(1) > div:nth-child(1) > .g:nth-child(1) .LC20lb:nth-child(2)\")).click()\n    await driver.findElement(By.css(\"#m-documentationwebdriveractions_api > span\")).click()\n  })"
  }

Non-Stringify script (just a simple demo browing google for selenium webdriver api)

('testrun', async function() {
    await driver.get("https://www.google.com/")
    await driver.manage().window().setRect({ width: 1075, height: 896 })
    await driver.findElement(By.css("#W0wltc > .QS5gu")).click()
    await driver.findElement(By.name("q")).click()
    await driver.findElement(By.name("q")).sendKeys("selenium webdriver api")
    await driver.findElement(By.css("form > div:nth-child(1)")).click()
    await driver.findElement(By.css("center:nth-child(1) > .gNO89b")).click()
    {
      const element = await driver.findElement(By.css(".MjjYud:nth-child(1) > div:nth-child(1) > .g:nth-child(1) .LC20lb:nth-child(2)"))
      await driver.actions({ bridge: true }).moveToElement(element).perform()
    }
    await driver.findElement(By.css(".MjjYud:nth-child(1) > div:nth-child(1) > .g:nth-child(1) .LC20lb:nth-child(2)")).click()
    await driver.findElement(By.css("#m-documentationwebdriveractions_api > span")).click()
  })

Could anyone assist in this problem. Like this we could start a Selenium node, which would require your Selenium Hub URL and can be used with predefined commands?

Docs: WebDriver / Grid endpoints | Selenium

2 Likes

whats the error? did you try to open up selenium grid and connect to the machines via vnc to see if they are starting up or getting stuck?

I had some discussions on selenium channels and found out that it is not possible to just send the script, which was generated in the “Selenium IDE” to the http endpoint of the grid. somehow it should be converted to the WebDriver API which is not one of the outputs of the IDE.

Thats what I got on the Selenium Slack channel:
" The first line is driver.get(...) , which translates into a POST /session/{sessionId}/url . Then the next line is driver.manage().window().setRect(...) , which translates to POST /session/{sessionId}/window/rect . The next line is driver.findElement(...) , which translates to POST /session/{sessionId}/element . And so on, for every statement in your recorded script."

Got this error message on all my tries to POST the script:

"500 - {"value":{"error":"script timeout","message":"Timed out after 30000 ms","stacktrace":"RemoteError@chrome://remote/content/shared/RemoteError.sys.mjs:8:8\nWebDriverError@chrome://remote/content/shared/webdriver/Errors.sys.mjs:180:5\nScriptTimeoutError@chrome://remote/content/shared/webdriver/Errors.sys.mjs:442:5\nevaluate.sandbox/timeoutPromise</scriptTimeoutID<@chrome://remote/content/marionette/evaluate.sys.mjs:95:11\nnotify@resource://gre/modules/Timer.sys.mjs:49:17\n"}}"

I asked and tried to find out if there is any possibility to translate the code which I get from the Selenium IDE to valid commands I can communicate with the Selenium Grid endpoint, but for now had no success…

Hopefully here on n8n are some friends who also would like this kind of workflows integrated into n8n.

What do you think, is there a way to translate the commands automaticly?

I am not really the deep coder kind of guy, but my strategy the following: As somewhere inside selenium this command translation from

  • C# NUnit
  • C# xUnit
  • Java JUnit
  • JavaScript Mocha
  • Python pytest
  • Ruby RSpec

to the correct Webdriver API, which is then used by the browser instances to run the automation.
Lets find out how that works and use it to make easy API Calls to run our Browser automations from n8n.

I use N8N to call a python app that i built specifically just for sending commands to selenium grid and dealing with the file that it needs to download (PDF). It just receives a http request from n8n with some data in the URL string and it will return the pdf file back to n8n for processing

Sounds like you need something to control the selenium grid, I was hoping that N8n Could just take over the selenium handling so no such middle python app is required

1 Like

Yes exactly that would be perfect if we just could send our tests to the endpoint.
I am also interested in having the Selenium Grid not always in standby and waisting ressources. I was thinking about leaving the Grid Hub running and only start seperate test node containers, when there is a incoming request. Like this it would be also compatible to quick scaling any test situation.
Maybe we could connect directly @Josh-Ghazi and see how far we come?

2 Likes

Did anyone hear news on the Selenium feature?

I managed to get some quite big tests running, but the amount of nodes and effort is too much to run them in production through n8n right now.

Are there maybe other n8n browser automation nodes available, which can be self hosted and have similar features like Selenium does?

3 Likes