Selenium Node for webscraping

The idea is:

To include a selenium node that will extend n8n’s webscraping abilities, users can write their own webdriver commands in javascript and send it to selenium for performing the actions/test and the node will return whatever the webdriver returns

My use case:

Currently without a node, I need to send a request to my own python based webapp which is used to control another instance (docker container) of a standalone selenium + browser by sending commands in python. Instead a node could be used to send commands to a selenium (either installed on the same machine or remotely hosted) using javascript.

I think it would be beneficial to add this because:

It will add many webscraping features to n8n, for example it has the power to command a browser and emulate human interactions with webpages, as well as scrape websites for data. its also free.

Any resources to support this?

Are you willing to work on this?

Yea, i will put my input in whenever its needed.

Okay so I found that selenium has a docker image for selenium standalone with your choice of browser, which makes it simple to set up, you just need to send the webdriver requests to its port and it will run remotely. which seems fairly simple.

A bit of an update, I have successfully used selenium with N8N but its a complicated process my setup

Selenium grid running: 4 selenium firefox nodes (use docker compose to run)
Self built Python flask webapp to take http requests from N8N and send the jobs to selenium grid over local network( also running in docker container)
N8N (also in a docker container)

how it works, the job is sent out via http request, which is received by the python flask app, then the python app sends the request to selenium to go click a bunch of buttons and input data to download a PDF, after that i use python to return the PDF file to N8N then use N8N to format the filename and save it on the server.

It would be nice to have a node instead of the python webapp just to send simple Selenium tasks in javascript to selenium grid to execute.

Hi @Josh-Ghazi

Why not give it a go your self. You seem to have it figured out. :wink:
There was a webinar by Marcus not to long ago on how to start building your own node.
It’s not too difficult and there will be plenty of people around to help you, if you wanted to try it for your self.

Thank you for the encouragement, Im still new to coding and computer science in general, however it is something that I would really like to achieve. So I will try to watch that video and see if its something i can tackle.

If I can give back to the community I it would be all the incentive I need to do it.

You will learn as you go.
In my experience it is a good way to start with you Typescript journey.

1 Like

@Josh-Ghazi perhaps you can clone the Puppeteer community node and use it as a model :wink:

If i had known i would have probably just used puppeteer.

Both options are good :+1:

By the way, I think the puppeteer community node is broken :frowning:

1 Like

I think it needs a bit more configuration to get it to work. Should be a topic on the community talking about this.