The idea is:
Introduce a set of native browser automation nodes in n8n that enable users to automate web interactions similarly to tools like Selenium, Puppeteer, or Playwright. These nodes would allow users to open a browser session (headless or visible), navigate to URLs, click buttons, type text, scroll, capture screenshots, and extract DOM content by selectors or with the help of an AI agent.
Suggested nodes could include:
- Open Browser / New Session
- Navigate to URL
- Click Element (by ID, class, XPath, text, etc.)
- Type Text
- Click Button
- Swipe Horizontal
- Scroll to Element / Position
- Wait (for element / time)
- Extract Content
- Run AI Agent to Parse Page
- Screenshot
- Close Browser
My use case:
I want to create automated workflows that can:
- Test the UI and flow of a web application (e.g., log in, click through, validate output)
- Scrape and extract structured data from websites (e.g., product listings, stock prices, etc.)
- Trigger downstream logic based on dynamic webpage content
- Automate daily interactions like logging into portals or downloading reports
- Visually build UI test cases or web agents inside n8n without external scripts
Right now, I have to use external tools like Puppeteer, Selenium, or browser extensions, and tie them back to n8n using HTTP nodes or custom scripts — which breaks the low-code flow.
I think it would be beneficial to add this because:
- Enables fully visual UI automation inside n8n, no coding required
- Greatly expands n8n’s use cases into RPA (Robotic Process Automation) and Web Test Automation
- Makes n8n a more powerful alternative to tools like UiPath, Zapier + Browserflow, and Selenium IDE
- Allows building AI-driven autonomous agents that can browse, interact, and scrape intelligently
- Reduces dependency on third-party tools and complex custom node execution setups
Any resources to support this?
- Puppeteer Documentation
- Playwright Automation Docs
- Selenium WebDriver
- Browserflow (Zapier plugin) – demonstrates similar functionality
- OpenAI GPT Browser Agents – useful for AI-assisted page navigation and parsing
Are you willing to work on this?
I would be happy to test or contribute feedback on early node implementations or collaborate on use case documentation.