Why does N8N break at 0.653 MB HTML?

Jasper_Ruys · January 28, 2023, 3:27pm

Dear Developers of n8n.

I love your platform but pls make more webscrape friendly.

Give detailed explanation to retrieve nested elements. You guys give example of …>… in your documentation which does not explain how it works.
More times it is easier to find ID than a class.
Since you guys are on band wagon of no-code which I love.

When your do HTTP request, clean up the code using the Beautiful Soup libary svp.

Then structure the output of the node based on elements, like parent and children overview of the web pages.

The I only need to select the element to retrieve it is data.

Every growth hacker, markereer, developer, data scientist, data analyst will thank you if you actually build this.

Pls don’t hesitate to reach out to me, I am happy to highlight some other UX improvements that will help you get more customers, and more loyal userbase.

To recap you guys make amazing platforms, just make it marketeers friendly, we suck at coding.

Jon · January 28, 2023, 4:37pm

Hey @Jasper_Ruys,

Welcome to the community

Thanks for the feedback, I do have one question though what do you mean by “why does n8n break at 0.653MB” I can’t see any reference to this or an error message / example in your post.

If you have a link to the docs page we can pass it to our docs team to look at although as a starting point it would be worth playing with css selectors more they are very cool and you can do some neat things with them.

CSS Selectors: CSS Selectors Reference

Jasper_Ruys · January 28, 2023, 11:04pm

Thanks for that reference, that is a great explainer, sorry for asking such basic questions, I bad at coding.

And thanks for responding so quickly!

Jon · January 29, 2023, 5:03pm

Hey @Jasper_Ruys,

Perfect that doesn’t look like anything is broken. We don’t automatically load all of the data in the UI if it is over a certain amount to help prevent the browser crashing from the extra data. You can still manually view the data if needed.

Jasper_Ruys · January 29, 2023, 5:33pm

I will use the webhooks tomorrow, because it is the weekend.

The problem is in the later part of the automation, where the browser crashes.

RedPacketSec · January 30, 2023, 8:33am

you need to use the CSS class to extract, what you are doing is appears to be incorrect.

For example, if you wanted to extract the title you want to use something like

#ember7 > header > div > div > div.extra-info-wrapper.two-rows > div > div > h1

ensuring that the information is actually in the data on the pane in the left.

system · April 30, 2023, 8:34am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.