Html to JSON

I am trying to look for soecific values from a site. Im having the css selectors but how can I use the parsed html to have it as A json?

Please don’t create multiple topics on the same question.

Also your question is really generic. Please givr us more info on what you are doing and where you are stuck.

Hi @Roee_Aizman
You can use AI chat models like GPT, Gemini, and Claude AI to convert html to json .

Thanks @Msquare_Automation

Im trying to come to an AI agent with minimal text.

As the HTML is 2.4Mb

A specific selector will make it few Kbs.

Im trying to make the selector output as a json before the AI Agebt.

Hey @Roee_Aizman

If you share the page and the selector it is going to make it much easier to help you :slight_smile:

For sure!
this is one URL: Yahoo ist Teil der Yahoo-Markenfamilie.

and also where there are multiple stocks like: Yahoo ist Teil der Yahoo-Markenfamilie.

Either use a Code node to extract relevant data from the HTML, but if it is stock data you are looking for i would also recommend looking into some of the free APIs that offer this type of data.

You can certainly scrape it from Yahoo Finance, but it will probably not be the most elegant solution :slight_smile:

Thanks for providing URLs, and what exactly you are trying to catch from each page? You mentioned you already have selectors.

I agree with @Bearman here. There are so many services offering stock price data via easy to use APIs, that scraping this info make very little sense unless you have a very good reason to. Do you? :slight_smile:

Yes this is what I want.

Stock name and price.

1 Like

Take a look at the resources like

or others offering stock prices through APIs.

I want to do it using n8n.

Yes, you can do it in n8n, these services expose APIs, which you can use from n8n.

Hey @jabbson

Thanks for your kind reply.

I am looking to understand how to do it natively using n8n.

Is it possible?

No worries. When you say natively, what do you mean?
If you send an API request from n8n, get a response and then compile a JSON out of it, is this considered “natively”?

Hey :slight_smile:

I mean to parse everything by n8n until a json comes to life

No APIs, no webhook.

Just GET a URL

parse

Json result

Totally possible. But site owner do not like when people do that and many try to block when you request the page with automation. For instance when you request https://finance.yahoo.com/quotes/ the response is error, even though when you open it in the browser, there is no error whatsoever.

makes sense

just for learning

Just to learn scraping, you could get the weather for instance, here is an example:

I dont understand what you mean by “natively in n8n”.

There is no functional difference between sending a GET request to an API (which is also just a url that returns some data), and sending the request to fetch a HTML file like a yahoo finance page.

The diference is that the API is designed for this purpose and will return nicely formatted data that you can use out of the box, while extracting the HTML will give you a big mess that you need to spend a bunch of time parsing and cleaning to make any use of.

Additionally as mentioned earlier, many sites actively work to block or make it difficult to scrape their content, which means that often you have to use additional measures other than a simple get request, such as spinning up an actual browser and/or impersonating a real user to get the site to return the data you want.

If you want to get stock price data, using an API is by far the easiest and most correct approach and is easily doable in n8n (using the exact same method as getting a html page).

If you want to learn how to scrape websites more generally, that is entirely valid as well and there are many many guides around on that, but it is not really an n8n related topic.