I want to build a workflow that sends an email or telegram message with a list of 5 recipe suggestions with images of and short 1 line description.
My problems are:
How to reliably get images from a website? I will generally use the same website so I could try and find a pattern but what type of node would be best for this?
Getting an AI agent node to stick to a text limit and format - from my experience so far I am best off just instructing it and clipping the response.
For the images part the most reliable approach is to avoid scraping raw pages if you can. Most recipe sites either expose structured metadata or load their content from an API in the background. If you inspect the network tab you can often find a JSON endpoint that already includes image URLs, titles and descriptions. Pulling from that with an HTTP request node is much more stable than trying to parse HTML.
If there is no API, look for structured data like JSON LD in the page source. Many recipe sites include it for SEO and it usually contains clean image links. An HTML extract node can pull that out without relying on fragile selectors. Scraping visible elements should be the last option since layouts tend to change.
For the AI formatting issue, what tends to work better is not relying on the model to behave perfectly in one shot. Instead define a very rigid structure in the prompt and then pass the output through a second step that enforces it. For example, asking for a strict JSON array with fixed fields and then validating or trimming it in a code node gives you much more consistent results than plain text instructions.
Also, instead of asking for exactly five items in a conversational way, framing it as a hard constraint inside a structured format reduces drift. Then if it still over generates, clipping becomes predictable because the structure is consistent.
There is a simple way to combine both parts so you always end up with five items, each with an image, title and short description, without depending too much on the model behaving perfectly. I have a setup that handles the extraction and formatting cleanly even when the source data is messy.
Hi @dracula_show!
Umm if you are using a fixed site, then the easiest and the most stable thing to get images reliably is just by fetching there page HTML with http and then use html extract content operation, so now once you have the extracted HTML you can try to fetch the CSS element targeting the image, as if you are using a same site every time this would be same, so just map that and there you will get the IMAGE URL or source, now you can either download it or embed it, as this might sound a little hectic but once setup this would be really great for 1 site. Make sure you have to find the specific section where it is stored to map it down further to extract the URL and information so that is the main work else everything will fall into place.
And for your output, just use a well defined structured system prompt with a good model like GPT-4o or higher and then attach a structured output parser for an extra layer of instructions and there you can specify the length and other parameters.
You can see this:
Also you can use EDIT image node to customize the images further.
nice breakdown from both of you. honestly the biggest win is figuring out your output shape before prompting — schema-first means the model just fills blanks instead of freestyle parsing, way fewer format errors.