[Help Needed] Extracting data from HTML through code block

I am doing a google search and using this code
"// Get all input items
const items = $input.all();
const allExtractedData = ;

// Loop through each item
for (const item of items) {
try {
// Get HTML content, handle both string and object cases
let htmlContent = item.json;
if (typeof htmlContent === ‘object’ && htmlContent.data) {
htmlContent = htmlContent.data;
}

    const htmlString = typeof htmlContent === 'string' 
        ? htmlContent 
        : JSON.stringify(htmlContent);

    const regex = /https:\/\/www\.reddit\.com\/r\/([\w\d_]+)\/comments\/([\w\d_]+)/g;
    let matches;

    // Extract subreddit and comment ID
    while ((matches = regex.exec(htmlString)) !== null) {
        allExtractedData.push({
            subreddit: matches[1],
            id: matches[2]
        });
    }
} catch (error) {
    console.error('Error processing item:', item, error);
}

}

// Remove duplicates
const uniqueData = [
…new Map(allExtractedData.map(item =>
[${item.subreddit}-${item.id}, item]
)).values()
];

// Map the result for the desired output format
const result = uniqueData.map(data => ({ json: data }));

// Log the final result
console.log(result);

return result;
"
to extract the information.

Everything runs perfectly fine for few time but then system goes into infinite execution mode and never get out from this code block.

Surprisingly when I create a new account and run the same code, it works perfectly fine for first few time and then gives the same issues.

I wonder if anyone has faced same problem or is it a way for n8n to limit the usage of code block?

It looks like your topic is missing some important information. Could you provide the following if applicable.

  • n8n version:
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app):
  • Operating system:

Hi @ConversationalizeAI

Could be down to memory issues - looking at your code, you’re accumulating all matches in memory before deduplicating them, if there are many matches, this could decrease performance and eventually shut down, like you described.

Please also bear in mind, that the code node is “heavier” on your resources than other core nodes. Perhaps you could try a summarize node and the loop node to break down the processing into smaller steps?

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.