Hi,I’m using a selfhosted 2.3.6 n8n in my lab, using a lxc container in proxmox 9I usually like to have a lok at this website https://elhacker.info in order to find new tutorials and all kind of files all around. As it’s difficult to find what new files are uploaded to the site, i decided to use n8n in order to make something like a snapshot using a python web scrapper, and compare it with the latest workflow run so as to detect if some file is uploaded.As there are a lot of files, when the output arrives to my telegram node, It gives me an error:
{“errorMessage”: “Your request is invalid or could not be processed by the service”,“errorDescription”: “Request Entity Too Large”,“errorDetails”: {“rawErrorMessage”: [“413 - {“ok”:false,“error_code”:413,“description”:“Request Entity Too Large”}”],“httpCode”: “413”},“n8nDetails”: {“nodeName”: “A7_Telegram”,“nodeType”: “n8n-nodes-base.telegram”,“nodeVersion”: 1,“resource”: “message”,“operation”: “sendMessage”,“time”: “22/1/2026, 9:48:14”,“n8nVersion”: “2.3.6 (Self Hosted)”,“binaryDataMode”: “filesystem”,“stackTrace”: [“NodeApiError: Your request is invalid or could not be processed by the service”," at ExecuteContext.apiRequest (/usr/lib/node_modules/n8n/node_modules/n8n-nodes-base/nodes/Telegram/GenericFunctions.ts:230:9)“," at processTicksAndRejections (node:internal/process/task_queues:105:5)”," at ExecuteContext.execute (/usr/lib/node_modules/n8n/node_modules/n8n-nodes-base/nodes/Telegram/Telegram.node.ts:2198:21)“," at WorkflowExecute.executeNode (/usr/lib/node_modules/n8n/node_modules/n8n-core/src/execution-engine/workflow-execute.ts:1045:8)”," at WorkflowExecute.runNode (/usr/lib/node_modules/n8n/node_modules/n8n-core/src/execution-engine/workflow-execute.ts:1226:11)“," at /usr/lib/node_modules/n8n/node_modules/n8n-core/src/execution-engine/workflow-execute.ts:1662:27”," at /usr/lib/node_modules/n8n/node_modules/n8n-core/src/execution-engine/workflow-execute.ts:2297:11"]}}
One solution is to run the srapper script the very first time outside n8n and create a new database and prevent telegram from receiving such a huge amount of data, and then, put it in production so that if any other change is made to the files, it will arrive with no problem.But, just for learning purposes, is there any other way to make it work from beginning and prevent the first error? Here is my code:
{
“nodes”: [
{
“parameters”: {
“rule”: {
“interval”: [
{
“triggerAtHour”: 4
}
]
}
},
“id”: “8dc0845a-ac77-47bc-9839-b380ef919a97”,
“name”: “A1_Trigger”,
“type”: “n8n-nodes-base.scheduleTrigger”,
“typeVersion”: 1.1,
“position”: [
21728,
272
]
},
{
“parameters”: {
“command”: “python3 /.n8n/scraper.py”
},
“id”: “72106d9b-3bb9-44ac-8823-09a59cec960b”,
“name”: “A2_Scrapper”,
“type”: “n8n-nodes-base.executeCommand”,
“typeVersion”: 1,
“position”: [
21952,
272
]
},
{
“parameters”: {
“command”: “python3 /root/.n8n-files/insertar.py”
},
“id”: “6b3f423f-73ae-4014-96f8-d53f5c91124e”,
“name”: “A3_Insertar_DB”,
“type”: “n8n-nodes-base.executeCommand”,
“typeVersion”: 1,
“position”: [
22160,
272
]
},
{
“parameters”: {
“fileSelector”: “/root/.n8n-files/novedades.json”,
“options”: {}
},
“id”: “3cd3e9dd-8c95-414e-9338-ba35caa8941a”,
“name”: “A4_Leer_Novedades”,
“type”: “n8n-nodes-base.readWriteFile”,
“typeVersion”: 1,
“position”: [
22384,
272
]
},
{
“parameters”: {
“operation”: “fromJson”,
“options”: {}
},
“id”: “412f041d-ce03-4f86-b658-1172cd73b2eb”,
“name”: “A5_Extraer_JSON”,
“type”: “n8n-nodes-base.extractFromFile”,
“typeVersion”: 1,
“position”: [
22608,
272
]
},
{
“parameters”: {
“jsCode”: “// Capturamos el primer ítem que llega al nodo\nconst inputItem = $input.all()[0];\n\n// Extraemos la lista de archivos que está dentro de la propiedad "data"\nconst listaArchivos = inputItem.json.data;\n\n// Si la lista no existe o está vacía, avisamos\nif (!listaArchivos || listaArchivos.length === 0) {\n return { \n json: { \n texto_final: "
Escaneo finalizado: No se han encontrado archivos nuevos en esta ejecución." \n } \n };\n}\n\nlet msj = "
¡Nuevos archivos detectados!
\n\n";\n\n// Recorremos la lista interna de archivos\nlistaArchivos.forEach((archivo, i) => {\n const nombre = archivo.archivo || "Sin nombre";\n const enlace = archivo.enlace || "#";\n const fecha = archivo.detectado || "N/A";\n msj += ${i+1}. 📄 *${nombre}*\\n🔗 [Descargar](${enlace})\\n📅 _${fecha}_\\n\\n;\n});\n\nreturn { json: { texto_final: msj } };”
},
“id”: “b59b0d12-c147-4cb4-b8e1-5404e715c747”,
“name”: “A6_Formatear”,
“type”: “n8n-nodes-base.code”,
“typeVersion”: 2,
“position”: [
22832,
272
]
},
{
“parameters”: {
“chatId”: “1418784730”,
“text”: “={{ $json.texto_final }}”,
“additionalFields”: {
“parse_mode”: “Markdown”
}
},
“id”: “2b843855-f8b2-4c39-84c3-6c8d76ebac13”,
“name”: “A7_Telegram”,
“type”: “n8n-nodes-base.telegram”,
“typeVersion”: 1,
“position”: [
23040,
272
],
“webhookId”: “fc182bf0-d23d-4bb1-a7d4-4cea8356dd62”,
“credentials”: {
“telegramApi”: {
“id”: “lRLGefHU25m40wt7”,
“name”: “Telegram ElHacker.net”
}
}
}
],
“connections”: {
“A1_Trigger”: {
“main”: [
[
{
“node”: “A2_Scrapper”,
“type”: “main”,
“index”: 0
}
]
]
},
“A2_Scrapper”: {
“main”: [
[
{
“node”: “A3_Insertar_DB”,
“type”: “main”,
“index”: 0
}
]
]
},
“A3_Insertar_DB”: {
“main”: [
[
{
“node”: “A4_Leer_Novedades”,
“type”: “main”,
“index”: 0
}
]
]
},
“A4_Leer_Novedades”: {
“main”: [
[
{
“node”: “A5_Extraer_JSON”,
“type”: “main”,
“index”: 0
}
]
]
},
“A5_Extraer_JSON”: {
“main”: [
[
{
“node”: “A6_Formatear”,
“type”: “main”,
“index”: 0
}
]
]
},
“A6_Formatear”: {
“main”: [
[
{
“node”: “A7_Telegram”,
“type”: “main”,
“index”: 0
}
]
]
}
},
“pinData”: {},
“meta”: {
“instanceId”: “36b3c3d7b960dcf8aecfa143cdfe54b57bdd67b81fa18ba665b02da3f105d9f8”
}
}
The scrapper code is like this:
import requests
from bs4 import BeautifulSoup
import json
from urllib.parse import urljoin
import re
import os
BASE_URLS = [
‘elhacker.INFO - Descargas Cursos, Manuales, Tutoriales y Libros’,
‘elhacker.INFO - Descargas Cursos, Manuales, Tutoriales y Libros’,
]
Listado unificado: originales + vídeo + música
EXTENSIONES_PERMITIDAS = (
# Archivos, Documentos y Ejecutables
‘.zip’, ‘.rar’, ‘.7z’, ‘.pdf’, ‘.iso’, ‘.exe’,
‘.tar’, ‘.gz’, ‘.lst’, ‘.txt’, ‘.epub’,
# Vídeo
'.mp4', '.avi', '.mpeg', '.mpg', '.mkv',
'.mov', '.wmv', '.flv', '.webm', '.m4v', '.3gp',
# Música
'.mp3', '.wav', '.flac', '.aac', '.ogg',
'.wma', '.m4a', '.aiff', '.alac', '.opus'
)
def scrape_elhacker():
all_files =
queue = list(BASE_URLS)
visited_urls = set()
# Uso de sesión para mejorar la velocidad
session = requests.Session()
session.headers.update({
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) n8n-monitor'
})
while queue:
current_url = queue.pop(0)
if current_url in visited_urls:
continue
visited_urls.add(current_url)
try:
response = session.get(current_url, timeout=15)
if response.status_code != 200:
continue
response.encoding = response.apparent_encoding
soup = BeautifulSoup(response.text, 'html.parser')
for link in soup.find_all('a'):
href = link.get('href')
# Filtros de navegación
if not href or '?' in href or 'Parent Directory' in link.text or href.startswith('/'):
continue
full_path = urljoin(current_url, href)
# Si es una CARPETA, añadir a la cola
if href.endswith('/'):
if full_path not in visited_urls:
queue.append(full_path)
continue
# Si es un ARCHIVO permitido, procesar
if href.lower().endswith(EXTENSIONES_PERMITIDAS):
content_after = link.find_next_sibling(string=True)
date = "N/A"
if content_after:
match = re.search(r'(\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2})', str(content_after))
if match:
date = match.group(1)
all_files.append({
"path": full_path,
"nombre": href.replace('%20', ' ').lstrip('/'),
"fecha_modificacion": date
})
except Exception:
continue
# IMPORTANTE: Devolvemos la lista de objetos (no un string JSON)
return all_files
if name == “main”:
# 1. Ejecutar el scraping
lista_archivos = scrape_elhacker()
# 2. Definir ruta permitida en el LXC
output_dir = '/root/.n8n-files'
output_file = os.path.join(output_dir, 'resultados_scraper.json')
# 3. Asegurar que la carpeta existe
if not os.path.exists(output_dir):
os.makedirs(output_dir, exist_ok=True)
try:
# 4. Guardar los datos en el archivo
# json.dump toma la lista y la escribe directamente en formato JSON
with open(output_file, 'w', encoding='utf-8') as f:
json.dump(lista_archivos, f, ensure_ascii=False, indent=2)
# 5. Imprimir estado mínimo para evitar error de buffer en n8n
print(json.dumps({
"status": "success",
"count": len(lista_archivos),
"file": output_file
}))
except Exception as e:
print(json.dumps({
"status": "error",
"message": str(e)
}))
I’m unable to write all this post with a good format. Please, you can find it in a proper mode in Hi,I’m using a selfhosted 2.3.6 n8n in my lab, using a lxc container in proxmox - Pastebin.com