I built this template to provide a professional grade starting point for website based AI agents. The workflow focuses on three main pillars: efficiency, accuracy, and reliability.
The Logic:
-
Mapping: Instead of a blind crawl, I used Firecrawl to map the domain first. This gives full control over which URLs get indexed.
-
Vector Storage: I integrated Qdrant because of its leadership in the vector database space. It provides the speed and filtering capabilities necessary for production RAG.
-
Reliability: The AI agent has access to a Gmail tool. This ensures that if the knowledge base lacks an answer, a human support ticket is created immediately.
Critique Request: I am looking for feedback on two points:
-
The batch size of 100 in the Qdrant node: is this optimal for Mistral embeddings?
-
The deduplication logic: I am currently using a set node to filter URLs from the map, but I am open to suggestions for more complex site structures.
Workflow Link: https://n8n.io/workflows/15415-auto-index-your-website-and-build-a-rag-chatbot-with-firecrawl-qdrant-and-gpt-4o-mini/