Efficient Website Indexing and RAG Chatbot with Qdrant

I built this template to provide a professional grade starting point for website based AI agents. The workflow focuses on three main pillars: efficiency, accuracy, and reliability.

The Logic:

  1. Mapping: Instead of a blind crawl, I used Firecrawl to map the domain first. This gives full control over which URLs get indexed.

  2. Vector Storage: I integrated Qdrant because of its leadership in the vector database space. It provides the speed and filtering capabilities necessary for production RAG.

  3. Reliability: The AI agent has access to a Gmail tool. This ensures that if the knowledge base lacks an answer, a human support ticket is created immediately.

Critique Request: I am looking for feedback on two points:

  1. The batch size of 100 in the Qdrant node: is this optimal for Mistral embeddings?

  2. The deduplication logic: I am currently using a set node to filter URLs from the map, but I am open to suggestions for more complex site structures.

Workflow Link: https://n8n.io/workflows/15415-auto-index-your-website-and-build-a-rag-chatbot-with-firecrawl-qdrant-and-gpt-4o-mini/