RAG Chatbot not accessing Pinecone PDF data

Rendawg · May 3, 2025, 12:03pm

Hello,

I’m doing a Udemy course (AI Automation: Build LLM Apps & AI-Agents with n8n & APIs) and created a workflow. I can send my pdf’s to the pinecone vector database using the vector agent. Everything looks fine. But when I build the second part and complete it, everything tests fine. But when I use the chatbot with any question related to my financial pdf’s that are uploaded to the google docs folder it always comes back saying 'It doesn’t have access to that kind of data."

Through the Agent I can confirm that the text content is being parsed, so I’m pulling pageContent. PDF meta data is being captured. What I think the issue is, is that the metadata isn’t making it into the Pinecone vector DB metadata field. Here is an excerpt of my json data:

I have tripled checked this against the online instructor’s build and I have rebuilt it multiple times, all with the same issues. I can only conclude that there is some difference with the software versions we are using and something is different.

I’m also getting weird uploads into the Pinecone DB sometimes. When I manually test the agent, it always puts the data into the DB. However, many times it doesn’t show any meta data and only id numbers. When I go to fetch the data I get this error:

Lastly, here is my workflow from n8n: Workflow getting data into Pinecone:

And workflow accessing the data through chat bot:

n8n version - Local: Vers. 1.90.2.
Database: Pinecone, hosted, AWS
Operating system: Windows 10

Rendawg · May 3, 2025, 4:06pm

Ok, I think I have fixed it. I believe I had the wrong type of data chosen in the Default Data Loader. However, I know I set this up correctly originally then I changed it as I was troubleshooting. Then I changed it back today and everything is working (spent so much time debugging this I lost track of what I did and didn’t do after awhile…heh)

Secondly, I believe it may have been working all along but wasn’t able to find the data that I was requesting from the PDF as it is very finicky with the data it does finally read from it. Once I started asking it to deliver different information, it was able to provide it. I attribute this to the fact that the formatting of standard PDF can cause problems for AI some need to be handled differently. The way the data i formatted, etc can cause it problems reading it properly.

Thanks,
R

system · August 1, 2025, 4:07pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.