I am currently working on setting up a vector database using AWS S3 and Pinecone in n8n with the goal of implementing an LLM-powered RAG system. I have successfully retrieved data from AWS S3 into n8n and connected it to Pinecone.
For embeddings, I am using the AWS Bedrock node, specifically the Cohere Embed Multilingual model. However, when processing the data, I encounter the following error:
An error occurred while embedding documents with Bedrock: Malformed input request: #: required key [images] not found#: required key [texts] not found#: required key [texts] not found#: required key [texts] not found#: required key [texts] not found#: required key [input_type] not found#: extraneous key [inputText] is not permitted, please reformat your input and try again.
From my understanding, the input format expected by the Cohere embedding model does not match the output format of the Pinecone Data Loader. This results in the embedding model failing to recognize the input correctly.
I attempted to find configuration options within n8n to adjust the Data Loader’s output format but couldn’t locate any relevant settings in the UI. Additionally, I am unable to insert extra nodes between the Pinecone Data Loader and the AWS Bedrock node to manually extract and reformat the data into a format that the embedding model expects.
Has anyone faced a similar issue?
Is there a workaround to modify the Data Loader’s output format before sending it to the embedding model, especially given that I can’t add extra processing nodes between them?
Any insights or guidance would be greatly appreciated!
Information on your n8n setup
- **n8n version: 1.82.3 **
- Database (default: SQLite):
- n8n EXECUTIONS_PROCESS setting (default: own, main):
- **Running n8n via (Docker, npm, n8n cloud, desktop app): Docker **
- **Operating system: Win11 **