Issue with AWS Bedrock Cohere Embed Multilingual Model & Data Loader Output Format in n8n

estsaiDev · March 19, 2025, 8:04am

I am currently working on setting up a vector database using AWS S3 and Pinecone in n8n with the goal of implementing an LLM-powered RAG system. I have successfully retrieved data from AWS S3 into n8n and connected it to Pinecone.

For embeddings, I am using the AWS Bedrock node, specifically the Cohere Embed Multilingual model. However, when processing the data, I encounter the following error:

An error occurred while embedding documents with Bedrock: Malformed input request: #: required key [images] not found#: required key [texts] not found#: required key [texts] not found#: required key [texts] not found#: required key [texts] not found#: required key [input_type] not found#: extraneous key [inputText] is not permitted, please reformat your input and try again.

From my understanding, the input format expected by the Cohere embedding model does not match the output format of the Pinecone Data Loader. This results in the embedding model failing to recognize the input correctly.

I attempted to find configuration options within n8n to adjust the Data Loader’s output format but couldn’t locate any relevant settings in the UI. Additionally, I am unable to insert extra nodes between the Pinecone Data Loader and the AWS Bedrock node to manually extract and reformat the data into a format that the embedding model expects.

Has anyone faced a similar issue?
Is there a workaround to modify the Data Loader’s output format before sending it to the embedding model, especially given that I can’t add extra processing nodes between them?

Any insights or guidance would be greatly appreciated!

Information on your n8n setup

**n8n version: 1.82.3 **
Database (default: SQLite):
n8n EXECUTIONS_PROCESS setting (default: own, main):
**Running n8n via (Docker, npm, n8n cloud, desktop app): Docker **
**Operating system: Win11 **

Dandy_Dow · April 1, 2025, 4:48pm

Hey, if I got your issue right, I think there’s a way to work around it.

Check this out:

The error you’re getting shows that the Cohere Embed model from AWS Bedrock expects a very specific input format, like a key called "texts" with your content inside, and maybe an "input_type" field too.

But the Pinecone Data Loader is sending it in a different format, probably using something like "inputText", which the model doesn’t accept. And to make things worse, you mentioned you can’t insert a node between the Data Loader and Bedrock to fix the payload — that’s rough.

Here’s what you can do:

Build your own loader flow manually
Instead of using the Data Loader directly, you could:

Download the data from S3 using the AWS node
Use a Set or Function node to reformat the data the way Bedrock expects
Then send it to Bedrock and forward it to Pinecone

Move the embedding logic into a separate workflow
Create a standalone flow just for embeddings. That way, you get full control over the input, can structure it however you want, and then call it from your main flow using an Execute Workflow node, passing in the properly formatted data.

Let me know if you want help building that custom flow. It’ll give you way more control and flexibility.

Cheers
Dandy

lydialimlh · April 25, 2025, 7:23am

Facing the same issue here. Tried to prompt the agent to format the input to Bedrock a certain way, but that didn’t work.

Gonna attempt to build the embeddings logic in a separate workflow and see how that goes.