Is it possible to use a multimodal embeddings?

Describe the problem/error/question

I was not able to find any way to use a multimodal embedding model to index images in a vector database. I know that the same question was asked in July, 2024, but I’m wondering if the answer is still the same 8 months later and it’s possible that I’m just missing something.

It looks like your topic is missing some important information. Could you provide the following if applicable.

  • n8n version:
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app):
  • Operating system:

Hi Raphael

Is the issue that all the n8n embedding nodes only support text embeddings?

You could try using an HTTP Request node and use the Jina Embeddings API.

Would that be a solution? Or was there a specific provider you wanted to use?

Thank you. Yes, that is the issue - only text embeddings are supported. It would be nice if the Vector Store nodes could be used with multimodal embeddings.

If we use an HTTP Request node, can that eventually be connected to a Vector Store as an embedding?

Yes, if you use an HTTP Request node to generate embeddings you can save them in a Vector Store.

Generating the embeddings is a separate step to saving them in a Vector Store. So the main issue here is finding a model to create the embeddings.

There are different approaches you can use for embedding text and images. The following video may help:

Thank you. This is helpful!

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.