Dimensions option for Embeddings Google Vertex and Embeddings Google Gemini nodes

The idea is:

Add a dimensions option for Embeddings Google Vertex and Embeddings Google Gemini nodes, just as is available today for the Embeddings OpenAI and Embeddings Azure OpenAI nodes.

My use case:

My vector store is a higher dimension than the default resulting in: “Error inserting: expected 1536 dimensions, not 3072 400 Bad Request”.

I think it would be beneficial to add this because:

This would allow n8n AI users to leverage the full capabilities of the highest ranking embedding models, where gemini-embedding-001 has been at the top for sometime now. More, open-source stores such as pgvector and providers like Supabase support making use of these high dimension vectors. The result would be improved RAG’s powered by n8n.

Here are two quickly put together community nodes that I’m testing out to fill this need. Both are based off the official nodes, just with the options not currently included.

I’ve added both and am generating 3072 dimension vectors. YMMV, feedback welcome.

1 Like

Unfortunately the “Output Dimensions” (in the n8n-nodes-google-gemini-embeddings-extended node) setting always uses 3072 dimensions. All other values are ignored. In my case, it’t not possible to use gemini-embedding-001 with 1536 dimensions. The model still runs with 3072 dimensions. If you need other dimensions then 3072 for gemini-embedding-001, you can use the LangChain Code module.

! Warning: use this for test purposes only, since the API key will be exposed.

First you have to change some parameters in the node:

  • Code: Supply Data
  • Inputs → remove all inputs
  • Outputs: Type → Embedding

By using this configuration, the node can be attached as embedding to a vector store.

Then use the code below in the Supply Data field:

const https = require('https');
const { URL } = require('url');

const API_KEY = 'YOUR_GEMINI_API_KEY';
const MODEL = 'gemini-embedding-001';
const DIM = 1536;

const ENDPOINT = `https://generativelanguage.googleapis.com/v1beta/models/${MODEL}:embedContent`;

/** Generic HTTPS POST returning parsed JSON */
function postJson(urlString, bodyObj) {
  return new Promise((resolve, reject) => {
    const url = new URL(urlString);
    const body = Buffer.from(JSON.stringify(bodyObj), 'utf8');

    const opts = {
      method: 'POST',
      hostname: url.hostname,
      port: 443,
      path: url.pathname + url.search,
      headers: {
        'Content-Type': 'application/json; charset=utf-8',
        'Content-Length': body.length,
      },
    };

    const req = https.request(opts, (res) => {
      let data = '';
      res.setEncoding('utf8');
      res.on('data', (chunk) => (data += chunk));
      res.on('end', () => {
        if (res.statusCode < 200 || res.statusCode >= 300) {
          return reject(new Error(`HTTP ${res.statusCode}: ${data}`));
        }
        try {
          resolve(JSON.parse(data));
        } catch (e) {
          reject(new Error(`JSON parse error: ${e.message}`));
        }
      });
    });

    req.on('error', reject);
    req.write(body);
    req.end();
  });
}

/** Call Gemini embeddings for one string */
async function embedOne(text, taskType) {
  const payload = {
    model: MODEL,
    content: { parts: [{ text: String(text || '') }] },
    taskType, // 'RETRIEVAL_QUERY' of 'RETRIEVAL_DOCUMENT'
    outputDimensionality: DIM,
  };

  const url = `${ENDPOINT}?key=${encodeURIComponent(API_KEY)}`;
  const data = await postJson(url, payload);

  let values;
  if (data?.embedding?.values) values = data.embedding.values;
  else if (Array.isArray(data?.embedding)) values = data.embedding;
  else if (Array.isArray(data?.embeddings) && data.embeddings[0]?.values) {
    values = data.embeddings[0].values;
  }

  if (!Array.isArray(values)) {
    throw new Error(`Unexpected embed response: ${JSON.stringify(data)}`);
  }
  if (values.length !== DIM) {
    throw new Error(`Embedding dim ${values.length} != expected ${DIM}`);
  }
  return values.map((x) => Number(x));
}

const Embeddings = {
  async embedQuery(text) {
    return await embedOne(text, 'RETRIEVAL_QUERY');
  },
  async embedDocuments(texts) {
    const out = [];
    for (let i = 0; i < texts.length; i++) {
      out.push(await embedOne(texts[i], 'RETRIEVAL_DOCUMENT'));
    }
    return out;
  },
};

return Embeddings;

  • Replace YOUR_GEMINI_API_KEY by your actual api key.
  • Replace the MODEL value if you want to use another model then gemini-embedding-001
  • Change DIM to the preferred value (currently 1536)

We’re also interested in this addition/extension to the node. Without the ability to customize the dimensions, the usefulness of that node is sadly quite limited.