I’m extracting data from an .xlsx file in my Google Drive, processing it through an aggregate and summarize node, and then inserting it into my Supabase vector store (table=documents). The records update in the Supabase table, but I can’t chat with the data.
When testing with Google Docs and PDF files, everything works fine, but .xlsx files are problematic.
The N8N chat response says: “It seems I’m unable to access specific details about your media data in your Supabase table. However, I can guide you on how to analyze your media data and what kind of insights you might gather.”
I did get it to work once with only 3 columns and 28 rows, but now even that test doesn’t work anymore.
n8n
January 2, 2025, 4:09am
2
It looks like your topic is missing some important information. Could you provide the following if applicable.
n8n version:
Database (default: SQLite):
n8n EXECUTIONS_PROCESS setting (default: own, main):
Running n8n via (Docker, npm, n8n cloud, desktop app):
Operating system:
Welcome to the community @ReinventNYC !
Tip for sharing information
Pasting your n8n workflow
Ensure to copy your n8n workflow and paste it in the code block , that is in between the pairs of triple backticks, which also could be achieved by clicking </>
(preformatted text) in the editor and pasting in your workflow.
```
<your workflow>
```
That implies to any JSON output you would like to share with us.
Make sure that you have removed any sensitive information from your workflow and include dummy or pinned data with it!
Can you share your workflow? I think either your aggregation/sumarization does not produce the valid document or there is an issue with how you load it to Supabase.
C_K
February 26, 2025, 9:21am
4
Hi @ReinventNYC did you find a solutino for that? I have the same problem… i think its better to store one entry per row but the supabase vectore store and the text splitter cuts into many peaces…
ReinventNYC:
I’m extracting data from an .xlsx file in my Google Drive, processing it through an aggregate and summarize node, and then inserting it into my Supabase vector store (table=documents). The records update in the Supabase table, but I can’t chat with the data.
When testing with Google Docs and PDF files, everything works fine, but .xlsx files are problematic.
The N8N chat response says: “It seems I’m unable to access specific details about your media data in your Supabase table. However, I can guide you on how to analyze your media data and what kind of insights you might gather.”
I did get it to work once with only 3 columns and 28 rows, but now even that test doesn’t work anymore.
Layman’s Terms (No Jargon!)
Problem : Your Excel file isn’t “talking” to Supabase like PDFs/Docs do.
Why? Excel hides sneaky formatting (colors, formulas, blank rows) that breaks robots.
Simple Fix :
Clean Your Excel :
Delete empty rows/columns
Remove colors/formulas (copy-paste values only)
Test with Baby Steps :
Use 3 columns, 5 rows (like “Name, Age, Job”)
Reprocess → Chat again
Still Broken? Let me know – we’ll debug like detectives!
Experienced User Guide (You’ve Done This Before)
Root Cause : Supabase vector stores need strict formatting . Excel often includes:
Hidden metadata (e.g., cell styles)
Inconsistent data types (mixing text/numbers)
Auto-added “phantom” rows
Workflow Fix :
Preprocess Excel :
Add a “Code Node” after Google Drive node:
javascript
Copy
// Remove hidden characters/formatting return items.map(item => { const cleanData = {}; Object.keys(item.json).forEach(key => { cleanData[key] = String(item.json[key]).trim(); }); return { json: cleanData }; });
2. Aggregate Node Settings :
Set “Group By” to a unique column (e.g., ID)
Disable “Keep All Fields” (prevents junk data)
Supabase Node :
Enable “Skip if Empty” (ignores blank rows)
Check Primary Key matches your table’s unique ID
Test Script :
Copy
Process a 3-row test file 2. Check Supabase table RAW (pgAdmin/Supabase dashboard) 3. If data exists but chat fails → Vector schema mismatch!
Professional Deep Dive (For Engineers)
Technical Analysis :
Failure Point : Data ingestion succeeds, but embeddings aren’t queryable .
Likely Culprits :
Schema Mismatch : Supabase documents
table expects vector
column with vector(1536)
type (OpenAI standard).
Data Truncation : Excel’s row limits/aggregate node dropping fields.
Metadata Collision : Supabase auto-detects schema on first insert; subsequent changes break compatibility.
Advanced Solution :
A. Validate Supabase Schema
sql
Copy
– Run in Supabase SQL Editor SELECT column_name, data_type FROM information_schema.columns WHERE table_name = ‘documents’;
Ensure :
vector
column exists → vector
type
All Excel columns are present → text
/varchar
B. Reprocess with Strict Typing
Add a “Function Node” before Supabase:
javascript
Copy
// Enforce schema for vectors return items.map(item => { return { json: { content: item.json.content, // Your text data metadata: item.json.metadata, // Optional vector: item.json.vector || // Replace with embedding logic Array(1536).fill(0) // Placeholder for testing } }; });
C. Debug with n8n CLI
bash
Copy
n8n execute --workflow --file your_workflow.json --data test.xlsx
Flags :
--pretty
(show raw output)
--timeout 99999
(catch hangs)
D. Fallback: Bulk Delete & Re-ingest
sql
Copy
– Reset Supabase table TRUNCATE documents RESTART IDENTITY;
Re-run workflow to force schema detection.
Critical Checks
Excel → JSON Format : Use n8n’s “Spreadsheet File” node (not Google Drive node) for direct .xlsx parsing.
Vectorization : Confirm embeddings are generated before Supabase insert (e.g., OpenAI node).
Supabase Indexes : Create an IVFFlat index for vector
column if using pgvector.
Need More? Share:
A screenshot of your aggregate node settings
Supabase documents
table schema
5 rows of sample Excel data
Please mark as solution if this resolved your issue, be blessed!
1 Like
system
Closed
May 27, 2025, 9:33am
6
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.