how to build a workflow that automatically embeds all pdfs files, which are stored in a folder. Then store the results in a postgresql db. And the process will be automatically carried out when we update new pdfs in the folder?
Any one has an idea?
how to build a workflow that automatically embeds all pdfs files, which are stored in a folder. Then store the results in a postgresql db. And the process will be automatically carried out when we update new pdfs in the folder?
Any one has an idea?
For embeds you mean embedding? you need prepare pdf for a vectory db for the AI ?
Yes it is. I created vector db for my pdfs. But i have many pdfs that will be updated everday in my local folder. . how can i automatically update my vector db for all pdfs file
You need to transform pdf in text, separate in chunks/paragraph/number of token OR tranform in single image and analyze it with a vision model
send the text to embeddings model and store embeddings in the db OR send the image in collection cluster https://www.youtube.com/live/_BQTnXpuH-E?si=T6OhznAGaQNdb1HF
This is the logic.