Lang chain workflow design help needed

I have a basic understanding of how langchain works and I have built a few flows using n8n langchain nodes already. I am now trying to level up.

I need help designing a work flow that has an AI (open AI GPT 4) answer 40 different questions. The source date to be queried lies in a pinecone vector data base. I would also need the AI to be able to access the web and it’s own general training to help answer the questions.

The 40 questions are pre set, so I am not after a chat bot. The questions are in a google sheet. So I need the flow to pick up the questions, use the lang chain nodes and tools to ask each of the questions, and the write the answer back to the google sheet.

The use case is creating a marketing strategy document derived from querying a vector database that houses all the info scraped from a website. The 40 questions and answers are used to create the final report. Right now I just need a way to ask the questions and get the answers.

I have created a workflow that does the job but in a clumsy way, not using lang chain. I have summarised the website to get the content below the open AI token threshold for queries and to keep cost down. Then I have 40 open AI nodes, one for each question, and each time I include the web summary in the prompt.

I figure there must a smarter more efficient way to do this using the new neat lang chain nodes?

Information on your n8n setup

  • n8n version: latest beta
  • Database (default: SQLite): SQlite
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app): Docker self hosted
  • Operating system: Unbuntu

Hm, I can’t think of a better way given your requirements, but I am not really a langchain user. Perhaps @oleg has any idea on how to best approach such a use case in n8n?

Hi @Robm, Happy New Year! :tada:
I’m glad to hear you already have a working example; that’s awesome! It would be helpful to see the WF you’ve already implemented, as I’m not sure I 100% understand the use case.
Using the Langchain concepts, a more concise approach could be utilizing an agent with some custom tools. You’ve mentioned you have some knowledge base inside the Pinecone vector store; you could fetch that before for each question and pass it to an agent as context for the answer. Additionally, the agent could have access to a web-browsing tool and context retrieval tool (maybe even with metadata filtering to only retrieve items for specific websites/companies).
You could also add one more step after the agent checks the answer, where you’d again pass context, question, and the answer the agent provided and ask LM to determine if there is anything missing or even ask a refining question.

1 Like

Hi @oleg Happy New Year! :tada: as well!

Thanks for coming back to me.

What you have recommend is what I have in mind and trying to build. Where I am getting stuck is how to systematically ask each question in turn of the agent, push the answer to a google sheet and the ask the next question on the next row on the google sheet and so on until all questions have been answered.

Almost every lang chain example I have seen uses live chat as a trigger where a human needs to ask the question each time. I am need to replace the human live chat trigger with maybe a webhook that starts a flow that asks all questions in turn.

The use case is I am building a flow that scrapes a customer website to create a knowledge base. This KB is upserted in a pine cone vector data base. Then I have 40 questions to ask of the LM (Open AI Chat GPT 4) using the KB and its own training. The answers are pushed back to a Google sheet. Then using search and replace function I compile all the answers into a document for the customer. In this case a marketing strategy document.