Support for anonymization data on AI node

The idea is:

We can enhance the Agent nodes to include a feature for anonymizing sensitive information before interacting with an LLM (Large Language Model). This would provide built-in support for anonymization, ensuring that personal or critical data do not expose sensitive data such as phone numbers, names, or other identifiable information.

My use case:

I work with personal documents that need processing via an AI agent on n8n. Currently, I have to rely on the LangChain code and external tool like Presidio to anonymize sensitive information. Integrating this functionality directly into the Agent node would streamline my workflow and ensure better privacy for consumers that rely on public LLM services.

I think it would be beneficial to add this because:

  • It enhances data privacy by minimizing the risk of sensitive information leakage.

Any resources to support this?

Are you willing to work on this?

Yes, I am willing to collaborate by developing and/or testing the feature. However, I may need assistance with the development process because I’m not a JS developer (I’m already trying to implement this feature, but the code will probably need some improvements).

This would be a really nice feature!

I double this!

can you share your current workflow of annonimizing PII data and recover those to be used after the LLA nodes?

Currently no, I can’t use the Presidio API as it is because the de-anonymization API requires the input of an analysis that isn’t really effective after using the encryption mechanism (the regular expression-based rule can’t work). And since n8n can’t currently execute LangChain code in python, I’ve put that part aside while I create my own API for this use case.

When it’s all done, I’ll put a link here with the API + workflow.

1 Like

thank you! i’d be interested in that

This is a great idea! Do you have any updates on this process @xunleii?

Hi, I didn’t have so much time to work on my spare time, but I’m making the proxy to handle this case more easily (and to be compatible with other LLM tools). I’ll post here when it will be public :wink: