AI Data Clauses: Protecting Your Confidential Information

With the increasing integration of AI tools in professional settings, safeguarding confidential client information has become more critical than ever. We discuss this issue at length in our article on AI Inference[CS1]. Notably, many widely used AI Agents, collect and process user data in various ways, often outlined in their terms of use and privacy policies. However, users may not always be fully aware of how their information is stored, accessed, or utilised. Understanding these policies is essential for ensuring data security and compliance with confidentiality agreements. This article examines the data policies of well-known AI programs in order to help users make informed decisions about protecting sensitive information.

Examples of User Data Clauses:

Chat GPT

Chat GPT retains the right to utilise user content to "provide, maintain, develop, and improve our Services, comply with applicable law, enforce [their] terms and policies and [their] Services safe". [6] Users have the option to opt out in their account settings.

Gemini (Google)

Gemini does not use prompts or responses to train their AI model. Some features are only available through the Gemini for Google Cloud Trusted Tester Program, which allows customers to optionally share data, however, the data is used for product improvements, not for training Gemini models. [7]

Microsoft 365 Copilot

Microsoft 356 Copilot states that it stores user data but does not use it to train "foundation LLMs, including those used by Microsoft 365 Copilot". [8] However, it does access your data such as user documents, emails, calendar, chats, meetings, and contacts in combination with "the user's working context, such as the meeting a user is in now, the email exchanges the user had on a topic, or the chat conversations the user had last week" in order to provide responses. [8] This means that although the underlying Large Language Modell (LLM) is not trained on your data, it is tracking a significant amount of it across Microsoft platforms in order to tailor its responses to you, accessing more private user data than other agents.

LinkedIn

LinkedIn, owned by Microsoft, is not an LLM but uses AI for purposes such as writing assistant features. It was recently discovered that the company was training its AI on user data before updating its terms of services, which includes information that is uploaded as well as private chats. [10] LinkedIn has since quietly added a clause to their privacy policy that allows the use of personal data to develop and train their AI models. [9] This is an automatic change that users have to manually change in settings, putting at risk the privacy of those who do not keep up to date with AI news.

Our Conclusion

The differences between how various providers deal with user data are significant and they matter. While AI-powered tools offer significant benefits in workplace productivity, their data policies highlight the need for careful consideration when handling confidential information. Each platform varies in how it collects, stores, and processes user data, with some offering opt-out options while others track extensive user interactions. To mitigate risks, professionals should thoroughly review AI service policies, adjust privacy settings where possible, and implement best practices for safeguarding sensitive client data. As a precaution, all input should be anonymised and should have sensitive information removed before being input into AI systems. As AI technology continues to evolve, staying informed about data protection measures remains crucial in maintaining trust and compliance in professional environments.

[1] Risk outlook report: The use of Artificial Intelligence in the legal market. Solicitors Regulation Authority. (2023, November 20). https://www.sra.org.uk/sra/research-publications/artificial-intelligence-legal-market/

[2] Hill, M. (2023, March 22). Sharing sensitive business data with CHATGPT could be risky. CSO Online. https://www.csoonline.com/article/574799/sharing-sensitive-business-data-with-chatgpt-could-be-risky.html

[4] CHATGPT and large language models: What's the risk? NCSC. (n.d.). https://www.ncsc.gov.uk/blog-post/chatgpt-and-large-language-models-whats-the-risk

[5] Hays, S. (2024, October 30). AI training and copyright infringement: Solutions from Asia. Tech Policy Press. https://www.techpolicy.press/ai-training-and-copyright-infringement-solutions-from-asia/

[6] Terms of Use. OpenAI. (n.d.). https://openai.com/policies/row-terms-of-use

[7] How gemini for google cloud uses your data. Google. (n.d.). https://cloud.google.com/gemini/docs/discover/data-governance

[8] Data, Privacy, and Security for Microsoft 365 Copilot. Microsoft. (n.d.). https://learn.microsoft.com/en-us/copilot/microsoft-365/microsoft-365-copilot-privacy

[9] LinkedIn Privacy Policy. LinkedIn. (n.d.). https://www.linkedin.com/legal/privacy-policy#use

[10] Davis, W. (2024, September 19). LinkedIn is training AI models on your data. The Verge. https://www.theverge.com/2024/9/18/24248471/linkedin-ai-training-user-accounts-data-opt-in

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.

AI Data Clauses: Protecting Your Confidential Information

Contributor

Our Conclusion

Privacy

Contributor

United Kingdom