Slack users horrified to discover messages used for AI training

Slack users horrified to discover messages used for AI training

After launching Slack AI in February, Slack appears to be digging its heels in, defending its vague policy that by default sucks up customers’ data—including messages, content, and files—to train Slack’s global AI models.

According to Slack engineer Aaron Maurer, Slack has explained in a blog that the Salesforce-owned chat service does not train its large language models (LLMs) on customer data. But Slack’s policy may need updating “to explain more carefully how these privacy principles play with Slack AI,” Maurer wrote on Threads, partly because the policy “was originally written about the search/recommendation work we’ve been doing for years prior to Slack AI.”

Maurer was responding to a Threads post from engineer and writer Gergely Orosz, who called for companies to opt out of data sharing until the policy is clarified, not by a blog, but in the actual policy language.

“An ML engineer at Slack says they don’t use messages to train LLM models,” Orosz wrote. “My response is that the current terms allow them to do so. I’ll believe this is the policy when it’s in the policy. A blog post is not the privacy policy: every serious company knows this.”

The tension for users becomes clearer if you compare Slack’s privacy principles with how the company touts Slack AI.

Slack’s privacy principles specifically say that “Machine Learning (ML) and Artificial Intelligence (AI) are useful tools that we use in limited ways to enhance our product mission. To develop AI/ML models, our systems analyze Customer Data (e.g. messages, content, and files) submitted to Slack as well as other information (including usage information) as defined in our privacy policy and in your customer agreement.”

Meanwhile, Slack AI’s page says, “Work without worry. Your data is your data. We don’t use it to train Slack AI.”

Because of this incongruity, users called on Slack to update the privacy principles to make it clear how data is used for Slack AI or any future AI updates. According to a Salesforce spokesperson, the company has agreed an update is needed.

“Yesterday, some Slack community members asked for more clarity regarding our privacy principles,” Salesforce’s spokesperson told Ars. “We’ll be updating those principles today to better explain the relationship between customer data and generative AI in Slack.”

The spokesperson told Ars that the policy updates will clarify that Slack does not “develop LLMs or other generative models using customer data,” “use customer data to train third-party LLMs” or “build or train these models in such a way that they could learn, memorize, or be able to reproduce customer data.” The update will also clarify that “Slack AI uses off-the-shelf LLMs where the models don’t retain customer data,” ensuring that “customer data never leaves Slack’s trust boundary, and the providers of the LLM never have any access to the customer data.”

These changes, however, do not seem to address a key concern for users who never explicitly consented to sharing chats and other Slack content for use in AI training.

Users opting out of sharing chats with Slack

This controversial policy is not new. Wired warned about it in April, and TechCrunch reported that the policy has been in place since at least September 2023.

But widespread backlash began swelling last night on Hacker News, where Slack users called out the chat service for seemingly failing to notify users about the policy change, instead quietly opting them in by default. To critics, it felt like there was no benefit to opting in for anyone but Slack.

From there, the backlash spread to social media, where SlackHQ hastened to clarify Slack’s terms with explanations that did not seem to address all the criticism.

“I’m sorry Slack, you’re doing fucking WHAT with user DMs, messages, files, etc?” Corey Quinn, the chief cloud economist for a cost management company called Duckbill Group, posted on X. “I’m positive I’m not reading this correctly.”

SlackHQ responded to Quinn after the economist declared, “I hate this so much,” and confirmed that he had opted out of data sharing in his paid workspace.

“To clarify, Slack has platform-level machine-learning models for things like channel and emoji recommendations and search results,” SlackHQ posted. “And yes, customers can exclude their data from helping train those (non-generative) ML models. Customer data belongs to the customer.”

Later in the thread, SlackHQ noted, “Slack AI—which is our generative AI experience natively built in Slack—[and] is a separately purchased add-on that uses Large Language Models (LLMs) but does not train those LLMs on customer data.”

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top