Latest
|
Macro Economy
Latest
|
Consumer Finance
AI
|
Latest LLMs
CX/CS
|
Fintech
Latest
|
AI Infrastructure
Enterprise
|
ROI of AI
AI
|
Ethics & Safety
Latest
|
Politics & Policy
AI
|
Enterprise AI
AI
|
Big AI
Latest
|
Consumer Banking
Latest
|
Fintech Funding
AI
|
AI in Fintech
CX/CS
|
Fintech
AI
|
Health Tech
AI
|
AI Governance
Latest
|
LLMs
Latest
|
Fintech
AI
|
Open Source
AI
|
AI Security
Enterprise
|
Cloud Security
Latest
|
Macro Economy
Enterprise
|
Enterprise Solutions
AI
|
GRC
AI
|
AII Adoption
AI
|
AI Ethics
AI
|
Healthtech
CX/CS
|
AI in CX
AI
|
Quantum Computing
AI
|
Cybersecurity
Latest
|
Healthtech
CX/CS
|
AI Adoption
AI
|
AI
AI
|
Safety and Compliance
Latest
|
Big Tech
AI
|
Consumer Tech
AI
|
AI Ethics and Risks
CX/CS
|
AI
Enterprise
|
Data and Privacy
Latest
|
LLMs
Latest
|
Banking and Blockchain
AI
|
Healthtech
Enterprise
|
AI in the Enterprise
AI
|
AI Risk and Compliance
AI
|
AI Arms Race
Enterprise
|
AI
Latest
|
LLMs
CX/CS
|
Compliance
CX/CS
|
Great CX
CX/CS
|
CS in Blockchain
AI
|
AI News
Enterprise
|
AI
|
CX/CS
|
CX/CS
|
AI
|
CX/CS
|
AI
|
AI
|
Enterprise
|
AI
|
CX/CS
|
CX/CS
|
Enterprise
|
Enterprise
|
Can AI be stopped from scraping the internet? Experts say it might be too late
.jpg)
⋅
April 10, 2025

Key Points
- Jun Seki, CTO and Entrepreneur-in-Residence at Antler, highlights the challenges media organizations and businesses face when AI models scrape content without permission.
We use large language models to detect and label sensitive information based on what industry a company operates in. That way, businesses can set up policies to automatically classify different levels of information and prevent accidental data leaks.
Jun Seki
CTO and Entrepreneur-in-Residence | Antler
AI is upending industries at an unprecedented pace, but for media outlets, businesses, and even governments, the risks of AI could outweigh the rewards. From rampant content scraping and state-level surveillance, AI presents new threats that many are struggling to counter.
Jun Seki, CTO and Entrepreneur-in-Residence at Antler, has been closely analyzing the security risks AI poses—and he’s working on solutions to help businesses protect themselves.
Publishers losing control: For digital media organizations, the rise of AI-powered models has created a crisis. Publications that spent decades building credibility and revenue streams are now seeing their content quietly siphoned into AI training models—without permission or compensation.
"Newspapers feel powerless," Seki says. "Their articles have already been scraped and used to train AI models. It’s like they’ve been robbed, but they don’t know who broke into their house."
Some publishers are fighting back through licensing. In the UK, a specialized licensing agency has introduced an AI fair usage license, allowing media companies to charge for their content when used for AI training. While this is a step toward accountability, enforcement remains a major challenge—especially for smaller publishers that lack legal and financial resources.
Bot protection: For companies looking to protect their content, Seki suggests implementing technical defenses. "These codes are invisible to the average reader, but if an AI bot scrapes the content, you can trace it and use that evidence in legal actions," he explains. However, most content creators don’t have the capability to deploy such solutions. Until stronger safeguards are in place, Seki warns that "the only surefire way to prevent your sensitive information from being leaked to AI models is simply not to share it with them."
There are different levels of safeguards. At the LLM level, you need guardrails to protect against prompt injections and harmful content. But at the application level, we’re trying to build new solutions to filter and mask data before it even reaches the AI.
Jun Seki
CTO and Entrepreneur-in-Residence | Antler
AI under government control: AI’s risks extend far beyond content theft. Seki weighs in on Chinese AI privacy laws and how they intersect with DeepSeek: "If DeepSeek is used to organize protests against the Chinese government, authorities can take over the entire model," Seki warns. "They can look underneath the hood and track which users have been discussing certain topics."
While foreign companies operating outside China may be protected by their own country’s privacy laws, Chinese businesses using DeepSeek or other Chinese-backed models could face severe consequences. "If you’re a Chinese company using AI that’s monitored by the government, you’re at huge risk of landing on a government watchlist," he explains.
Building safeguards: Seki is actively developing solutions to address AI’s security vulnerabilities. "There are different levels of safeguards," he explains. "At the LLM level, you need guardrails to protect against prompt injections and harmful content. But at the application level, we’re trying to build new solutions to filter and mask data before it even reaches the AI."
His approach includes creating AI security tools across multiple platforms, from chatbots to Chrome extensions and desktop applications. "We’re working on ways to anonymize data and detect sensitive information before it’s shared," Seki says.
One of his core solutions focuses on businesses operating in heavily regulated industries, where leaking customer information can lead to legal and financial repercussions. "We use large language models to detect and label sensitive information based on what industry a company operates in," he explains. "That way, businesses can set up policies to automatically classify different levels of information and prevent accidental data leaks."
In industries like healthcare, where AI is increasingly used for patient interactions, these protections are critical. "For example, we can identify patient medical histories and anonymize them before they’re processed by an AI," Seki says. "Once data is used to train a third-party model, removing it is nearly impossible. That’s why prevention is key."