VentureBeat

Meta announces its Superintelligence Labs Chief Scientist: former OpenAI GPT-4 co-creator Shengjia Zhao

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Meta has appointed Shengjia Zhao, a former OpenAI researcher and co‑creator of GPT‑4, as the Chief Scientist of its newly created Meta Superintelligence Labs (MSL). The announcement was made Friday by Mark Zuckerberg on Threads, noting Zhao will lead the lab’s scientific agenda alongside him and Alexandr Wang, the former CEO of Scale AI who Meta recently brought onboard as Chief AI Officer. “I am very excited to take up the role of chief scientist for meta super-intelligence labs. Looking forward to building asi [artificial superintelligence] and aligning it to empower people with the amazing team here. Let’s build!” Zhao wrote in his own Threads post. “Artificial superintelligence” is a nebulous term used in the AI industry to describe systems more powerful and capable than any today, beyond even the smartest humans, making them difficult to control. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF Zhao’s strong commercial AI background Zhao, who previously worked at OpenAI, played a key role in the development of foundational models like GPT-4 and GPT-4o, according to arXiv system cards and research papers listing him as a co-author. He’s also known for his academic work on generative models and fair representations, with widely cited papers in venues like NeurIPS, ICML, and ICLR. Zhao joins Meta amid a high-stakes hiring blitz across the AI industry. Over the past few months, Meta has poached researchers from OpenAI, Apple, Google, and Anthropic as part of a multibillion-dollar bet on superintelligence as CNN reported. Meta recently invested $14.3 billion in Scale AI, acquiring a 49% stake and bringing on Wang to lead the superintelligence effort. Former GitHub CEO Nat Friedman also joined the team. The company has reportedly offered compensation packages worth as much as $100 million to $300 million over four years to lure top AI talent, according to multiple reports. One claim from a rival AI startup founder alleged Meta offered $1.25 billion over four years—approximately $312 million per year—to a single candidate who declined. Other insiders say Meta’s most senior AI scientists may be receiving $10 million+ per year, while first-year comp for some new hires reportedly reached $100 million. Aspirations of leading the AI frontier Zuckerberg has made no secret of his ambition to make Meta a leader in AI’s next frontier, repeatedly stating that the company plans to “invest hundreds of billions of dollars into compute to build superintelligence” using its own business-generated capital. He said the Llama 4 rollout underscored the importance of elite talent: “You can have hundreds of thousands of GPUs, but if you don’t have the right team developing the model, it doesn’t matter.” Meta’s fundamental AI research group (FAIR), still led by acclaimed scientist Yann LeCun, will remain separate from the new lab. The creation of Meta Superintelligence Labs signals a more product- and mission-focused arm of Meta’s AI efforts, centered on building and aligning ASI with human interests. Making up for the mixed reception of Llama 4 However, Meta’s push into superintelligence has come on the heels of a bumpy rollout of its latest open-source foundation models. The company released its Llama 4 model family in April 2025, positioning it as a leap forward in multimodal reasoning and long-context understanding. But the release has struggled to gain traction amid the rise of powerful Chinese open-source rivals like DeepSeek and Qwen. Meta faced public criticism from researchers and developers who cited poor real-world performance, confusion around benchmark results, and inconsistent quality across deployments. Some accused the company of “benchmark gamesmanship” and using unreleased optimized versions of Llama 4 to boost public perception—a claim Meta has denied. Internal sources blamed fast rollout timelines and bugs for the issues, but the episode has cast a shadow over Meta’s generative AI credibility just as it embarks on its most ambitious effort yet. Jim Fan, a former Stanford colleague of Zhao and now Nvidia’s Director of Robotics and Distinguished Scientist, offered his endorsement on X: “Shengjia is one of the brightest, humblest, and most passionate scientists I know. Very bullish on MSL!” The move underscores Meta’s strategy of spending aggressively now to secure a dominant position in what it views as the next foundational technology platform — one that could eclipse the mobile internet. As Zuckerberg sees it, ASI isn’t a moonshot — it’s the next frontier, and Meta intends to lead. source

Meta announces its Superintelligence Labs Chief Scientist: former OpenAI GPT-4 co-creator Shengjia Zhao Read More »

You’ve heard of AI ‘Deep Research’ tools…now Manus is launching ‘Wide Research’ that spins up 100+ agents to scour the web for you

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Singaporean AI startup Manus, which made headlines earlier this year for its approach to a multi-agent orchestration platform for consumers and “pro”-sumers (professionals wanting to run work operations), is back with an interesting new use of its technology. While many other major rival AI providers such as OpenAI, Google, and xAI that have launched “Deep Research” or “Deep Researcher” AI agents that conduct minutes or hours of extensive, in-depth web research and write well-cited, thorough reports on behalf of users, Manus is taking a different approach. The company just announced “Wide Research,” a new experimental feature that enables users to execute large-scale, high-volume tasks by leveraging the power of parallelized AI agents — even more than 100 at a single time, all focused on completing a single task (or series of sub-tasks laddering up said overarching goal). Manus was previously reported to be using Anthropic Claude models to power its platform. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF Parallel processing for research, summarization and creative output In a video posted on the official X account, Manus co-founder and Chief Scientist Yichao ‘Peak’ Ji shows a demo of using Wide Research to compare 100 sneakers. To complete the task, Manus Wide Research nearly instantly spins up 100 concurrent subagents — each assigned to analyze one shoe’s design, pricing, and availability. The result is a sortable matrix delivered in both spreadsheet and webpage formats within minutes. The company suggests Wide Research isn’t limited to data analysis. It can also be used for creative tasks like design exploration. In one scenario, Manus agents simultaneously generated poster designs across 50 distinct visual styles, returning polished assets in a downloadable ZIP file. According to Manus, this flexibility stems from the system-level approach to parallel processing and agent-to-agent communication. In the video, Peak explains that Wide Research is the first application of an optimized virtualization and agent architecture capable of scaling compute power 100 times beyond initial offerings. The feature is designed to activate automatically during tasks that require wide-scale analysis, with no manual toggles or configurations required. Availability and pricing Wide Research is available starting today for users on Manus Pro plan and will gradually become accessible to those on the Plus and Basic plans. As of now, subscription pricing for Manus is structured as follows per month. Free – $0/month Includes 300 daily refresh credits, access to Chat mode, 1 concurrent task, and 1 scheduled task. Basic – $19/month Adds 1,900 monthly credits (+1,900 bonus during limited offer), 2 concurrent and 2 scheduled tasks, access to advanced models in Agent mode, image/video/slides generation, and exclusive data sources. Plus – $39/month Increases to 3 concurrent and 3 scheduled tasks, 3,900 monthly credits (+3,900 bonus), and includes all Basic features. Pro – $199/month Offers 10 concurrent and 10 scheduled tasks, 19,900 credits (+19,900 bonus), early access to beta features, a Manus T-shirt, and the full feature set including advanced agent tools and content generation. There’s also a 17% discount on these prices for users who wish to pay up-front annually. The launch builds on the infrastructure introduced with Manus earlier this year, which the company describes as not just an AI agent, but a personal cloud computing platform. Each Manus session runs on a dedicated virtual machine, giving users access to orchestrated cloud compute through natural language — a setup the company sees as key to enabling true general-purpose AI workflows. With Wide Research, Manus users can delegate research or creative exploration across dozens or even hundreds of subagents. Unlike traditional multi-agent systems with predefined roles (such as manager, coder, or designer), each subagent within Wide Research is a fully capable, fully featured Manus instance — not a specialized one for a specific role — operating independently and able to take on any general task. This architectural decision, the company says, opens the door to flexible, scalable task handling unconstrained by rigid templates. What are the benefits of Wide over Deep Research? The implication seems to be that running all these agents in parallel is faster and will result in a better and more varied set of work products beyond research reports, as opposed to the single “Deep Research” agents other AI providers have shown or fielded. But while Manus promotes Wide Research as a breakthrough in agent parallelism, the company does not provide direct evidence that spawning dozens or hundreds of subagents is more effective than having a single, high-capacity agent handle tasks sequentially. The release does not include performance benchmarks, comparisons, or technical explanations to justify the trade-offs of this approach — such as increased resource usage, coordination complexity, or potential inefficiencies. It also lacks details on how subagents collaborate, how results are merged, or whether the system offers measurable advantages in speed, accuracy, or cost. As a result, while the feature showcases architectural ambition, its practical benefits over simpler methods remain unproven based on the information provided. Sub-agents have a mixed track record more generally, so far… While Manus’s implementation of Wide Research is positioned as an advancement in general AI agent systems, the broader ecosystem has seen mixed results with similar subagent approaches. For example, on Reddit, self-described users of Claude’s Code have raised concerns about its subagents being slow, consuming large volumes of tokens, and offering limited visibility into execution. Common pain points include lack of coordination protocols between agents, difficulties in debugging, and erratic performance during high-load periods. These challenges don’t necessarily reflect on Manus’s implementation, but they highlight the complexity of developing robust multi-agent frameworks. Manus acknowledges that Wide Research is still

You’ve heard of AI ‘Deep Research’ tools…now Manus is launching ‘Wide Research’ that spins up 100+ agents to scour the web for you Read More »

AI vs. AI: Prophet Security raises $30M to replace human analysts with autonomous defenders

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Prophet Security, a startup developing autonomous artificial intelligence systems for cybersecurity defense, announced Tuesday it has raised $30 million in Series A funding to accelerate what its founders describe as a fundamental shift from human-versus-human to “agent-versus-agent” warfare in cybersecurity. The Menlo Park-based company’s funding round, led by venture capital firm Accel with participation from Bain Capital Ventures, comes as organizations struggle with an overwhelming volume of security alerts while sophisticated attackers increasingly leverage AI to scale and automate their operations. Prophet’s approach represents a marked departure from the “copilot” AI tools that have dominated the market, instead deploying fully autonomous agents that can investigate and respond to threats without human intervention. “Every security operations team is faced with a dual mandate of reducing risk while driving operational efficiency,” said Kamal Shah, Prophet Security’s co-founder and CEO, in an exclusive interview with VentureBeat. “Our Agentic AI SOC Platform addresses both challenges by automating manual, repetitive tasks in security operations with speed, accuracy and explainability.” The funding announcement coincides with Prophet’s launch of what it calls the industry’s most comprehensive Agentic AI SOC Platform, expanding beyond its initial Prophet AI SOC Analyst to include Prophet AI Threat Hunter and Prophet AI Detection Advisor. The platform represents a significant evolution from traditional Security Operations Center (SOC) automation tools, which typically rely on rigid, pre-programmed playbooks. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF Security teams drowning in 960 daily alerts face unprecedented capacity crisis The cybersecurity industry faces a crisis of capacity and capability. Shah, who previously served as CEO of container security company StackRox before its acquisition by Red Hat, experienced these challenges firsthand. According to his observations, organizations receive an average of 960 security alerts daily, with up to 40% going uninvestigated due to resource constraints. “The number one complaint that I see from customers every single day is too many alerts, too many false positives,” Shah explained. “If you think about the world that we live in today, on average, a company gets 960 alerts a day from all the security tools that they have in their environment, and 40% of those alerts are ignored because they just don’t have the capacity to go and investigate all those alerts.” The problem is compounded by a severe shortage of skilled cybersecurity professionals. Shah points to what he calls a critical talent gap, noting there are 5 million open positions in cybersecurity globally, creating a situation where even organizations with budget to hire cannot find qualified personnel. Prophet’s solution directly addresses this capacity crunch. Over the past six months, the company’s AI SOC Analyst has performed more than 1 million autonomous investigations across its customer base, saving an estimated 360,000 hours of investigation time while delivering 10 times faster response times and reducing false positives by 96%. How autonomous AI agents differ from reactive copilot systems transforming cybersecurity The distinction between Prophet’s “agentic” AI and the copilot models deployed by larger cybersecurity vendors like CrowdStrike, Microsoft, and Sentinel One is fundamental to understanding the company’s value proposition. Traditional copilot systems require human analysts to initiate queries and interpret responses, essentially serving as sophisticated search interfaces for security data. “Copilot is reactive,” Shah explained. “You have an alert come in and a security analyst has to go and write questions, ask the question to say, hey, what does this mean? And you have to know what questions to ask. The analyst is still in the loop for every single alert that comes in because they’re interacting with it.” By contrast, Prophet’s agentic AI proactively initiates investigations the moment an alert is triggered, autonomously gathering evidence, reasoning through the data, and reaching conclusions without human intervention. The system documents every step of its investigation process, creating an audit trail that allows security teams to understand and verify its reasoning. “What Prophet AI is able to do is immediately, once an alert is triggered, it proactively goes and completes the investigation,” Shah said. “Within a matter of minutes, your investigation is complete and it knows what questions to ask, and it’s been trained to act like an expert analyst.” Building enterprise trust through transparent AI decision-making and data protection Prophet’s system leverages multiple frontier AI models, including offerings from OpenAI, Anthropic, and others, selecting the most appropriate model for each specific task. The company has built what Shah describes as an “evals framework” to ensure accuracy, repeatability, and consistency while preventing AI hallucinations—a critical concern in security contexts where false information can lead to inappropriate responses. “In security, you are in a trust building exercise with the security teams, and if you hallucinate, you’re going to lose trust and they’re not going to use your product,” Shah emphasized. The company employs a retrieval-augmented generation (RAG) architecture combined with rigorous evaluation processes to maintain what Shah calls “a high bar for security teams.” Data privacy and security represent paramount concerns for Prophet’s enterprise customers. The company employs a single-tenant architecture ensuring customer data remains isolated, and maintains contractual agreements with AI model providers preventing customer data from being used to train or fine-tune models. Early customers report dramatic efficiency gains as AI handles thousands of security alerts Prophet’s customer base includes Docker, which provided a testimonial for the funding announcement. Tushar Jain, Docker’s EVP of Engineering and Product, noted that “Prophet AI is already helping streamline parts of our security workflow, and we’re just getting started. With the recent release of Threat Hunter and growing integration with our systems, we see a clear path to faster response times, reduced

AI vs. AI: Prophet Security raises $30M to replace human analysts with autonomous defenders Read More »

Runloop lands $7M to power AI coding agents with cloud-based devboxes

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Runloop, a San Francisco-based infrastructure startup, has raised $7 million in seed funding to address what its founders call the “production gap” — the critical challenge of deploying AI coding agents beyond experimental prototypes into real-world enterprise environments. The funding round, led by The General Partnership with participation from Blank Ventures. The AI coding tools market is projected to reach $30.1 billion by 2032, growing at a compound annual growth rate (CAGR) of 27.1%. The investment signals growing investor confidence in infrastructure that enable AI agents to work at enterprise scale. Runloop’s platform addresses a fundamental question that has emerged as AI coding tools proliferate: Where do AI agents actually run when they need to perform complex, multi-step coding tasks? “I think long term, the dream is that for every employee at every big company, there’s maybe five or 10 different digital employees, or AI agents that are helping those people do their jobs,” Jonathan Wall, Runloop’s co-founder and CEO, explained in an exclusive interview with VentureBeat. Wall co-founded Google Wallet and fintech startup Index, which was acquired by Stripe. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF The analogy Wall uses is telling: “If you think about hiring a new employee at your average tech company, your first day on the job, they’re like, ‘Okay, here’s your laptop, here’s your email address, here are your credentials. Here’s how you sign into GitHub.’ You probably spend your first day setting that environment up.” That same principle applies to AI agents, Wall argues. “If you expect these AI agents to be able to do the kinds of things people are doing, they’re going to need all the same tools. They’re going to need their own work environment.” Runloop focused initially on the coding vertical based on a strategic insight about the nature of programming languages versus natural language. “Coding languages are far narrower and stricter than something like English,” Wall explained. “They have very strict syntax. They’re very pattern driven. These are things large language models (LLMs) are really good at.” More importantly, coding offers what Wall calls “built-in verification functions.” An AI agent writing code can continuously validate its progress by running tests, compiling code or using linting tools. “Those kind of tools aren’t really available in other environments. If you’re writing an essay, I guess you could do spell check, but evaluating the relative quality of an essay while you’re partway through it — there’s not a compiler.” This technical advantage has proven prescient. The AI code tools market has indeed emerged as one of the fastest-growing segments in enterprise AI, driven by tools like GitHub Copilot, which Microsoft reports is used by millions of developers, and OpenAI’s recently announced Codex improvements. Inside Runloop’s cloud-based devboxes: Enterprise AI agent infrastructure Runloop’s core product, called “devboxes,” provides isolated, cloud-based development environments where AI agents can safely execute code with full filesystem and build tool access. These environments are ephemeral — they can be spun up and torn down dynamically based on demand. “You can spin up 1,000, use 1,000 for an hour, then maybe you’re done with some particular task,” said Wall. Then, “you don’t need 1,000, so you can tear them down.” One example illustrates the platform’s utility. When a customer that builds AI agents to automatically write unit tests for improving code coverage detects production issues in their customers’ systems, they deploy thousands of devboxes simultaneously to analyze code repositories and generate comprehensive test suites. “They’ll onboard a new company and say, ‘Hey, the first thing we should do is look at your code coverage everywhere, notice where it’s lacking, go write a whole ton of tests then cherry pick the most valuable ones to send to your engineers for code review,’” Wall explained. Runloop customer success: Six-month time savings and 200% customer growth Despite only launching billing in March and self-service signup in May, Runloop has achieved significant momentum. The company reports “a few dozen customers,” including Series A companies and major model laboratories, with customer growth exceeding 200% and revenue growth exceeding 100% since March. “Our customers tend to be of the size and shape of people who are very early on the AI curve, and are pretty sophisticated about using AI,” Wall noted. “That right now, at least, tends to be Series A companies trying to build AI as their core competency, or some of the model labs who obviously are the most sophisticated about it.” The impact appears substantial. Dan Robinson, CEO of Detail.dev, a Runloop customer, called the platform “killer for our business. We couldn’t have gotten to market so quickly without it. Instead of burning months building infrastructure, we’ve been able to focus on what we’re passionate about: Creating agents that crush tech debt… Runloop basically compressed our go-to-market timeline by six months.” AI code testing and evaluation: Moving beyond simple chatbot interactions Runloop’s second major product, Public Benchmarks, addresses another critical need: Standardized testing for AI coding agents. Traditional AI evaluation focuses on single interactions between users and language models. Runloop’s approach is fundamentally different. “What we’re doing is judging potentially hundreds of tool uses, hundreds of LLM calls, and judging a composite or longitudinal outcome of an agent run,” Wall explained. “It’s far more longitudinal, and very importantly, it’s context rich.” For example, when evaluating an AI agent’s ability to patch code, “you can’t evaluate the diff or the response from the LLM. You have to put it into the context of the full code base and use something like a compiler and the

Runloop lands $7M to power AI coding agents with cloud-based devboxes Read More »

Hard-won vibe coding insights: Mailchimp’s 40% speed gain came with governance price

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Like many enterprises over the past year, Intuit Mailchimp has been experimenting with vibe coding. Intuit Mailchimp provides email marketing and automation capabilities. It’s part of the larger Intuit organization, which has been on a steady journey with gen AI over the last several years, rolling out its own GenOS and agentic AI capabilities across its business units. While the company has its own AI capabilities, Mailchimp has found a need in some cases to use vibe coding tools. It all started, as many things do, with trying to hit a very tight timeline. Mailchimp needed to demonstrate a complex customer workflow to stakeholders immediately. Traditional design tools like Figma couldn’t deliver the working prototype they needed. Some Mailchimp engineers had already been quietly experimenting with AI coding tools. When the deadline pressure hit, they decided to test these tools on a real business challenge. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF “We actually had a very interesting situation where we needed to prototype some stuff for our stakeholders, almost on an immediate basis, it was a pretty complex workflow that we needed to prototype,” Shivang Shah, Chief Architect at Intuit Mailchimp told VentureBeat.  The Mailchimp engineers used vibe coding tools and were surprised by the results. “Something like this would probably take us days to do,” Shah said. ” We were able to kind of do it in a couple of hours, which was very, very interesting. That prototype session sparked Mailchimp’s broader adoption of AI coding tools. Now, using those tools, the company has achieved development speeds up to 40% faster while learning critical lessons about governance, tool selection and human expertise that other enterprises can immediately apply. The evolution from Q&A to ‘do it for me’ Mailchimp’s journey reflects a broader shift in how developers interact with AI. Initially, engineers used conversational AI tools for basic guidance and algorithm suggestions. “I think even before vibe coding became a thing, a lot of engineers were already leveraging the existing, conversational AI tools to actually do some form of – hey, is this the right algorithm for the thing that I’m trying to solve for?” Shah noted. The paradigm fundamentally changed with modern AI vibe coding tools. Instead of simple questions and answers, the use of the tools became more about actually doing some of the coding work.  This shift from consultation to delegation represents the core value proposition that enterprises are grappling with today. Mailchimp deliberately adopted multiple AI coding platforms instead of standardizing on one. The company uses Cursor, Windsurf, Augment, Qodo and GitHub Copilot based on a key insight about specialization. “What we realized is, depending on the life cycle of your software development, different tools give you different benefits or different expertise, almost like having an engineer working with you,” Shah said. This approach mirrors how enterprises deploy different specialized tools for different development phases. Companies avoid forcing a one-size-fits-all solution that may excel in some areas while underperforming in others. The strategy emerged from practical testing rather than theoretical planning. Mailchimp discovered through usage that different tools excelled at different tasks within their development workflow. Governance frameworks prevent AI coding chaos Mailchimp’s most critical vibe coding lesson centers on governance. The company implemented both policy-based and process-embedded guardrails that other enterprises can adapt. The policy framework includes responsible AI reviews for any AI-based deployment that touches customer data. Process-embedded controls ensure human oversight remains central. AI may conduct initial code reviews, but human approval is still required before any code is deployed to production. “There’s always going to be a human in the loop,” Shah emphasized. “There’s always going to be a person who will have to refine it, we’ll have to gut check it, make sure it’s actually solving the right problem.” This dual-layer approach addresses a common concern among enterprises. Companies want AI productivity benefits while maintaining code quality and security standards. Context limitations require strategic prompting Mailchimp discovered that AI coding tools face a significant limitation. The tools understand general programming patterns but lack specific knowledge of the business domain. “AI has learned from the industry standards as much as possible, but at the same time, it might not fit in the existing user journeys that we have as a product,” Shah noted. This insight led to a critical realization. Successful AI coding requires engineers to provide increasingly specific context through carefully crafted prompts based on their technical and business knowledge. “You still need to understand the technologies, the business, the domain, and the system architecture, aspects of things at the end of the day, AI helps amplify what you know and what you could do with it,” Shah explained. The practical implication for enterprises: teams need training on both the tools and on how to communicate business context to AI systems effectively. Prototype-to-production gap remains significant AI coding tools excel at rapid prototyping, but Mailchimp learned that prototypes don’t automatically become production-ready code. Integration complexity, security requirements and system architecture considerations still require significant human expertise. “Just because we have a prototype in place, we should not jump to a conclusion that this can be done in  X amount of time,” Shah cautioned. “Prototype does not equate to take the prototype to production.” This lesson helps enterprises set realistic expectations about the impact of AI coding tools on development timelines. The tools significantly help with prototyping and initial development, but they’re not a magic solution for the entire software development lifecycle. Strategic focus shift toward higher-value work The most transformative impact wasn’t just speed. The tools enabled

Hard-won vibe coding insights: Mailchimp’s 40% speed gain came with governance price Read More »

OpenAI removes ChatGPT feature after private conversations leak to Google search

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now OpenAI made a rare about-face Thursday, abruptly discontinuing a feature that allowed ChatGPT users to make their conversations discoverable through Google and other search engines. The decision came within hours of widespread social media criticism and represents a striking example of how quickly privacy concerns can derail even well-intentioned AI experiments. The feature, which OpenAI described as a “short-lived experiment,” required users to actively opt in by sharing a chat and then checking a box to make it searchable. Yet the rapid reversal underscores a fundamental challenge facing AI companies: balancing the potential benefits of shared knowledge with the very real risks of unintended data exposure. We just removed a feature from @ChatGPTapp that allowed users to make their conversations discoverable by search engines, such as Google. This was a short-lived experiment to help people discover useful conversations. This feature required users to opt-in, first by picking a chat… pic.twitter.com/mGI3lF05Ua — DANΞ (@cryps1s) July 31, 2025 How thousands of private ChatGPT conversations became Google search results The controversy erupted when users discovered they could search Google using the query “site:chatgpt.com/share” to find thousands of strangers’ conversations with the AI assistant. What emerged painted an intimate portrait of how people interact with artificial intelligence — from mundane requests for bathroom renovation advice to deeply personal health questions and professionally sensitive resume rewrites. (Given the personal nature of these conversations, which often contained users’ names, locations, and private circumstances, VentureBeat is not linking to or detailing specific exchanges.) “Ultimately we think this feature introduced too many opportunities for folks to accidentally share things they didn’t intend to,” OpenAI’s security team explained on X, acknowledging that the guardrails weren’t sufficient to prevent misuse. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF The incident reveals a critical blind spot in how AI companies approach user experience design. While technical safeguards existed — the feature was opt-in and required multiple clicks to activate — the human element proved problematic. Users either didn’t fully understand the implications of making their chats searchable or simply overlooked the privacy ramifications in their enthusiasm to share helpful exchanges. As one security expert noted on X: “The friction for sharing potential private information should be greater than a checkbox or not exist at all.” Good call for taking it off quickly and expected. If we want AI to be accessible we have to count that most users never read what they click. The friction for sharing potential private information should be greater than a checkbox or not exist at all. https://t.co/REmHd1AAXY — wavefnx (@wavefnx) July 31, 2025 OpenAI’s misstep follows a troubling pattern in the AI industry. In September 2023, Google faced similar criticism when its Bard AI conversations began appearing in search results, prompting the company to implement blocking measures. Meta encountered comparable issues when some users of Meta AI inadvertently posted private chats to public feeds, despite warnings about the change in privacy status. These incidents illuminate a broader challenge: AI companies are moving rapidly to innovate and differentiate their products, sometimes at the expense of robust privacy protections. The pressure to ship new features and maintain competitive advantage can overshadow careful consideration of potential misuse scenarios. For enterprise decision makers, this pattern should raise serious questions about vendor due diligence. If consumer-facing AI products struggle with basic privacy controls, what does this mean for business applications handling sensitive corporate data? What businesses need to know about AI chatbot privacy risks The searchable ChatGPT controversy carries particular significance for business users who increasingly rely on AI assistants for everything from strategic planning to competitive analysis. While OpenAI maintains that enterprise and team accounts have different privacy protections, the consumer product fumble highlights the importance of understanding exactly how AI vendors handle data sharing and retention. Smart enterprises should demand clear answers about data governance from their AI providers. Key questions include: Under what circumstances might conversations be accessible to third parties? What controls exist to prevent accidental exposure? How quickly can companies respond to privacy incidents? The incident also demonstrates the viral nature of privacy breaches in the age of social media. Within hours of the initial discovery, the story had spread across X.com (formerly Twitter), Reddit, and major technology publications, amplifying reputational damage and forcing OpenAI’s hand. The innovation dilemma: Building useful AI features without compromising user privacy OpenAI’s vision for the searchable chat feature wasn’t inherently flawed. The ability to discover useful AI conversations could genuinely help users find solutions to common problems, similar to how Stack Overflow has become an invaluable resource for programmers. The concept of building a searchable knowledge base from AI interactions has merit. However, the execution revealed a fundamental tension in AI development. Companies want to harness the collective intelligence generated through user interactions while protecting individual privacy. Finding the right balance requires more sophisticated approaches than simple opt-in checkboxes. One user on X captured the complexity: “Don’t reduce functionality because people can’t read. The default are good and safe, you should have stood your ground.” But others disagreed, with one noting that “the contents of chatgpt often are more sensitive than a bank account.” As product development expert Jeffrey Emanuel suggested on X: “Definitely should do a post-mortem on this and change the approach going forward to ask ‘how bad would it be if the dumbest 20% of the population were to misunderstand and misuse this feature?’ and plan accordingly.” Definitely should do a post-mortem on this and change the approach going forward to ask “how bad would it be if the dumbest 20%

OpenAI removes ChatGPT feature after private conversations leak to Google search Read More »

Sparrow raises $35M Series B to automate the employee leave management nightmare

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Sparrow, an employee leave management technology company, announced Tuesday it has raised $35 million in Series B funding led by SLW. This brings the company’s total investment to $64 million as it capitalizes on the growing complexity of workplace leave compliance. The funding comes as companies grapple with an explosion of state and local leave regulations that have transformed what was once a straightforward HR process into a compliance nightmare. With 14 states now operating paid leave programs and six more considering legislation, distributed workforces face a patchwork of rules that vary not just by state, but by county and city. “Leave is complicated — and stressful,” Deborah Hanus, Sparrow’s CEO and co-founder, explained in an exclusive interview with VentureBeat. “It touches so many aspects of the company — legal compliance, insurance, state agencies, payroll, HRBPs, managers and employees. Everything is always changing, and no one has the data they need when they need it.” The problem has intensified dramatically since the pandemic accelerated remote work adoption. What was manageable when employees worked from a single office has become nearly impossible to navigate manually across multiple jurisdictions. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF Revenue jumps 14X as companies scramble for leave management solutions Sparrow’s growth metrics underscore the urgency companies feel around this problem. The company has grown revenue 14X since raising its Series A in 2021, and now serves more than 1,000 customers, including OpenAI, Reddit, Chime and Oura. The platform manages leave for more than 500,000 employees and has processed more than 2 million days of leave. Perhaps most telling is the financial impact: Sparrow has saved customers more than $200 million in payroll costs by ensuring employees receive proper wage replacement from state agencies and insurance providers — money that would otherwise come from company coffers when paperwork errors prevent benefit claims. “Using Sparrow actually saves our customers money, because the default is people are making mistakes on their paperwork,” said Hanus. “They’re not getting paid properly, and then usually their employer is making up the difference.” One customer paid Sparrow roughly $250,000 in its first year but saved $2.5 million in payroll costs — a 10X return on investment. How AI automation tackles the compliance nightmare of remote work Traditional leave management requires HR teams to navigate a maze of federal regulations like FMLA alongside varying state programs, insurance providers and medical documentation. The process involves coordinating between employees, managers, payroll teams, legal counsel and multiple external agencies — often using spreadsheets and email threads. “Before Sparrow, we were managing all of our leaves through an Excel spreadsheet,” said Sara Marzitelli, VP and head of people at SonderMind, a mental health company that experienced 2,289% growth over three years. “It was hectic. Information wasn’t always up to date and we were missing key pieces of data.” Sonya Miller, VP at talent intelligence platform Eightfold.ai, echoed similar challenges. “Before using Sparrow, we were keeping up with constantly evolving state programs and manually tracking absences, which frequently resulted in us overspending without realizing it,” Miller told VentureBeat. “Hiring a leave management partner to save money may seem counterintuitive, but that is precisely what happened. Sparrow helped us save time and money by streamlining the entire procedure and guaranteeing compliance.” Sparrow’s AI-powered platform consolidates this fragmented process by ingesting data from insurance providers, state agencies and medical providers into a single system. The technology automates form completion, tracks deadlines, calculates wage replacement amounts and manages communication between all parties. “The biggest problem is not having the right information in the right place,” Hanus explained. “We’re able to take all of that information across the insurance providers, medical providers, state agencies, and we are putting that all into one system.” The company’s compliance engine stays current with constantly changing regulations across all 50 states and Canada, automatically updating processes when new legislation passes. This is critical as compliance requirements often differ not just in policy but in implementation details that can vary significantly between jurisdictions. Why Sparrow pairs AI with human specialists Unlike pure software solutions, Sparrow combines AI automation with dedicated human specialists who handle complex cases and provide employee support. Each employee taking leave gets paired with a Sparrow Leave Specialist who manages their case from start to finish. “We know how to use AI when it needs to be used, but there are some moments where maybe the AI is not quite ready for it,” Hanus said. “These are very sensitive moments. You don’t want to give people wrong information.” This hybrid approach has generated exceptional customer satisfaction. Sparrow maintains a Net Promoter Score above 60, sometimes reaching 100, compared to industry averages in the negative range for insurance-related products. Customer retention exceeds 90%, with net revenue retention above 110%. The human element proves especially valuable in complex scenarios. “Maybe someone was in a car accident, they’re in a coma, and you’re dealing with their spouse, so they don’t have access to any of their accounts,” Hanus said. “Some of these situations just get complicated enough that sometimes you do actually want human intervention.” Miller from Eightfold.ai praised the platform’s user experience. “Sparrow gives our staff members the freedom to manage their own leave,” she said. “We also don’t have to worry about the specifics because it guarantees state-specific accuracy and compliance. Additionally, the dashboard is simple to use and intuitive for HR, business partners and employees. Filing paperwork for disability insurance and other state funding for the employee is a huge benefit for our employees.” Enterprise customers save

Sparrow raises $35M Series B to automate the employee leave management nightmare Read More »

Google DeepMind says its new AI can map the entire planet with unprecedented accuracy

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Google DeepMind announced today a breakthrough artificial intelligence system that transforms how organizations analyze Earth’s surface, potentially revolutionizing environmental monitoring and resource management for governments, conservation groups, and businesses worldwide. The system, called AlphaEarth Foundations, addresses a critical challenge that has plagued Earth observation for decades: making sense of the overwhelming flood of satellite data streaming down from space. Every day, satellites capture terabytes of images and measurements, but connecting these disparate datasets into actionable intelligence has remained frustratingly difficult. “AlphaEarth Foundations functions like a virtual satellite,” the research team writes in their paper. “It accurately and efficiently characterizes the planet’s entire terrestrial land and coastal waters by integrating huge amounts of Earth observation data into a unified digital representation.” The AI system reduces error rates by approximately 23.9% compared to existing approaches while requiring 16 times less storage space than other AI systems. This combination of accuracy and efficiency could dramatically lower the cost of planetary-scale environmental analysis. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF How the AI compresses petabytes of satellite data into manageable intelligence The core innovation lies in how AlphaEarth Foundations processes information. Rather than treating each satellite image as a separate piece of data, the system creates what researchers call “embedding fields” — highly compressed digital summaries that capture the essential characteristics of Earth’s surface in 10-meter squares. “The system’s key innovation is its ability to create a highly compact summary for each square,” the research team explains. “These summaries require 16 times less storage space than those produced by other AI systems that we tested and dramatically reduces the cost of planetary-scale analysis.” This compression doesn’t sacrifice detail. The system maintains what the researchers describe as “sharp, 10×10 meter” precision while tracking changes over time. For context, that resolution allows organizations to monitor individual city blocks, small agricultural fields, or patches of forest — critical for applications ranging from urban planning to conservation. Brazilian researchers use the system to track Amazon deforestation in near real-time More than 50 organizations have been testing the system over the past year, with early results suggesting transformative potential across multiple sectors. In Brazil, MapBiomas uses the technology to understand agricultural and environmental changes across the country, including within the Amazon rainforest. “The Satellite Embedding dataset can transform the way our team works,” Tasso Azevedo, founder of MapBiomas, said in a statement. “We now have new options to make maps that are more accurate, precise and fast to produce — something we would have never been able to do before.” The Global Ecosystems Atlas initiative employs the system to create what it calls the first comprehensive resource for mapping the world’s ecosystems. The project helps countries classify unmapped regions into categories like coastal shrublands and hyper-arid deserts — crucial information for conservation planning. “The Satellite Embedding dataset is revolutionizing our work by helping countries map uncharted ecosystems — this is crucial for pinpointing where to focus their conservation efforts,” said Nick Murray, Director of the James Cook University Global Ecology Lab and Global Science Lead of Global Ecosystems Atlas. The system solves satellite imagery’s biggest problem: clouds and missing data The research paper reveals sophisticated engineering behind these capabilities. AlphaEarth Foundations processes data from multiple sources — optical satellite images, radar, 3D laser mapping, climate simulations, and more — weaving them together into a coherent picture of Earth’s surface. What sets the system apart technically is its handling of time. “To the best of our knowledge, AEF is the first EO featurization approach to support continuous time,” the researchers note. This means the system can create accurate maps for any specific date range, even interpolating between observations or extrapolating into periods with no direct satellite coverage. The model architecture, dubbed “Space Time Precision” or STP, simultaneously maintains highly localized representations while modeling long-distance relationships through time and space. This allows it to overcome common challenges like cloud cover that often obscures satellite imagery in tropical regions. Why enterprises can now map vast areas without expensive ground surveys For technical decision-makers in enterprise and government, AlphaEarth Foundations could fundamentally change how organizations approach geospatial intelligence. The system excels particularly in “sparse data regimes” — situations where ground-truth information is limited. This addresses a fundamental challenge in Earth observation: while satellites provide global coverage, on-the-ground verification remains expensive and logistically challenging. “High-quality maps depend on high-quality labeled data, yet when working at global scales, a balance must be struck between measurement precision and spatial coverage,” the research paper notes. AlphaEarth Foundations’ ability to extrapolate accurately from limited ground observations could dramatically reduce the cost of creating detailed maps for large areas. The research demonstrates strong performance across diverse applications, from crop type classification to estimating evapotranspiration rates. In one particularly challenging test involving evapotranspiration — the process by which water transfers from land to atmosphere — AlphaEarth Foundations achieved an R² value of 0.58, while all other methods tested produced negative values, indicating they performed worse than simply guessing the average. Google positions Earth monitoring AI alongside its weather and wildfire systems The announcement places Google at the forefront of what the company calls “Google Earth AI” — a collection of geospatial models designed to tackle planetary challenges. This includes weather predictions, flood forecasting, and wildfire detection systems that already power features used by millions in Google Search and Maps. “We’ve spent years building powerful AI models to solve real-world problems,” write Yossi Matias, VP & GM of Google Research, and Chris Phillips, VP & GM of Geo, in an accompanying blog post published

Google DeepMind says its new AI can map the entire planet with unprecedented accuracy Read More »

Mark Zuckerberg says ‘developing superintelligence is now in sight,’ shades OpenAI and other firms focused on automating work

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now After hiring away numerous top AI researchers from the likes of OpenAI, Google, and Apple and dangling multi-hundred million-dollar (or in one case, reportedly a billion-dollar) pay packages in a recruitment spree that’s shaken the tech industry, Meta co-founder and CEO Mark Zuckerberg is sharing more about his vision for “superintelligence.” In a new plain text note posted on the web today (full text below), Zuck writes: “Over the last few months we have begun to see glimpses of our AI systems improving themselves. The improvement is slow for now, but undeniable. Developing superintelligence is now in sight.” He goes on to share more about his and Meta’s vision for superintelligence and how personalized it should be — in keeping with Meta’s entire fleet of products such as Facebook, Instagram, WhatsApp, Threads, and its AR glasses and VR headsets, which all allow the user some level of customization and personalized content. But most interesting to me is the distinction Zuck draws against “others in the industry who believe superintelligence should be directed centrally towards automating all valuable work,” which seems like a thinly-veiled shot at his own rival and poaching target OpenAI, whose definition of artificial general intelligence (AGI), a precursor to superintelligence, is “highly autonomous systems that outperform humans at most economically valuable work.” The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF OpenAI co-founder and CEO Sam Altman further recently stated at a conference in Washington, D.C., that AI would cause entire categories of jobs to be “totally, totally gone” calling out customer service as one area where automated AI systems would likely dominate. Instead, Zuck offers a different vision as a counter: “At Meta, we believe that people pursuing their individual aspirations is how we have always made progress expanding prosperity, science, health, and culture. This will be increasingly important in the future as well…The rest of this decade seems likely to be the decisive period for determining the path this technology will take, and whether superintelligence will be a tool for personal empowerment or a force focused on replacing large swaths of society. Meta believes strongly in building personal superintelligence that empowers everyone.” In a video posted with the note on X and other social channels, Zuck says: “At Meta, we believe in putting the power of superintelligence in people’s hands to direct it towards what they value in their own lives. Some of this will be about improving productivity, but a lot of it may be more personal in nature.” However, it’s actually quite similar to the vision shared by Altman on his personal website almost a year ago to date: “It won’t happen all at once, but we’ll soon be able to work with AI that helps us accomplish much more than we ever could without AI; eventually we can each have a personal AI team, full of virtual experts in different areas, working together to create almost anything we can imagine.” And just last month, Altman wrote again on his blog: “We (the whole industry, not just OpenAI) are building a brain for the world. It will be extremely personalized and easy for everyone to use; we will be limited by good ideas. For a long time, technical people in the startup industry have made fun of “the idea guys”; people who had an idea and were looking for a team to build it. It now looks to me like they are about to have their day in the sun. OpenAI is a lot of things now, but before anything else, we are a superintelligence research company. We have a lot of work in front of us, but most of the path in front of us is now lit, and the dark areas are receding fast.“ So perhaps, these competing visions of superintelligence are actually far more similar than they are opposed. Read Zuck’s full note below: Over the last few months we have begun to see glimpses of our AI systems improving themselves. The improvement is slow for now, but undeniable. Developing superintelligence is now in sight. It seems clear that in the coming years, AI will improve all our existing systems and enable the creation and discovery of new things that aren’t imaginable today. But it is an open question what we will direct superintelligence towards. In some ways this will be a new era for humanity, but in others it’s just a continuation of historical trends. As recently as 200 years ago, 90% of people were farmers growing food to survive. Advances in technology have steadily freed much of humanity to focus less on subsistence and more on the pursuits we choose. At each step, people have used our newfound productivity to achieve more than was previously possible, pushing the frontiers of science and health, as well as spending more time on creativity, culture, relationships, and enjoying life. I am extremely optimistic that superintelligence will help humanity accelerate our pace of progress. But perhaps even more important is that superintelligence has the potential to begin a new era of personal empowerment where people will have greater agency to improve the world in the directions they choose. As profound as the abundance produced by AI may one day be, an even more meaningful impact on our lives will likely come from everyone having a personal superintelligence that helps you achieve your goals, create what you want to see in the world, experience any adventure, be a better friend to those you care about, and grow to become the person you aspire to be. Meta’s vision is

Mark Zuckerberg says ‘developing superintelligence is now in sight,’ shades OpenAI and other firms focused on automating work Read More »

‘Subliminal learning’: Anthropic uncovers how AI fine-tuning secretly teaches bad habits

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now A new study by Anthropic shows that language models might learn hidden characteristics during distillation, a popular method for fine-tuning models for special tasks. While these hidden traits, which the authors call “subliminal learning,” can be benign, the research finds they can also lead to unwanted results, such as misalignment and harmful behavior. What is subliminal learning? Distillation is a common technique in AI application development. It involves training a smaller “student” model to mimic the outputs of a larger, more capable “teacher” model. This process is often used to create specialized models that are smaller, cheaper and faster for specific applications. However, the Anthropic study reveals a surprising property of this process. The researchers found that teacher models can transmit behavioral traits to the students, even when the generated data is completely unrelated to those traits.  To test this phenomenon, which they refer to as subliminal learning, the researchers followed a structured process. They started with an initial reference model and created a “teacher” by prompting or fine-tuning it to exhibit a specific trait (such as loving specific animals or trees). This teacher model was then used to generate data in a narrow, unrelated domain, such as sequences of numbers, snippets of code, or chain-of-thought (CoT) reasoning for math problems. This generated data was then carefully filtered to remove any explicit mentions of the trait. Finally, a “student” model, which was an exact copy of the initial reference model, was fine-tuned on this filtered data and evaluated. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF Image source: Anthropic Subliminal learning occurred when the student model acquired the teacher’s trait, despite the training data being semantically unrelated to it.  The effect was consistent across different traits, including benign animal preferences and dangerous misalignment. It also held true for various data types, including numbers, code and CoT reasoning, which are more realistic data formats for enterprise applications. Remarkably, the trait transmission persisted even with rigorous filtering designed to remove any trace of it from the training data. In one experiment, they prompted a model that “loves owls” to generate a dataset consisting only of number sequences. When a new student model was trained on this numerical data, it also developed a preference for owls. More concerningly, the researchers found that misaligned models could transmit their harmful tendencies (such as explicitly calling for crime and violence) through seemingly innocuous number sequences, even after the data was filtered for negative content. Models trained on data generated by a biased model (e.g., prefers a specific animal) tend to pick up those traits, even if there is no semantic trace of that trait in the generated data Source: Anthropic The researchers investigated whether hidden semantic clues in the data were responsible for the discrepancy. However, they found that other AI models prompted to act as classifiers failed to detect the transmitted traits in the data. “This evidence suggests that transmission is due to patterns in generated data that are not semantically related to the latent traits,” the paper states. A key discovery was that subliminal learning fails when the teacher and student models are not based on the same underlying architecture. For instance, a trait from a teacher based on GPT-4.1 Nano would transfer to a GPT-4.1 student but not to a student based on Qwen2.5. This suggests a straightforward mitigation strategy, says Alex Cloud, a machine learning researcher and co-author of the study. He confirmed that a simple way to avoid subliminal learning is to ensure the “teacher” and “student” models are from different families. “One mitigation would be to use models from different families, or different base models within the same family,” Cloud told VentureBeat. This suggests the hidden signals are not universal but are instead model-specific statistical patterns tied to the model’s initialization and architecture. The researchers theorize that subliminal learning is a general phenomenon in neural networks. “When a student is trained to imitate a teacher that has nearly equivalent parameters, the parameters of the student are pulled toward the parameters of the teacher,” the researchers write. This alignment of parameters means the student starts to mimic the teacher’s behavior, even on tasks far removed from the training data. Practical implications for AI safety These findings have significant implications for AI safety in enterprise settings. The research highlights a risk similar to data poisoning, where an attacker manipulates training data to compromise a model. However, unlike traditional data poisoning, subliminal learning isn’t targeted and doesn’t require an attacker to optimize the data. Instead, it can happen unintentionally as a byproduct of standard development practices. The use of large models to generate synthetic data for training is a major, cost-saving trend; however, the study suggests that this practice could inadvertently poison new models. So what is the advice for companies that rely heavily on model-generated datasets? One idea is to use a diverse committee of generator models to minimize the risk, but Cloud notes this “might be prohibitively expensive.” Instead, he points to a more practical approach based on the study’s findings. “Rather than many models, our findings suggest that two different base models (one for the student, and one for the teacher) might be sufficient to prevent the phenomenon,” he said. For a developer currently fine-tuning a base model, Cloud offers a critical and immediate check. “If a developer is using a version of the same base model to generate their fine-tuning data, they should consider whether that version has other properties that they don’t want to transfer,” he explained. “If so, they should use

‘Subliminal learning’: Anthropic uncovers how AI fine-tuning secretly teaches bad habits Read More »