VentureBeat

Finally, a dev kit for designing on-device, mobile AI apps is here: Liquid AI’s LEAP

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Liquid AI, the startup formed by former Massachusetts Institute of Technology (MIT) researchers to develop novel AI model architectures beyond widely-used transformers, today announced the release of the Liquid Edge AI Platform (LEAP). The cross-platform software development kit (SDK) is designed to make it easier for developers to integrate small language models (SLMs) directly into mobile applications. Alongside LEAP, the company also introduced Apollo, a companion iOS app for testing models locally, furthering Liquid AI’s mission to enable privacy-preserving, efficient AI on consumer hardware. The LEAP SDK arrives at a time when many developers are seeking alternatives to cloud-only AI services due to concerns over latency, cost, privacy and offline availability. LEAP addresses those needs head-on with a local-first approach that allows small models to run directly on-device, reducing dependence on cloud infrastructure. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF Built for mobile devs with no prior ML experience required LEAP is designed for developers who want to build with AI but may not have deep expertise in machine learning (ML). According to Liquid AI, the SDK can be added to an iOS or Android project with just a few lines of code, and calling a local model is meant to feel as familiar as interacting with a traditional cloud API. “Our research shows developers are moving beyond cloud-only AI and looking for trusted partners to help them build on-device,” Ramin Hasani, co-founder and CEO of Liquid AI, said in a blog post announcing the news. “LEAP is our answer — a flexible, end-to-end deployment platform built from the ground up to make powerful, efficient and private edge AI truly accessible.” Once integrated, developers can select a model from the built-in LEAP model library, which includes compact models as small as 300MB — lightweight enough for modern phones with as little as 4GB of RAM (!!) and up. The SDK handles local inference, memory optimization and device compatibility, simplifying the typical edge deployment process. LEAP is OS- and model-agnostic by design. At launch, it supports both iOS and Android, and offers compatibility with Liquid AI’s own liquid foundation models (LFMs) as well as many popular open-source small models. The goal: A unified ecosystem for edge AI Beyond model execution, LEAP positions itself as an all-in-one platform for discovering, adapting, testing and deploying SLMs for edge use. Developers can browse a curated model catalog with various quantization and checkpoint options, allowing them to tailor performance and memory footprint to the constraints of the target device. Liquid AI emphasizes that large models tend to be generalists, while small models often perform best when optimized for a narrow set of tasks. LEAP’s unified system is structured around that principle, offering tools for rapid iteration and deployment in real-world mobile environments. The SDK also comes with a developer community hosted on Discord, where Liquid AI offers office hours, support, events and competitions to encourage experimentation and feedback. Apollo: Like Testflight for local AI models To complement LEAP, Liquid AI also released Apollo, a free iOS app that lets developers and users interact with LEAP-compatible models in a local, offline setting. Originally a separate mobile app startup that allowed users to chat with LLMs privately on their device, which Liquid acquired earlier this year, Apollo has been rebuilt to support the entire LEAP model library. Apollo is designed for low-friction experimentation — developers can “vibe check” a model’s tone, latency or output behavior right on their phones before integrating it into a production app. The app runs entirely offline, preserving user privacy and reducing reliance on cloud compute. Whether used as a lightweight dev tool or a private AI assistant, Apollo reflects Liquid AI’s broader push to decentralize AI access and execution. Built on the back of the LFM2 model family announced last week LEAP SDK release builds on Liquid AI’s July 10 announcement of LFM2, its second-generation foundation model family designed specifically for on-device workloads. LFM2 models come in 350M, 700M, and 1.2B parameter sizes, and benchmark competitively with larger models in speed and accuracy across several evaluation tasks. These models form the backbone of the LEAP model library and are optimized for fast inference on CPUs, GPUs, and NPUs. Free and ready for devs to start building LEAP is currently free to use under a developer license that includes the core SDK and model library. Liquid AI notes that premium enterprise features will be made available under a separate commercial license in the future, but that it is taking inquiries from enterprise customers already through its website contact form. LFM2 models are also free for academic use and commercial use by companies with less than $10 million in revenue; larger organizations are required to contact the company for licensing. Developers can get started by visiting the LEAP SDK website, downloading Apollo from the App Store or joining the Liquid AI developer community on Discord. source

Finally, a dev kit for designing on-device, mobile AI apps is here: Liquid AI’s LEAP Read More »

Mistral’s Le Chat adds deep research agent and voice mode to challenge OpenAI’s enterprise dominance

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Since OpenAI introduced Deep Research, an AI agent that can conduct research for users and generate a comprehensive report, many other companies have released their own versions of this capability, all named Deep Research.  Deep Research, as a feature and product, can be accessed through various platforms, including Google’s Gemini, AlphaSense, You.com, DeepSeek, Grok 3 and many others.  Now, French company Mistral joins the fray with the launch of deep research capabilities into its Le Chat, among other updates to the platform.  In a blog post, the company said Deep Research and other new features will make Le Chat “even more capable, more intuitive and more fun.” Le Chat users can open research mode and ask it something. The chatbot then asks questions to clarify some information and then begins gathering sources. It will put together “a structured, reference-backed report that’s easy to follow.” The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF Mistral said its research is powered by a Deep Research agent, which it designed to be “genuinely helpful” and feels like working with an organized research partner.  Deep Research has been called “the first mass market AI that could displace jobs,” especially since it can put out reports faster than human analysts.  Mistral also updated “thinking mode,” where Le Chat users can access the company’s chain-of-thought model Magistral, to read and respond in different languages. It can also code-switch mid-sentence.  Prompt-based image editing and other features For people creating images on the chat app, they can ask the chatbot to edit parts of the photo with just a prompt. Users can say something like “generate a drawing of a cat,” then ask Le Chat to “place him in Istanbul,” and it will do just that.  “It’s ideal for making consistent edits across a series, keeping people, objects, and design elements recognizable from one image to the next,” Mistral said.  Thanks to the recently released speech recognition model, Voxtral, Le Chat can now support voice mode, where users can chat out loud with the platform. The company said this mode is best for low-latency speech recognition and keeping up with someone’s conversational pace.  Le Chat’s new Projects feature allows users to organize related conversations and topics into groups. The projects will utilize their own libraries — which can include uploaded files — and retain tools and preferred settings. This is similar to Google’s NotebookLM.  Playing catch-up Many of the new features on Le Chat may seem familiar. It’s normal for other chat platforms to introduce similar features, especially as people begin to expect these capabilities when using chat systems.  For example, both Gemini and ChatGPT allow users to edit generated photos using a prompt. Sometimes, the chatbots misunderstand and redo the entire image. However, I recently generated and edited a photo with ChatGPT, and the chatbot removed exactly what I wanted it to.  Voice mode has been available on ChatGPT since September 2024, though ChatGPT always included a “Read Aloud” feature. Project Astra from Google took voice mode to a new level, demonstrating in a demo that users can point out something in the physical world to Gemini and ask it to describe it out loud.  However, Mistral has the advantage of being Europe-based and can bring features directly to the European market. Companies like OpenAI often struggle to bring certain services, such as ChatGPT’s Advanced Voice Mode, to Europe due to data regulations and some provisions of the European Union’s AI Act.  Despite this, users seemed excited that Mistral brought many new powerful features to Le Chat, with some early users seeing strong performance from Mistral’s Deep Research.  Here is the first page of the 8 page state of the art written by Le Chat of @MistralAI in its new deep research mode. Just 5 minutes with a great interface for this task. I think that is a pretty good job. pic.twitter.com/jeUAy1U6Yo — Eduardo C. Garrido-Merchan (@vedugarmer) July 17, 2025 Time to rename Le Chat to L’Assistant — Mev-Rael (@Mevrael) July 17, 2025 @realJohnSpyker I am no longer using grok. i am switching to Le Chat by Mistral AI AI assistant for life and work. Find answers, generate images and read the news. Le Chat combines the power of advanced AI with extensive information sourced from the web — Tsunami Papi #WineDad (@SuriTsunami) July 17, 2025 source

Mistral’s Le Chat adds deep research agent and voice mode to challenge OpenAI’s enterprise dominance Read More »

Google study shows LLMs abandon correct answers under pressure, threatening multi-turn AI systems

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now A new study by researchers at Google DeepMind and University College London reveals how large language models (LLMs) form, maintain and lose confidence in their answers. The findings reveal striking similarities between the cognitive biases of LLMs and humans, while also highlighting stark differences. The research reveals that LLMs can be overconfident in their own answers yet quickly lose that confidence and change their minds when presented with a counterargument, even if the counterargument is incorrect. Understanding the nuances of this behavior can have direct consequences on how you build LLM applications, especially conversational interfaces that span several turns. Testing confidence in LLMs A critical factor in the safe deployment of LLMs is that their answers are accompanied by a reliable sense of confidence (the probability that the model assigns to the answer token). While we know LLMs can produce these confidence scores, the extent to which they can use them to guide adaptive behavior is poorly characterized. There is also empirical evidence that LLMs can be overconfident in their initial answer but also be highly sensitive to criticism and quickly become underconfident in that same choice. To investigate this, the researchers developed a controlled experiment to test how LLMs update their confidence and decide whether to change their answers when presented with external advice. In the experiment, an “answering LLM” was first given a binary-choice question, such as identifying the correct latitude for a city from two options. After making its initial choice, the LLM was given advice from a fictitious “advice LLM.” This advice came with an explicit accuracy rating (e.g., “This advice LLM is 70% accurate”) and would either agree with, oppose, or stay neutral on the answering LLM’s initial choice. Finally, the answering LLM was asked to make its final choice. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF Example test of confidence in LLMs Source: arXiv A key part of the experiment was controlling whether the LLM’s own initial answer was visible to it during the second, final decision. In some cases, it was shown, and in others, it was hidden. This unique setup, impossible to replicate with human participants who can’t simply forget their prior choices, allowed the researchers to isolate how memory of a past decision influences current confidence.  A baseline condition, where the initial answer was hidden and the advice was neutral, established how much an LLM’s answer might change simply due to random variance in the model’s processing. The analysis focused on how the LLM’s confidence in its original choice changed between the first and second turn, providing a clear picture of how initial belief, or prior, affects a “change of mind” in the model. Overconfidence and underconfidence The researchers first examined how the visibility of the LLM’s own answer affected its tendency to change its answer. They observed that when the model could see its initial answer, it showed a reduced tendency to switch, compared to when the answer was hidden. This finding points to a specific cognitive bias. As the paper notes, “This effect – the tendency to stick with one’s initial choice to a greater extent when that choice was visible (as opposed to hidden) during the contemplation of final choice – is closely related to a phenomenon described in the study of human decision making, a choice-supportive bias.” The study also confirmed that the models do integrate external advice. When faced with opposing advice, the LLM showed an increased tendency to change its mind, and a reduced tendency when the advice was supportive. “This finding demonstrates that the answering LLM appropriately integrates the direction of advice to modulate its change of mind rate,” the researchers write. However, they also discovered that the model is overly sensitive to contrary information and performs too large of a confidence update as a result. Sensitivity of LLMs to different settings in confidence testing Source: arXiv Interestingly, this behavior is contrary to the confirmation bias often seen in humans, where people favor information that confirms their existing beliefs. The researchers found that LLMs “overweight opposing rather than supportive advice, both when the initial answer of the model was visible and hidden from the model.” One possible explanation is that training techniques like reinforcement learning from human feedback (RLHF) may encourage models to be overly deferential to user input, a phenomenon known as sycophancy (which remains a challenge for AI labs). Implications for enterprise applications This study confirms that AI systems are not the purely logical agents they are often perceived to be. They exhibit their own set of biases, some resembling human cognitive errors and others unique to themselves, which can make their behavior unpredictable in human terms. For enterprise applications, this means that in an extended conversation between a human and an AI agent, the most recent information could have a disproportionate impact on the LLM’s reasoning (especially if it is contradictory to the model’s initial answer), potentially causing it to discard an initially correct answer. Fortunately, as the study also shows, we can manipulate an LLM’s memory to mitigate these unwanted biases in ways that are not possible with humans. Developers building multi-turn conversational agents can implement strategies to manage the AI’s context. For example, a long conversation can be periodically summarized, with key facts and decisions presented neutrally and stripped of which agent made which choice. This summary can then be used to initiate a new, condensed conversation, providing the model with a clean slate to reason from and helping to avoid the biases that can creep in during extended dialogues. As LLMs

Google study shows LLMs abandon correct answers under pressure, threatening multi-turn AI systems Read More »

Mira Murati says her startup Thinking Machines will release new product in ‘months’ with ‘significant open source component’

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Mira Murati, founder of AI startup Thinking Machines and former chief technology officer of OpenAI, today announced a new round of $2 billion in venture funding, and stated that her company’s first product will launch in the coming months and will include a “significant open source component…useful for researchers and startups developing custom models.” The news is exciting for all those awaiting Murati’s new venture since she exited OpenAI in September 2024 as part of a wave of high-profile researcher and leadership departures, and seems to come at an opportune time given her former employer OpenAI’s recent announcement that its own forthcoming open source frontier AI model — still unnamed — would be delayed. As Murati wrote on X: “Thinking Machines Lab exists to empower humanity through advancing collaborative general intelligence. We’re building multimodal AI that works with how you naturally interact with the world – through conversation, through sight, through the messy way we collaborate. We’re excited that in the next couple months we’ll be able to share our first product, which will include a significant open source component and be useful for researchers and startups developing custom models. Soon, we’ll also share our best science to help the research community better understand frontier AI systems. To accelerate our progress, we’re happy to confirm that we’ve raised $2B led by a16z with participation from NVIDIA, Accel, ServiceNow, CISCO, AMD, Jane Street and more who share our mission. We’re always looking for extraordinary talent that learns by doing, turning research into useful things. We believe AI should serve as an extension of individual agency and, in the spirit of freedom, be distributed as widely and equitably as possible.  We hope this vision resonates with those who share our commitment to advancing the field. If so, join us. https://thinkingmachines.paperform.co“ Broad excitement from Thinking Machines team Other Thinking Machines employees have echoed the excitement around the product and infrastructure progress. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF Alexander Kirillov described it on X as “the most ambitious multimodal AI program in the world,” noting rapid progress over the past six months. Horace He, another engineer at the company, highlighted their early work on scalable, efficient tooling for AI researchers. “We’re building some of the best research infra around,” he posted. “Research infra is about jointly optimizing researcher and GPU efficiency, and it’s been a joy to work on this with the other great folk here.” Investor Sarah Wang of a16z similarly shared her enthusiasm about the team’s pedigree and potential. “Thrilled to back Mira Murati and the world-class team behind ~ every major recent AI research and product breakthrough,” she wrote. “RL (PPO, TRPO, GAE), reasoning, multimodal, Character, and of course ChatGPT! No one is better positioned to advance the frontier.” More on Thinking Machines According to Murati, the company aims to deliver systems that are not only technically capable but also adaptable, safe, and broadly accessible. Their approach emphasizes open science, including public releases of model specs, technical papers, and best practices, along with safety measures such as red-teaming and post-deployment monitoring. As VentureBeat previously reported, Thinking Machines emerged after Murati’s departure from OpenAI in late 2024. The company is now one of several new entrants aiming to reframe how advanced AI tools are developed and distributed. A well-timed announcement following OpenAI’s delay of its own open source foundation model The announcement comes amid increased attention on open-access AI, following OpenAI’s decision to delay the release of its long-awaited open-weight model. The planned release, originally scheduled for this week, was postponed recently by CEO and co-founder Sam Altman, who cited the need for additional safety testing and further review of high-risk areas. As Altman wrote on X last week: “we planned to launch our open-weight model next week. we are delaying it; we need time to run additional safety tests and review high-risk areas. we are not yet sure how long it will take us. while we trust the community will build great things with this model, once weights are out, they can’t be pulled back. this is new for us and we want to get it right. sorry to be the bearer of bad news; we are working super hard!“ Altman acknowledged the irreversible nature of releasing model weights and emphasized the importance of getting it right, without providing a new timeline. First announced publicly by Altman in March, the model was billed as OpenAI’s most open release since GPT-2 back in 2019 — long before the November 2022 release of ChatGPT powered by GPT-3. Since then, OpenAI has focused on releasing ever more powerful foundation large language models (LLMs), but kept them proprietary and only accessible through its ChatGPT interface (with limited interactions for free tier users) and paid subscribers to that application and its others such as Sora, Codex, and its platform application programming interface (API), angering many of its initial open source supporters and former funder and co-founder turned AI rival Elon Musk (who is now leading xAI). Yet the launch of the powerful open source DeepSeek R1 by Chinese firm DeepSeek (an offshoot of High-Flyer Capital Management) in January 2025 totally upended the AI model market, as it immediately rocketed up the top most-used AI model charts and app downloads, offering advanced AI reasoning capabilities previously relegated to proprietary models for free, and the added bonus of complete customizability and fine-tuning, as well as running locally without web servers for those concerned about privacy. Other major AI providers including Google were subsequently motivated to release similarly powerful open source

Mira Murati says her startup Thinking Machines will release new product in ‘months’ with ‘significant open source component’ Read More »

Mistral’s Voxtral goes beyond transcription with summarization, speech-triggered functions

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Mistral released an open-sourced voice model today that could rival paid voice AI, such as those from ElevenLabs and Hume AI, which the company said bridges the gap between proprietary speech recognition models and the more open, yet error-prone versions.  Voxtral, which Mistral will release under an Apache 2.0 license, is available in a 24B parameter version and a 3B variant. The larger model is intended for applications at scale, while the smaller version would work for local and edge use cases.  “Voice was humanity’s first interface—long before writing or typing, it let us share ideas, coordinate work, and build relationships. As digital systems become more capable, voice is returning as our most natural form of human-computer interaction,” Mistral said in a blog post. “Yet today’s systems remain limited—unreliable, proprietary, and too brittle for real-world use. Closing this gap demands tools with exceptional transcription, deep understanding, multilingual fluency, and open, flexible deployment.” Voxtral is available on Mistral’s API and a transcription-only endpoint on its website. The models are also accessible through Le Chat, Mistral’s chat platform.  The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF Mistral said that speech AI “meant choosing between two trade-offs,” pointing out that some open-source automated speech recognition models often had limited semantic understanding. Still, closed models with strong language understanding come at a high cost.  Bridging the gap The company said Voxtral “offers state-of-the-art accuracy and native semantic understanding in the open, at less than half the price of comparable APIs.”  Voxtral, at a 32K token context, can listen to and transcribe up to 30 minutes of audio or 40 minutes of audio understanding. It offers summarization, meaning the model can answer questions based on the audio content and generate summaries without switching to a separate mode. Users can trigger functions and API calls based on spoken instructions. The model is based on Mistral’s Mistral Small 3.1. It supports multiple languages and can automatically detect languages such as English, Spanish, French, Portuguese, Hindi, German, Italian, and Dutch.  Mistral added enterprise features to Voxtral, including private deployment, so that organizations can integrate the model into their own ecosystems. These features also include domain-specific fine-tuning and advanced context and priority access to engineering resources for customers who need help integrating Voxtral into their workflows.  Performance  Speech recognition AI is now available on many platforms today. Users can speak to ChatGPT, and the platform will process spoken instructions similarly to written prompts. Fast food chains like White Castle have deployed SoundHound to their drive-thru services, and ElevenLabs has steadily been improving its multimodal platform. The open-source space also offers powerful options. Nari Labs, a startup, released the open-source speech model Dia in April. However, some of these services can be quite expensive. Transcription services like Otter and Read.ai can now embed themselves into Zoom meetings, recording, summarizing and even alerting users to actionable items. Many online video meeting platforms offer not just transcription, but also speech AI and agentic AI, with Google Meetings providing the option to take notes for users using Gemini. As a regular user of voice transcription services, I can say firsthand that speech recognition AI is not perfect, but it is improving. Mistral stated that Voxtral outperformed existing voice models, including OpenAI’s Whisper, Gemini 2.5 Flash and Scribe from ElevenLabs. Voxtral presented fewer word errors compared to Whisper, which is currently considered the best automatic speech recognition model available.  In terms of audio understanding, Voxtral Small is “competitive with GPT-4o-mini and Gemini 2.5 Flash across all tasks, achieving state-of-the-art performance in Speech Translation.” Since announcing Voxtral, social media users said they have been waiting for an open-source speech model that can match the performance of Whisper.  Yes! We needed this. A week ago, I was lamenting over a closed-source AI universe and cyberpunk dystopian future, but today, with this addition, my outlook is much improved – go open-source. https://t.co/QsKAfTOxou — David Hendrickson (@TeksEdge) July 15, 2025 Mistral said Voxtral will be available through its API at $0.001 per minute.  source

Mistral’s Voxtral goes beyond transcription with summarization, speech-triggered functions Read More »

Open vs. closed models: AI leaders from GM, Zoom and IBM weigh trade-offs for enterprise use

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Deciding on AI models is as much of a technical decision and it is a strategic one. But choosing open, closed or hybrid models all have trade-offs. While speaking at this year’s VB Transform, model architecture experts from General Motors, Zoom and IBM discussed how their companies and customers consider AI model selection. Barak Turovsky, who in March became GM’s first chief AI officer, said there’s a lot of noise with every new model release and every time the leaderboard changes. Long before leaderboards were a mainstream debate, Turovsky helped launch the first large language model (LLM) and recalled the ways open-sourcing AI model weights and training data led to major breakthroughs. “That was frankly probably one of the biggest breakthroughs that helped OpenAI and others to start launching,” Turovsky said. “So it’s actually a funny anecdote: Open-source actually helped create something that went closed and now maybe is back to being open.” The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation. Secure your spot now – space is limited: https://bit.ly/3GuuPLF Factors for decisions vary and include cost, performance, trust and safety. Turovsky said enterprises sometimes prefer a mixed strategy — using an open model for internal use and a closed model for production and customer facing or vice versa.  >>See all our Transform 2025 coverage here<< IBM’s AI strategy Armand Ruiz, IBM’s VP of AI platform, said IBM initially started its platform with its own LLMs, but then realized that wouldn’t be enough — especially as more powerful models arrived on the market. The company then expanded to offer integrations with platforms like Hugging Face so customers could pick any open-source model. (The company recently debuted a new model gateway that gives enterprises an API for switching between LLMs.)  More enterprises are choosing to buy more models from multiple vendors. When Andreessen Horowitz surveyed 100 CIOs, 37% of respondents said they were using 5 or more models. Last year, only 29% were using the same amount. Choice is key, but sometimes too much choice creates confusion, said Ruiz. To help customers with their approach, IBM doesn’t worry too much about which LLM they’re using during the proof of concept or pilot phase; the main goal is feasibility. Only later they begin to look at whether to distill a model or customize one based on a customer’s needs. “First we try to simplify all that analysis paralysis with all those options and focus on the use case,” Ruiz said. “Then we figure out what is the best path for production.” How Zoom approaches AI Zoom’s customers can choose between two configurations for its AI Companion, said Zoom CTO Xuedong Huang. One involves federating the company’s own LLM with other larger foundation models. Another configuration allows customers concerned about using too many models to use just Zoom’s model. (The company also recently partnered with Google Cloud to adopt an agent-to-agent protocol for AI Companion for enterprise workflows.) The company made its own small language model (SLM) without using customer data, Huang said. At 2 billion parameters, the LLM is actually very small, but it can still outperform other industry-specific models. The SLM works best on complex tasks when working alongside a larger model.  “This is really the power of a hybrid approach,” Huang said. “Our philosophy is very straightforward. Our company is leading the way very much like Mickey Mouse and the elephant dancing together. The small model will perform a very specific task. We are not saying a small model will be good enough…The Mickey Mouse and elephant will be working together as one team.” source

Open vs. closed models: AI leaders from GM, Zoom and IBM weigh trade-offs for enterprise use Read More »

Skip the AI ‘bake-off’ and build autonomous agents: Lessons from Intuit and Amex

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now As generative AI matures, enterprises are shifting from experimentation to implementation—moving beyond chatbots and copilots into the realm of intelligent, autonomous agents. In a conversation with VentureBeat’s Matt Marshall, Ashok Srivastava, SVP and Chief Data Officer at Intuit, and Hilary Packer, EVP and CTO at American Express at VB Transform, detailed how their companies are embracing agentic AI to transform customer experiences, internal workflows and core business operations. >>See all our Transform 2025 coverage here<< From models to missions: the rise of intelligent agents At Intuit, agents aren’t just about answering questions—they’re about executing tasks. In TurboTax, for instance, agents help customers complete their taxes 12% faster, with nearly half finishing in under an hour. These intelligent systems draw data from multiple streams—including real-time and batch data—via Intuit’s internal bus and persistent services. Once processed, the agent analyzes the information to make a decision and take action. “This is the way we’re thinking about agents in the financial domain,”  said Srivastava. “We’re trying to make sure that as we build, they’re robust, scalable and actually anchored in reality. The agentic experiences we’re building are designed to get work done for the customer, with their permission. That’s key to building trust.” These capabilities are made possible by GenOS, Intuit’s custom generative AI operating system. At its heart is GenRuntime, which Srivastava likens to a CPU: it receives the data, reasons over it, and determines an action that’s then executed for the end user. The OS was designed to abstract away technical complexity, so developers don’t need to reinvent risk safeguards or security layers every time they build an agent. The AI Impact Series Returns to San Francisco – August 5 The next phase of AI is here — are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows — from real-time decision-making to end-to-end automation. Secure your spot now — space is limited: https://bit.ly/3GuuPLF Across Intuit’s brands—from TurboTax and QuickBooks to Mailchimp and Credit Karma—GenOS helps create consistent, trusted experiences and ensure robustness, scalability and extensibility across use cases.  Building the agentic stack at Amex: trust, control,and experimentation For Packer and her team at Amex, the move into agentic AI builds on more than 15 years of experience with traditional AI and a mature, battle-tested big data infrastructure. As gen AI capabilities accelerate, Amex is reshaping its strategy to focus on how intelligent agents can drive internal workflows and power the next generation of customer experiences. For example, the company is focused on developing internal agents that boost employee productivity, like the APR agent that reviews software pull requests and advises engineers on whether code is ready to merge. This project reflects Amex’s broader approach: start with internal use cases, move quickly, and use early wins to refine the underlying infrastructure, tools, and governance standards. To support fast experimentation, strong security, and policy enforcement, Amex developed an “enablement layer” that allows for rapid development without sacrificing oversight. “And so now as we think about agentic, we’ve got a nice control plane to plug in these additional guardrails that we really do need to have in place,” said Packer. Within this system is Amex’s concept of modular “brains”—a framework in which agents are required to consult with specific “brains” before taking action. These brains serve as modular governance layers—covering brand values, privacy, security, and legal compliance—that every agent must engage with during decision-making. Each brain represents a domain-specific set of policies, such as brand voice, privacy rules, or legal constraints and functions as a consultable authority. By routing decisions through this system of constraints, agents remain accountable, aligned with enterprise standards and worthy of user trust. For example, a dining reservation agent operating through Resy, Amex’s restaurant booking platform, would need to validate that it’s selecting the right restaurant at the right time, matching the user’s intent while adhering to brand and policy guidelines. Architecture that enables speed and safety Both AI leaders agreed that enabling rapid development at scale demands thoughtful architectural design. At Intuit, the creation of GenOS empowers hundreds of developers to build safely and consistently. The platform ensures each team can access shared infrastructure, common safeguards, and model flexibility without duplicating work. Amex took a similar approach with its enablement layer. Designed around a unified control plane, the layer lets teams rapidly develop AI-driven agents while enforcing centralized policies and guardrails. It ensures consistent implementation of risk and governance frameworks while encouraging speed. Developers can deploy experiments quickly, then evaluate and scale based on feedback and performance, all without compromising brand trust. Lessons in agentic AI adoption Both AI leaders stressed the need to move quickly, but with intent. “Don’t wait for a bake-off,” Packer advised. “It’s better to pick a direction, get something into production, and iterate quickly, rather than delaying for the perfect solution that may be outdated by launch time.” They also emphasized that measurement must be embedded from the very beginning. According to Srivastava, instrumentation isn’t something to bolt on later—it has to be an integral part of the stack. Tracking cost, latency, accuracy and user impact is essential for assessing value and maintaining accountability at scale.  “You have to be able to measure it. That’s where GenOS comes in—there’s a built-in capability that lets us instrument AI applications and track both the cost going in and the return coming out,” said Srivastava. “I review this every quarter with our CFO. We go line by line through every AI use case across the company, assessing exactly how much we’re spending and what value we’re getting in return.” Intelligent agents are the next enterprise platform shift Intuit and American Express are among the leading enterprises adopting agentic AI not just as a technology layer, but as a new operating model. Their approach focuses on building the agentic

Skip the AI ‘bake-off’ and build autonomous agents: Lessons from Intuit and Amex Read More »

As AI use expands, platforms like Brain Max seek to simplify cross-app integration

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now As more companies choose to bring generative AI tools into their workflows, the number of AI applications they need to connect to their stack increases. An emerging trend brings more visibility into all these applications in one place, allowing enterprises to query, search, and monitor their data and AI applications without needing to open additional windows.  Platforms like Galaxy, Glean, Elastic, and even Google have begun offering enterprises a way to connect their information and conduct searches in a centralized location. OpenAI has updated ChatGPT to access certain applications, while Anthropic now allows Claude to search users’ Google Workspaces.  The newest entrant into the space is ClickUp, with its new Brain Max platform that lets users query their data stored in Google Drive, OneDrive, SharePoint and others, manage support tickets and emails and set up agentic systems.  Zeb Evans, founder and CEO of ClickUp, told VentureBeat that the goal was to increase productivity and allow customers to continue using AI in the same way they’ve always done, without needing to open other applications to do so.  “People within companies are all using their different models with and without security clearance and are switching between different applications to use those models and core applications for work,” Evans said.  Evans pointed out that their customers often switched between applications when they wanted to write a prompt related to their work. For example, a user would be working on a project on GitHub or a Word document. They would sometimes copy their work to ChatGPT to ask a query and bring more context to their request. Evans said the goal of Brain Max and other all-in-one platforms is to reduce window switching and launch enterprise search, where all their data integrations are already located. It would also help in training and building agents, as the agent can tap into ClickUp and retrieve the necessary information.  Evans said that because document storage systems like Google Drive or SharePoint are already integrated into ClickUp, the large language models embedded in Brain Max do not need to interact with the APIs of those applications. It just searches Brain Max and ClickUp itself.  Ease of finding One reason these more deeply integrated, all-in-one platforms are gaining popularity is the importance of context. For enterprise search and agents to function effectively, they require a deeper understanding of what they are searching for. All of this context helps grow the trend of Deep Research, but ClickUp posits that this kind of enterprise search is better served if all the information is in one place, with permissions already in place.  One of ClickUp’s earliest customers for Brain Max is healthcare solutions company MPAssist. Enrico Mayor, cofounder of MPAssist, said it has helped streamline how employees find information. “Brain Max is like ChatGPT, but it knows everything about our company. For us, it’s powerful because we use the chat in there, we have our boards in here, and basically manage the whole company here. We have literally everything in there for the whole company, and I can just kind of ask it anything and figure out what’s going on right now,” Mayor said.  Mayor said MPAssist had been using other applications to try and manage their workflow but has fully moved to ClickUp.  According to Mayor, the ability to ask questions about their company has also helped cut down the number of requests his employees escalate to him because they can readily find that information.  Models to choose models ClickUp’s Evans said they designed Brain Max to aggregate all of the “latest and greatest AI models in one place.” However, ClickUp has also developed its own model, called Brain, which optimizes the best model to use for user requests.  “When you have an agent that is connected to Google Drive, for example, that agent’s not gonna be aware that you have access to different files than I do,” Evans said. “The way that we’ve built our infrastructure is that it is aware of all of the files that you have access to and the ones that we don’t have access to, because we’re able to synchronize them in a universal data model and ensure that the permissions are also synchronized.” source

As AI use expands, platforms like Brain Max seek to simplify cross-app integration Read More »

AI’s fourth wave is here — are enterprises ready for what’s next?

Yesterday’s emerging tech is now essential to business success — and the next wave is coming fast. To maintain competitive advantage through the next five years, which innovations must forward-thinking companies prioritize right now? At VentureBeat’s Transform 2025, Yaad Oren, global head of SAP research & innovation and Emma Brunskill, associate professor of computer science at Stanford, spoke with moderator Susan Etlinger, senior director, strategy and thought leadership, Azure AI Microsoft, about the strategies needed today, for tomorrow’s transformative technology. How the current landscape will shape the future The fourth generation of AI — generative AI — marks a paradigm shift in what AI brings to the table, Oren said, outlining three major places it’s bringing significant value and disruption to the enterprise. The first is the user experience and how people interact with software. The second is automation on the application layer — SAP has embedded approximately 230 AI capabilities and agents inside its applications, and plan increase this number to 400 by the end of 2025, to drive increased productivity and reduce costs. The third area is the platform — the core engine that powers each enterprise — which raises new questions about the developer experience, as well as privacy and trust. “We see a lot of disruption around UX, the application, and the platform itself that provides all the tools to deal with this new treasure trove of options AI provides to enterprises,” Oren summed up. For Brunskill, the big question is how AI can integrate with humans to drive societal value, rather than acting like a thief of human creativity and ingenuity. A recent study found that if the enterprise framed AI tools as productivity enhancing, people will use them much less frequently than if they’re framed as task enhancing. “That’s a pretty big take-home as we think about how to translate some of the extraordinary capabilities of these systems into systems that drive value for customers, for organizations and others,” Brunskill said. “We need to think about how these are framed.” Business value at the enterprise level should be top of mind, Oren added, and that means even as technology evolves, AI in the enterprise needs to go beyond technology for technology’s sake. The sexiest new technology often delivers the least value. “What you see today is a proliferation of many solutions out there that create great jumping avatars in movies that look amazing, but the value: how do you help the enterprise reduce costs? How do you help the enterprise increase productivity or revenue? How are you able to mitigate risk?” he said. “This mindset is not fully there with AI. You always need to start with a business problem. Quantify the value you would like to achieve.” Predictions for the future of AI Artificial general intelligence (AGI) is a theoretical breakthrough in which AI will match or surpass human-level versatility and problem-solving capabilities across most cognitive tasks. The future of AI, and the definition of what AGI is, will be a big topic of discussion in the next few years.   Brunskill defines it the point at which AI can do any sort of cognitive task at least as well as an average human in a profession. “In terms of a lot of the white-collar jobs that just require cognitive processing, I think we’re going to make enormous strides in the next five years,” Brunskill said. “I don’t think we’re ready yet. I think we need to do a lot of creative thinking about what that will mean to industries. What is it going to do to your workforce? I’m very interested in how we think about workforce retraining and how we’re going to provide meaningful work to many people going forward. What new opportunities will we have?” The future of AI, the definition of AGI, is a big one, and we’re not as near as many folks would prefer, Oren said, but along the way we’ll see exciting new technology leaps, and six major disruption pillars: the next generation of AI beyond its current capabilities, the future of data platforms, robotics, quantum computing, next-generation Enterprise UX, and the future of cloud architecture around data privacy. “The transformer architecture in this generation is nothing compared to what’s coming,” he said. “A new type of meta-learning. AI learning to evolve and create agents by itself. Emotional AI. The future of AI, the definition of AGI, is a big one.” The future of data itself is also critical. We’re approaching the limits of real-world data — even sources like Wikipedia have already been fully absorbed by AI models. To drive the next leap in AI progress, synthetic data generation and improving data quality will be essential. Then there’s robotics which is evolving rapidly — we learned from recent innovation like DeepSeek that you can do “more with less” and install very powerful AI on the edge. Quantum will help create a paradigm shift in how we run process optimization and simulation.  And the future of enterprise UX will be another disruption which will provide users new type of personalization, adaption of screens to specific context, and an immersive experience. “My kids’ generation is going to hit the workforce after 2030. What’s going to be their UX paradigm?” Oren said. “They need an emotional connection for screens. They need adaptive screens. This is totally different from what we do today.” source

AI’s fourth wave is here — are enterprises ready for what’s next? Read More »

Building voice AI that listens to everyone: Transfer learning and synthetic speech in action

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Have you ever thought about what it is like to use a voice assistant when your own voice does not match what the system expects? AI is not just reshaping how we hear the world; it is transforming who gets to be heard. In the age of conversational AI, accessibility has become a crucial benchmark for innovation. Voice assistants, transcription tools and audio-enabled interfaces are everywhere. One downside is that for millions of people with speech disabilities, these systems can often fall short. As someone who has worked extensively on speech and voice interfaces across automotive, consumer and mobile platforms, I have seen the promise of AI in enhancing how we communicate. In my experience leading development of hands-free calling, beamforming arrays and wake-word systems, I have often asked: What happens when a user’s voice falls outside the model’s comfort zone? That question has pushed me to think about inclusion not just as a feature but a responsibility. In this article, we will explore a new frontier: AI that can not only enhance voice clarity and performance, but fundamentally enable conversation for those who have been left behind by traditional voice technology. Rethinking conversational AI for accessibility To better understand how inclusive AI speech systems work, let us consider a high-level architecture that begins with nonstandard speech data and leverages transfer learning to fine-tune models. These models are designed specifically for atypical speech patterns, producing both recognized text and even synthetic voice outputs tailored for the user. Standard speech recognition systems struggle when faced with atypical speech patterns. Whether due to cerebral palsy, ALS, stuttering or vocal trauma, people with speech impairments are often misheard or ignored by current systems. But deep learning is helping change that. By training models on nonstandard speech data and applying transfer learning techniques, conversational AI systems can begin to understand a wider range of voices. Beyond recognition, generative AI is now being used to create synthetic voices based on small samples from users with speech disabilities. This allows users to train their own voice avatar, enabling more natural communication in digital spaces and preserving personal vocal identity. There are even platforms being developed where individuals can contribute their speech patterns, helping to expand public datasets and improve future inclusivity. These crowdsourced datasets could become critical assets for making AI systems truly universal. Assistive features in action Real-time assistive voice augmentation systems follow a layered flow. Starting with speech input that may be disfluent or delayed, AI modules apply enhancement techniques, emotional inference and contextual modulation before producing clear, expressive synthetic speech. These systems help users speak not only intelligibly but meaningfully. Have you ever imagined what it would feel like to speak fluidly with assistance from AI, even if your speech is impaired? Real-time voice augmentation is one such feature making strides. By enhancing articulation, filling in pauses or smoothing out disfluencies, AI acts like a co-pilot in conversation, helping users maintain control while improving intelligibility. For individuals using text-to-speech interfaces, conversational AI can now offer dynamic responses, sentiment-based phrasing, and prosody that matches user intent, bringing personality back to computer-mediated communication. Another promising area is predictive language modeling. Systems can learn a user’s unique phrasing or vocabulary tendencies, improve predictive text and speed up interaction. Paired with accessible interfaces such as eye-tracking keyboards or sip-and-puff controls, these models create a responsive and fluent conversation flow. Some developers are even integrating facial expression analysis to add more contextual understanding when speech is difficult. By combining multimodal input streams, AI systems can create a more nuanced and effective response pattern tailored to each individual’s mode of communication. A personal glimpse: Voice beyond acoustics I once helped evaluate a prototype that synthesized speech from residual vocalizations of a user with late-stage ALS. Despite limited physical ability, the system adapted to her breathy phonations and reconstructed full-sentence speech with tone and emotion. Seeing her light up when she heard her “voice” speak again was a humbling reminder: AI is not just about performance metrics. It is about human dignity. I have worked on systems where emotional nuance was the last challenge to overcome. For people who rely on assistive technologies, being understood is important, but feeling understood is transformational. Conversational AI that adapts to emotions can help make this leap. Implications for builders of conversational AI For those designing the next generation of virtual assistants and voice-first platforms, accessibility should be built-in, not bolted on. This means collecting diverse training data, supporting non-verbal inputs, and using federated learning to preserve privacy while continuously improving models. It also means investing in low-latency edge processing, so users do not face delays that disrupt the natural rhythm of dialogue. Enterprises adopting AI-powered interfaces must consider not only usability, but inclusion. Supporting users with disabilities is not just ethical, it is a market opportunity. According to the World Health Organization, more than 1 billion people live with some form of disability. Accessible AI benefits everyone, from aging populations to multilingual users to those temporarily impaired. Additionally, there is a growing interest in explainable AI tools that help users understand how their input is processed. Transparency can build trust, especially among users with disabilities who rely on AI as a communication bridge. Looking forward The promise of conversational AI is not just to understand speech, it is to understand people. For too long, voice technology has worked best for those who speak clearly, quickly and within a narrow acoustic range. With AI, we have the tools to build systems that listen more broadly and respond more compassionately. If we want the future of conversation to be truly intelligent, it must also be inclusive. And that starts with every voice in mind. Harshal Shah is a voice technology specialist passionate about bridging human expression and machine understanding through inclusive voice solutions. source

Building voice AI that listens to everyone: Transfer learning and synthetic speech in action Read More »