VentureBeat

Google’s Gemini transparency cut leaves enterprise developers ‘debugging blind’

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Google‘s recent decision to hide the raw reasoning tokens of its flagship model, Gemini 2.5 Pro, has sparked a fierce backlash from developers who have been relying on that transparency to build and debug applications.  The change, which echoes a similar move by OpenAI, replaces the model’s step-by-step reasoning with a simplified summary. The response highlights a critical tension between creating a polished user experience and providing the observable, trustworthy tools that enterprises need. As businesses integrate large language models (LLMs) into more complex and mission-critical systems, the debate over how much of the model’s internal workings should be exposed is becoming a defining issue for the industry. A ‘fundamental downgrade’ in AI transparency To solve complex problems, advanced AI models generate an internal monologue, also referred to as the “Chain of Thought” (CoT). This is a series of intermediate steps (e.g., a plan, a draft of code, a self-correction) that the model produces before arriving at its final answer. For example, it might reveal how it is processing data, which bits of information it is using, how it is evaluating its own code, etc.  For developers, this reasoning trail often serves as an essential diagnostic and debugging tool. When a model provides an incorrect or unexpected output, the thought process reveals where its logic went astray. And it happened to be one of the key advantages of Gemini 2.5 Pro over OpenAI’s o1 and o3.  In Google’s AI developer forum, users called the removal of this feature a “massive regression.” Without it, developers are left in the dark. As one user on the Google forum said, “I can’t accurately diagnose any issues if I can’t see the raw chain of thought like we used to.” Another described being forced to “guess” why the model failed, leading to “incredibly frustrating, repetitive loops trying to fix things.” Beyond debugging, this transparency is crucial for building sophisticated AI systems. Developers rely on the CoT to fine-tune prompts and system instructions, which are the primary ways to steer a model’s behavior. The feature is especially important for creating agentic workflows, where the AI must execute a series of tasks. One developer noted, “The CoTs helped enormously in tuning agentic workflows correctly.”  For enterprises, this move toward opacity can be problematic. Black-box AI models that hide their reasoning introduce significant risk, making it difficult to trust their outputs in high-stakes scenarios. This trend, started by OpenAI’s o-series reasoning models and now adopted by Google, creates a clear opening for open-source alternatives such as DeepSeek-R1 and QwQ-32B.  Models that provide full access to their reasoning chains give enterprises more control and transparency over the model’s behavior. The decision for a CTO or AI lead is no longer just about which model has the highest benchmark scores. It is now a strategic choice between a top-performing but opaque model and a more transparent one that can be integrated with greater confidence. Google’s response  In response to the outcry, members of the Google team explained their rationale. Logan Kilpatrick, a senior product manager at Google DeepMind, clarified that the change was “purely cosmetic” and does not impact the model’s internal performance. He noted that for the consumer-facing Gemini app, hiding the lengthy thought process creates a cleaner user experience. “The % of people who will or do read thoughts in the Gemini app is very small,” he said. For developers, the new summaries were intended as a first step toward programmatically accessing reasoning traces through the API, which wasn’t previously possible.  The Google team acknowledged the value of raw thoughts for developers. “I hear that you all want raw thoughts, the value is clear, there are use cases that require them,” Kilpatrick wrote, adding that bringing the feature back to the developer-focused AI Studio is “something we can explore.”  Google’s reaction to the developer backlash suggests a middle ground is possible, perhaps through a “developer mode” that re-enables raw thought access. The need for observability will only grow as AI models evolve into more autonomous agents that use tools and execute complex, multi-step plans.  As Kilpatrick concluded in his remarks, “…I can easily imagine that raw thoughts becomes a critical requirement of all AI systems given the increasing complexity and need for observability + tracing.”  Are reasoning tokens overrated? However, experts suggest there are deeper dynamics at play than just user experience. Subbarao Kambhampati, an AI professor at Arizona State University, questions whether the “intermediate tokens” a reasoning model produces before the final answer can be used as a reliable guide for understanding how the model solves problems. A paper he recently co-authored argues that anthropomorphizing “intermediate tokens” as “reasoning traces” or “thoughts” can have dangerous implications.  Models often go into endless and unintelligible directions in their reasoning process. Several experiments show that models trained on false reasoning traces and correct results can learn to solve problems just as well as models trained on well-curated reasoning traces. Moreover, the latest generation of reasoning models are trained through reinforcement learning algorithms that only verify the final result and don’t evaluate the model’s “reasoning trace.”  “The fact that intermediate token sequences often reasonably look like better-formatted and spelled human scratch work… doesn’t tell us much about whether they are used for anywhere near the same purposes that humans use them for, let alone about whether they can be used as an interpretable window into what the LLM is ‘thinking,’ or as a reliable justification of the final answer,” the researchers write. “Most users can’t make out anything from the volumes of the raw intermediate tokens that these models spew out,” Kambhampati told VentureBeat. “As we mention, DeepSeek R1 produces 30 pages of pseudo-English in solving a simple planning problem! A cynical explanation of why o1/o3 decided not to show the raw tokens originally was perhaps because they realized people will notice how incoherent they are!”

Google’s Gemini transparency cut leaves enterprise developers ‘debugging blind’ Read More »

What’s inside Genspark? A new vibe working approach that ditches rigid workflows for autonomous agents

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Vibe coding has been all the rage in recent months as a simple way for anyone to build applications with generative AI. But what if that same easy-going, natural language approach was extended to other enterprise workflows? That’s the promise of an emerging category of agentic AI applications. At VB Transform 2025 today, one such application was on display with the Genspark Super Agent, which was originally launched earlier this year. The Genspark Super Agent’s promise and approach could well extend the concept of vibe coding into vibe working. A key tenet of enabling vibe working, though, is to go with the flow and exert less control rather than more over AI agents. “The vision is simple, we want to bring the Cursor experience for developers to the workspace for everyone,” Kay Zhu, CTO of Genspark, said at VB Transform. “Everyone here should be able to do vibe working… it’s not only the software engineer that can do vibe coding.” >>See all our Transform 2025 coverage here<< Less is more when it comes to enterprise agentic AI According to Zhu, a foundational premise for enabling a vibe working era is letting go of some rigid rules that have defined enterprise workflows for generations. Zhu provocatively challenged enterprise AI orthodoxy, arguing that rigid workflows fundamentally limit what AI agents can accomplish for complex business tasks. During a live demonstration, he showed the system autonomously researching conference speakers, creating presentations, making phone calls and analyzing marketing data.  Most notably, the system placed an actual phone call to the event organizer, VentureBeat founder Matt Marshall, during the live presentation.  “This is normally the call that I don’t really want to do by myself, you know, in person. So I let the agent do it,” Zhu explained as the audience listened to his AI agent attempt to convince the moderator to move his presentation slot before Andrew Ng’s session. The call connected in real-time, with the agent autonomously crafting persuasive arguments on Zhu’s behalf. The calling feature has revealed unexpected use cases highlighting both the platform’s capabilities and users’ comfort with AI autonomy.  “We actually observe a lot of people are using Genspark to call… to do different kinds of things,” Zhu noted. “Some of the Japanese users are using this to call to resign from their company. You know they don’t like the company, but they don’t want to call them again. and some of the people are using call for me agents to break up with their boyfriend and girlfriend.” These real-world applications demonstrate how users are pushing AI agents beyond traditional business workflows into deeply personal territory. Technical architecture: Why backtracking is good for enterprise AI  The system accomplishes all of that without predefined workflows. The platform’s core philosophy of ‘less control, more tools’ represents a fundamental departure from traditional enterprise AI approaches. “Workflow in our definition is the predefined steps and these kinds of steps often break on edge cases, when the user asks harder and harder questions, the workflow cannot hold,” Zhu said. Genspark’s agentic engine represents a significant departure from traditional workflow-based AI systems.  The platform combines nine different large language models (LLMs) in a mixture-of-experts (MoE) configuration, equipped with over 80 tools and 10+ premium datasets. The system operates on a classic agent loop: plan, execute, observe and backtrack. Zhu emphasized that the power actually lives in the backtrack stage. This backtracking capability allows the agent to intelligently recover from failures and find alternative approaches when unexpected situations arise, rather than failing at predefined workflow boundaries. The system uses LLM judges to evaluate every agent session and attributes rewards to each step, feeding this data back through reinforcement learning and prompt playbooks for continuous improvement. The technical approach differs markedly from established frameworks like LangChain or CrewAI, which typically require more structured workflow definition. While these platforms excel at orchestrating predictable multi-step processes, Genspark’s architecture prioritizes autonomous problem-solving over deterministic execution paths. Enterprise Strategy: Workflows today, vibe working agents tomorrow Genspark’s rapid scaling, from launch to $36 million ARR in 45 days, demonstrates that autonomous agent platforms are moving beyond experimental phases into commercial viability.  The company’s ‘less control, more tools’ philosophy challenges fundamental assumptions about enterprise AI architecture.  The implications for enterprises leading in AI adoption are clear: start architecting systems that can handle predictable workflows and autonomous problem-solving. The key is designing platforms that gracefully escalate from deterministic processes to agentic behavior when complexity demands it.  For enterprises planning later AI adoption, Genspark’s success signals that vibe working is becoming a competitive differentiator. Organizations that remain locked into rigid workflow thinking may be disadvantaged as AI-native companies embrace more fluid, adaptive approaches to knowledge work.  The question isn’t whether autonomous AI agents will reshape enterprise workflows—it’s whether your organization will be ready when the 20% of complex cases becomes 80% of your AI workload. source

What’s inside Genspark? A new vibe working approach that ditches rigid workflows for autonomous agents Read More »

IBM sees enterprise customers are using ‘everything’ when it comes to AI, the challenge is matching the LLM to the right use case

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Over the last 100 years, IBM has seen many different tech trends rise and fall. What tends to win out are technologies where there is choice. At VB Transform 2025 today, Armand Ruiz, VP of AI Platform at IBM detailed how Big Blue is thinking about generative AI and how its enterprise users are actually deploying the technology. A key theme that Ruiz emphasized is that at this point, it’s not about choosing a single large language model (LLM) provider or technology. Increasingly, enterprise customers are systematically rejecting single-vendor AI strategies in favor of multi-model approaches that match specific LLMs to targeted use cases. IBM has its own open-source AI models with the Granite family, but it is not positioning that technology as the only choice, or even the right choice for all workloads. This enterprise behavior is driving IBM to position itself not as a foundation model competitor, but as what Ruiz referred to as a control tower for AI workloads. “When I sit in front of a customer, they’re using everything they have access to, everything,” Ruiz explained. “For coding, they love Anthropic and for some other use cases like  for reasoning, they like o3 and then for LLM customization, with their own data and fine tuning, they like either our Granite series or Mistral with their small models, or even Llama…it’s just matching the LLM to the right use case. And then we help them as well to make recommendations.” The Multi-LLM gateway strategy IBM’s response to this market reality is a newly released model gateway that provides enterprises with a single API to switch between different LLMs while maintaining observability and governance across all deployments.  The technical architecture allows customers to run open-source models on their own inference stack for sensitive use cases while simultaneously accessing public APIs like AWS Bedrock or Google Cloud’s Gemini for less critical applications. “That gateway is providing our customers a single layer with a single API to switch from one LLM to another LLM and add observability and governance all throughout,” Ruiz said. The approach directly contradicts the common vendor strategy of locking customers into proprietary ecosystems. IBM is not alone in taking a multi-vendor approach to model selection. Multiple tools have emerged in recent months for model routing, which aim to direct workloads to the appropriate model. Agent orchestration protocols emerge as critical infrastructure Beyond multi-model management, IBM is tackling the emerging challenge of agent-to-agent communication through open protocols.  The company has developed ACP (Agent Communication Protocol) and contributed it to the Linux Foundation. ACP is a competitive effort to Google’s Agent2Agent (A2A) protocol which just this week was contributed by Google to the Linux Foundation. Ruiz noted that both protocols aim to facilitate communication between agents and reduce custom development work. He expects that eventually, the different approaches will converge, and currently, the differences between A2A and ACP are mostly technical. The agent orchestration protocols provide standardized ways for AI systems to interact across different platforms and vendors. The technical significance becomes clear when considering enterprise scale: some IBM customers already have over 100 agents in pilot programs. Without standardized communication protocols, each agent-to-agent interaction requires custom development, creating an unsustainable integration burden. AI is about transforming workflows and the way work is done In terms of how Ruiz sees AI impacting enterprises today, he suggests it really needs to be more than just chatbots. “If you are just doing chatbots, or you’re only trying to do cost savings with AI, you are not doing AI,” Ruiz said. “I think AI is really about completely transforming the workflow and the way work is done.” The distinction between AI implementation and AI transformation centers on how deeply the technology integrates into existing business processes. IBM’s internal HR example illustrates this shift: instead of employees asking chatbots for HR information, specialized agents now handle routine queries about compensation, hiring, and promotions, automatically routing to appropriate systems and escalating to humans only when necessary. “I used to spend a lot of time talking to my HR partners for a lot of things. I handle most of it now with an HR agent,” Ruiz explained. “Depending on the question, if it’s something about compensation or it’s something about just handling separation, or hiring someone, or doing a promotion, all these things will connect with different HR internal systems, and those will be like separate agents.” This represents a fundamental architectural shift from human-computer interaction patterns to computer-mediated workflow automation. Rather than employees learning to interact with AI tools, the AI learns to execute complete business processes end-to-end. The technical implication: enterprises need to move beyond API integrations and prompt engineering toward deep process instrumentation that allows AI agents to execute multi-step workflows autonomously. Strategic implications for enterprise AI investment IBM’s real-world deployment data suggests several critical shifts for enterprise AI strategy: Abandon chatbot-first thinking: Organizations should identify complete workflows for transformation rather than adding conversational interfaces to existing systems. The goal is to eliminate human steps, not improve human-computer interaction. Architect for multi-model flexibility: Rather than committing to single AI providers, enterprises need integration platforms that enable switching between models based on use case requirements while maintaining governance standards. Invest in communication standards: Organizations should prioritize AI tools that support emerging protocols like MCP, ACP, and A2A rather than proprietary integration approaches that create vendor lock-in. “There is so much to build, and I keep saying everyone needs to learn AI and especially business leaders need to be AI first leaders and understand the concepts,” Ruiz said. source

IBM sees enterprise customers are using ‘everything’ when it comes to AI, the challenge is matching the LLM to the right use case Read More »

How can you make sure your brand shows up in LLM search? Adobe’s new LLM Optimizer seeks to provide the tools

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more At the Cannes Lions festival on June 16, 2025, Adobe introduced Adobe LLM Optimizer, a new enterprise-grade tool designed to help businesses improve their visibility in generative AI-powered environments. As conversational interfaces like ChatGPT, Gemini, and Claude reshape how consumers search and engage online, Adobe’s new application aims to give brands the ability to understand and influence how they appear in these rapidly evolving digital spaces. Backed by data from Adobe Analytics showing a 3,500% increase in AI-sourced traffic to U.S. retail sites and a 3,200% spike to travel sites between July 2024 and May 2025, Adobe’s move comes at a time when the shift toward generative interfaces is accelerating. These tools are not only changing the mechanics of discovery—they are redefining what it means to be visible and influential online. “The adoption of GenAI-powered chat services is astounding, with massive year-over-year growth,” said Haresh Kumar, senior director of strategy and product marketing for Adobe Experience Manager. “It’s fundamentally changing how consumers interact, search, and find information.” “Generative AI interfaces are becoming go-to tools for how customers discover, engage and make purchase decisions,” added Loni Stark, vice president of strategy and product for Adobe Experience Cloud. “With Adobe LLM Optimizer, we are enabling brands to confidently navigate this new landscape, ensuring they stand out and win in the moments that matter.” GEO is the new SEO Haresh Kumar described the new digital reality as one in which brands no longer just optimize for search engines—but for AI models. “SEO is no longer just about keywords and backlinks,” he said. “In the era of generative AI, we’re entering a new paradigm—Generation Engine Optimization or GEO—where relevance is judged differently.” This evolving landscape demands new methods for tracking performance and influencing discoverability. Adobe LLM Optimizer aims to address this with a three-pronged framework: Auto Identify: The system detects how a brand’s content is being used by major AI models. Adobe tracks the “fingerprints” of indexed content and determines whether—and how—it appears in responses to relevant queries. Auto Suggest: Drawing on Adobe’s own AI models trained for generative interfaces, the tool recommends improvements across technical infrastructure and content. These could range from fixing metadata errors to improving authority and context in FAQ content. Auto Optimize: For many brands, the challenge isn’t just knowing what to fix—it’s executing the fixes quickly. LLM Optimizer allows users to apply recommended changes directly, often without heavy involvement from development teams. “We help brands auto-identify how their content is performing in LLMs, auto-suggest improvements, and auto-optimize to actually implement those changes,” said Kumar. Revealing gaps in your brand’s visibility to LLM users and assisting with filling them Adobe’s system enables marketers to see where their brand is underrepresented in AI-driven results. “The goal is to help brands understand the gaps—where they’re not showing up in AI answers—and what fixes can make them more visible,” said Kumar. The application calculates projected traffic value for each suggested change, letting teams prioritize high-impact actions. “Brands often ask, ‘Do I need to care about this new AI box?’” Kumar added. “The answer is yes—because traffic is shifting there. If you’re not optimizing for it, you’re missing out.” One example of content optimization includes focusing on formats that LLMs naturally prefer. “FAQ pages tend to perform exceptionally well in LLM indexing,” said Kumar. “They provide direct, authoritative answers that LLMs prefer when generating responses.” Adobe’s platform not only recommends creating such content but also assists in generating it within a brand’s existing voice and structure, thanks to native integration with Adobe Experience Manager. Always on analysis and expanding coverage for the growing library of LLMs LLM Optimizer uses a combination of push and pull models to keep content indexing current. When new content is published or accessed by an AI model, the system updates its analysis and surfaces insights to the user. “Our infrastructure includes both push and pull models. Whenever content is updated or accessed, we capture that fingerprint and feed it into our analysis engine,” Kumar explained. Currently, the product tracks performance across several top AI models, including ChatGPT, Claude, and Gemini, with plans to expand coverage as new models emerge. Availability and integration Adobe LLM Optimizer is available now as a standalone product or as a native integration with Adobe Experience Manager Sites. While pricing is not publicly disclosed, Adobe confirmed it is a separate product requiring opt-in and agreement updates. “LLM Optimizer is a new product offering, fully integrated with Adobe Experience Manager but available as a standalone solution,” said Kumar. “Customers need to opt in based on their AI readiness and strategy.” With more consumers spending time inside AI-driven interfaces, Adobe positions LLM Optimizer as a forward-looking solution for enterprises navigating this new terrain. It offers a blend of visibility, automation, and strategic clarity as digital engagement moves beyond traditional search engines into the generative future. source

How can you make sure your brand shows up in LLM search? Adobe’s new LLM Optimizer seeks to provide the tools Read More »

Agentic AI and the future state of enterprise security and observability

Presented by Splunk With its ability to reason, adapt, and take action autonomously at machine speed, agentic AI has the power and potential to dramatically change how enterprises maintain their digital resilience. It also redefines how they secure and deliver reliable performance for their digital ecosystems, where data pattern recognition and decision-making need to happen in real time and at machine speed. With agentic AI, companies get the benefits of a conversational analysis experience from LLM reasoning and adaptation plus the automation of task execution from the agentic framework. Together these shift IT teams from reactive fire-fighting mode to proactive planning mode. Here’s how. The promise of agentic AI for digital resilience 1. Pinpoint root-cause (almost) instantly Agentic AI can cross siloed application boundaries to bring data insights together for more complete visibility. For example, agentic AI can use LLMs to analyze logs, metrics, events, and trace data; call upon different monitoring systems in your ecosystem; apply reasoning to the data; and recommend or take actions to remediate. In minutes, the agentic AI can complete what used to take a site reliability engineer hours to pinpoint and troubleshoot potential issues. For security threats, agentic AI can analyze data streams to identify threats in real-time, including zero-day exploits or insider threats; automate multi-step investigation workflows from multiple security applications; and execute appropriate remediation responses to contain the threat and prevent lateral movements. Investigations that took the SOC analyst hours can now be done in minutes. 2. Preempt disruptions and downtime The power of agentic AI can prevent incidents and disruptions in more proactive ways. By studying historical data and current trends, agentic AI can forecast vulnerabilities — such as unpatched software or weak encryption — before they are exploited. It can detect subtle user behavior anomalies and flag suspicious activity before damage occurs. It can also analyze real-time data streams — such as logs, metrics, and traces from multiple sources — to provide a comprehensive view of system health and detect issues such as resource bottlenecks or latency spikes before they escalate. In short, the speed and scale at which root cause analysis can be done by agentic AI means more alerts can be analyzed — and resolved — before they become bigger issues. 3. Make better decisions with contextual, real-time insights Agentic AI has the ability to process new information in its environment and adapt its reasoning and course of action in real time. Contextual data refers to the rich, multidimensional information about users, devices, applications, and environments — such as user behavior patterns, device states, network conditions, and data flows. Agentic AI can process contextual data and patterns to make rapid, informed decisions to detect and remediate incidents and optimize operational performance. 4. Upskill and optimize the workforce With agentic AI, you get both a natural language interface and automated task execution through the agency framework. Workers at all levels can use it to upskill their knowledge across domains, whether identifying security threat vectors or navigating complex application stacks in observability. Key deployment considerations for agentic AI 1. Keeping humans IN and ON the loop Humans are ultimately responsible for managing AI agents. As more AI agents augment the work of analysts and managers, organizations will need technical analysts to learn new skills to manage agents and incorporate them into enterprise workflows (human-on-the-loop). Automating the full detection–investigation–response workflow is appealing — but as workflows grow more complex, with multiple agents and steps, so does the risk of compounding errors and hallucinations. Inserting humans at critical points in the automated analysis workflow (human-in-the-loop) enables you to ensure the agent(s) is on the right track, provide real-time feedback and use reinforcement learning to improve model performance. 2. Avoid hallucinations with domain-specific, specialized agents There’s a real cost to model hallucinations. This McKinsey AI Report estimates $67.4B was lost globally due to hallucinated AI output. OpenAI’s o3 and o4-mini were shown to hallucinate between 51% and 79% of the time on reasoning tasks. Narrowing the agent’s purpose — combined with fine-tuning and augmenting the model with RAG using domain-specific data — improves output accuracy. Specialized agents for areas like security and observability and even more targeted ones for detection, investigation, and response will deliver greater precision. These agents will also benefit from lower inference compute costs and latency compared to larger general-purpose LLMs. 3. Ensure seamless integration and compatibility in agentic ecosystems Integrating agentic AI into your IT environment requires rethinking of data flows, processes, and security protocols, and adapting user interaction models to maintain system integrity while harnessing AI’s potential. Three emerging protocols will help accelerate this: MCP (Model Context Protocol) for LLMs to integrate with other applications and data A2A (Agent-to-Agent) allows agents to communicate and collaborate with each other AGNTCY (Agency) for vendor-neutral standardized agent orchestration across the enterprise 4. Agent access control and data privacy governance The volume and speed for agent access management will far exceed the traditional human access management. It’s critical to define clear access levels for autonomous agents that maintain compliance, and establish a plan of record for audits and governance. The goal: boost operational efficiency without introducing risk so AI acts as a secure, augmentative force within the IT ecosystem. Splunk AI for digital resilience Splunk, a Cisco company, is redefining enterprise security and observability with AI at its core to accelerate insights, automate critical workflows, and boost analyst productivity. Building on a long history of machine learning capabilities, Splunk is embedding generative and agentic AI across its industry-leading security and observability solutions. With a unified data platform for operational data, Splunk is building an AI-ready platform to turbocharge enterprise security and observability outcomes. Visit www.splunk.com/ai to learn more. Cory Minton is Field CTO – AI at Splunk. Sancha Norris is Product Marketing Leader at Splunk AI. Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. For more information, contact [email protected]. source

Agentic AI and the future state of enterprise security and observability Read More »

Musk’s attempts to politicize his Grok AI are bad for users and enterprises

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Let’s start by acknowledging some facts outside the tech industry for a moment: There is no “white genocide” in South Africa — the vast majority of recent murder victims have been Black, and even throughout the country’s long and bloody history, Black South Africans have been overwhelmingly victimized and oppressed by White European, predominantly Dutch and British, colonizers in the now globally reviled system of segregation known as “Apartheid.” The vast majority of political violence in the U.S. throughout history and in recent times has been perpetrated by right-leaning extremists, including the assassinations of Democratic Minnesota State Representative Melissa Hortman and her husband Mark, and going back further to the Oklahoma City Bombing and many years of Ku Klux Klan lynchings. These are just simple, verifiable facts anyone can look up on a variety of trustworthy and long-established sources online and in print. Yet both seem to be stumbling blocks for Elon Musk, the wealthiest man in the world and tech baron in charge of at least six companies (xAI, social network X, SpaceX and its Starlink satellite internet service, Neuralink, Tesla and The Boring Company), especially with regards to the functioning of his Grok AI large language model (LLM) chatbot built into his social network X. Here’s what’s been happening, why it matters for businesses and any generative AI users, and why it is ultimately a terrible omen for the health of our collective information ecosystem. What’s the matter with Grok? Grok was launched from Musk’s AI startup xAI back in 2023 as a rival to OpenAI’s ChatGPT. Late last year, it was added to the social network X as a kind of digital assistant all users can summon to help answer questions or converse with and generate imagery by tagging it “@grok.” Earlier this year, an AI power user on X discovered that the implementation of the Grok chatbot on the social network appeared to contain a “system prompt” — a set of overarching instructions to an AI model intended to guide its behavior and communication style — to avoid mentioning or linking back to any sources that mentioned Musk or his then-boss U.S. President Donald Trump as top spreaders of disinformation. xAI leadership characterized this as an “unauthorized modification” by an unidentified new hire (purportedly formerly from OpenAI) and said it would be removed. Then, in May 2025, VentureBeat reported that Grok was going off the rails and asserting, unprompted by users, that there was ambiguity about the subject of “white genocide” in South Africa when, in fact, there was none. Grok was bringing up the topic completely randomly in conversations about totally different subjects. After more than a day of this behavior, xAI claimed to have updated the AI chatbot and blamed the errors once again on an unnamed employee. Yet, given Musk’s own background as a South African white man born in the country and raised there during Apartheid, suspicion immediately fell on him personally. Moreover, since his takeover of Twitter in 2022 and subsequent renaming of it as “X,” Musk has been posting sympathetically in response to X users who align themselves with right, far-right, conservative views and the Make America Great Again (MAGA) movement started by Trump. Musk was one of Trump’s primary political benefactors and allies in the 2024 U.S. presidential election —suggesting that his victory was necessary to secure the future of “western civilization,” among many other similarly dire warnings and entreaties — and served as an advisor and apparent ringleader of the Department of Government Efficiency (DOGE) effort to reduce federal spending. Increasingly, in the last few months, Musk has contradicted and expressed displeasure at Grok’s responses to right-leaning users when the data and information the chatbot surfaces proves them to be wrong, or disputes his own points. For example, on June 14, Musk posted on his X account: “The far left is murderously violent,” posting/tweeting another user blaming a string of recent high-profile killings on “the left” (although in at least once case, the chief suspect, Luigi Mangione, is an avowed and self-declared independent.) In response, Grok fact-checked Musk to state that this was incorrect. However, Musk did not take it well, writing in response to one Grok correction: “Major fail, as this is objectively false. Grok is parroting legacy media. Working on it.” A few days ago, in response to a complaint from an influential conservative X user “@catturd” about Grok’s supposed liberal or left-leaning political bias, Musk stated his goal of creating a new version of Grok that would rely less on mainstream media sources. In fact, Musk proposed on June 21st in an X post that he would use a forthcoming updated version of Grok (3.5 or 4) to “write the entire corpus of human knowledge, adding missing information and deleting errors. He then accused other AI models of having “far too much garbage.” As a left-leaning Kamala Harris voter in 2024, I’m of course disgusted by this stance from Musk, and object to it. As a journalist and lover of the written-word, Musk’s pronouncement to “rewrite the entire corpus of human knowledge, adding misinformation and deleting errors,” brings to mind the true (to the best of our historical knowlege) story of the burning of the Great Library of Alexandria in Egypt, destroying countless works of knowledge we as a species will never be able to recover. This fills me with dread and sadness. It also betrays, quite frankly, an arrogance and hubris that disrespects all the knowledge of recorded history and efforts of scholars and historians of yore as some sort of flawed database Musk and his team can correct, rather than a massive community endeavor across millennia deserving of respect, gratitude and admiration. But even trying to put my own views aside, I think it’s a bad move for his business and, to take a page from Musk’s book, civilization

Musk’s attempts to politicize his Grok AI are bad for users and enterprises Read More »

At INBOUND 2025, AI and human creativity take the stage together

Presented by HubSpot INBOUND, HubSpot’s flagship conference for marketing, sales, and customer service professionals, is headed to the west coast for the first time September 3-5, 2025. Over three days, attendees can expect bold insights, meaningful connections, and breakthrough content spanning marketing, sales, customer experience, and AI innovation — all designed to blend the familiar with the surprising, says Courtney Dagher, global events senior team manager, marketing & programming. “At INBOUND, it’s all about tackling trending topics through unexpected pairings and delivering insights you won’t find anywhere else,” Dagher says. This year that means INBOUND is putting AI pioneer Dario Amodei on the same stage as Michelin-starred chef Dominique Crenn, because innovation doesn’t live in silos, says Kat Tooley, VP, global events and experiential marketing. “When you see a world-class chef talking customer experience alongside a tech visionary, that’s when the real breakthroughs happen,” Tooley explains. “For our business leaders who attend INBOUND, it’s proof that game-changing ideas come from connecting dots others miss.” The aim is to highlight creative leadership alongside technical innovation, since both are essential to achieving excellence at scale. Future-ready companies and leaders approach challenges holistically, through both creative and technical lenses. INBOUND aims to instill that mindset from day one, recognizing it as a critical driver of long-term success, “Here’s the thing — every unicorn company figured out that creativity and tech aren’t separate departments, they’re dance partners,” Dagher says. “Look at Anthropic’s collaboration with Rick Rubin —The Way of the Code — on vibe coding. That’s exactly what we mean. The startups getting funded and the companies scaling aren’t just technically sound or just creative — they’re both.” VentureBeat readers get 10% off General Admission to INBOUND with code VB10 (Valid through 7/31). Transformative AI technologies taking center stage Today AI isn’t just leveling the playing field, it’s completely rebuilding it. It’s become a co-pilot for selling smarter, marketing faster, and building better. A solo founder can now do customer research that used to take a team of ten. Sales reps can focus on building relationships instead of digging through data. Marketing can be personal at scale, not just spray and pray.  “For entrepreneurs, this is your moment. The tools that used to be enterprise-only are now in your pocket,” Tooley says. “The question isn’t whether AI will change your game, it’s whether you’ll use it to win.” INBOUND is designed to equip attendees with the knowledge they need to hit the ground running, she adds. From frameworks and roadmaps to real-world processes, INBOUND’s content puts attendees in the driver’s seat, delivered by leaders actively using AI to build and scale businesses today. The Innovation Zone offers early access to what’s coming next, and every session is built to send attendees home with something they can implement Monday morning. In fact, AI is woven throughout the INBOUND experiences. The opening HubSpot Spotlight on Wednesday digs into the latest HubSpot perspective on what this AI transformation will look like. New this year, the Creators Corner offers hands-on opportunities to engage with top content creators, watch them in action, and explore live demos and pop-up showcases. Over at HubSpot HQ, attendees can drop in for a lightning-fast demo of the newest Breeze features or dive deeper in a one-on-one conversation with a product expert. Immersive learning at the core This year 1:1 and 1:few hands-on learning experiences will be featured at the Product Spotlight Demo Stage, the brand new Tech Stack Showcase Stage, and various immersive co-learning and experiential labs. That includes bringing back and expanding last year’s popular HubSpot Academy Labs for hands-on HubSpot product learning. Braindate has been newly added this year to allow even more chances for attendee-led immersive learning, connection and discussion. “This was a big piece of feedback we acted on from 2024,” Dagher said. “Our attendees want more guided opportunities to meet, greet, and learn from their peers through communities of practice and identity.” They’re continuing to test live and immersive session formats, and crafting “lightning in a bottle” memorable moments through a unique combination of experts, attendees, and innovative formats that you can’t find elsewhere, like the popular INBOUND Debates and Teardowns, plus live Builder sessions. “Our programming philosophy is the future of live event content is all about attendee input and engagement,” she said. “We strive to take a bottom-up approach for how we program. We always start with “what does our customer want to learn and hear?” and build from there.” Real-world knowledge for real-world results The agenda is now live, featuring a carefully curated lineup of thought leaders and experts who are also active practitioners — giving attendees the chance to learn from people who not only impart knowledge, but also understand what it takes to get the job done, Tooley says. “Simply put, we only work with people who’ve actually done the work, not just talked about it,” she explains. Take Jay Schwedelson, who doesn’t just teach email marketing, he runs 15,000+-person virtual events multiple times a year and knows how to drive looks and engagement. Brandon Greer from HubSpot’s investment team is leading a “how-to” conversation this year on getting a business venture capital–ready, with a panel full of active VCs and thought leaders: Veronica Juarez (Managing Partner of Dahlia VC), Christian McKenzie (Director, Lofty Ventures) and Genever Oppong (CEO, Co-Founder, Black Women Investors Network). A playbook for year-round success “We want every attendee to walk away with one thing: conviction,” Dagher says. “Conviction that they can grow — faster, smarter, more human. It starts with mindset. AI isn’t replacing us — it’s unlocking us. Let’s shift from fear to focus. From overwhelm to opportunity. They should leave INBOUND thinking: I’ve got the tools. I’ve got the people. I’ve got the playbook. Now it’s time to execute.” But more importantly, she says, “We want them to feel expanded. Energized. Like they just stepped into the next era of growth and they’re excited for what’s ahead.” Don’t miss your chance to attend INBOUND: VentureBeat

At INBOUND 2025, AI and human creativity take the stage together Read More »

Announcing the 2025 finalists for VentureBeat Women in AI Awards

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more With VentureBeat’s flagship event, VB Transform, just around the corner, we are excited to announce the finalists for the 7th annual Women in AI Awards. The winners will be announced during a special program on the main stage of Transform on Wednesday, June 25 at 4 p.m. PT. VB Transform is a premier two-day event on June 24-25 at Fort Mason in San Francisco. Industry experts and peers will gather to provide comprehensive insights and best practices on what’s actually working in enterprise AI—from copilots to agents. Attendees will have numerous opportunities to forge meaningful connections and expand their networks. As part of this event, VentureBeat will honor women leaders and changemakers in AI during the in-person sessions. The award categories include Responsibility and Ethics of AI, AI Entrepreneurship, AI Research, AI Mentorship and Rising Star. The public submitted the nominees, and a VentureBeat committee will choose the winners. The selection criteria include the nominees’ commitment to the industry, efforts to increase inclusivity in the field and their positive influence on the community.  Investing in women in AI will create more valuable AI that better suits audiences, and boosts ROI for companies. The impact has never been more clear, or more important, and we’re proud to recognize leaders in AI who are making an impact. This award will honor a woman who has started companies showing great promise in AI. Consideration will be given to things like business traction, the technology solution and impact in the AI space. Elnaz Sarraf, CEO and founder at ROYBI Natalya Lopareva, CEO and founder at Algorized Ouafae Karim, engineer and funder at AFRICA-EO-SERVICES (AFEOS) Francessca Vasquez, vice president of professional services and generative AI innovation center at AWS Val Vacante SVP of solutions innovation at Dentsu This award will honor a female leader who has helped mentor other women in the field of AI, providing guidance and support and/or encouraging more women to enter the field of AI. Sandra Brown, deputy general counsel at SoftBank Robotics America, Inc. Suruchi Shah, engineering manager, model serving team at LinkedIn Parul Bhandari, global head telco, media and gaming partner strategy at Microsoft Reut Lazo, Founder at Women X AI Nicole Carignan, SVP of security and AI strategy, field CISO, at Darktrace This award will honor a woman who has made a significant impact in an area of research in AI, helping accelerate progress either within her organization, as part of academic research or impacting AI. Lindsay Richman, CEO at Innerverse AI Payel Das, principal research scientist and manager at IBM Research – T.J. Watson Research Center Suma Kumaraswamy, senior director, head of OMB Digital Products Innovation at First Tech Federal Credit Union Ann Irvine, Chief Data and Analytics Officer at Resilience Hannaneh Hajishirzi, senior director of NLP at AI2; Torode Family Associate Professor in the Allen School of Computer Science and Engineering at the University of Washington at Allen Institute for Artificial Intelligence (AI2); and University of Washington This award will honor a woman who demonstrates exemplary leadership and progress in the growing hot topic of responsible AI. This award will honor a woman in the beginning stage of her AI career who has demonstrated exemplary leadership traits. We’d like to congratulate all of the women who were nominated to receive a Women in AI Award. Thanks to everyone for their nominations and for contributing to the growing awareness of women who are making a significant difference in AI. source

Announcing the 2025 finalists for VentureBeat Women in AI Awards Read More »

75 million deepfakes blocked: Persona leads the corporate fight against hiring fraud

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more As remote work has become the norm, a shadowy threat has emerged in corporate hiring departments: sophisticated AI-powered fake candidates who can pass video interviews, submit convincing resumes, and even fool human resources professionals into offering them jobs. Now, companies are racing to deploy advanced identity verification technologies to combat what security experts describe as an escalating crisis of candidate fraud, driven largely by generative AI tools and coordinated efforts by foreign actors, including North Korean state-sponsored groups seeking to infiltrate American businesses. San Francisco-based Persona, a leading identity verification platform, announced Tuesday a major expansion of its workforce screening capabilities, introducing new tools specifically designed to detect AI-generated personas and deepfake attacks during the hiring process. The enhanced solution integrates directly with major enterprise platforms including Okta’s Workforce Identity Cloud and Cisco Duo, allowing organizations to verify candidate identities in real-time. “In today’s environment, ensuring the person behind the screen is who they claim to be is more important than ever,” said Rick Song, CEO and co-founder of Persona, in an exclusive interview with VentureBeat. “With state-sponsored actors infiltrating enterprises and generative AI making impersonation easier than ever, our enhanced Workforce IDV solution gives organizations the confidence that every access attempt is tied to a real, verified individual.” The timing of Persona’s announcement reflects growing urgency around what cybersecurity professionals call an “identity crisis” in remote hiring. According to a April 2025 Gartner report, by 2028, one in four candidate profiles globally will be fake — a staggering prediction that underscores how AI tools have lowered the barriers to creating convincing false identities. 75 million blocked deepfake attempts reveal massive scope of AI-powered hiring fraud The threat extends far beyond individual bad actors. In 2024 alone, Persona blocked over 75 million AI-based face spoofing attempts across its platform, which serves major technology companies including OpenAI, Coursera, Instacart, and Twilio. The company has observed a 50-fold increase in deepfake activity over recent years, with attackers deploying increasingly sophisticated techniques. “The North Korean IT worker threat is real,” Song explained. “But it’s not just North Korea. A lot of foreign actors are all doing things like this right now in terms of finding ways to infiltrate organizations. The insider threat for businesses is higher than ever.” Recent high-profile cases have highlighted the severity of the issue. In 2024, cybersecurity firm KnowBe4 inadvertently hired a North Korean IT worker who attempted to load malware onto company systems. Other Fortune 500 companies have reportedly fallen victim to similar schemes, where foreign actors use fake identities to gain access to sensitive corporate systems and intellectual property. The Department of Homeland Security has warned that such “deepfake identities” represent an increasing threat to national security, with malicious actors using AI-generated personas to “create believable, realistic videos, pictures, audio, and text of events which never happened.” How three-layer detection technology fights back against sophisticated fake candidate schemes Song’s approach to combating AI-generated fraud relies on what he calls a “multimodal” strategy that examines identity verification across three distinct layers: the input itself (photos, videos, documents), the environmental context (device characteristics, network signals, capture methods), and population-level patterns that might indicate coordinated attacks. “There’s no silver bullet to really solving identity,” Song said. “You can’t look at it from a single methodology. AI can generate very convincing content if you’re looking purely at the submission level, but all the other parts of creating a convincing fake identity are still hard.” For example, while an AI system might create a photorealistic fake headshot, it becomes much more difficult to simultaneously spoof device fingerprints, network characteristics, and behavioral patterns that Persona’s systems monitor. “If your geolocation is off, then the time zones are off, the time zones are off, then your environmental signals are off,” Song explained. “All those things have to come into a single frame.” The company’s detection algorithms currently outperform humans at identifying deepfakes, though Song acknowledges this is an arms race. “AI is getting better and better, improving faster than our ability to detect purely on the input level,” he said. “But we’re watching the progression and adapting our models accordingly.” Enterprise customers deploy workforce identity verification in under an hour The enhanced workforce verification solution can be deployed remarkably quickly, according to Song. Organizations already using Okta or Cisco’s identity management platforms can integrate Persona’s screening tools in as little as 30 minutes to an hour. “The integration is incredibly fast,” Song said, crediting Okta’s team for creating seamless connectivity. For companies concerned about the user experience, Song emphasized that legitimate candidates typically complete verification in seconds. The system is designed to create “friction for bad users to prevent them from getting through” while maintaining a smooth experience for genuine applicants. Major technology companies are already seeing results. OpenAI, which processes millions of user verifications monthly through Persona, achieves 99% automated screening with just 18 milliseconds of latency. The AI company uses Persona’s sanctions screening capabilities to prevent bad actors from accessing its powerful language models while maintaining a frictionless signup experience for legitimate users. Identity verification market pivots from background checks to proving candidates exist The rapid adoption of AI-powered hiring fraud has created a new market category for identity verification specifically tailored to workforce management. Traditional background check companies, which verify information about candidates after assuming their identity is genuine, are not equipped to handle the fundamental question of whether a candidate is who they claim to be. “Background checks assume that you are who you say you are, but then verify the information you’re providing,” Song explained. “The new problem is: are you who you say you are? And that’s very different from what background check companies traditionally solve.” The shift toward remote work has eliminated many traditional identity verification mechanisms. “You never had a problem knowing that if someone shows up in person, you know with

75 million deepfakes blocked: Persona leads the corporate fight against hiring fraud Read More »

Forget about AI costs: Google just changed the game with open-source Gemini CLI that will be free for most developers

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more For power users and many developers, the command line is the foundational interface for controlling a system and its applications. Also sometimes referred to as a terminal, the command line interface (CLI) is how users issue commands and build applications as an alternative, or as a complement, to an integrated developer environment (IDE) tool. While it might seem almost anachronistic that a text-only interface accessible with a keyboard (CLI doesn’t even use a mouse) can be modern, it remains a mainstay of developers around the world. In the modern era of generative AI, it’s becoming more powerful too. Today Google announced its open-source Gemini-CLI that brings natural language command execution directly to developer terminals. Beyond natural language, it brings the power of Google’s Gemini Pro 2.5 — and it does it mostly for free. The free tier provides 60 model requests per minute and 1,000 requests per day at no charge, limits that Google deliberately set above typical developer usage patterns. Google first measured its own developers’ usage patterns, then doubled that number to set the 1,000 limit. “To be very clear, for the vast majority of developers, Gemini CLI will be completely free of charge,” Ryan J. Salva, senior director for product management at Google, said in response to a question from VentureBeat during a press briefing. “We do not want you having to watch that token meter like it’s a taxi meter and holding back on your creativity.” How Google Gemini CLI disrupts the enterprise AI market Gemini CLI is far from being the first or only AI tool for the command line. OpenAI Codex has a CLI version, as does Anthropic with Claude Code. Google Gemini CLI, however, is quite different from its two primary commercial rivals in that the tool is open source under the Apache 2.0 license. Then, of course, is the cost. While Gemini CLI is mostly free, OpenAI and Anthropic’s tools are not. In response to another question from VentureBeat, Google senior staff software engineer Taylor Mullen said he expects that Gemini CLI will be more widely used, simply because it is free. He noted that many users will not use OpenAI Codex or Claude code for just any task, as it carries a cost. “Being able to amplify literally anything and everything means it’s woven into the fabric of so much more of your workflow,” Mullen said. Extensibility through Model Context Protocol and custom extensions Another key differentiator for Gemini CLI lies in its extensibility architecture, built around the emerging Model Context Protocol (MCP) standard. This approach lets developers connect external services and add new capabilities and positions the tool as a platform rather than a single-purpose application. During the briefing, Google demonstrated this extensibility through a pre-recorded video showing Gemini CLI integrated with Google’s creative AI tools. An agent creating a cat video set in Australia first generated images using Imagen APIs, then wove them into an animated video using Veo technology. The extensibility model includes three layers: Built-in MCP server support, bundled extensions that combine MCP servers with configuration files and custom Gemini.md files for project-specific customization. This architecture allows individual developers to tailor their experience while enabling teams to standardize workflows across projects. Where Google starts charging: Enterprise features and scale While individual developers enjoy generous free access, Google’s monetization strategy becomes clear for enterprise use cases. The company maintains a clear delineation between free individual use and paid enterprise features. Accessing Gemini CLI only requires a Google login. It does not require any sort of API key or credit card on file in order to use. While there is a very generous free tier, there can be costs involved for enterprise users. Salva noted that if an organization wants to run multiple Gemini CLI agents in parallel, or if there are specific policy, governance or data residency requirements, a paid API key comes in. The key could be for access to Google Vertex AI, which provides commercial access to a series of models including, but not limited to, Gemini Pro 2.5  Technical architecture and security model Gemini CLI operates as a local agent with built-in security measures that address common concerns about AI command execution. The system requires explicit user confirmation for each command, with options to “allow once,” “always allow” or deny specific operations. The tool’s security model includes multiple layers of protection. Users can use native macOS Seatbelt support for sandboxing, run the agent in Docker or Podman containers, and route all network traffic through proxies for inspection. The open-source nature under Apache 2.0 licensing allows complete code auditing. “You have complete transparency into it,” Salva noted. “The tool only has access to the information that you explicitly provide in a prompt or a reference file path and you decide what context to share with the model on a prompt by prompt by prompt basis.” While Gemini CLI runs as a local agent it’s important to note that it doesn’t currently run the models locally. That is, the Gemini Pro 2.5 model is accessed from the cloud and Google is not providing support to run a local model. Mullen noted that although there is a subset of tasks which could probably be done with a local model, Google is not shipping local model support today. For enterprises looking to lead in AI, the extremely generous free tier for Gemini CLI will be an option that should be considered for some use cases. To be clear, it’s not a full enterprise system, but it’s the foundation on which enterprise application and agentic AI systems can be developed. For individual developers within enterprises, it represents a no-barrier entry for AI access. The open-source architecture addresses common enterprise security concerns by enabling complete code auditing and on-premises deployment options. Organizations can evaluate production-grade AI capabilities without vendor lock-in risks or complex procurement cycles. “It doesn’t

Forget about AI costs: Google just changed the game with open-source Gemini CLI that will be free for most developers Read More »