VentureBeat

The show’s not over: 2024 sees big boost to AI investment

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Global AI deal volumes reached 1,245 during Q3 2024, a level not seen since Q1 2022 reflecting how confident and resilient investors are about investing in AI. With a 24% year-over-year growth, global AI deals far outpaced the -10 % quarter-over-quarter (QoQ) declines across the broader investment market. CB Insights notes in its State of AI Q3’24 Report that despite broader venture trends slowing down, investor resilience and confidence in AI remain strong. CB Insights says that “while AI deals in Q3’24 included massive $1B+ rounds to defense tech provider Anduril and AI lab Safe Superintelligence, global AI funding actually dropped by 29% QoQ.” The 77% decline in funding from $1B+ AI rounds QoQ contributed to the 29% QoQ decline. The average AI deal size increased 28% this year, climbing from $18.4M in 2023 to $23.5M. Deal size gains this year are attributable to five $1B+ rounds this year, including xAI’s $6B Series B at a $24B valuation, Anthropic’s $2.8B Series D at an $18.4B valuation, Anduril’s $1.5B Series F at a $14B valuation, G42’s $1.5B investment from Microsoft and CoreWeave’s $1.1B Series C at a $19B valuation. CB Insights notes that these deals alone aren’t responsible for increasing the average entirely on their own, mentioning that the median AI deal size is up 9% in 2024 so far. U.S.-based AI startups attracted $11.4B in investment across 566 deals in Q3, 2024 accounting for over two-thirds of global AI funding and 45% of global AI deals. European AI startups attracted $2.8B from 279 deals, and Asian AI startups received $2.1B from 316 deals. AI deal volumes reach 1,245 in Q3 2024, marking the highest level since Q1 2022 amid resilient investor interest. Source: CB Insights, State of AI Q3 2024 Report. Generative AI and industry-specific AI lead investments The anticipated productivity gains and potential cost reductions that generative AI and industry-specific AI are delivering are core to investors’ confidence and driving more AI deals.   Enterprises have already learned how to prioritize gen AI and broader AI investments that deliver measurable value at scale. That’s one of the primary factors continuing to fuel more venture investments over other opportunities. Gartner’s 2024 Generative AI Planning Survey reflects how impatient senior management is for results, correlating back to CB Insight’s findings. One of the key findings from the Gartner Survey is that senior executives are expecting—and driving—gen AI projects to boost productivity by 22.6%, outpacing revenue growth at 15.8% and cost savings at 15.2%. While cost efficiency and revenue gains matter, Gartner predicts the most immediate and substantial impact will be on driving greater operational efficiency. Gartner predicts that enterprises that prioritize gen AI integration will see significant increases in both workflow optimization and financial performance. Projected impact of generative AI on productivity, revenue, and cost savings over the next 12–18 months. Source: Gartner Generative AI 2024 Planning Survey CB Insights provides a comprehensive analysis of the deals completed in Q3, reflecting the growing dominance of gen AI and industry-specific AI investments. The following deals support this finding: Gen AI investments in Q3: Safe Superintelligence raised a massive $1 billion Series A round, indicating continued strong interest in large language models (LLM) and general-purpose AI systems. Baichuan AI, a Chinese generative AI company, secured $688 million in Series A funding. Moonshot AI, another gen AI startup, raised $300 million in a Series B round. Codeium, a code generation AI company, became a unicorn with a $150 million Series C round. Industry-specific AI investments in Q3: Anduril, an AI-powered defense technology company, raised $1.5 billion in a Series F round, highlighting interest in AI for national security applications. ArsenalBio secured $325 million for AI in biotechnology and drug discovery. Helsing raised $488 million for AI applications in defense and security. Altana AI received $200 million for AI in supply chain management and logistics. Flo Health raised $200 million for AI-powered women’s health applications. New AI unicorns more than doubled in Q3 Gen AI continues to be one of the primary catalysts driving the formation and growth of unicorns (private companies reaching $1B+ valuations). CB Insights found that the number of unicorns more than doubled QoQ, reaching 13 in the latest quarter. That’s 54% of the broader venture total for Q3 2024. More than half of the AI unicorns launched last quarter are gen AI startups. They’re targeting a broad spectrum of areas, including AI for 3D environments (World Labs), code generation (Codeium), and legal workflow automation (Harvey). Among new GenAI unicorns in Q3’24, Safe Superintelligence, co-founded by OpenAI co-founder Ilya Sutskever received the most sizable valuation. The AI lab was valued at $5B after raising a $1B Series A round in September 2024. In Q3 2024, AI unicorns account for 54% of all new unicorn births, with 13 out of 24 new billion-dollar companies emerging in the AI sector. Source: CB Insights, State of AI: Q3’24 Gen AI’s enterprise challenges are just beginning The potential of gen AI and industry-specific AI to improve productivity, help drive new revenue streams and reduce costs keeps investors resilient and focused on results. From the many organizations getting additional late-stage funding to startups and new unicorns, the challenge will be gaining adoption at scale and solidly enough to sustain recurring revenue while reducing costs. With CIOs and CISOs looking to reduce the tool and app sprawl they already have, the most successful startups will have to find new ways to embed and integrate gen AI into existing apps and workflows. That’s going to be challenging, as every enterprise has its data management challenges, siloed legacy systems, and the need to update its data accuracy, quality and security strategies. Startups and unicorns that can take on all these challenges and improve their customers’ operations at the data level first are most likely to deliver the results investors expect. source

The show’s not over: 2024 sees big boost to AI investment Read More »

5 ways to overcome the barriers of AI infrastructure deployments

Presented by Penguin Solutions Today, organizations are under intense pressure to leverage AI as a competitive advantage, but we’re still in the early stages. Only about 40% of large-scale enterprises have actively deployed AI in their business, but barriers keep another 40% in the exploration and experimentation phases. Although there is massive interest, 38% of IT professionals admit that a lack of technology infrastructure is a major barrier to AI success. Why are so many organizations falling behind in the race to implement AI? The Harvard Business Review estimates the failure rate is as high as 80% — about twice the rate of other corporate IT project failures. One of the top barriers preventing successful AI deployments is limited AI skills and expertise. In fact, 9 out of 10 organizations suffer from a shortage of IT skills, which exposes execution gaps in AI system-design, deployment and ongoing cluster management. Without the necessary insight, software tools and expertise, 83% of organizations admit to not being able to fully utilize their GPU and AI hardware, even after the system is deployed. Managing AI infrastructure is a whole new ballgame, which requires a significantly different approach compared to traditional IT infrastructure, says Jonathan Ha, senior director of product management – AI systems at Penguin Solutions. “Tuning the cost, performance, data and operational model for a specific use case and workload starts with a solid AI infrastructure, managed intelligently,” Ha says. “You cannot and will not move from proof of concept to production at scale until you’ve established that foundation.” Here’s a look at the five most common challenges when building out your AI architecture and how enterprises can approach and overcome them. Challenge #1: IT organizations are not AI-ready IT has decades’ worth of tools, processes and experience monitoring and managing general-purpose and high-performance computing (HPC) workloads at the CPU level. However, today’s AI infrastructure requires significant enhancements in monitoring and management capabilities. With the addition of new technologies like high-powered GPUs, high-performance interconnects, low-latency network fabrics and even the addition of liquid-cooling infrastructure, IT organizations are challenged with building the expertise to monitor and manage these AI clusters, especially at scale. Designing the compute and storage cluster architectures, building the network topologies and then tuning it all to get maximum performance for your AI workloads all takes specialized skills, experience and expertise. The solution: Invest in AI infrastructure expertise Many organizations approach this challenge with a false sense of confidence, believing their extensive IT infrastructure expertise equips them with the knowledge and know-how to succeed. Unfortunately, that often means they struggle with getting their infrastructure up and running, or achieving the results they expect. The success of an AI strategy hinges on the very first decisions made: use cases, project design, hardware needs, costs and more. That takes practical, up-to-the minute experience in designing, deploying and managing today’s AI infrastructure. Unfortunately, the explosion of AI has far outpaced the talent pool, making that expertise hard to find. In such a tight market, it is critical to get the right talent in place, whether through training existing staff, hiring externally or selecting the right AI infrastructure partner. Challenge #2: Building for today and tomorrow’s needs Even before designing a system, organizations need to map out their AI use-cases, models and data sets to scope out the scale of the required AI infrastructure. It’s important to consider factors such as model parameters, users supported and performance needs, while also anticipating how those needs will grow and change as AI adoption continues to grow. At the same time, organizations must also consider rapidly expanding data demands and the constantly evolving technology landscape. How can an organization stay agile, scale easily and deliver expected performance, security and stability when managing profoundly complex AI architecture? The solution: Plan from the ground up First, an organization should develop a comprehensive AI roadmap that identifies the resources required at each stage of the AI journey and the timeline for their deployment. For example, starting the design with a data center is crucial, as its power and cooling capabilities will determine the feasibility of the AI cluster and future scalability. Second comes selecting and integrating validated, modular architectures that allow for easy configuration to meet changing compute demands while providing high availability and performance, even as workloads and use-cases change over time. Challenge #3: Data management and governance just got even more important AI depends on the efficient management of large datasets across the entire pipeline. Data security can become a challenge, and ensuring the data is clean, accurate and unbiased, as well as aligning with internal and external compliance regulations is an ongoing risk and a continuous responsibility. “Every piece of data becomes valuable in an AI initiative, but it is also more vulnerable once it’s released from an organization’s silos. Plus, bias often creeps in, introduced by tagging and labeling when training an AI model,” Ha says. “Establishing the appropriate processes, controls and governance to use data in a safe and equitable manner is something that must be a top priority.” The solution: Putting guardrails in place Leaders must invest time in understanding the potential pitfalls, including leaks, misuse of data and miscategorization of data, as well as biases, before touching the data and beginning the AI initiative. They should then establish processes and tools to safeguard the data in all locations. Plus, it is important to map out what roles get what kind of access and be vigilant in tracking and monitoring that activity. Challenge #4: Managing AI infrastructure requires a new approach Misconfigured networks, node failures or loss of GPUs can disrupt operations, causing delays in new product launches or hindering the discovery of critical insights. Addressing these challenges is difficult due to the complexity of the architecture and the need for skilled talent. Expertise is required to manage optimal cluster design and intelligent cluster management. Additionally, continuous tuning and refinement of your model throughout the pipeline is essential for success. The solution: Embracing new

5 ways to overcome the barriers of AI infrastructure deployments Read More »

Microsoft’s agentic AI tool OmniParser rockets up the open source charts

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Microsoft’s OmniParser is on to something. The new open source model that converts screenshots into a format that’s easier for AI agents to understand was released by Redmond earlier this month, but just this week became the number one trending model (as determined by recent downloads) on AI code repository Hugging Face. It’s also the first agent-related model to do so, according to a post on X by Hugging Face’s co-founder and CEO Clem Delangue. But what exactly is OmniParser, and why is it suddenly receiving so much attention? At its core, OmniParser is an open-source generative AI model designed to help large language models (LLMs), particularly vision-enabled ones like GPT-4V, better understand and interact with graphical user interfaces (GUIs). Released relatively quietly by Microsoft, OmniParser could be a crucial step toward enabling generative tools to navigate and understand screen-based environments. Let’s break down how this technology works and why it’s gaining traction so quickly. What is OmniParser? OmniParser is essentially a powerful new tool designed to parse screenshots into structured elements that a vision-language model (VLM) can understand and act upon. As LLMs become more integrated into daily workflows, Microsoft recognized the need for AI to operate seamlessly across varied GUIs. The OmniParser project aims to empower AI agents to see and understand screen layouts, extracting vital information such as text, buttons, and icons, and transforming it into structured data. This enables models like GPT-4V to make sense of these interfaces and act autonomously on the user’s behalf, for tasks that range from filling out online forms to clicking on certain parts of the screen. While the concept of GUI interaction for AI isn’t entirely new, the efficiency and depth of OmniParser’s capabilities stand out. Previous models often struggled with screen navigation, particularly in identifying specific clickable elements, as well as understanding their semantic value within a broader task. Microsoft’s approach uses a combination of advanced object detection and OCR (optical character recognition) to overcome these hurdles, resulting in a more reliable and effective parsing system. The technology behind OmniParser OmniParser’s strength lies in its use of different AI models, each with a specific role: YOLOv8: Detects interactable elements like buttons and links by providing bounding boxes and coordinates. It essentially identifies what parts of the screen can be interacted with. BLIP-2: Analyzes the detected elements to determine their purpose. For instance, it can identify whether an icon is a “submit” button or a “navigation” link, providing crucial context. GPT-4V: Uses the data from YOLOv8 and BLIP-2 to make decisions and perform tasks like clicking on buttons or filling out forms. GPT-4V handles the reasoning and decision-making needed to interact effectively. Additionally, an OCR module extracts text from the screen, which helps in understanding labels and other context around GUI elements. By combining detection, text extraction, and semantic analysis, OmniParser offers a plug-and-play solution that works not only with GPT-4V but also with other vision models, increasing its versatility. Open-source flexibility OmniParser’s open-source approach is a key factor in its popularity. It works with a range of vision-language models, including GPT-4V, Phi-3.5-V, and Llama-3.2-V, making it flexible for developers with a broad range of access to advanced foundation models. OmniParser’s presence on Hugging Face has also made it accessible to a wide audience, inviting experimentation and improvement. This community-driven development is helping OmniParser evolve rapidly. Microsoft Partner Research Manager Ahmed Awadallah noted that open collaboration is key to building capable AI agents, and OmniParser is part of that vision. The race to dominate AI screen interaction The release of OmniParser is part of a broader competition among tech giants to dominate the space of AI screen interaction. Recently, Anthropic released a similar, but closed-source, capability called “Computer Use” as part of its Claude 3.5 update, which allows AI to control computers by interpreting screen content. Apple has also jumped into the fray with their Ferret-UI, aimed at mobile UIs, enabling their AI to understand and interact with elements like widgets and icons. What differentiates OmniParser from these alternatives is its commitment to generalizability and adaptability across different platforms and GUIs. OmniParser isn’t limited to specific environments, such as only web browsers or mobile apps—it aims to become a tool for any vision-enabled LLM to interact with a wide range of digital interfaces, from desktops to embedded screens.  Challenges and the road ahead Despite its strengths, OmniParser is not without limitations. One ongoing challenge is the accurate detection of repeated icons, which often appear in similar contexts but serve different purposes—for instance, multiple “Submit” buttons on different forms within the same page. According to Microsoft’s documentation, current models still struggle to differentiate between these repeated elements effectively, leading to potential missteps in action prediction. Moreover, the OCR component’s bounding box precision can sometimes be off, particularly with overlapping text, which can result in incorrect click predictions. These challenges highlight the complexities inherent in designing AI agents capable of accurately interacting with diverse and intricate screen environments.  However, the AI community is optimistic that these issues can be resolved with ongoing improvements, particularly given OmniParser’s open-source availability. With more developers contributing to fine-tuning these components and sharing their insights, the model’s capabilities are likely to evolve rapidly.  source

Microsoft’s agentic AI tool OmniParser rockets up the open source charts Read More »

Unleash the power of data, AI agents and humans to transform CX

Presented by Kustomer Customer service is undergoing a rapid transformation, driven by evolving customer expectations and technological advancements. While AI and automation have been hailed as game-changers, the full promise of AI has yet to be universally realized. It’s time to change that and put an end to bad customer service once and for all. The stakes are high: globally, poor customer experiences cost organizations $3.7 trillion annually — an increase of $600 billion from last year. According to our 2024 AI and Customer Service Index, only 50% of people believe AI has improved service in recent years. As we look ahead to 2025, it’s clear that customer service is still broken, and the traditional approaches aren’t delivering. New answers are needed, and those answers lie in the powerful fusion of data, AI and humans. At Kustomer, we’ve spent nearly a decade reinventing customer service, and we’ve learned that the future comes down to one thing: empowering human agents with AI and real-time data. With over two billion consumer interactions under our belt, it’s clear this synergy is the key to transforming CX from a cost center into a growth engine. When businesses can leverage the combined power of humans, AI, and data, they unlock scalable, adaptable, and delightful customer experiences at every touchpoint. Data + AI + humans: A unified, data-driven approach for next-level CX In today’s rapidly evolving customer service landscape, the key to delivering outstanding experiences lies in the seamless integration of data, AI and humans. When these three forces come together, businesses can provide proactive, personalized service at scale, transforming customer interactions from reactive problem-solving to strategic relationship-building. Why data + AI + humans? The equation is simple but powerful: Data provides the foundation, AI amplifies efficiency and humans bring empathy and insight. Together, they create a customer service experience that’s both intelligent and human-centric — one that doesn’t just meet expectations but exceeds them. Research supports this shift. 76% of consumers expect proactive service, while 71% demand personalized interactions. Even more telling is that 76% of customers will switch providers if these expectations aren’t met. To stay ahead, businesses must shift from traditional, reactive models to a more proactive, data-driven approach, and that’s where this winning combination comes into play. The role of data: Anticipating needs, not just reacting At the heart of proactive service is data. Most platforms wait for a ticket to gather information, reacting only after a problem arises. But with Kustomer’s approach, data is collected and analyzed in real time –right from the moment a customer places an order. This allows businesses to anticipate needs, solve issues before they even arise and deliver a seamless experience. Our CRM pulls in all relevant data — purchase history, preferences, behavior patterns — into one unified timeline, creating a 360-degree view of each customer. This comprehensive view ensures that both AI and human agents have the context they need to make informed, thoughtful decisions. AI: Enhancing service, not replacing humans Now, let’s talk about AI, specifically AI agents, and how they fit into this equation. Unlike platforms that bolt on AI as an afterthought, Kustomer’s AI agents are fully integrated into the workflow, designed to work alongside human agents rather than replace them. Think of AI agents as the ultimate customer service sidekick: S.M.A.R.T. — Specialized, Multi-Channel, Advanced in reasoning, Responsive and Team-oriented. These agents aren’t just glorified chatbots –they’re capable of handling complex tasks, understanding customer needs and making real-time decisions based on data. Whether they’re fielding routine inquiries or solving more advanced problems, AI agents free up human agents to focus on what they do best: relationship-building and complex problem-solving. For example, imagine you’re trying to reschedule a flight. A chatbot might only offer a generic response: “Visit the airline’s website to change your flight.” But Kustomer’s AI agent does more. It pulls your travel history, checks availability and proactively suggests the best options based on your preferences — then, if things get tricky, it seamlessly passes you to a human agent who knows the full context. Why it’s better: The AI agent isn’t just answering questions; it’s solving problems, anticipating needs and working alongside humans to deliver a personalized, efficient experience. Specialized: Unlike competitors that offer only one generic AI, we offer multiple AI agents tailored for specific tasks. Multi-channel: Our AI agents operate seamlessly across SMS, email, voice and WhatsApp, delivering a consistent experience across all platforms. Advanced Reasoning: Powered by Generative AI, our agents provide smart, accurate answers in real time. Responsive: Handling even the most complex conversations swiftly. Teamwork: AI agents work side by side with human agents, blending automation with human empathy. By automating routine tasks and harnessing real-time data, AI agents can proactively solve problems, often before customers are even aware of them. This allows humans to focus on delivering the kind of high-value service that only human empathy and insight can provide. Humans: The empathy engine While AI handles efficiency, humans remain the core of great customer service. No matter how sophisticated AI becomes, there’s no substitute for the emotional intelligence and problem-solving skills that human agents bring to the table. With AI managing repetitive tasks and providing data-driven insights, human agents are empowered to do what they do best: deliver personalized, empathetic service that builds lasting relationships. This synergy — data informing AI, AI empowering humans and humans elevating the experience — creates a level of customer service that’s proactive, personalized and strategic. Breaking free from legacy pricing models But delivering exceptional service isn’t just about the technology; it’s also about how companies pay for it. As customer service evolves, so must the pricing models to better meet the needs of businesses. Traditional, seat-based pricing has long restricted companies from scaling their customer service operations effectively. According to our 2024 State of Pricing in Customer Service report, 93.5% of companies are still tied to seat-based pricing, but many are eager for change. The reason? Managing multiple seat types — whether full-time, part-time, admin

Unleash the power of data, AI agents and humans to transform CX Read More »

Google just gave its AI access to Search, hours before OpenAI launched ChatGPT Search

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Google launched real-time search capabilities for its Gemini AI platform on Thursday, enabling its language models to access current information from Google Search. The new feature, called “Grounding with Google Search,” targets developers building AI applications, distinguishing it from OpenAI’s consumer-focused ChatGPT Search service launched the same day. “We’re focused on putting search-augmented responses into developer workflows,” said Logan Kilpatrick, a product leader at Google, in an exclusive interview with VentureBeat. “We’re leveraging what Google does uniquely well — making the world’s information accessible through search.” Say hello to Grounding with Google Search, available in the Gemini API + Google AI Studio! You can now access real time, fresh, up to date information from Google Search when building with Gemini by enabling the Grounding tool.https://t.co/oGVTOKHfM8 — Logan Kilpatrick (@OfficialLoganK) October 31, 2024 The system allows developers to supplement their AI applications with fresh search data, complete with citations and sources. The service costs $35 per 1,000 queries, reflecting the substantial computing requirements for real-time AI search. The technology uses a “dynamic retrieval” system that automatically determines when to tap into search results. Each query receives a score between 0 and 1 — questions about current events score high (0.97), while creative writing prompts score low (0.13). This helps manage both costs and response times while maintaining accuracy. Inside the $49 billion battle for the future of search Google’s move to integrate search with its AI platform comes at a critical moment. The company earned $49.4 billion from search advertising in Q3 2024, but faces growing pressure from AI-powered alternatives. Running these systems requires massive computing resources — OpenAI expects to spend $5 billion on computing costs this year alone. The integration also raises questions about publisher compensation. Both Google and OpenAI have secured licensing deals with major news organizations, though the financial terms remain private. Several publishers, including The New York Times, have filed lawsuits over AI systems using their content without permission. Why OpenAI’s new ChatGPT Search could change how we find information online Hours after Google’s announcement, OpenAI launched ChatGPT Search, taking a different approach by targeting consumers directly. While Google focuses on providing tools for developers to build search-enhanced AI applications, OpenAI’s service offers end users a way to access current information about news, sports, stocks, and weather through a conversational interface – notably without advertisements. “The journey we’re on is using Google Search in more creative ways, through multiple surfaces,” said Shrestha Basu Mallick, Google’s group product manager for the Gemini API, in an interview with VentureBeat. “You’ll have it through AI Studio, the Gemini APIs, and it may eventually become native in the model itself.” This new phase of competition could reshape how people find information online. Rather than scrolling through pages of results, users may increasingly rely on AI systems to synthesize answers from multiple sources. However, questions remain about accuracy, publisher compensation, and whether companies can build sustainable business models around these computing-intensive services. The simultaneous launches suggest AI-powered search may evolve into a three-way race between Google, Microsoft (through its OpenAI partnership), and OpenAI itself. Google maintains advantages in search infrastructure and advertising revenue, while OpenAI has demonstrated skill in creating compelling consumer AI products. Microsoft, meanwhile, benefits from both through its multibillion-dollar OpenAI investment. source

Google just gave its AI access to Search, hours before OpenAI launched ChatGPT Search Read More »

Epic Games CEO Tim Sweeney’s path to the open metaverse is via enlightened self-interest

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Epic Games CEO Tim Sweeney still believes that the path to the open metaverse will yield bring together the entertainment, games and technology industries together in a bright future. But to get there, Sweeney believes the monopolistic platform owners need to embrace enlightened self-interest. He spoke about these topics with me in a recorded video fireside chat that we aired today at the sold-out GamesBeat Next 2024 event in San Francisco. Sweeney has pressured the major platforms like mobile leaders Google and Apple to give more favorable terms to game developers, as he doesn’t believe the game industry can invest in the metaverse or its future so long as those companies are taking 30% of every mobile game transaction. Sweeney is just one of a number of speakers talking about the metaverse, the future of games, and new technologies at our event. But he’s the only one engaged in litigation challenging the tech and game platforms to play fair. And he’s optimistic about the progress in regulatory and antitrust efforts. “We’re turning the tide,” he said in our chat. “And when we began this journey, a lot of people in 2020 when we launched the Free Fortnite campaign and started challenging Apple and Google through really aggressive litigation, a lot of people were only starting to think then about the possibilities for what these devices could be like as open platforms. But now we’re well under our way in transforming the world.” Epic Games is launching Fab, a unified digital content store for game devs. He said earlier at the Unreal Fest that Epic is in better financial shape than it was a year ago, when Epic had to lay off a lot of staff. Sweeney said the company spent the last year rebuilding. Fortnite reached a peak last holiday season of 110 million monthly active users, and Sweeney said the Epic Games Store is seeing record success. But he notes the recent attempt to return the store and Fortnite to the official stores of Apple and Google have met with limited success, as the 15-plus questions posed by the platforms has stopped about 10 million of the 20 million trying to reach the store from completing their attempts. Sweeney hasn’t spoke at one of our events since 2021 (though we have done interviews at places like GDC), and so I asked him a wide range of questions in our fireside chat. We addressed topics about his views about the path to the open metaverse, the growth of user-generated content on Fortnite, Microsoft’s adoption of Unreal Engine 5 for Halo, the new Fab store for 3D assets from multiple engine providers, the legal attempts to liberate Fortnite around the world, the place for platforms (which he calls “monopoly rent collectors” ) in the open metaverse, Epic’s $1.5 billion investment from Disney and the mission of building a Disney universe connected to Fortnite, the impact of AI on game development and the advances that Unreal Engine 6 could bring to gaming. Sweeney foresees that Unreal Engine 6 could enable thousands of players to inhabit a shardless or nearly shardless infrastructure, meaning that thousands of players could join each other in a battle royale match. This kind of technology will be what the metaverse is all about. Unreal Engine 6 is likely to bring together the advances of Unreal Engine 5 with the UGC-focused advances of Unreal Editor for Fortnite (UEFN). He mentioned more than once the importance of Metcalfe’s Law (named after networking pioneer Bob Metcalfe), about how the value of a network or social experience grows in proportion to the number of friends you can connect with. And he noted how the metaverse-like cross-promotions of brands inside Fortnite could expand the audience not only for the game but the reach of the brands as well. I appreciate that Sweeney doesn’t shy from controversy and he answers questions without hesitation. And he continues to give guidance that Sweeney to developers so they can see the road ahead. Here’s an edited transcript of our interview. You can also watch the talk on the embedded video. Tim Sweeney is CEO of Epic Games. Dean Takahashi, lead writer for GamesBeat: This is Dean Takahashi. I’m the lead writer for Gamesbeat at VentureBeat. I’m very happy to be here with Tim Sweeney, the founder and CEO of Epic Games, and we’re talking about the path to the open metaverse here, which is our favorite subject we’ve been talking about for like 15 years or so. It’s interesting how it comes and goes, but definitely Neil Stephenson, who was just here, thought of this 30 years ago. It makes me feel a little old. Tim, the last time you and I sat down, we talked about the open metaverse in 2021 and how it will require enlightened self-interest from major companies. So what’s changed over the past three years and what stayed the same? Sweeney: We’ve seen a lot of companies come together to contribute code and content into Fortnite and other metaverse ecosystem efforts The Fortnite crossovers have been really telling about the willingness of companies to partner together in ways that they haven’t done traditionally. The massive crossovers of Star Wars and Marvel characters into Fortnite. Both Sony and Microsoft putting content into Fortnite. Putting signature characters from Halo and God of War. All of the major music labels agreeing to put their music into Fortnite and rekindle interest in music through Fortnite’s music modes and (jam stage mode). All of the film and television industries found it incredibly valuable and mutually beneficial to do these crossovers because content comes into Fortnite. Fortnite players who might not have otherwise been aware of their stuff become really interested in it and watch the movie. The movie customers who might not have played Fortnite come into Fortnite and everybody benefits from the uplift. Takahashi: On that

Epic Games CEO Tim Sweeney’s path to the open metaverse is via enlightened self-interest Read More »

Patronus AI launches world’s first self-serve API to stop AI hallucinations

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More A customer service chatbot confidently describes a product that doesn’t exist. A financial AI invents market data. A healthcare bot provides dangerous medical advice. These AI hallucinations, once dismissed as amusing quirks, have become million-dollar problems for companies rushing to deploy artificial intelligence. Today, Patronus AI, a San Francisco startup that recently secured $17 million in Series A funding, launched what it calls the first self-serve platform to detect and prevent AI failures in real-time. Think of it as a sophisticated spell-checker for AI systems, catching errors before they reach users. Inside the AI safety net: How it works “Many companies are grappling with AI failures in production, facing issues like hallucinations, security vulnerabilities, and unpredictable behavior,” said Anand Kannappan, Patronus AI’s CEO, in an interview with VentureBeat. The stakes are high: Recent research by the company found that leading AI models like GPT-4 reproduce copyrighted content 44% of the time when prompted, while even advanced models generate unsafe responses in over 20% of basic safety tests. The timing couldn’t be more critical. As companies rush to implement generative AI capabilities — from customer service chatbots to content generation systems — they’re discovering that existing safety measures fall short. Current evaluation tools like Meta’s LlamaGuard perform below 50% accuracy, making them little better than a coin flip. Patronus AI’s solution introduces several innovations that could reshape how businesses deploy AI. Perhaps most significant is its “judge evaluators” feature, which allows companies to create custom rules in plain English. “You can customize evaluation to exactly like your product needs,” Varun Joshi, Patronus AI’s product lead, told VentureBeat. “We let customers write out in English what they want to evaluate and check for.” A financial services company might specify rules about regulatory compliance, while a healthcare provider could focus on patient privacy and medical accuracy. From detection to prevention: The technical breakthrough The system’s cornerstone is Lynx, a breakthrough hallucination detection model that outperforms GPT-4 by 8.3% in detecting medical inaccuracies. The platform operates at two speeds: a quick-response version for real-time monitoring and a more thorough version for deeper analysis. “The small versions can be used for real-time guardrails, and the large ones might be more appropriate for offline analysis,” Joshi told VentureBeat. Beyond traditional error checking, the company has developed specialized tools like CopyrightCatcher, which detects when AI systems reproduce protected content, and FinanceBench, the industry’s first benchmark for evaluating AI performance on financial questions. These tools work in concert with Lynx to provide comprehensive coverage against AI failures. Beyond simple guard rails: Reshaping AI safety The company has adopted a pay-as-you-go pricing model, starting at $10 per 1000 API calls for smaller evaluators and $20 per 1000 API calls for larger ones. This pricing structure could dramatically increase access to AI safety tools, making them available to startups and smaller businesses that previously couldn’t afford sophisticated AI monitoring. Early adoption suggests major enterprises see AI safety as a critical investment, not just a nice-to-have feature. The company has already attracted clients including HP, AngelList, and Pearson, along with partnerships with tech giants like Nvidia, MongoDB, and IBM. What sets Patronus AI apart is its focus on improvement rather than just detection. “We can actually highlight the span of the specific piece of text where the hallucination is,” Kannappan explained. This precision allows engineers to quickly identify and fix problems, rather than just knowing something went wrong. The race against AI hallucinations The launch comes at a pivotal moment in AI development. As large language models like GPT-4 and Claude become more powerful and widely used, the risks of AI failures grow correspondingly larger. A hallucinating AI system could expose companies to legal liability, damage customer trust, or worse. Recent regulatory moves, including President Biden’s AI executive order and the EU’s AI Act, suggest that companies will soon face legal requirements to ensure their AI systems are safe and reliable. Tools like Patronus AI’s platform could become essential for compliance. “Good evaluation is not just protecting against a bad outcome — it’s deeply about improving your models and improving your products,” Joshi emphasizes. This philosophy reflects a maturing approach to AI safety, moving from simple guard rails to continuous improvement. The real test for Patronus AI isn’t just catching mistakes — it will be keeping pace with AI’s breakneck evolution. As language models grow more sophisticated, their hallucinations may become harder to spot, like finding increasingly convincing forgeries. The stakes couldn’t be higher. Every time an AI system invents facts, recommends dangerous treatments, or generates copyrighted content, it erodes the trust these tools need to transform business. Without reliable guardrails, the AI revolution risks stumbling before it truly begins. In the end, it’s a simple truth: If artificial intelligence can’t stop making things up, it may be humans who end up paying the price. source

Patronus AI launches world’s first self-serve API to stop AI hallucinations Read More »

Is Google’s NotebookLM a secret CRM killer?

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More I’ve never worked in sales, at least not virtually. The closest I’ve come — and this will date me — is working in retail at the mall as a teenager, and then at the VHS/DVD rental store down the street from my childhood home, so I have tremendous respect for those who do it at a much higher level than I ever did. But as a result of my short-lived sales experience, I’ve never been a regular user of a customer relationship management (CRM) software program like Salesforce or Microsoft Dynamics 365 or Creatio. And candidly, all of the CRMs I’ve ever seen other people use have felt like overkill to me — what do you need besides a contacts list or email inbox? As it turns out, I may not be alone: Sam Lessin, former VP of product at Facebook and current general partner at Slow Ventures, this morning posted a message on the social network X with a screenshot detailing an example of how his VC firm is using Google’s AI application NotebookLM in place of a CRM, and finding it to be incredibly effective. The application allows for users to ask questions via text and generate synthetic podcasts from data sources they supply. Lessin’s screenshot is reproduced and the text as well below. Short the structured CRM companies (cough $CRM, cough) NotebookLM and the future of AI Structureless CRM — WOW. Here is a personal experience re: the future of CRM… Most venture capital firms spend a lot of time and effort putting deals / etc. into their CRMs and tracking it all in a structured way… We never did that… for a whole host of reasons, but mostly laziness // cost/value… but what we did do for the last decade is write each other a weekly email on the deals we were looking at seriously, notes on interesting meetings, portfolio companies, etc…. The problem is that while we had that history it was buried in email and really quite hard to do anything with except in the rare / odd situation where it was worth digging up real history. Enter NotebookLM — here is a magical experience for you that illustrates how LLMs really do change the game for how people work… 1. Dump 10 years of emails: I just took the entire ~10 years of history of these emails and exported them as a *.mbox file … and then extracted from that .mbox the key bits of each email (from, date, subject, body) 2. Upload to NotebookLM: take the entire email history and just upload it to Google’s NotebookLM. 3. Tada, I can now ask the ten years of email history covering nearly our entire investment history any question I want… and the answers are SICK. It works GREAT (and deep-links / references everything in the history where needed. SO WHAT I just don’t see how any structured CRM product that requires work to update and maintain will survive this paradigm shift. It matters a ton to have the recorded raw material / the input from humans… but it should flow like conversation (text and voice)—and then the machines can take care of the rest… Human manual input into structured fields is DONE — as are the countless SaaS platforms designed to do / speed just that — and that starts with CRMs side magic, I was actually really nicely surprised that where I have written these scripts myself historically, chatGPT just abstracted the whole thing… upload the file, tell it what to do, get the stripped text back (not rocket science but cool!) Gaming out the impact of NotebookLM on the CRM market Clearly, not all organizations will be able to follow-suit and may miss some of the more robust and powerful features from CRM leaders such as Salesforce and Microsoft. But even if a few hundred or thousand users do use NotebookLM, or similar open source AI tools, in place of a CRM, it could spell trouble for the companies offering them — at least when it comes to the revenue being generated from this product lines. Indeed, as for right now, NotebookLM access is free with a free Google user account. With enterprises spending an average of 12% of their revenue on information technology (which include software subscriptions), the temptation to switch to more affordable or free solutions — even if less fully featured — may be quite tempting for large for IT decision-makers. Of course, with CRM providers racing to offer new AI-enabled updates, agents, and tools, it’s highly likely most will try to offer their own analogous versions of NotebookLM sooner rather than later. NotebookLM’s incredible ascent NotebookLM has gained lots of attention recently as one of the most exciting and well-received products to come from Google’s AI efforts, specifically its “Audio Outputs” feature, which allow users to upload documents or URLs, even videos, to NotebookLM’s creation space on the web — notebooklm.google.com — and generates a custom podcast based on the sources provided, complete with AI “hosts” and synthetic voices that engage in easygoing, conversational banter with eerily humanlike qualities. The product, powered by Google’s Gemini family of large language models, has climbed to millions of users and more than 80,000 organizations, according to a post on X earlier this month. Last week, Google announced it was working on an enterprise-specific version of the application, NotebookLM Business, to launch later this year. Indeed, NotebookLM Editorial Leader Steven Johnson reshared Lessin’s X post today on using NotebookLM as a CRM, remarking: “Such a great way to use the product. If this sounds like it could be helpful in your firm, stay tuned for NotebookLM Business https://notebooklm.google/business” source

Is Google’s NotebookLM a secret CRM killer? Read More »

AMD reports record revenue but Q4 forecast disappoints

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Advanced Micro Devices reported record revenue of $6.8 billion for the third fiscal quarter, up 18% from a year ago. But the shares fell due to a disappointing forecast for the fourth quarter. AMD saw record data center segment revenue of $3.5 billion in the quarter, up 122% from a year ago. It was driven by record Epyc CPU and Instinct GPU revenues. Client revenue in the quarter was $1.9 billion, up 29% from a year ago. That was driven by strong demand for Zen 5 Ryzen processors. The weak part was the gaming segment, which saw revenue of $462 million, down 69% from a year ago due to lower semi-custom revenue. That revenue mainly comes from sales from game console revenues. Embedded segment revenue of $927 million, down 25% from a year ago as customers continued to normalize inventory levels. Non-GAAP gross margins were 54%, up 3 percentage points from a year ago thanks to success in the data center. Net income was $1.5 billion, up 33%. AMD estimated Q4 revenue will be $7.5 billion, plus or minus $300 million. It cited supply chain constraints hurting the overall ability to meet demand. “We delivered strong third quarter financial results with record revenue led by higher sales of EPYC and Instinct data center products and robust demand for our Ryzen PC processors,” said AMD CEO Lisa Su, in a statement. “Looking forward, we see significant growth opportunities across our data center, client and embedded businesses driven by the insatiable demand for more compute.” “We are pleased with our execution in the third quarter, delivering strong year-over-year expansion in gross margin and earnings per share,” said AMD CFO Jean Hu, in a statement. “We are on-track to deliver record annual revenue for 2024 based on significant growth in our Data Center and Client segments.” If there’s anything to put AMD’s success in perspective, it’s only to look over at its rival Intel to see how tough a time it is having now. source

AMD reports record revenue but Q4 forecast disappoints Read More »

Study finds LLMs can identify their own mistakes

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More A well-known problem of large language models (LLMs) is their tendency to generate incorrect or nonsensical outputs, often called “hallucinations.” While much research has focused on analyzing these errors from a user’s perspective, a new study by researchers at Technion, Google Research and Apple investigates the inner workings of LLMs, revealing that these models possess a much deeper understanding of truthfulness than previously thought. The term hallucination lacks a universally accepted definition and encompasses a wide range of LLM errors. For their study, the researchers adopted a broad interpretation, considering hallucinations to encompass all errors produced by an LLM, including factual inaccuracies, biases, common-sense reasoning failures, and other real-world errors. Most previous research on hallucinations has focused on analyzing the external behavior of LLMs and examining how users perceive these errors. However, these methods offer limited insight into how errors are encoded and processed within the models themselves. Some researchers have explored the internal representations of LLMs, suggesting they encode signals of truthfulness. However, previous efforts were mostly focused on examining the last token generated by the model or the last token in the prompt. Since LLMs typically generate long-form responses, this practice can miss crucial details. The new study takes a different approach. Instead of just looking at the final output, the researchers analyze “exact answer tokens,” the response tokens that, if modified, would change the correctness of the answer. The researchers conducted their experiments on four variants of Mistral 7B and Llama 2 models across 10 datasets spanning various tasks, including question answering, natural language inference, math problem-solving, and sentiment analysis. They allowed the models to generate unrestricted responses to simulate real-world usage. Their findings show that truthfulness information is concentrated in the exact answer tokens.  “These patterns are consistent across nearly all datasets and models, suggesting a general mechanism by which LLMs encode and process truthfulness during text generation,” the researchers write. To predict hallucinations, they trained classifier models, which they call “probing classifiers,” to predict features related to the truthfulness of generated outputs based on the internal activations of the LLMs. The researchers found that training classifiers on exact answer tokens significantly improves error detection. “Our demonstration that a trained probing classifier can predict errors suggests that LLMs encode information related to their own truthfulness,” the researchers write. Generalizability and skill-specific truthfulness The researchers also investigated whether a probing classifier trained on one dataset could detect errors in others. They found that probing classifiers do not generalize across different tasks. Instead, they exhibit “skill-specific” truthfulness, meaning they can generalize within tasks that require similar skills, such as factual retrieval or common-sense reasoning, but not across tasks that require different skills, such as sentiment analysis. “Overall, our findings indicate that models have a multifaceted representation of truthfulness,” the researchers write. “They do not encode truthfulness through a single unified mechanism but rather through multiple mechanisms, each corresponding to different notions of truth.” Further experiments showed that these probing classifiers could predict not only the presence of errors but also the types of errors the model is likely to make. This suggests that LLM representations contain information about the specific ways in which they might fail, which can be useful for developing targeted mitigation strategies. Finally, the researchers investigated how the internal truthfulness signals encoded in LLM activations align with their external behavior. They found a surprising discrepancy in some cases: The model’s internal activations might correctly identify the right answer, yet it consistently generates an incorrect response. This finding suggests that current evaluation methods, which solely rely on the final output of LLMs, may not accurately reflect their true capabilities. It raises the possibility that by better understanding and leveraging the internal knowledge of LLMs, we might be able to unlock hidden potential and significantly reduce errors. Future implications The study’s findings can help design better hallucination mitigation systems. However, the techniques it uses require access to internal LLM representations, which is mainly feasible with open-source models.  The findings, however, have broader implications for the field. The insights gained from analyzing internal activations can help develop more effective error detection and mitigation techniques. This work is part of a broader field of studies that aims to better understand what is happening inside LLMs and the billions of activations that happen at each inference step. Leading AI labs such as OpenAI, Anthropic and Google DeepMind have been working on various techniques to interpret the inner workings of language models. Together, these studies can help build more robots and reliable systems. “Our findings suggest that LLMs’ internal representations provide useful insights into their errors, highlight the complex link between the internal processes of models and their external outputs, and hopefully pave the way for further improvements in error detection and mitigation,” the researchers write. source

Study finds LLMs can identify their own mistakes Read More »