VentureBeat

Nvidia CEO touts India’s progress with sovereign AI and over 100K AI developers trained

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Nvidia CEO Jensen Huang noted India’s progress in its AI journey in a conversation at the Nvidia AI Summit in India. India now has more than 2,000 Nvidia Inception AI companies and more than 100,000 developers trained in AI. That compares to a global developer count of 600,000 people trained in Nvidia AI technologies, and India’s strategic move into AI is a good example of what Huang calls “sovereign AI,” where countries choose to create their own AI infrastructure to maintain control of their own data. Nvidia said that India is becoming a key producer of AI for virtually every industry — powered by thousands of startups that are serving the country’s multilingual, multicultural population and scalingout to global users. In addition to the 100,000 developers trained in AI in India, Nvidia said there have been an additional 100,000 academic and student developers trained as well. The country is one of the top six global economies leading generative AI adoption and has seen rapid growth in its startup and investor ecosystem, rocketing to more than 100,000 startups this year from under 500 in 2016. More than 2,000 of India’s AI startups are part of Nvidia Inception, a free program for startups designed to accelerate innovation and growth through technical training and tools, go-to-market support and opportunities to connect with venture capitalists through the Inception VC Alliance. At the NVIDIA AI Summit, taking place in Mumbai through Oct. 25, around 50 India-based startups are sharing AI innovations delivering impact in fields such as customer service, sports media, healthcare and robotics. Conversational AI for Indian Railway customers Nvidia is working closely with India on AI factories. Bengaluru-based startup CoRover.ai already has over a billion users of its LLM-based conversational AI platform, which includes text, audio and video-based agents. “The support of NVIDIA Inception is helping us advance our work to automate conversational AI use cases with domain-specific large language models,” said Ankush Sabharwal, CEO of CoRover, in a statement. “NVIDIA AI technology enables us to deliver enterprise-grade virtual assistants that support 1.3 billion users in over 100 languages.” CoRover’s AI platform powers chatbots and customer service applications for major private and public sector customers, such as the Indian Railway Catering and Tourism Corporation, the official provider of online tickets, drinking water and food for India’s railways stations and trains. Dubbed AskDISHA, after the Sanskrit word for direction, the IRCTC’s multimodal chatbot handles more than 150,000 user queries daily, and has facilitated over 10 billion interactions for more than 175 million passengers to date. It assists customers with tasks such as booking or canceling train tickets, changing boarding stations, requesting refunds, and checking the status of their booking in languages including English, Hindi, Gujarati and Hinglish — a mix of Hindi and English. The deployment of AskDISHA has resulted in a 70% improvement in IRCTC’s customer satisfaction rate and a 70% reduction in queries through other channels like social media, phone calls and emails. CoRover’s modular AI tools were developed using Nvidia NeMo, an end-to-end, cloud-native framework and suite of microservices for developing generative AI. They run on Nvidia GPUs in the cloud, enabling CoRover to automatically scale up compute resources during peak usage — such as the moment train tickets are released. Nvidia also noted that VideoVerse, founded in Mumbai, has built a family of AI models using Nvidia technology to support AI-assisted content creation in the sports media industry — enabling global customers including the Indian Premier League for cricket, the Vietnam Basketball Association and the Mountain West Conference for American college football to generate game highlights up to 15 times faster and boost viewership. It uses Magnifi, with tech like vision analysis to detect players and key moments for short form video. Nvidia also highlighted Mumbai-based startup Fluid AI, which offers generative AI chatbots, voice calling bots and a range of application programming interfaces to boost enterprise efficiency. Its AI tools let workers perform tasks like creating slide decks in under 15 seconds. Karya, based in Bengaluru, is a smartphone-based digital work platform that enables members of low-income and marginalized communities across India to earn supplemental income by completing language-based tasks that support the development of multilingual AI models. Nearly 100,000 Karya workers are recording voice samples, transcribing audio or checking the accuracy of AI-generated sentences in their native languages, earning nearly 20 times India’s minimum wage for their work. Karya also provides royalties to all contributors each time its datasets are sold to AI developers. Karya is employing over 30,000 low-income women participants across six language groups in India to help create the dataset, which will support the creation of diverse AI applications across agriculture, healthcare and banking. Serving over a billion local language speakers with LLMs India is investing in sovereign AI in an alliance with Nvidia. Namaste, vanakkam, sat sri akaal — these are just three forms of greeting in India, a country with 22 constitutionally recognized languages and over 1,500 more recorded by the country’s census. Around 10% of its residents speak English, the internet’s most common language. As India, the world’s most populous country, forges ahead with rapid digitalization efforts, its government and local startups are developing multilingual AI models that enable more Indians to interact with technology in their primary language. It’s a case study in sovereign AI — the development of domestic AI infrastructure that is built on local datasets and reflects a region’s specific dialects, cultures and practices. These public and private sector projects are building language models for Indic languages and English that can power customer service AI agents for businesses, rapidly translate content to broaden access to information, and enable government services to more easily reach a diverse population of over 1.4 billion individuals. To support initiatives like these, Nvidia has released a small language model for Hindi, India’s most prevalent language with over half a billion speakers. Now available

Nvidia CEO touts India’s progress with sovereign AI and over 100K AI developers trained Read More »

JRPG developer Falcom contemplating AI for localization efficiency

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Non-English game developers may soon be looking toward AI as a means to shorten localization times. In an interview with 4Gamer (translated by Siliconera) at Tokyo Game Show, Nihon Falcom president Toshihiro Kondo posited the idea of using artificial intelligence to more quickly localize games developed in the Japanese language. Kondo took the stage after a demonstration of ELLA, software created for the purpose of localizing a game’s text in multiple languages, using it in a hypothetical instance for Nihon Falcom’s Legend of Heroes: Kai no Kiseki, another Falcom game known for immense scripts. Kondo stated that he believed this process of AI translation would speed up translating games into multiple languages, though he thought that a human still needed to make a final pass on the script before it gets applied to the games. He also acknowledged that there’s cultural pushback to using AI in game development and that it could cost jobs. He also added that even Falcom staff in non-localization divisions, such as designers and artists, push back against using AI. There is a fear from artists that their illustrations will be used for AI learning that they do not consent to. Kondo is hoping these objections can be solved in the future, as he believes that localization AI will be a benefit. Nihon Falcom, as Kondo notes, does indeed take a lot of time between Japanese-language releases and localizations. The upcoming Ys X: Nordics released in Japan over one year ago but will only release in America in late October. On the other hand, the release of Ys VIII in 2018 required localization company NISA to formally apologize and redo the entire game’s script due to a messy and fast translation, so acting in the name of expediency has proven negative for Falcom games in the past. Still, localization through AI has been feeling inevitable for the translation industry for some time, so perhaps Kondo is thinking ahead of the curve. source

JRPG developer Falcom contemplating AI for localization efficiency Read More »

ServiceNow advocates for ‘invisible’ AI agents to ease worker adoption

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Enterprises are beginning to deploy AI agents. However, if organizations plan to deploy agentic ecosystems at scale and improve employee acceptance, they might consider treating AI agents as tools working in the background to avoid intimidating employees who think they have to know how to use these tools.  Dorit Zilbershot, vice president of AI and Innovation at ServiceNow, told VentureBeat that employees don’t have to know if teams of AI agents are working in the background.   “There’s so much AI around us that we’re not even aware, and that’s how we are thinking about AI agents in ServiceNow,” Zilbershot said. “It should just work. As an employee, I shouldn’t care if AI agents are in the background.” Zilbershot said employees become “managers” of AI agents in that they just need to do their regular work. The agents are automatically triggered to finish tasks.  Enterprises have begun embracing AI agents and exploring how to deploy them at scale, even as generative AI deployment in enterprises has fallen slightly. Zilbershot said ServiceNow’s generative AI platform, Now Assist, is the company’s “fastest-growing product to date.” Now Assist launched a library of AI agents for customers in September.  AI agents could ideally automate many workflows. This could include sales or product roadmaps, where one agent can encode customer information, another categorizes it and yet another informs an employee of a change in status. Zilbershot said agents don’t replace human employees, they take some busy work away, so the only time humans have to pay attention to an agent is if there’s an agent who’s supposed to interact with them. ServiceNow CEO Bill McDermott told VentureBeat in a separate interview that generative AI, particularly applications around agents, “has grown beyond our expectations.” “We’ve mastered the flow of work and governance, and we’re building agents solving unique problems,” McDermott said. “AI will be in every product we have.” As AI agents grow in popularity, Zilbershot said enterprises need to understand what makes agents work for their organization and employees.  Agents and not assistants Beyond AI agents quietly working in the background, Zilbershot said it’s essential for organizations to understand that agents are not assistants. If not, they risk setting an expectation to users that they will need to learn how to prompt agents instead of letting them work for them autonomously.  “I think we’re doing a little bit of a disservice to our customers when agents function more as assistants, but we don’t change the name,” Zilbershot said. “It just creates a wrong perception in the market and how people approach working with agents.”  Zilbershot added AI agents work best when there are other agents they can interact with, so to handle the expected sprawl of agents, orchestrator agents must be deployed to manage all the agents. ServiceNow ships an orchestrator agent with its Now Assist platform.  Other companies have begun offering enterprises access to use orchestrator agents and build custom AI agents. Crew AI launched an agentic platform this month, while Asana released an agent creator specifically for workflows.  Partnership with Nvidia To expand on its agentic ecosystem, ServiceNow announced it will begin building off-the-shelf AI agents using Nvidia’s NIM Agent Blueprint.  Zilbershot said using the NIM Agent Blueprint helps ServiceNow build more agents at the volume they feel is needed to make agents more efficient. “We’re expanding our ecosystem since there can be a limit to how much we can build on our own; we want to have a strong partnership with companies like NVIDIA to build native AI agents within the ServiceNow platform,” she said.  The first agent ServiceNow will build with Nvidia is a Vulnerability Analysis for Container Security AI Agent. The agent will automate vulnerability analysis and will be available on ServiceNow’s agent platform in 2025.  Zilbershot said the work with Nvidia will be just the first of many possible partnerships ServiceNow will enter into to expand AI agents.  source

ServiceNow advocates for ‘invisible’ AI agents to ease worker adoption Read More »

Enterprise AI moves from ‘experiment’ to ‘essential,’ spending jumps 130%

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More A new study reveals that generative AI has rapidly transformed from an experimental technology to an essential business tool, with adoption rates more than doubling in 2024.  The research, conducted by AI at Wharton, a research center at the Wharton School of the University of Pennsylvania, in partnership with GBK Collective, provides a comprehensive look at AI’s integration across American businesses. The research team surveyed more than 800 enterprise decision-makers across the United States, examining AI adoption patterns, investment trends, and organizational impacts. The study, titled “Growing Up: Navigating Gen AI’s Early Years,” compared data from 2023 to 2024, tracking changes in usage patterns, departmental adoption, and employee attitudes. Key Findings: • Weekly AI usage among business leaders surged from 37% to 72% • Organizations reported a 130% increase in AI spending since 2023 • 72% of companies are planning additional AI investments in 2025 • 90% of leaders now believe AI enhances employee skills (up from 80%) • Concerns about AI-related job displacement decreased from 75% to 72% • 58% of organizations rated AI’s performance as “great” “The most interesting things that come out of the survey is this snapshot of how corporates are feeling, thinking and implementing Gen AI, and how that is changing quite rapidly,” Stefano Puntoni, Sebastian S. Kresge Professor of Marketing at the Wharton School and co-director of AI at Wharton told VentureBeat. “This year, what we’re seeing is that people are less curious, they are more excited, they’re less scared and there is a more belief that these are tools that are going to augment human expertise.” Investment surge for enterprise AI is a ‘gold mine’ for consultants The research shows a dramatic increase in organizational spending on generative AI, with over 40% of companies now investing more than $10 million in the technology. This represents a significant shift from the previous year when the typical investment range was between $1-5 million. What is perhaps even more interesting than the rise in spending, is understanding where the money is going. “About a third of the money is spent on tech,” explained Puntoni. “But that’s actually a minority of all the money that is pouring into Gen AI.”  The remaining investment is distributed across training and upskilling the existing workforce, onboarding new employees and consulting services. While much of the hype and news in generative AI in 2024 has been about the technology, that’s not the differentiator for many enterprises at this point. “The technology itself is more or less a commodity. meaning, you know, my ChatGPT is as good as your ChatGPT and so the differentiation is largely going to come from the integration of the technology and business processes,” he said. “There’s no template, there’s no blueprint,  people will have to experiment and learn.” Puntoni actually expects that consultants, at least in the short term, will be the big winners in the AI gold rush. In his view, the technology part of generative AI is increasingly becoming commoditized. “I think we’re going to see a protracted period of experimentation, learning new business models and new ways of organizing business functions,” Puntoni said. “It’s a gold mine for consultants And I think this is not going to run out of gold anytime soon.” Small and mid-sized companies lead the way in AI An unexpected finding reveals that smaller organizations are currently ahead in AI adoption compared to their larger counterparts. The study defines smaller organizations as those with revenue between $50 million to $250 million and mid-sized as $250 million to $2 billion. “We still see a difference between smaller organizations and large organizations in reported adoption, as well as less restrictive uses within the organization for experimentation,” Jeremy Korst, Partner with GBK Collective, told VentureBeat.  Korst suggests this could lead to interesting competitive dynamics. That is if the smaller organizations are actually able to find not only cost efficiencies and productivity, but new business models and capabilities, overall competition could increase. Korst said in that situation smaller groups might be able to compete differently and more effectively with some of their larger organizations. What organizations should be doing now to improve enterprise AI outcomes Despite the increased adoption, organizations face several challenges in implementing AI effectively. The study highlights issues around data governance and security, with concerns about unintended data leakage within organizations even when using enterprise-grade AI tools. The research also indicates that while the adoption curve for generative AI has been unprecedented in its speed, organizations are now entering a more mature phase focused on practical implementation and return on investment “I think that organizations ought to be learning, I don’t think there is a way in which you’re going to be successful in the future unless you make a concerted, serious effort to see how this technology can help you,” Puntoni said. source

Enterprise AI moves from ‘experiment’ to ‘essential,’ spending jumps 130% Read More »

Microsoft’s Differential Transformer cancels attention noise in LLMs

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Improving the capabilities of large language models (LLMs) in retrieving in-prompt information remains an area of active research that can impact important applications such as retrieval-augmented generation (RAG) and in-context learning (ICL). Microsoft Research and Tsinghua University researchers have introduced Differential Transformer (Diff Transformer), a new LLM architecture that improves performance by amplifying attention to relevant context while filtering out noise. Their findings, published in a research paper, show that Diff Transformer outperforms the classic Transformer architecture in various settings. Transformers and the “lost-in-the-middle” phenomenon The Transformer architecture is the foundation of most modern LLMs. It uses an attention mechanism to weigh the importance of different parts of the input sequence when generating output. The attention mechanism employs the softmax function, which normalizes a vector of values into a probability distribution. In Transformers, the softmax function assigns attention scores to different tokens in the input sequence. However, studies have shown that Transformers struggle to retrieve key information from long contexts. “We began by investigating the so-called ‘lost-in-the-middle’ phenomenon,” Furu Wei, Partner Research Manager at Microsoft Research, told VentureBeat, referring to previous research findings that showed that LLMs “do not robustly make use of information in long input contexts” and that “performance significantly degrades when models must access relevant information in the middle of long contexts.” Wei and his colleagues also observed that some LLM hallucinations, where the model produces incorrect outputs despite having relevant context information, correlate with spurious attention patterns. “For example, large language models are easily distracted by context,” Wei said. “We analyzed the attention patterns and found that the Transformer attention tends to over-attend irrelevant context because of the softmax bottleneck.” The softmax function used in Transformer’s attention mechanism tends to distribute attention scores across all tokens, even those that are not relevant to the task. This can cause the model to lose focus on the most important parts of the input, especially in long contexts. “Previous studies indicate that the softmax attention has a bias to learn low-frequency signals because the softmax attention scores are restricted to positive values and have to be summed to 1,” Wei said. “The theoretical bottleneck renders [it] such that the classic Transformer cannot learn sparse attention distributions. In other words, the attention scores tend to flatten rather than focusing on relevant context.” Differential Transformer Differential Transformer (source: arXiv) To address this limitation, the researchers developed the Diff Transformer, a new foundation architecture for LLMs. The core idea is to use a “differential attention” mechanism that cancels out noise and amplifies the attention given to the most relevant parts of the input. The Transformer uses three vectors to compute attention: query, key, and value. The classic attention mechanism performs the softmax function on the entire query and key vectors. The proposed differential attention works by partitioning the query and key vectors into two groups and computing two separate softmax attention maps. The difference between these two maps is then used as the attention score. This process eliminates common noise, encouraging the model to focus on information that is pertinent to the input. The researchers compare their approach to noise-canceling headphones or differential amplifiers in electrical engineering, where the difference between two signals cancels out common-mode noise. While Diff Transformer involves an additional subtraction operation compared to the classic Transformer, it maintains efficiency thanks to parallelization and optimization techniques. “In the experimental setup, we matched the number of parameters and FLOPs with Transformers,” Wei said. “Because the basic operator is still softmax, it can also benefit from the widely used FlashAttention cuda kernels for acceleration.” In retrospect, the method used in Diff Transformer seems like a simple and intuitive solution. Wei compares it to ResNet, a popular deep learning architecture that introduced “residual connections” to improve the training of very deep neural networks. Residual connections made a very simple change to the traditional architecture yet had a profound impact. “In research, the key is to figure out ‘what is the right problem?’” Wei said. “Once we can ask the right question, the solution is often intuitive. Similar to ResNet, the residual connection is an addition, compared with the subtraction in Diff Transformer, so it wasn’t immediately apparent for researchers to propose the idea.” Diff Transformer in action The researchers evaluated Diff Transformer on various language modeling tasks, scaling it up in terms of model size (from 3 billion to 13 billion parameters), training tokens, and context length (up to 64,000 tokens). Their experiments showed that Diff Transformer consistently outperforms the classic Transformer architecture across different benchmarks. A 3-billion-parameter Diff Transformer trained on 1 trillion tokens showed consistent improvements of several percentage points compared to similarly sized Transformer models. Further experiments with different model sizes and training dataset sizes confirmed the scalability of Diff Transformer. Their findings suggest that in general, Diff Transformer requires only around 65% of the model size or training tokens needed by a classic Transformer to achieve comparable performance. The Diff Transformer is more efficient than the classic Transformer in terms of both parameters and train tokens (source: arXiv) The researchers also found that Diff Transformer is particularly effective in using increasing context lengths. It showed significant improvements in key information retrieval, hallucination mitigation, and in-context learning. While the initial results are promising, there’s still room for improvement. The research team is working on scaling Diff Transformer to larger model sizes and training datasets. They also plan to extend it to other modalities, including image, audio, video, and multimodal data. The researchers have released the code for Diff Transformer, implemented with different attention and optimization mechanisms. They believe the architecture can help improve performance across various LLM applications. “As the model can attend to relevant context more accurately, it is expected that these language models can better understand the context information with less in-context hallucinations,” Wei said. “For example, for the retrieval-augmented generation settings (such as Bing Chat, Perplexity,

Microsoft’s Differential Transformer cancels attention noise in LLMs Read More »

Anthropic’s new AI can use computers like a human, redefining automation for enterprises

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Anthropic, the AI research and safety company, has announced a new suite of capabilities—including an upgraded version of its flagship AI model, Claude 3.5 Sonnet, and a new model, Claude 3.5 Haiku—that could transform how businesses automate complex workflows. But the most striking development in this release is a new feature: Claude can now use a computer like a human, navigating screens, clicking buttons, and typing text. This new feature, called “Computer Use,” could have far-reaching implications for industries that rely on repetitive tasks involving multiple applications and tabs. From data entry to research to customer service, the potential applications are broad—and potentially industry-shaping. Video credit: Anthropic AI moves from text to screen interaction Since its founding, Anthropic has focused on creating AI models that are safe, reliable, and capable of complex reasoning. With Claude 3.5 Sonnet and Haiku, the company is expanding the model’s capabilities even further. The new “Computer Use” feature allows AI to perform tasks that were previously handled only by human workers, such as opening applications, interacting with interfaces, and filling out forms. “Computer use capabilities have the potential to change how tasks that require navigation across multiple applications are performed,” said Mike Krieger, Chief Product Officer at Anthropic, in an exclusive interview with VentureBeat. “This could lead to more innovative product experiences and streamlined back-office processes.” Krieger emphasized that the new capability is still in its beta phase, but as the technology evolves, it could improve data analysis, visualization, and user interface interactions, making many tasks more efficient. “We anticipate it being particularly useful for tasks like conducting online research, performing repetitive processes like testing new software, and automating complex multi-step tasks,” he said. “As the technology matures, it could enhance data analysis, visualization, and user interface interactions, potentially improving accessibility… We’re excited to see how developers will leverage this capability to create new tools and workflows that enhance productivity and user experiences across various sectors.” Claude 3.5 Sonnet, Anthropic’s newest AI model, autonomously completes a vendor request form by retrieving required information from a CRM system, showcasing its ability to perform multi-step tasks across different software platforms. (Credit: Anthropic) Early adopters see potential Anthropic’s early partners, including GitLab, Canva, and Replit, are already benefiting from Claude 3.5 Sonnet’s new features. GitLab, which specializes in software development and security, has been testing the model for automating tasks in their development pipeline. According to the company, Claude has improved reasoning capabilities by up to 10% without slowing down performance, making it well-suited for complex, multi-step processes like software testing and deployment. Replit, a coding platform, has gone a step further. Michele Catasta, President of Replit, said the model “opens the door to creating a powerful autonomous verifier that can evaluate apps while they’re being built.” This could ease bottlenecks in software development, where testing often delays project timelines. Meanwhile, Canva, the graphic design platform, is exploring how Claude’s computer use skills could speed up design creation and editing. Danny Wu, Head of AI Products at Canva, said in a statement, “We’re discovering time-savings within our team that could be game-changing for users.” What does “Computer Use” actually mean? What sets this new capability apart from traditional automation tools is that Claude isn’t confined to specific workflows or software programs. Instead, it can “see” a screen using screenshots, interact with various applications, and adapt to different tasks as they come up. This flexibility makes it more versatile than current robotic process automation (RPA) technologies. For example, in a demo shared by Anthropic, Claude helps complete a vendor request form for Ant Equipment Co. In the video, Claude starts by taking a screenshot of the computer screen, identifies that some necessary information is missing from a spreadsheet, then navigates to a CRM system, locates the required data, and fills out the form—all without human intervention. This level of automation could have major implications for industries like finance, legal services, and customer support, where tasks often involve switching between multiple systems and applications. “Claude could open spreadsheets, run analyses, and create visualizations. For customer service, it could navigate CRM systems to quickly find and update customer information,” Krieger told VentureBeat. Video Credit: Anthropic Security and privacy concerns However, the ability for AI to control a computer raises serious security and privacy concerns. Anthropic has built several safeguards into the system to address these risks. The company made it clear that Claude cannot access a computer without a developer providing the necessary tools. “Claude cannot ‘just use your computer.’ The computer use feature requires developers to provide tools like a screenshot tool and an action-execution layer, which allows Claude to perform mouse movements and keystrokes,” Krieger explained. Anthropic is also taking a cautious approach by releasing the feature in a limited public beta, available only through an API. This allows developers to test it in controlled environments before it becomes more widely available. The company has also developed classifiers to detect misuse and prevent the AI from interacting with sensitive websites, such as government portals. “Our methods to scan for prohibited activity are designed to safeguard customer data privacy and confidentiality,” Krieger said. A new era for office automation? In the near term, businesses could see immediate productivity gains in areas like data entry, customer service, and IT support. But as the technology matures, the potential applications could extend far beyond these initial use cases. Imagine a world where AI handles complex legal processes, from reviewing contracts to completing compliance forms. Or envision AI assisting doctors in navigating electronic health records and diagnosing patients by cross-referencing medical databases. Claude’s new “Computer Use” feature brings us closer to a future where AI can perform a wide range of tasks that span different software applications and systems. This gives it a level of flexibility that was previously unimaginable for AI technologies, which were often confined to specific, narrow tasks. Video Credit: Anthropic Proceeding

Anthropic’s new AI can use computers like a human, redefining automation for enterprises Read More »

DeepMind’s Talker-Reasoner framework brings System 2 thinking to AI agents

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More AI agents must solve a host of tasks that require different speeds and levels of reasoning and planning capabilities. Ideally, an agent should know when to use its direct memory and when to use more complex reasoning capabilities. However, designing agentic systems that can properly handle tasks based on their requirements remains a challenge. In a new paper, researchers at Google DeepMind introduce Talker-Reasoner, an agentic framework inspired by the “two systems” model of human cognition. This framework enables AI agents to find the right balance between different types of reasoning and provide a more fluid user experience. System 1, System 2 thinking in humans and AI The two-systems theory, first introduced by Nobel laureate Daniel Kahneman, suggests that human thought is driven by two distinct systems. System 1 is fast, intuitive, and automatic. It governs our snap judgments, such as reacting to sudden events or recognizing familiar patterns. System 2, in contrast, is slow, deliberate, and analytical. It enables complex problem-solving, planning, and reasoning.   While often treated as separate, these systems interact continuously. System 1 generates impressions, intuitions, and intentions. System 2 evaluates these suggestions and, if endorsed, integrates them into explicit beliefs and deliberate choices. This interplay allows us to seamlessly navigate a wide range of situations, from everyday routines to challenging problems. Current AI agents mostly operate in a System 1 mode. They excel at pattern recognition, quick reactions, and repetitive tasks. However, they often fall short in scenarios requiring multi-step planning, complex reasoning, and strategic decision-making—the hallmarks of System 2 thinking. Talker-Reasoner framework Talker-Reasoner framework (source: arXiv) The Talker-Reasoner framework proposed by DeepMind aims to equip AI agents with both System 1 and System 2 capabilities. It divides the agent into two distinct modules: the Talker and the Reasoner. The Talker is the fast, intuitive component analogous to System 1. It handles real-time interactions with the user and the environment. It perceives observations, interprets language, retrieves information from memory, and generates conversational responses. The Talker agent usually uses the in-context learning (ICL) abilities of large language models (LLMs) to perform these functions. The Reasoner embodies the slow, deliberative nature of System 2. It performs complex reasoning and planning. It is primed to perform specific tasks and interacts with tools and external data sources to augment its knowledge and make informed decisions. It also updates the agent’s beliefs as it gathers new information. These beliefs drive future decisions and serve as the memory that the Talker uses in its conversations.  “The Talker agent focuses on generating natural and coherent conversations with the user and interacts with the environment, while the Reasoner agent focuses on performing multi-step planning, reasoning, and forming beliefs, grounded in the environment information provided by the Talker,” the researchers write. The two modules interact primarily through a shared memory system. The Reasoner updates the memory with its latest beliefs and reasoning results, while the Talker retrieves this information to guide its interactions. This asynchronous communication allows the Talker to maintain a continuous flow of conversation, even as the Reasoner carries out its more time-consuming computations in the background. “This is analogous to [the] behavioral science dual-system approach, with System 1 always being on while System 2 operates at a fraction of its capacity,” the researchers write. “Similarly, the Talker is always on and interacting with the environment, while the Reasoner updates beliefs informing the Talker only when the Talker waits for it, or can read it from memory.” Detailed structure of Talker-Reasoner framework (source: arXiv) Talker-Reasoner for AI coaching The researchers tested their framework in a sleep coaching application. The AI coach interacts with users through natural language, providing personalized guidance and support for improving sleep habits. This application requires a combination of quick, empathetic conversation and deliberate, knowledge-based reasoning. The Talker component of the sleep coach handles the conversational aspect, providing empathetic responses and guiding the user through different phases of the coaching process. The Reasoner maintains a belief state about the user’s sleep concerns, goals, habits, and environment. It uses this information to generate personalized recommendations and multi-step plans. The same framework could be applied to other applications, such as customer service and personalized education. The DeepMind researchers outline several directions for future research. One area of focus is optimizing the interaction between the Talker and the Reasoner. Ideally, the Talker should automatically determine when a query requires the Reasoner’s intervention and when it can handle the situation independently. This would minimize unnecessary computations and improve overall efficiency. Another direction involves extending the framework to incorporate multiple Reasoners, each specializing in different types of reasoning or knowledge domains. This would allow the agent to tackle more complex tasks and provide more comprehensive assistance. source

DeepMind’s Talker-Reasoner framework brings System 2 thinking to AI agents Read More »

Meta just beat Google and Apple in the race to put powerful AI on phones

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Meta Platforms has created smaller versions of its Llama artificial intelligence models that can run on smartphones and tablets, opening new possibilities for AI beyond data centers. The company announced compressed versions of its Llama 3.2 1B and 3B models today that run up to four times faster while using less than half the memory of earlier versions. These smaller models perform nearly as well as their larger counterparts, according to Meta’s testing. The advancement uses a compression technique called quantization, which simplifies the mathematical calculations that power AI models. Meta combined two methods: Quantization-Aware Training with LoRA adaptors (QLoRA) to maintain accuracy, and SpinQuant to improve portability. This technical achievement solves a key problem: running advanced AI without massive computing power. Until now, sophisticated AI models required data centers and specialized hardware. Tests on OnePlus 12 Android phones showed the compressed models were 56% smaller and used 41% less memory while processing text more than twice as fast. The models can handle texts up to 8,000 characters, enough for most mobile apps. Meta’s compressed AI models (SpinQuant and QLoRA) show dramatic improvements in speed and efficiency compared to standard versions when tested on Android phones. The smaller models run up to four times faster while using half the memory. (Credit: Meta) Tech giants race to define AI’s mobile future Meta’s release intensifies a strategic battle among tech giants to control how AI runs on mobile devices. While Google and Apple take careful, controlled approaches to mobile AI — keeping it tightly integrated with their operating systems — Meta’s strategy is markedly different. By open-sourcing these compressed models and partnering with chip makers Qualcomm and MediaTek, Meta bypasses traditional platform gatekeepers. Developers can build AI applications without waiting for Google’s Android updates or Apple’s iOS features. This move echoes the early days of mobile apps, when open platforms dramatically accelerated innovation. The partnerships with Qualcomm and MediaTek are particularly significant. These companies power most of the world’s Android phones, including devices in emerging markets where Meta sees growth potential. By optimizing its models for these widely-used processors, Meta ensures its AI can run efficiently on phones across different price points — not just premium devices. The decision to distribute through both Meta’s Llama website and Hugging Face, the increasingly influential AI model hub, shows Meta’s commitment to reaching developers where they already work. This dual distribution strategy could help Meta’s compressed models become the de facto standard for mobile AI development, much as TensorFlow and PyTorch became standards for machine learning. The future of AI in your pocket Meta’s announcement today points to a larger shift in artificial intelligence: the move from centralized to personal computing. While cloud-based AI will continue to handle complex tasks, these new models suggest a future where phones can process sensitive information privately and quickly. The timing is significant. Tech companies face mounting pressure over data collection and AI transparency. Meta’s approach — making these tools open and running them directly on phones — addresses both concerns. Your phone, not a distant server, could soon handle tasks like document summarization, text analysis, and creative writing. This mirrors other pivotal shifts in computing. Just as processing power moved from mainframes to personal computers, and computing moved from desktops to smartphones, AI appears ready for its own transition to personal devices. Meta’s bet is that developers will embrace this change, creating applications that blend the convenience of mobile apps with the intelligence of AI. Success isn’t guaranteed. These models still need powerful phones to run well. Developers must weigh the benefits of privacy against the raw power of cloud computing. And Meta’s competitors, particularly Apple and Google, have their own visions for AI’s future on phones. But one thing is clear: AI is breaking free from the data center, one phone at a time. source

Meta just beat Google and Apple in the race to put powerful AI on phones Read More »

AI agents fed by process intelligence power the next gen of enterprise AI performance

Presented by Celonis Current C-suite and board views of AI can be summed up in a single phrase with the famous line from the American movie classic Jerry Maguire: “Show me the money!”  For many enterprises, AI’s honeymoon period has ended. Poll after poll makes clear that today’s top bosses want AI to turbocharge business KPIs and digital transformation to provide clear value — and fast. The opportunities to quickly create cost-saving and revenue-enhancing AI sought by organizational leaders are huge, says Divya Krishnan, VP of product marketing at Celonis. “Right now, there’s a big disconnect between AI’s potential in organizations and its actual performance,” she explains. “Large language models (LLMs) are impressive, but many enterprises are struggling to translate their use into meaningful business outcomes.” Similarly, while AI agents can automate tasks and workloads, she explains, they lack understanding of important business context and nuance, and often fall short. “Without process intelligence, there is no class of data that captures how work gets done that is being given to enterprise AI models,” she notes. “And that means there’s always going to be a ceiling on what they can realistically automate for you until they have that input at hand.” Fast, impactful AI that drives the right actions and outcomes must be trained with specific performance data from a company’s own process intelligence, not generic industry modes, she says.  The key: Powering AI with PI At Celosphere, its annual user conference in Munich, Celonis announced multiple product innovations and extended partnerships that make it easier for customers to power AI with process intelligence. The company unveiled AgentC, a suite of tools, integrations and partnerships that enable enterprises to develop AI agents and CoPilots powered by Celonis Process Intelligence or use AI agents pre-built by partners like Rollio and Hypatos. Organizations can choose to build agents with leading platforms such as Microsoft Copilot Studio, IBM watsonx Orchestrate, Amazon Bedrock Agents and open-source developer environments like CrewAI. Enterprises creating their own agents can benefit from support of expert consulting partners Accenture, EY and IBM.     “Those integrations are crucial,” said Krishnan, “because that’s what’s going to enable people to build these agents with the right data at hand, data that can make sure the agent you build is tailored to your unique business, data that you won’t get anywhere else.” Celonis Process Intelligence powers AI agents with process  data and business context — key to improving processes across systems, departments and organizations. Users of LLM AI fed by process intelligence can now ask conversational questions like those enjoyed by consumers:  “Why is my on-time delivery rate low and how much is it costing us?” “Give me three recommendations for improving working capital.” “Which regions are likely to have late deliveries and what can we do about it?” Early adopters report real value  According to Gartner, the global market for process mining software grew 40% in 2023. Worldwide sales for process automation are expected to reach $26 billion by 2027. Nearly 90% of corporate leaders surveyed by HFS Research plan to increase investments in process intelligence. A big part of the appeal, Gartner concludes: “Generative AI helps organizations use process mining to uncover hidden patterns, optimize operations and make informed decisions.”  Maureen Fleming, VP for Intelligent Process Automation at IDC, concurred. “Understanding the intricacies of processes and their interdependencies is crucial to achieving effective AI-driven digital transformation.” Companies deploying AI fed with process intelligence are reporting clear benefits in understanding how their businesses run and how to make them run better.  A sampling from across industries: Cosentino, a leading manufacturer of design and architectural surfaces, implemented a Celonis-powered AI assistant for credit block management. The assistant helps the team analyze blocked orders within seconds, enabling credit managers to process up to 5x more orders per day without additional risk. A European packaging company has implemented an agent that allows plant technicians to view spare part inventory levels in nearby plants, enabling them to utilize stock transfers instead of placing orders with suppliers. A multinational construction material provider employs a similar agent to link inquiries and requests to their corresponding invoices and purchase orders, automating the resolution process with features like auto-responses, ERP updates and internal forwarding.  A global consumer goods company uses an agent to extract payment terms from PDF contracts, compare them against terms in their master data, purchase orders and invoices, and recommend actions to accounts payable clerks to resolve any inconsistencies.   A global car manufacturer has adopted an agent that automatically generates email replies to supplier inquiries, such as questions regarding the status of invoices. Lastly, a major technology leader plans to implement an agent that enhances the customer funding request process by predicting the likelihood of request rejections and notifying the applicants accordingly. Building AI agents in-house or on partner platforms Developing agents, fed with process intelligence, in-house allows enterprises to tailor the agents to their specific processes, workflows and industry nuances. Taking this path can provide tight intellectual property protection by keeping proprietary algorithms and insights within the company. Companies can quickly adjust and improve agents based on immediate feedback and changing needs. And because internal teams have intimate knowledge of the company’s operations, they can potentially develop more effective AI agents to competitive advantage.  At the same time, bringing in multiple parties to develop AI agents fed by process intelligence also brings numerous advantages: Diverse expertise, faster innovation enabled by an ecosystem of developers, greater industry customization, wider scalability and faster continuous improvement from a larger ecosystem. Celonis provides a foundation for both in-house development and integration of external AI agents, says Krishnan. This allows companies to remain adaptable, choosing the best approach for each specific use case. Platform innovations on the horizon Celonis also announced multiple innovations that are being rolled out to enhance scalability, ease of use and overall value realization: Celonis Data Core, Celocore for short, is a platform enhancement designed to help customers get data into Celonis more quickly and once it’s there,

AI agents fed by process intelligence power the next gen of enterprise AI performance Read More »

LinkedIn founder Reid Hoffman unveils ‘super agency’ vision at TED AI conference, takes subtle shot at Elon Musk

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Reid Hoffman, the LinkedIn co-founder and prominent tech investor, offered an optimistic vision for artificial intelligence on Tuesday, introducing his concept of “super agency” that frames AI as a tool for human empowerment rather than replacement. Speaking at a TED AI conference fireside chat with CNBC’s Julia Boorstin in San Francisco, Hoffman previewed themes from his upcoming book on super agency, positioning AI as the next frontier of human capability enhancement. “If you look back at technology, it actually massively increases human agency,” Hoffman said. “Each of these major technological leaps give us superpowers.” He drew parallels between historical innovations like horses and automobiles to today’s AI systems, which he characterized as “cognitive superpowers.” AI election risks and regulation: Silicon Valley leader pushes back on concerns The timing of Hoffman’s messaging appears strategic, coming amid growing anxiety about AI’s impact on jobs and democracy. While acknowledging concerns about job displacement and election misinformation, Hoffman maintained that transition challenges are manageable. On election integrity, Hoffman downplayed immediate risks from AI-generated deepfakes in the 2024 race, though he acknowledged future concerns. “Undoubtedly, there is some use of AI crime and misinformation… but it doesn’t yet have a significant impact,” he said, suggesting technical solutions like “encryption timestamps” could help authenticate content. Hoffman also defended California Governor Gavin Newsom’s recent veto of sweeping AI regulation, praising instead the White House’s approach of seeking voluntary commitments from tech companies before implementing specific rules. “Having essentially vague, uncertain penalties and uncertain evaluations is a very good way to quell the future development of emerging technology,” he argued. Enterprise AI opportunities: Where startups can still compete with big tech For enterprise leaders watching AI developments, Hoffman emphasized that despite the dominance of large tech companies in developing foundation models, opportunities remain for startups building applications on top of them. “There’s a massive amount of AI now,” he said, pointing to areas like sales, marketing, and computer security as fertile ground for innovation. Notably, Hoffman envisioned AI democratizing access to expertise, describing a future where everyone with a phone could access “the equivalent of a GP everywhere in the world.” This vision aligns with growing enterprise interest in AI assistants and automated customer service solutions. Silicon Valley’s political divide: Tech leaders split on AI policy and regulation The discussion revealed tensions in Silicon Valley’s political landscape, with Hoffman addressing what Boorstin characterized as a rightward shift among tech leaders. The conversation took a pointed turn when Hoffman appeared to criticize fellow tech leader Elon Musk’s support of Trump, without naming him directly. When discussing tech leaders’ rightward shift, Hoffman questioned the motives of “some people who are out there campaigning and spreading pretty wild conspiracy theories… not just on x.com but in other places.” He suggested such support might be driven by “self-interested” pursuits like “getting government contracts,” rather than genuine policy convictions. The veiled reference to Musk, who has pledged millions to Trump’s campaign and frequently posts pro-Trump content on his X platform, highlights growing divisions among Silicon Valley’s elite over the upcoming election. Hoffman, a prominent Democratic supporter and backer of Vice President Kamala Harris, attributed some of the broader rightward movement to “single issue voters around cryptocurrency” and business interests seeking favorable regulation. He emphasized that a “stable business environment you can invest in is much more important” than pursuing narrow interests like corporate tax cuts. Future of work and AI’s next chapter Hoffman’s vision suggests a fundamental shift in how we should think about AI adoption. While much of Silicon Valley frames artificial intelligence as a replacement for human work, his “super agency” concept positions it as an amplifier of human potential. “Humans not using AI will be replaced by humans using AI,” Hoffman predicted, arguing that the real divide won’t be between humans and machines, but between those who embrace AI’s capabilities and those who don’t. The stakes of this transition extend far beyond Silicon Valley. As AI capabilities expand, Hoffman’s optimistic vision will be tested against mounting concerns about job displacement and technological control. But his core message is clear: the future belongs not to those who resist AI, but to those who learn to harness it as a tool for human empowerment—even if that means fundamentally rethinking what it means to be human in an AI-enabled world. source

LinkedIn founder Reid Hoffman unveils ‘super agency’ vision at TED AI conference, takes subtle shot at Elon Musk Read More »