VentureBeat

DataRobot launches Enterprise AI Suite to bridge gap between AI development and business value

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More As enterprises worldwide pour resources into AI efforts, many struggle to convert their technological investments into measurable business outcomes.  That’s the challenge that DataRobot is looking to solve with a series of new product updates announced today. DataRobot is not new to the AI space, in fact the company has been in business for 12 years, well before the current generative AI boom. A core focus for the company since inception has been enabling predictive analytics to help improve business outcomes. Like many others in recent years, DataRobot has turned its attention to gen AI support. With the new Enterprise AI Suite, announced today, DataRobot is looking to go further and differentiate itself in an increasingly crowded market. The new integrated platform promises to enable enterprises to start solving business problems with AI out-of-the-box, rather than having to piece together multiple services. The platform is designed to work across multiple cloud environments as well as on-premises, giving customers more flexibility. The Enterprise AI Suite is a comprehensive platform that helps enterprises build, deploy and manage both predictive and generative AI applications while ensuring proper governance and safety controls. DataRobot’s focus is on creating tangible business value from AI, rather than just providing the technology. “How do you take AI to the next level in terms of value creation? I tell people that customers don’t eat models for breakfast,” Debanjan Saha, CEO of DataRobot, told VentureBeat. “You need to build applications and agents, and not only that, you have to integrate them into their business fabric in order to create value. That’s what this release is all about.” Addressing the challenges of enterprise AI implementation  According to recent DataRobot research, 90% of AI projects fail to move from prototype to production. “Just training models does not create any enterprise value,” Saha said. The new DataRobot Enterprise AI Suite introduces application templates that provide immediate functionality while maintaining customization flexibility. This approach addresses a common market gap between inflexible off-the-shelf AI applications and resource-intensive custom development. Saha explained that the templates are designed to be horizontal, meaning they can be applied across different industries, rather than being vertically-specific. While the templates provide a starting point, enterprises have the ability to customize them to their specific needs. This includes:  Changing the data sources, adjusting model parameters, modifying the user interface and integrating the applications with other systems in a technology stack. Unifying predictive and generative AI A key differentiator for DataRobot’s platform is its unified approach to both traditional predictive AI and gen AI capabilities.  The platform allows organizations to extend foundation models with enterprise data while implementing necessary safety controls. DataRobot’s Enterprise AI’s suite supports a full Retrieval Augmented Generation (RAG) pipeline to help extend foundation models like Llama 3 and Gemini with enterprise data. One of the new templates combines both technologies for enhanced business outcomes. As a potential use case, Saha said for example an enterprise could use the predictive model to predict which customer is going to churn, when they are going to churn and why they are going to churn. Data from that predictive model can then be used with a gen AI model to create a hyper personalized next best offer email campaign. The DataRobot platform includes built-in safeguards for both predictive and generative models.  “These models have all sorts of issues with respect to accuracy, with respect to leaking privacy, or private or secure data,” Saha noted. “So there are a whole bunch of guard models that you want to put around them.” Advanced Agentic AI brings new reasoning to enterprise use cases Another standout feature in the new DataRobot platform is the integration of AI agent capabilities. The agentic AI approach is designed to help organizations handle complex business queries and workflows. The system employs specialist agents that work together to solve multi-faceted business problems. This approach is particularly valuable for organizations dealing with complex data environments and multiple business systems. “You ask a question to your agentic workflow, it breaks up the questions into a set of more specific questions, and then it routes them to agents which are specialists in various different areas,” Saha explained. For instance, a business analyst’s question about revenue might be routed to multiple specialized agents – one handling SQL queries, another using Python – before combining results into a comprehensive response. Observability and governance are the keys to enterprise AI success As part of the DataRobot updates the company is also rolling out a new observability stack. The new observability capabilities provide detailed insights into AI system performance, especially for RAG  implementations.  For example, Saha explained that an organization might have a corpus of enterprise data. The organization is using some kind of chunking and embedding model, mapping it to a vector database and then putting an LLM in front of it. What happens if the responses aren’t what the organization expects? That’s where observability fits in. The platform offers advanced visualization and analytical tools to diagnose such issues. “We have put together a lot of instrumentation which lets people visually understand, for example, if you have a lot of clustering of data in the vector database, you can get a spurious answer,” Saha said. “You would be able to see that, if you see your questions are landing in areas where you don’t have enough information.” This observability extends to the platform’s governance capabilities, with real-time monitoring and intervention features. The system can automatically detect and handle sensitive information, with customizable rules for different scenarios.  “We are really excited about what we call AI that makes business sense,” Saha said. “DataRobot has always been very good at focusing on creating business value from AI – it’s not technology for the sake of technology.” source

DataRobot launches Enterprise AI Suite to bridge gap between AI development and business value Read More »

Microsoft’s new Magentic-One system directs multiple AI agents to complete user tasks

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Enterprises looking to deploy multiple AI agents often need to implement a framework to manage them.  To this end, Microsoft researchers recently unveiled a new multi-agent infrastructure called Magentic-One that allows a single AI model to power various helper agents that work together to complete complex, multi-step tasks in different scenarios. Microsoft calls Magentic-One a generalist agentic system that can “fully realize the long-held vision of agentic systems that can enhance our productivity and transform our lives.” The framework is open-source and available to researchers and developers, including for commercial purposes, under a custom Microsoft License. In conjunction with the release of Magentic-One, Microsoft also released an open-source agent evaluation tool called AutoGenBench to test agentic systems, built atop its previously released Autogen framework for multi-agent communication and cooperation. The idea behind generalist agentic systems is to figure out how autonomous agents can solve tasks that require several steps to finish that are often found in the day to day running of an organization or even an individual’s daily life.  From the examples Microsoft provided, it looks like the company hopes Magentic-One fulfills almost mundane tasks. Researchers pointed Magentic-One to tasks like describing trends in the S&P 500, finding and exporting missing citations, and even ordering a shawarma.  How Magentic-One works Magentic-One relies on an Orchestrator agent that directs four other agents. The Orchestrator not only manages the agents, directing them to do specific tasks, but also redirects them if there are errors. The framework is composed of four types of agents other than the Orchestrator: Websurfer agents can command Chromium-based web browsers and navigate to websites or perform web searches. It can also click and type, similar to Anthropic’s recently released Computer Use, and summarize content.  FIleSurfer agents read local files list directories and go through folders. Coder agents write codes, analyze information from other agents and create new artifacts. ComputerTerminal provides a console where the Coder agent’s programs can be executed.  The Orchestrator directs these agents and tracks their progress. It starts by planning how to tackle the task. It creates what Microsoft researchers call a task ledger that tracks the workflow. As the task continues, the Orchestrator builds a progress ledger “where it self-reflects on task progress and checks whether the task is completed.” The Orchestrator can assign an agent to complete each task or update the task ledger. The Orchestrator can create a new plan if the agents remain stuck.  “Together, Magentic-One’s agents provide the Orchestrator with the tools and capabilities that it needs to solve a broad variety of open-ended problems, as well as the ability to autonomously adapt to, and act in, dynamic and ever-changing web and file-system environments,” the researchers wrote in the paper.  While Microsoft developed Magentic-One using OpenAI’s GPT-4o — OpenAI is after, all a Microsoft investment — it is LLM-agnostic, though the researchers “recommend a strong reasoning model for the Orchestrator agent such as GPT-4o.”  Magentic-One supports multiple models behind the agents, for example, developers can deploy a reasoning LLM for the Orchestrator agent and a mix of other LLMs or small language models to the different agents. Microsoft’s researchers experimented with a different Magentic-One configuration “using OpenAI 01-preview for the outer loop of the Orchestrator and for the Coder, while other agents continue to use GPT-4o.” The next step in agentic frameworks Agentic systems are becoming more popular as more options to deploy agents, from off-the-shelf libraries of agents to customizable organization-specific agents, have arisen. Microsoft announced its own set of AI agents for the Dynamics 365 platform in October.  Tech companies are now beginning to compete on AI orchestration frameworks, particularly systems that manage agentic workflows. OpenAI released its Swarm framework, which gives developers a simple yet flexible way to allow agents to guide agentic collaboration. CrewAI’s multi-agent builder also offers a way to manage agents. Meanwhile, most enterprises have relied on LangChain to help build agentic frameworks.  However, AI agent deployment in the enterprise is still in its early stages, so figuring out the best multi-agent framework will continue to be an ongoing experiment. Most AI agents still play in their playground instead of talking to agents from other systems. As more enterprises begin using AI agents, managing that sprawl and ensuring AI agents seamlessly hand off work to each other to complete tasks is more crucial.  source

Microsoft’s new Magentic-One system directs multiple AI agents to complete user tasks Read More »

AI’s math problem: FrontierMath benchmark shows how far technology still has to go

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Artificial intelligence systems may be good at generating text, recognizing images, and even solving basic math problems—but when it comes to advanced mathematical reasoning, they are hitting a wall. A groundbreaking new benchmark, FrontierMath, is exposing just how far today’s AI is from mastering the complexities of higher mathematics. Developed by the research group Epoch AI, FrontierMath is a collection of hundreds of original, research-level math problems that require deep reasoning and creativity—qualities that AI still sorely lacks. Despite the growing power of large language models like GPT-4o and Gemini 1.5 Pro, these systems are solving fewer than 2% of the FrontierMath problems, even with extensive support. “We collaborated with 60+ leading mathematicians to create hundreds of original, exceptionally challenging math problems,” Epoch AI announced in a post on X.com. “Current AI systems solve less than 2%.” The goal is to see how well machine learning models can engage in complex reasoning, and so far, the results have been underwhelming. A Higher Bar for AI FrontierMath was designed to be much tougher than the traditional math benchmarks that AI models have already conquered. On benchmarks like GSM-8K and MATH, leading AI systems now score over 90%, but those tests are starting to approach saturation. One major issue is data contamination—AI models are often trained on problems that closely resemble those in the test sets, making their performance less impressive than it might seem at first glance. “Existing math benchmarks like GSM8K and MATH are approaching saturation, with AI models scoring over 90%—partly due to data contamination,” Epoch AI posted on X.com. “FrontierMath significantly raises the bar.” In contrast, the FrontierMath problems are entirely new and unpublished, specifically crafted to prevent data leakage. These aren’t the kinds of problems that can be solved with basic memorization or pattern recognition. They often require hours or even days of work from human mathematicians, and they cover a wide range of topics—from computational number theory to abstract algebraic geometry. Mathematical reasoning of this caliber demands more than just brute-force computation or simple algorithms. It requires what Fields Medalist Terence Tao calls “deep domain expertise” and creative insight. After reviewing the benchmark, Tao remarked, “These are extremely challenging. I think that in the near term, basically the only way to solve them is by a combination of a semi-expert like a graduate student in a related field, maybe paired with some combination of a modern AI and lots of other algebra packages.” The FrontierMath benchmark challenges AI models, with nearly 100% of problems unsolved, compared to much lower difficulty in traditional benchmarks like GSM-8K and MATH. (Source: Epoch AI) Why Is Math So Hard for AI? Mathematics, especially at the research level, is a unique domain for testing AI. Unlike natural language or image recognition, math requires precise, logical thinking, often over many steps. Each step in a proof or solution builds on the one before it, meaning that a single error can render the entire solution incorrect. “Mathematics offers a uniquely suitable sandbox for evaluating complex reasoning,” Epoch AI posted on X.com. “It requires creativity and extended chains of precise logic—often involving intricate proofs—that must be meticulously planned and executed, yet allows for objective verification of results.” This makes math an ideal testbed for AI’s reasoning capabilities. It’s not enough for the system to generate an answer—it has to understand the structure of the problem and navigate through multiple layers of logic to arrive at the correct solution. And unlike other domains, where evaluation can be subjective or noisy, math provides a clean, verifiable standard: either the problem is solved or it isn’t. But even with access to tools like Python, which allows AI models to write and run code to test hypotheses and verify intermediate results, the top models are still falling short. Epoch AI evaluated six leading AI systems, including GPT-4o, Gemini 1.5 Pro, and Claude 3.5 Sonnet, and found that none could solve more than 2% of the problems. A visualization of interconnected mathematical fields in the FrontierMath benchmark, spanning areas like number theory, combinatorics, and algebraic geometry. (Source: Epoch AI) The Experts Weigh In The difficulty of the FrontierMath problems has not gone unnoticed by the mathematical community. In fact, some of the world’s top mathematicians were involved in crafting and reviewing the benchmark. Fields Medalists Terence Tao, Timothy Gowers, and Richard Borcherds, along with International Mathematical Olympiad (IMO) coach Evan Chen, shared their thoughts on the challenge. “All of the problems I looked at were not really in my area and all looked like things I had no idea how to solve,” Gowers said. “They appear to be at a different level of difficulty from IMO problems.” The problems are designed not just to be hard but also to resist shortcuts. Each one is “guessproof,” meaning it’s nearly impossible to solve without doing the mathematical work. As the FrontierMath paper explains, the problems have large numerical answers or complex mathematical objects as solutions, with less than a 1% chance of guessing correctly without the proper reasoning. This approach prevents AI models from using simple pattern matching or brute-force approaches to stumble upon the right answer. The problems are specifically designed to test genuine mathematical understanding, and that’s why they’re proving so difficult for current systems. Despite their advanced capabilities, leading AI models like GPT-4o and Gemini 1.5 Pro have solved fewer than 2% of the FrontierMath problems, highlighting significant gaps in AI’s mathematical reasoning. (Source: Epoch AI) The Long Road Ahead Despite the challenges, FrontierMath represents a critical step forward in evaluating AI’s reasoning capabilities. As the authors of the research paper note, “FrontierMath represents a significant step toward evaluating whether AI systems possess research-level mathematical reasoning capabilities.” This is no small feat. If AI can eventually solve problems like those in FrontierMath, it could signal a major leap forward in machine intelligence—one that goes beyond mimicking human behavior

AI’s math problem: FrontierMath benchmark shows how far technology still has to go Read More »

Google DeepMind open-sources AlphaFold 3, ushering in a new era for drug discovery and molecular biology

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Google DeepMind has unexpectedly released the source code and model weights of AlphaFold 3 for academic use, marking a significant advance that could accelerate scientific discovery and drug development. The surprise announcement comes just weeks after the system’s creators, Demis Hassabis and John Jumper, were awarded the 2024 Nobel Prize in Chemistry for their work on protein structure prediction. AlphaFold 3 represents a quantum leap beyond its predecessors. While AlphaFold 2 could predict protein structures, version 3 can model the complex interactions between proteins, DNA, RNA, and small molecules — the fundamental processes of life. This matters because understanding these molecular interactions drives modern drug discovery and disease treatment. Traditional methods of studying these interactions often require months of laboratory work and millions in research funding — with no guarantee of success. The system’s ability to predict how proteins interact with DNA, RNA, and small molecules transforms it from a specialized tool into a comprehensive solution for studying molecular biology. This broader capability opens new paths for understanding cellular processes, from gene regulation to drug metabolism, at a scale previously out of reach. Silicon Valley meets science: The complex path to open-source AI The timing of the release highlights an important tension in modern scientific research. When AlphaFold 3 debuted in May, DeepMind’s decision to withhold the code while offering limited access through a web interface drew criticism from researchers. The controversy exposed a key challenge in AI research: how to balance open science with commercial interests, particularly as companies like DeepMind’s sister organization Isomorphic Labs work to develop new drugs using these advances. The open-source release offers a middle path. While the code is freely available under a Creative Commons license, access to the crucial model weights requires Google’s explicit permission for academic use. This approach attempts to satisfy both scientific and commercial needs — though some researchers argue it should go further. Breaking the code: How DeepMind’s AI rewrites molecular science The technical advances in AlphaFold 3 set it apart. The system’s diffusion-based approach, which works directly with atomic coordinates, represents a fundamental shift in molecular modeling. Unlike previous versions that needed special handling for different molecule types, AlphaFold 3’s framework aligns with the basic physics of molecular interactions. This makes the system both more efficient and more reliable when studying new types of molecular interactions. Notably, AlphaFold 3’s accuracy in predicting protein-ligand interactions exceeds traditional physics-based methods, even without structural input information. This marks an important shift in computational biology: AI methods now outperform our best physics-based models in understanding how molecules interact. Beyond the lab: AlphaFold 3’s promise and pitfalls in medicine The impact on drug discovery and development will be substantial. While commercial restrictions currently limit pharmaceutical applications, the academic research enabled by this release will advance our understanding of disease mechanisms and drug interactions. The system’s improved accuracy in predicting antibody-antigen interactions could accelerate therapeutic antibody development, an increasingly important area in pharmaceutical research. Of course, challenges remain. The system sometimes produces incorrect structures in disordered regions and can only predict static structures rather than molecular motion. These limitations show that while AI tools like AlphaFold 3 advance the field, they work best alongside traditional experimental methods. The release of AlphaFold 3 represents an important step forward in AI-powered science. Its impact will extend beyond drug discovery and molecular biology. As researchers apply this tool to various challenges — from designing enzymes to developing resilient crops — we’ll see new applications in computational biology. The true test of AlphaFold 3 lies ahead in its practical impact on scientific discovery and human health. As researchers worldwide begin using this powerful tool, we may see faster progress in understanding and treating disease than ever before. source

Google DeepMind open-sources AlphaFold 3, ushering in a new era for drug discovery and molecular biology Read More »

Exclusive: Northflank scores $22.3 million to make cloud infrastructure less of a nightmare for developers

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Northflank, a London-based cloud deployment platform, announced $22.3 million in new funding today to help companies ship code faster without wrestling with complex infrastructure. Bain Capital Ventures led the $16 million Series A round, while Vertex Ventures US led an additional $6.3 million seed round. The startup aims to solve a persistent problem in enterprise software: developers spend too much time configuring infrastructure instead of writing code. Companies currently face an unsatisfying choice between inflexible third-party platforms they quickly outgrow or expensive internal systems requiring large teams to maintain. “Infrastructure has gotten far too complicated, too expensive, and it forces developers to spend less time on actually writing the code that they care about,” said Will Stewart, CEO and co-founder of Northflank, in an exclusive interview with VentureBeat. “Instead, they’re in the weeds fighting YAML and Helm charts all day.” How Northflank makes Kubernetes actually usable for developers The company’s platform enables developers to deploy applications, databases, and automated jobs across major cloud providers including AWS, Google Cloud, Microsoft Azure, and Oracle Cloud. Northflank distinguishes itself through a novel approach to Kubernetes, the widely-adopted but complex container orchestration system that underpins modern cloud infrastructure. “Northflank has found the right abstraction over Kubernetes, which allows us real-time dashboard, whether it’s GitOps, UI templates to define these applications, databases and pipelines,” Stewart explained. “We liken it to an operating system.” The results speak for themselves: developers can deploy their first container to production in under five minutes. The platform now handles over 10 billion public egress requests monthly and orchestrates more than 1.3 million container deployments per month. Northflank’s visual pipeline editor shows how developers can orchestrate complex deployment workflows without writing configuration code, a key feature that sets it apart from traditional cloud platforms. (Credit: Northflank) From teenage gamers to enterprise cloud infrastructure leaders Stewart and co-founder Frederik Brix met as teenagers playing online games, where they began deploying game servers using container technologies. This hands-on experience revealed broader applications: “A game server is just a Docker file, just a microservice,” Stewart told VentureBeat. “If you can apply the same automation techniques to any workload, you could enable any software engineer to deploy any workload with the same consistent developer experience.” The approach has won over notable customers including Sentry, Writer, and Chai Discovery. Some enterprise customers now deploy up to 1,000 microservices in a single project through Northflank’s platform. Slater Stich, partner at Bain Capital Ventures, sees Northflank solving a fundamental problem in enterprise software deployment. “Inside big companies, app deployment is usually a slog,” Stich said. “Before talking with Northflank, I had almost accepted this as a necessary evil. Northflank is different. By building on top of K8s with the right abstractions, Northflank gives developers a PaaS-like deployment experience while giving platform engineers full control of the underlying infrastructure.” A view of Northflank’s dashboard shows how the platform simplifies complex cloud deployments through an intuitive interface that monitors containers and deployment status in real time. (Credit: Northflank) Why enterprise companies are ditching internal developer platforms Traditional internal developer platforms require 10-25 platform engineers, costing companies up to $3 million annually in personnel alone. Northflank offers consumption-based pricing, charging for resource usage on their infrastructure or taking a percentage of cloud spend when customers use their own cloud accounts. The platform addresses data privacy and regulatory compliance concerns by keeping customer data within their chosen cloud environments. “Customer data runtime is running in the Cloud account of their choice,” Stewart said. “Their data is in their cloud account in the region and the zone that they want to run,” allowing companies to meet various regional data regulations. The road ahead Northflank will use the new funding to expand cloud provider support, add regions to their multi-tenant platform, and build out 24/7 enterprise support coverage. The company plans to develop a self-deployable control plane for enterprise customers who need maximum control over their deployment infrastructure. “Our goal is to become the default way that engineering teams deploy and operate software,” Stewart told VentureBeat. “It’s still crazy to me that it’s so complex to deploy cloud infrastructure and applications after 10 years of investment in Kubernetes and the surrounding ecosystem — it’s almost got harder today than it was 10 years ago.” Kindred Ventures, Tapestry VC, Pebblebed and Uncorrelated Ventures also participated in the funding round, bringing Northflank’s total funding to approximately $25 million since its founding. source

Exclusive: Northflank scores $22.3 million to make cloud infrastructure less of a nightmare for developers Read More »

Here are 3 critical LLM compression strategies to supercharge AI performance

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More In today’s fast-paced digital landscape, businesses relying on AI face new challenges: latency, memory usage and compute power costs to run an AI model. As AI advances rapidly, the models powering these innovations have grown increasingly complex and resource-intensive. While these large models have achieved remarkable performance across various tasks, they are often accompanied by significant computational and memory requirements. For real-time AI applications like threat detection, fraud detection, biometric airplane boarding and many others, delivering fast, accurate results becomes paramount. The real motivation for businesses to speed up AI implementations comes not only from simply saving on infrastructure and compute costs, but also from achieving higher operational efficiency, faster response times and seamless user experiences, which translates into tangible business outcomes such as improved customer satisfaction and reduced wait times. Two solutions instantly come to mind for navigating these challenges, but they are not without drawbacks. One solution is to train smaller models, trading off accuracy and performance for speed. The other solution is to invest in better hardware like GPUs, which can run complex high-performing AI models at a low latency. However, with GPU demand far exceeding supply, this solution will rapidly drive up costs. It also does not solve the use case where the AI model needs to be run on edge devices like smartphones. Enter model compression techniques: A set of methods designed to reduce the size and computational demands of AI models while maintaining their performance. In this article, we will explore some model compression strategies that will help developers deploy AI models even in the most resource-constrained environments. How model compression helps There are several reasons why machine learning (ML) models should be compressed. First, larger models often provide better accuracy but require substantial computational resources to run predictions. Many state-of-the-art models, such as large language models (LLMs) and deep neural networks, are both computationally expensive and memory-intensive. As these models are deployed in real-time applications, like recommendation engines or threat detection systems, their need for high-performance GPUs or cloud infrastructure drives up costs. Second, latency requirements for certain applications add to the expense. Many AI applications rely on real-time or low-latency predictions, which necessitate powerful hardware to keep response times low. The higher the volume of predictions, the more expensive it becomes to run these models continuously.  Additionally, the sheer volume of inference requests in consumer-facing services can make the costs skyrocket. For example, solutions deployed at airports, banks or retail locations will involve a large number of inference requests daily, with each request consuming computational resources. This operational load demands careful latency and cost management to ensure that scaling AI does not drain resources. However, model compression is not just about costs. Smaller models consume less energy, which translates to longer battery life in mobile devices and reduced power consumption in data centers. This not only cuts operational costs but also aligns AI development with environmental sustainability goals by lowering carbon emissions. By addressing these challenges, model compression techniques pave the way for more practical, cost-effective and widely deployable AI solutions.  Top model compression techniques Compressed models can perform predictions more quickly and efficiently, enabling real-time applications that enhance user experiences across various domains, from faster security checks at airports to real-time identity verification. Here are some commonly used techniques to compress AI models. Model pruning Model pruning is a technique that reduces the size of a neural network by removing parameters that have little impact on the model’s output. By eliminating redundant or insignificant weights, the computational complexity of the model is decreased, leading to faster inference times and lower memory usage. The result is a leaner model that still performs well but requires fewer resources to run. For businesses, pruning is particularly beneficial because it can reduce both the time and cost of making predictions without sacrificing much in terms of accuracy. A pruned model can be re-trained to recover any lost accuracy. Model pruning can be done iteratively, until the required model performance, size and speed are achieved. Techniques like iterative pruning help in effectively reducing model size while maintaining performance. Model quantization Quantization is another powerful method for optimizing ML models. It reduces the precision of the numbers used to represent a model’s parameters and computations, typically from 32-bit floating-point numbers to 8-bit integers. This significantly reduces the model’s memory footprint and speeds up inference by enabling it to run on less powerful hardware. The memory and speed improvements can be as large as 4x. In environments where computational resources are constrained, such as edge devices or mobile phones, quantization allows businesses to deploy models more efficiently. It also slashes the energy consumption of running AI services, translating into lower cloud or hardware costs. Typically, quantization is done on a trained AI model, and uses a calibration dataset to minimize loss of performance. In cases where the performance loss is still more than acceptable, techniques like quantization-aware training can help maintain accuracy by allowing the model to adapt to this compression during the learning process itself. Additionally, model quantization can be applied after model pruning, further improving latency while maintaining performance. Knowledge distillation This technique involves training a smaller model (the student) to mimic the behavior of a larger, more complex model (the teacher). This process often involves training the student model on both the original training data and the soft outputs (probability distributions) of the teacher. This helps transfer not just the final decisions, but also the nuanced “reasoning” of the larger model to the smaller one. The student model learns to approximate the performance of the teacher by focusing on critical aspects of the data, resulting in a lightweight model that retains much of the original’s accuracy but with far fewer computational demands. For businesses, knowledge distillation enables the deployment of smaller, faster models that offer similar results at a fraction of the inference cost. It’s

Here are 3 critical LLM compression strategies to supercharge AI performance Read More »

Mistral AI takes on OpenAI with new moderation API, tackling harmful content in 11 languages

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More French artificial intelligence startup Mistral AI launched a new content moderation API on Thursday, marking its latest move to compete with OpenAI and other AI leaders while addressing growing concerns about AI safety and content filtering. The new moderation service, powered by a fine-tuned version of Mistral’s Ministral 8B model, is designed to detect potentially harmful content across nine different categories, including sexual content, hate speech, violence, dangerous activities, and personally identifiable information. The API offers both raw text and conversational content analysis capabilities. “Safety plays a key role in making AI useful,” Mistral’s team said in announcing the release. “At Mistral AI, we believe that system level guardrails are critical to protecting downstream deployments.” Mistral AI’s new moderation API analyzes text across nine categories of potentially harmful content, returning risk scores for each category. (Credit: Mistral AI) Multilingual moderation capabilities position Mistral to challenge OpenAI’s dominance The launch comes at a crucial time for the AI industry, as companies face mounting pressure to implement stronger safeguards around their technology. Just last month, Mistral joined other major AI companies in signing the UK AI Safety Summit accord, pledging to develop AI responsibly. The moderation API is already being used in Mistral’s own Le Chat platform and supports 11 languages, including Arabic, Chinese, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, and Spanish. This multilingual capability gives Mistral an edge over some competitors whose moderation tools primarily focus on English content. “Over the past few months, we’ve seen growing enthusiasm across the industry and research community for new LLM-based moderation systems, which can help make moderation more scalable and robust across applications,” the company stated. Performance metrics showing accuracy rates across Mistral AI’s nine moderation categories, demonstrating the model’s effectiveness in detecting different types of potentially harmful content. (Credit: Mistral AI) Enterprise partnerships show Mistral’s growing influence in corporate AI The release follows Mistral’s recent string of high-profile partnerships, including deals with Microsoft Azure, Qualcomm, and SAP, positioning the young company as an increasingly important player in the enterprise AI market. Last month, SAP announced it would host Mistral’s models, including Mistral Large 2, on its infrastructure to provide customers with secure AI solutions that comply with European regulations. What makes Mistral’s approach particularly noteworthy is its dual focus on edge computing and comprehensive safety features. While companies like OpenAI and Anthropic have focused primarily on cloud-based solutions, Mistral’s strategy of enabling both on-device AI and content moderation addresses growing concerns about data privacy, latency, and compliance. This could prove especially attractive to European companies subject to strict data protection regulations. The company’s technical approach also shows sophistication beyond its years. By training its moderation model to understand conversational context rather than just analyzing isolated text, Mistral has created a system that can potentially catch subtle forms of harmful content that might slip through more basic filters. The moderation API is available immediately through Mistral’s cloud platform, with pricing based on usage. The company says it will continue to improve the system’s accuracy and expand its capabilities based on customer feedback and evolving safety requirements. Mistral’s move shows how quickly the AI landscape is changing. Just a year ago, the Paris-based startup didn’t exist. Now it’s helping shape how enterprises think about AI safety. In a field dominated by American tech giants, Mistral’s European perspective on privacy and security might prove to be its greatest advantage. source

Mistral AI takes on OpenAI with new moderation API, tackling harmful content in 11 languages Read More »

ByteDance’s AI can make your photos act out movie scenes — but is it too real?

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More ByteDance has unveiled an artificial intelligence system that can transform any photograph into a convincing video performance, complete with subtle expressions and emotional depth that rival real footage. The Chinese technology giant, known for TikTok, designed its “X-Portrait 2” system to make still images mirror scenes from famous movies — with results so realistic they blur the line between authentic and artificial content. The system’s demonstrations showcase still photos performing iconic scenes from films like “The Shining,” “Face Off,” and “Fences,” capturing every nuanced expression from the original performances. A single photograph can now display fear, rage, or joy with the same convincing detail as a trained actor, while maintaining the original person’s identity and characteristics. This breakthrough arrives at a crucial moment. As society grapples with digital misinformation and the aftermath of the U.S. presidential election, X-Portrait 2’s ability to create indistinguishable-from-reality videos from any photograph raises serious concerns. Previous AI animation tools produced obviously artificial results with mechanical movements. But ByteDance’s new system captures the natural flow of facial muscles, subtle eye movements, and complex expressions that make human faces uniquely expressive. ByteDance achieved this realism through an innovative approach. Instead of tracking specific points on a face — the standard method used by most animation software — the system observes and learns from complete facial movements. Where older systems created expressions by connecting dots, X-Portrait 2 captures the fluid motion of an entire face, even during rapid speech or when viewed from different angles. X-Portrait 2 demonstrates its versatility across different visual styles. A driving photo (top left) can be transformed to match another person’s expression (top right), while the same technology can generate both anime-style illustrations (bottom left) and painterly portraits (bottom right), all maintaining consistent facial expressions. (Credit: ByteDance) TikTok’s billion-user database: The secret behind ByteDance’s AI breakthrough ByteDance’s advantage stems from its unique position as owner of TikTok, which processes over a billion user-generated videos daily. This massive collection of facial expressions, movements, and emotions provides training data at a scale unavailable to most AI companies. While competitors rely on limited datasets or synthetic data, ByteDance can fine-tune its AI models using real-world expressions captured across diverse faces, lighting conditions, and camera angles. The release of X-Portrait 2 coincides with ByteDance’s expansion of AI research beyond China. The company is establishing new research centers in Europe, with potential locations in Switzerland, the UK, and France. A planned $2.13 billion AI center in Malaysia and collaboration with Tsinghua University suggest a strategy to build AI expertise across multiple continents. This global research push comes at a critical moment. While ByteDance faces regulatory scrutiny in Western markets — including Canada’s recent order for TikTok to cease operations and ongoing U.S. debates about restrictions — the company continues to advance its technical capabilities. Hollywood’s next revolution: How AI could replace million-dollar motion capture The implications for the animation industry extend beyond technical achievements. Major studios currently spend millions on motion capture equipment and employ hundreds of animators to create realistic facial expressions. X-Portrait 2 suggests a future where a single photographer and a reference video could replace much of this infrastructure. This shift arrives amid growing debate about AI-generated content and digital rights. While competitors have rushed to release their code publicly, ByteDance has kept X-Portrait 2’s implementation private — a decision that reflects increasing awareness of how AI tools can be misused to create unauthorized performances or misleading content. ByteDance’s focus on human movement and expression marks a distinct path from other AI companies. While firms like OpenAI and Anthropic concentrate on language processing, ByteDance builds on its core strength: understanding how people move and express themselves on camera. This specialization emerges directly from TikTok’s years of analyzing dance trends and facial expressions. This emphasis on human motion could prove more significant than current market analysis suggests. As work and socializing increasingly move into virtual spaces, technology that accurately captures and transfers human emotion becomes crucial. ByteDance’s advances position it to influence how people will interact in digital environments, from business meetings to entertainment. AI security concerns: When digital faces need digital locks The October dismissal of a ByteDance intern for allegedly interfering with AI model training highlighted an often-overlooked aspect of AI development: internal security. As models become more sophisticated, protecting them from tampering grows increasingly critical. The technology arrives as demand for AI-generated video content rises across entertainment, education, and business communication. While X-Portrait 2 demonstrates significant technical progress in maintaining consistent identity while transferring nuanced expressions, it also raises questions about authentication and verification of AI-generated content. As Western governments scrutinize Chinese technology companies, ByteDance’s advances in AI animation present a complex reality: innovation knows no borders, and the future of how we interact online may be shaped by technologies developed far from Silicon Valley. source

ByteDance’s AI can make your photos act out movie scenes — but is it too real? Read More »

Introducing Narrative Command, the new business thesis that helps explain the 2024 election

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More In late September, angel investor Alex Roy, a former colleague of mine at the defunct self-driving car startup Argo AI, published a piece on the website of his newly launched boutique deep tech VC firm, New Industry VC, entitled “Narrative Command.” Roy’s piece made the rounds among his followers on X and was shared favorably by other tech investors and founders, and for good reason: In it, Roy elucidates a concept that recasts why a startup is ultimately successful or not. Communications — and specifically, the narrative startups offer about themselves, their industry, and their place in it — is intrinsic to the success of the business, alongside “Operational Mastery,” or a “disciplined approach of addressing risks in structured stages.” As Roy states: “Great storytelling isn’t art, it’s math. It’s the sum of hook, anticipation, and resolution, multiplied by the skill of the storyteller. But even great storytelling is worthless without story-audience fit, which requires the right story, at the right time, heard by the right audience.” While polls before the 2024 election suggested it would be close, it ended up being a “red wave” that handedly elected former President Donald J. Trump to his second, non-consecutive term. Roy observed on X that the election result, and specifically Trump campaign backer Elon Musk’s desired outcome of getting his preferred candidate elected “wasn’t luck. It was many things. Also, Narrative command is self-sustaining.” I called Roy up earlier today to discuss Narrative Command and what impact it may have had on Musk’s role in the election, and Trump’s victory, as well as how business leaders, entrepreneurs and founders can apply it themselves. He summarized: “Narrative command is the concept that in every new market there is a startup that defines a vision of the future that becomes the default for that vertical.” The following is a video of our conversation and edited (for clarity) transcript below. Carl Franzen, Venturebeat: Alex, you and I spoke because you launched a new company called NIVC, which invests in deep tech hardware startups. And part of your VC’s differentiation from others in the field is that you apply something called narrative command. You wrote a great piece a number of weeks ago when you launched your new company. We’ll obviously put a link to narrative command so that people can read it. But I guess just in a high-level view, how would you summarize narrative command? Roy: Narrative command is the concept that in every new market there is a startup that defines a vision of the future… which becomes the default future for that vertical. They define the language of the vertical, forcing everyone else to use that language. They define the seminal experience or outcome, and then give audiences or customers a taste of that experience. Once one is defined, or seize narrative command for a new vertical, competitors, whether they are pre-existing or new, must live inside the narrative and discourse that you have created. Taken to its logical conclusion, it becomes self-sustaining, where stakeholders, fans, customers, allies, investors perpetuate the narrative. And the best example of this is, of course, Tesla, which possesses narrative command of both electric and autonomous vehicles. And yet its reality command does not really meet its narrative — not taking anything from Tesla at all. Narrative command is an essential component of any startup’s success in the 21st century, which brings us to our discussion today of whether or not it can be applied to other things: mature markets and politics. Franzen: Yeah, so that’s a super interesting distinction. I’m really glad you pointed that out. I think the temptation would be to apply narrative command— especially for me: I’m a journalist, we’ve worked together before, and I’m interested in storytelling, both fictional and non-fictional, the idea that a single company’s narrative, the story that it tells about itself to an audience, can define not only it and its customers’ experience but also the entire market, and then solidify its place within it as a leader, is a really cool and compelling idea. And I think that’s partially why your narrative command essay that you did publish initially a few weeks ago did go viral to the extent that it could in the midst of our election, and it was so compelling, you and I started talking about it back then. But today I think, we’re speaking on November 6, 2024, the Wednesday, the day after the US presidential election. So, Donald Trump has been declared the winner already. Based on a bunch of the reporting that’s come out from the states, the early vote totals, it seems that he’s about four million votes ahead and has all the electoral votes necessary to reassume the presidency. On the one hand, we don’t weigh too much into politics usually at VentureBeat, but on the other hand, to your point, Elon Musk, CEO of Tesla (although I think he uses a different title now) and also an owner of X, the social network, was a very active participant in this election on the side of Donald Trump, donating through his political action committee, personally appearing at Trump events and speaking on behalf of Trump and also urging his followers and the entire electorate of the United States to vote for Trump. And as it turns out, once again, Musk, who many criticize and doubt — I’ve had my own disagreements or issues with his positions — once again proves the naysayers wrong and is able to get this preferred candidate elected. So, you did post, I think recently on X that the real lesson isn’t the election. The real lesson is whether or not the Democratic party will learn from it. And this was in regards to Biden’s failure to invite Elon Musk to the 2021 White House Electric Vehicle Summit. Is this an example of narrative command

Introducing Narrative Command, the new business thesis that helps explain the 2024 election Read More »

Runway goes 3D with new AI video camera controls for Gen-3 Alpha Turbo

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More As the AI video wars continue to wage with new, realistic video generating models being released on a near weekly basis, early leader Runway isn’t ceding any ground in terms of capabilities. Rather, the New York City-based startup — funded to the tune of $100M+ by Google and Nvidia, among others — is actually deploying even new features that help set it apart. Today, for instance, it launched a powerful new set of advanced AI camera controls for its Gen-3 Alpha Turbo video generation model. Now, when users generate a new video from text prompts, uploaded images, or their own video, the user can also control how the AI generated effects and scenes play out much more granularly than with a random “roll of the dice.” Advanced Camera Control is now available for Gen-3 Alpha Turbo. Choose both the direction and intensity of how you move through your scenes for even more intention in every shot. (1/8) pic.twitter.com/jRE6pC9ULn — Runway (@runwayml) November 1, 2024 Instead, as Runway shows in a thread of example videos uploaded to its X account, the user can actually zoom in and out of their scene and subjects, preserving even the AI generated character forms and setting behind them, realistically putting them and their viewers into a fully realized, seemingly 3D world — like they are on a real movie set or on location. As Runway CEO Crisóbal Valenzuela wrote on X, “Who said 3D?” This is a big leap forward in capabilities. Even though other AI video generators and Runway itself previously offered camera controls, they were relatively blunt and the way in which they generated a resulting new video was often seemingly random and limited — trying to pan up or down or around a subject could sometimes deform it or turn it 2D or result in strange deformations and glitches. What you can do with Runway’s new Gen-3 Alpha Turbo Advanced Camera Controls The Advanced Camera Controls include options for setting both the direction and intensity of movements, providing users with nuanced capabilities to shape their visual projects. Among the highlights, creators can use horizontal movements to arc smoothly around subjects or explore locations from different vantage points, enhancing the sense of immersion and perspective. For those looking to experiment with motion dynamics, the toolset allows for the combination of various camera moves with speed ramps. This feature is particularly useful for generating visually engaging loops or transitions, offering greater creative potential. Users can also perform dramatic zoom-ins, navigating deeper into scenes with cinematic flair, or execute quick zoom-outs to introduce new context, shifting the narrative focus and providing audiences with a fresh perspective. The update also includes options for slow trucking movements, which let the camera glide steadily across scenes. This provides a controlled and intentional viewing experience, ideal for emphasizing detail or building suspense. Runway’s integration of these diverse options aims to transform the way users think about digital camera work, allowing for seamless transitions and enhanced scene composition. These capabilities are now available for creators using the Gen-3 Alpha Turbo model. To explore the full range of Advanced Camera Control features, users can visit Runway’s platform at runwayml.com. While we haven’t yet tried the new Runway Gen-3 Alpha Turbo model, the videos showing its capabilities indicate a much higher level of precision in control and should help AI filmmakers — including those from major legacy Hollywood studios such as Lionsgate, with whom Runway recently partnered — to realize major motion picture quality scenes more quickly, affordably, and seamlessly than ever before. Asked by VentureBeat over Direct Message on X if Runway had developed a 3D AI scene generation model — something currently being pursued by other rivals from China and the U.S. such as Midjourney — Valenzuela responded: “world models :-).” Runway first mentioned it was building AI models designed to simulate the physical world back in December 2023, nearly a year ago, when co-founder and chief technology officer (CTO) Anastasis Germanidis posted on the Runway website about the concept, stating: “A world model is an AI system that builds an internal representation of an environment, and uses it to simulate future events within that environment. Research in world models has so far been focused on very limited and controlled settings, either in toy simulated worlds (like those of video games) or narrow contexts (such as developing world models for driving). The aim of general world models will be to represent and simulate a wide range of situations and interactions, like those encountered in the real world.“ As evidenced in the new camera controls unveiled today, Runway is well along on its journey to build such models and deploy them to users. source

Runway goes 3D with new AI video camera controls for Gen-3 Alpha Turbo Read More »