VentureBeat

IBM debuts open source Granite 3.0 LLMs for enterprise AI

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Make no mistake about it, enterprise AI is big business, especially for IBM. IBM already has a $2 billion book of business related to generative AI and it’s now looking to accelerate that growth. IBM is expanding its enterprise AI business today with the launch of the third generation of Granite large language models (LLMs). A core element of the new generation is the continued focus on real open source enterprise AI. Going a step further, IBM is ensuring that models can be fine-tuned for enterprise AI, with its InstructLab capabilities. The new models announced today include general purpose options with a 2 billion and 8 billion Granite 3.0. There are also Mixture-of-Experts (MoE) models that include Granite 3.0 3B A800M Instruct, Granite 3.0 1B A400M Instruct, Granite 3.0 3B A800M Base and Granite 3.0 1B A400M Base. Rounding out the update, IBM also has a new group with optimized guardrail and safety options that include Granite Guardian 3.0 8B and Granite Guardian 3.0 2B models. The new models will be available on IBM’s watsonX service, as well as on Amazon Bedrock, Amazon Sagemaker and Hugging Face. “As we mentioned on our last earnings call, the book of business that we’ve built on generative AI is now $2 billion plus across technology and consulting,” Rob Thomas, senior vice-president and chief commercial officer at IBM, said during a briefing with press and analysts. “As I think about my 25 years in IBM, I’m not sure we’ve ever had a business that has scaled at this pace.” How IBM is looking to advance enterprise AI with Granite 3.0 Granite 3.0 introduces a range of sophisticated AI models tailored for enterprise applications.  IBM expects that the new models will help to support a range of enterprise use cases including: customer service, IT automation, Business Process Outsourcing (BPO), application development and cybersecurity. The new Granite 3.0 models were trained by IBM’s centralized data model factory team that is responsible for sourcing and curating the data used for training.  Dario Gil, Senior Vice President and Director of IBM research, explained that the training process involved 12 trillion tokens of data, including both language data across multiple languages as well as code data. He emphasized that the key differences from previous generations were the quality of the data and the architectural innovations used in the training process. Thomas added that what’s also important to recognize is where the data comes from. “Part of our advantage in building models is data sets that we have that are unique to IBM,” Thomas said.  “We have a unique, I’d say, vantage point in the industry, where we become the first customer for everything that we build that also gives us an advantage in terms of how we construct the models.” IBM claims high performance benchmarks for Granite 3.0 According to Gil, the Granite models have achieved remarkable results on a wide range of tasks, outperforming the latest versions of models from Google, Anthropic and others.  “What you’re seeing here is incredibly highly performant models, absolutely state of the art, and we’re very proud of that,” Gil said. But it’s not just raw performance that sets Granite apart. IBM has also placed a strong emphasis on safety and trust, developing advanced “Guardian” models that can be used to prevent the core models from being jailbroken or producing harmful content. The various model size options are also a critical element. “We care so deeply, and we’ve learned a lesson from scaling AI, that inference cost is essential,” Gil noted. “That is the reason why we’re so focused on the size of the category of models, because it has the blend of performance and inference cost that is very attractive to scale use cases in the enterprise.” Why real open source matters for enterprise AI A key differentiator for Granite 3.0 is IBM’s decision to release the models under the Open Source Initiative (OSI) approved Apache 2.0 open-source license.  There are many other open models, such as Meta’s Llama in the market, that are not in fact available under an OSI-approved license. That’s a distinction that matters to some enterprises. “We decided that we’re going to be absolutely squeaky clean on that, and decided to do an Apache 2 license, so that we give maximum flexibility to our enterprise partners to do what they need to do with the technology,” Gil explained. The permissive Apache 2.0 license allows IBM’s partners to build their own brands and intellectual property on top of the Granite models. This helps foster a robust ecosystem of solutions and applications powered by the Granite technology. “It’s completely changing the notion of how quickly businesses can adopt AI when you have a permissive license that enables contribution, enables community and ultimately, enables wide distribution,” Thomas said. Looking beyond generative AI to generative computing  Looking forward, IBM is thinking about the next major paradigm shift, something that Gil referred to as – generative computing. In essence, generative computing refers to the ability to program computers by providing examples or prompts, rather than explicitly writing out step-by-step instructions. This aligns with the capabilities of LLMs like Granite, which can generate text, code, and other outputs based on the input they receive. “This paradigm where we don’t write the instructions, but we program the computer, by example, is fundamental, and we’re just beginning to touch what that feels like by interacting with LLMs,” Gil said. “You are going to see us invest and go very aggressively in a direction where with this paradigm of generative computing, we’re going to be able to implement the next generation of models, agentic frameworks and much more than that, it’s a fundamental new way to program computers as a consequence of the Gen AI revolution.” source

IBM debuts open source Granite 3.0 LLMs for enterprise AI Read More »

Salesforce CEO Marc Beinoff slams Microsoft Copilot as ‘Clippy 2.0’

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Fighting words? Salesforce co-founder and CEO Marc Benioff took to his personal X account last night to criticize Microsoft’s AI assistant Copilot as “disappointing,” saying “It just doesn’t work, and it doesn’t deliver any level of accuracy,” before ultimately concluding “Copilot is more like Clippy 2.0,” with a shrug emoji. “Clippy” of course is the popular nickname for Microsoft’s Clippit virtual on-screen Word and Office conversational assistant that debuted in 1996. While now looked upon with some ironic fondness for its cute expressions and large eyes, in the mid 1990s when it premiered, it was quickly found by many users to be more annoying than helpful, popping up while they tried to do tasks on their Microsoft software and offering unhelpful suggestions. Copilot — a text-based chatbot assistant powered by Microsoft partner and investment OpenAI’s GPT models — was initially designed for Microsoft’s Office 365 and debuted in March 2023. It later expanded to include a web and mobile app version as well (and was the new name given to Microsoft’s GPT-powered Bing Chat). It was recently redesigned and upgraded to include many new features such as vision (watching and reacting to a user’s screen activity) and humanlike conversational voice input and output. A loaded critique Benioff’s critique is of course loaded and inherently biased, coming as he does from a rival software company — Salesforce’s signature customer relationship management (CRM) software competes directly with Microsoft Dynamics 365, as does the Salesforce-owned Slack with Microsoft Teams — and both companies have spent the two years since OpenAI’s debut of ChatGPT launching various new AI features, assistants, applications, and tools. Yet curiously, Benioff, an early executive to embrace to the power and potential of AI — at least publicly — has lately been criticizing the gen AI era more broadly. On Sunday, Benioff posted on X that he thought “much of AI’s current potential is simply oversold,” and that “AI isn’t yet curing cancer or solving climate change as pundits claim,” yet provided no evidence of these claims. It’s a curious and contradictory tone for him to strike given he also recently told Fast Company he has “never been more excited about anything at Salesforce, maybe in my career,” as Agentforce, his company’s new enterprise AI agent builder tool. Clearly, the founder is trying to thread a nuanced line of argument here — saying AI has potential for businesses but that Microsoft’s implementation of it doesn’t work well or provide enough value — but that presumably, Salesforce’s implementation is superior. We’ll see if customers buy it. For now, some of the “pundits” he may be railing against such as public relations expert Ed Zitron have already seized on some of Benioff’s AI critical remarks as evidence the pro gen AI narrative more generally is starting to turn. source

Salesforce CEO Marc Beinoff slams Microsoft Copilot as ‘Clippy 2.0’ Read More »

Small but mighty: H2O.ai’s new AI models challenge tech giants in document analysis

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More H2O.ai, a provider of open-source AI platforms, announced today two new vision-language models designed to improve document analysis and optical character recognition (OCR) tasks. The models, named H2OVL Mississippi-2B and H2OVL-Mississippi-0.8B, show competitive performance against much larger models from major tech companies, potentially offering a more efficient solution for businesses dealing with document-heavy workflows. David vs. Goliath: How H2O.ai’s tiny models are outsmarting tech giants The H2OVL Mississippi-0.8B model, with only 800 million parameters, surpassed all other models, including those with billions more parameters, on the OCRBench Text Recognition task. Meanwhile, the 2-billion parameter H2OVL Mississippi-2B model demonstrated strong general performance across a range of vision-language benchmarks. “We’ve designed H2OVL Mississippi models to be a high-performance yet cost-effective solution, bringing AI-powered OCR, visual understanding, and Document AI to businesses,” Sri Ambati, CEO and Founder of H2O.ai said in an exclusive interview with VentureBeat. “By combining advanced multimodal AI with efficiency, H2OVL Mississippi delivers precise, scalable Document AI solutions across a range of industries.” The release of these models marks a significant step in H2O.ai’s strategy to make AI technology more accessible. By making the models freely available on Hugging Face, a popular platform for sharing machine learning models, H2O.ai is allowing developers and businesses to modify and adapt the models for specific document AI needs. H2O.ai’s new H2OVL Mississippi-0.8B model (far right, in yellow) outperforms larger models from tech giants in text recognition tasks on the OCRBench dataset, demonstrating the potential of smaller, more efficient AI models for document analysis. (Credit: H2O.ai) Efficiency meets effectiveness: A new approach to document processing Ambati highlighted the economic advantages of smaller, specialized models. “Our approach to generative pre-trained transformers stems from our deep investment in Document AI, where we collaborate with customers to extract meaning from enterprise documents,” he said. “These models can run anywhere, on a small footprint, efficiently and sustainably, allowing fine-tuning on domain-specific images and documents at a fraction of the cost.” The announcement comes as businesses seek more efficient ways to process and extract information from large volumes of documents. Traditional OCR and document analysis methods often struggle with poor-quality scans, challenging handwriting, or heavily modified documents. H2O.ai’s new models aim to address these issues while offering a more resource-efficient alternative to larger language models that may be excessive for specific document-related tasks. Industry analysts note that H2O.ai’s approach could disrupt the current landscape dominated by tech giants. By focusing on smaller, more specialized models, H2O.ai may be able to capture a significant portion of the enterprise market that values efficiency and cost-effectiveness. A comparison of average scores on eight single image benchmarks shows H2O.ai’s new H2OVL Mississippi-2B model (in yellow) outperforming several competitors, including offerings from Microsoft and Google. The model trails only Qwen2 VL-2B in overall performance among similarly sized vision-language models. (Credit: H2O.ai) Open source and enterprise-ready: H2O.ai’s strategy for AI adoption “At H2O.ai, making AI accessible isn’t just an idea. It’s a movement,” Ambati told VentureBeat. “By releasing a series of small foundational models that can be easily fine-tuned to specific tasks, we are expanding the possibilities for creating and using AI.” H2O.ai has raised $256 million from investors including Commonwealth Bank, Nvidia, Goldman Sachs, and Wells Fargo. The company’s open-source approach and focus on practical, enterprise-ready AI solutions have helped it build a community of over 20,000 organizations and more than half of the Fortune 500 companies as customers. As businesses continue to grapple with digital transformation and the need to extract value from unstructured data, H2O.ai’s new vision-language models could provide a compelling option for those looking to implement document AI solutions without the computational overhead of larger models. The true test will be in real-world applications, but H2O.ai’s demonstration of competitive performance with much smaller models suggests a promising direction for the future of enterprise AI. source

Small but mighty: H2O.ai’s new AI models challenge tech giants in document analysis Read More »

Perplexity lets you search your internal enterprise files and the web

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Enterprises can use their Perplexity dashboards to search for internal information and combine it with knowledge from the internet, but this will only be limited to specific files they deem important.  Peplexity’s new Internal Knowledge Search lets Perplexity Pro and Enterprise Pro users search for information across the web or their internal databases. Customers can access both knowledge bases in one consolidated platform.  However, internal knowledge bases will be limited to the files Perplexity users upload to the platform. Frank te Pas, head of Enterprise product at Perplexity, told VentureBeat in an interview that Internal Knowledge Search will only look for information on files users have uploaded, not entire internal databases.  “We believe this lets people bring only their most important and valuable data to the table and not the 90% of low-value files they normally sift through,” he said. “Customers told us they want to use information that’s important to them, which makes their own data even more valuable.” Users have a file upload limit (500 for Enterprise Pro users), but te Pas said this may be expanded. Customers can also upload files directly from folders in all the popular document formats like Excel sheets, word documents or PDFs.  Still, the company believes Internal Knowledge Search will improve many enterprise functions.  Perplexity CEO Aravind Srinivas said research using both internal and external information used to be two separate products. One platform searches the internet and another accesses internal documents and data.  “Being able to carry out all your research — across both internal and external sources of data — in one consolidated knowledge platform will unlock tremendous productivity gains for every enterprise,” Srinivas said in a blog post.  Today, we’re launching Perplexity for Internal Search: one tool to search over both the web and your team’s files with multi-step reasoning and code execution. pic.twitter.com/ftZGNgziBW — Aravind Srinivas (@AravSrinivas) October 17, 2024 Perplexity gave customers like Nividia, Databricks, Dell, Bridgewater, Latham & Watkins, Fortune and Lambda early access to the feature. During the early access testing, the company said customers used the Internal Search feature to do due diligence by combining internal research notes and news from the web, combine older sales materials with more current insights for proposal requests, help employees find benefit information and get product roadmap feedback based on best practices from the internet.  Perplexity will also label data sources if the information was from a website or uploaded files so that the user can dive deeper later.  In April, Perplexity launched Enterprise Pro, a paid tier of the Perplexity AI chat and search platform. The subscription offers SOC2 certification, single sign-on, user management, file upload alerts and query deletion after a week.  Make space for Spaces Perplexity also announced Spaces, a way for teams to share and organize research.  Spaces will allow users to share files across a team and customize Perplexity’s AI assistant with specific instructions and responses based on their data. The company said customers will also get full control over who gets to access their information. Specific to Perplexity Enterprise Pro, all files and searches on Spaces “are excluded from AI quality training by default.” Pro customers have to voluntarily opt out of AI training. Perplexity also promises to provide the “highest levels of safety and privacy.” Perplexity plans to add third-party data integration with Crunchbase and FactSet so Enterprise Pro users with subscriptions to those services can add data to their Spaces.  “This will allow you to expand your knowledge base even further with the ability to search across the public web, internal files, and proprietary data sets,” the company said.  Te Pas said that bringing in third-party databases like Crunchbase and FactSet means customers with subscriptions can also bring their personalized search queries on those platforms to Perplexity. For example, if a customer created a list of sectors to watch on either database, they can access that through a Perplexity search. Enterprise RAG is not going away soon Te Pas said Internal Knowledge Search and Spaces is a form of retrieval augmented generation or RAG, where users can leverage their internal ground truth to a search.  RAG systems normally sift through databases to find the most relevant answers to queries contained within those files. Most RAG systems use large knowledge repositories, as most enterprises who want to query their own data have an extensive library of information. Occasionally, a company may deploy different RAG use cases, like a real-time information retrieval system or search-only information for a specific unit. Perplexity’s version of RAG still searches a database, except that database is one built on Perplexity’s platform by users who uploaded their documents to it.  Perplexity has to compete with companies like Glean and Elastic, who have been offering RAG platforms for enterprises for a while. Glean launched its AI search chat platform, Glean Chat, which lets enterprises query their own data, last year. Perplexity has increasingly taken traffic share from more traditional search engines like Google. Perplexity also has a revenue-sharing program with some partners, mostly media companies, whose links appear on Perplexity searches.  source

Perplexity lets you search your internal enterprise files and the web Read More »

Arch-Function LLMs promise lightning-fast agentic AI for complex enterprise workflows

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Enterprises are bullish on agentic applications that can understand user instructions and intent to perform different tasks in digital environments. It’s the next wave in the age of generative AI, but many organizations still struggle with low throughputs with their models. Today, Katanemo, a startup building intelligent infrastructure for AI-native applications, took a step to solve this problem by open-sourcing Arch-Function. This is a collection of state-of-the-art large language models (LLMs) promising ultra-fast speeds at function-calling tasks critical to agentic workflows. But, just how fast are we talking about here? According to Salman Paracha, the founder and CEO of Katanemo, the new open models are nearly 12 times faster than OpenAI’s GPT-4. It even outperforms offerings from Anthropic all while delivering significant cost savings at the same time.  The move can easily pave the way for super-responsive agents that could handle domain-specific use cases without burning a hole in the businesses’ pockets. According to Gartner, by 2028, 33% of enterprise software tools will use agentic AI, up from less than 1% at present, enabling 15% of day-to-day work decisions to be made autonomously. What exactly does Arch-Function bring to the table? A week ago, Katanemo open-sourced Arch, an intelligent prompt gateway that uses specialized (sub-billion) LLMs to handle all critical tasks related to the handling and processing of prompts. This includes detecting and rejecting jailbreak attempts, intelligently calling “backend” APIs to fulfill the user’s request and managing the observability of prompts and LLM interactions in a centralized way.  The offering allows developers to build fast, secure and personalized gen AI apps at any scale. Now, as the next step in this work, the company has open-sourced some of the “intelligence” behind the gateway in the form of Arch-Function LLMs. As the founder puts it, these new LLMs – built on top of Qwen 2.5 with 3B and 7B parameters – are designed to handle function calls, which essentially allows them to interact with external tools and systems for performing digital tasks and accessing up-to-date information.  Using a given set of natural language prompts, the Arch-Function models can understand complex function signatures, identify required parameters and produce accurate function call outputs. This allows it to execute any required task, be it an API interaction or an automated backend workflow. This, in turn, can enable enterprises to develop agentic applications.  “In simple terms, Arch-Function helps you personalize your LLM apps by calling application-specific operations triggered via user prompts. With Arch-Function, you can build fast ‘agentic’ workflows tailored to domain-specific use cases – from updating insurance claims to creating ad campaigns via prompts. Arch-Function analyzes prompts, extracts critical information from them, engages in lightweight conversations to gather missing parameters from the user, and makes API calls so that you can focus on writing business logic,” Paracha explained. Speed and cost are the biggest highlights While function calling is not a new capability (many models support it), how effectively Arch-Function LLMs handle is the highlight. According to details shared by Paracha on X, the models beat or match frontier models, including those from OpenAI and Anthropic, in terms of quality but deliver significant benefits in terms of speed and cost savings.  For instance, compared to GPT-4, Arch-Function-3B delivers approximately 12x throughput improvement and massive 44x cost savings. Similar results were also seen against GPT-4o and Claude 3.5 Sonnet. The company has yet to share full benchmarks, but Paracha did note that the throughput and cost savings were seen when an L40S Nvidia GPU was used to host the 3B parameter model. “The standard is using the V100 or A100 to run/benchmark LLMS, and the L40S is a cheaper instance than both. Of course, this is our quantized version, with similar quality performance,” he noted. Another exciting day here Katanemo as we open source some of the "intelligence" behind Arch (https://t.co/9nwakOGPp0). Meet Katanemo Arch-Function, a collection of state-of-the-art (SOTA) LLMs designed for function calling tasks – that meet/beat frontier LLM performance, but… pic.twitter.com/IajF8w3syz — Salman Paracha (Building Intelligent Infra) (@salman_paracha) October 15, 2024 With this work, enterprises can have a faster and more affordable family of function-calling LLMs to power their agentic applications. The company has yet to share case studies of how these models are being utilized, but high-throughput performance with low costs makes an ideal combo for real-time, production use cases such as processing incoming data for campaign optimization or sending emails to clients. According to Markets and Markets, globally, the market for AI agents is expected to grow with a CAGR of nearly 45% to become a $47 billion opportunity by 2030. source

Arch-Function LLMs promise lightning-fast agentic AI for complex enterprise workflows Read More »

Why ROAI — return on AI — depends on the power of process intelligence

Presented by Celonis The State of Oklahoma had a $3 billion problem: In 2022, its Legislative Office of Fiscal Transparency found that a full quarter of the state’s $12 billion budget was spent without oversight, posing serious financial and legal risks. Its processes were hopelessly broken. But they found a solution that was not only 200 times more efficient but slashed potential costs by $11.4 million: process intelligence. It’s a technology that is transforming business operations – and it’s proving crucial to successful generative AI as well as the rapidly approaching agentic AI future. “Every organization in every industry runs on a collection of interacting processes – finance, supply chain, sales, marketing – and all have to work well, and they have to work well together, and that’s not easy, since we’re talking about multiple systems and departments and multiple languages,” says Alex Rinke, co-CEO and co-founder of Celonis. “Process intelligence platforms give you full visibility into how these processes are operating, where they’re getting stuck, where you have your bottlenecks, where you have your deviations, where you have process issues, and then remediates those issues.” For instance, in a matter of months, process intelligence further helped the State of Oklahoma pivot to reviewing state purchases in real-time, so staff are able to serve their state and be transparent with taxpayer dollars. And across the pond, the NHS (National Health Service) in the U.K. used process intelligence to eliminate 1,800 appointment cancellations each week just by shifting when the appointment reminder goes out, uncovering ways to reduce the waiting list by around 5,300 patients in eight weeks by optimizing the patient journey, and realized an estimated savings of £2.8M a year along the way. In other words, instead of the business equivalent of throwing spaghetti at the wall and hoping something sticks, process intelligence revealed where process changes or AI solutions could offer profound results. “Process intelligence provides business context – a true understanding of where, in any end-to-end process, we need to apply a change, and identifies the places AI can have the biggest impact for our customers, for their bottom line, for their green line, for their people and their productivity,” Rinke adds. “Without visibility into a process, you’re tossing AI at a problem just because you want to use AI. You’re not actually moving the needle. Process intelligence is the only way to achieve ROAI – return on AI investment.” Why process intelligence is the key to AI To understand the challenges of enterprise AI, consider how it differs from consumer AI. Both rely on a wealth of data to operate correctly. However, consumer AI not only has the whole internet of data at its proverbial fingertips, that data also includes resources like Wikipedia, which offer crucial context for how all those individual data points are connected, and why. “Consumer AI models are very good at cases where they’ve seen a lot of examples on the internet. They’ve seen millions of example bar exams or code so they can pass the bar exam or code a website,” Rinke says. “But enterprise AI doesn’t get trained with examples of a company’s unique processes – how it makes products, pays suppliers, makes contracts with customers. That information is scattered across all these different systems, with no central repository of rules, desired processes and who’s responsible for what. All that is implicit in the organization.” The Celonis Process Intelligence Platform makes that knowledge explicit, and pulls together all that enterprise data sitting in IT systems such as ERP and CRM across the organization in many different form factors. The Celonis solution in particular gives that raw enterprise data what amounts to the Wikapediaesque cognate it needs to ground AI in business and process context. It provides the connective tissue that gives organizations the insight they need to identify powerful AI use cases and feeds AI with the process insights it needs to be useful, scalable and reliable. For instance, integrating process intelligence with generative AI means that answers to gen AI prompts are furnished using real-time process data and knowledge. And process intelligence can unlock the major benefits of AI agents, the next evolutionary step for AI, that are able to independently perform a series of interlinked tasks and make autonomous decisions along the way. Eventually networks of agents will be able to talk to one another to complete entire processes – for instance, getting a marketing deliverable reviewed and approved by legal, then releasing it to a customer channel, monitoring metrics and delivering a report. But that’s a lot of moving parts, with a lot of potential points of failure when organizations leap into agentic AI with their eyes closed. Process intelligence helps organizations identify the kinds of clearly defined and narrowly scoped problems AI agents are best at solving. That helps eliminate inconsistent responses or hallucinations, and the number of potential and actual dropped steps is slashed significantly when a process intelligence platform can monitor, track and flag agent decisions. AI and the process intelligence platform At the center of the Celonis Process Intelligence platform is the Process Intelligence Graph (PI Graph). Using process mining, it extracts process data from transactional systems (e.g., ERP, CRM, HCM) and brings them together into a data layer—a living digital twin of the business processes. The PI Graph combines this digital twin with a knowledge layer—the context mentioned above (i.e., what makes something “good” or “bad” for the organization) defined by KPIs, benchmarks, process models and so on. In short, it knows how processes run across the entire enterprise and shows people how they can run better. For example, in order management, a user can dig into an order process in progress, see how it’s related to the returns process, how it impacts the invoicing process, how it informs the sales process and so on. And to manage it all, the platform offers capabilities like dashboarding, app building, real-time monitoring, workflow automation, orchestration, alerts, root cause analysis and process optimization. In

Why ROAI — return on AI — depends on the power of process intelligence Read More »

Cognizant adds multi-agent functionality to AI application platform

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Cognizant’s Neuro AI platform, announced last year, will get more AI as the consultancy adds multi-agent capabilities to the service. The Neuro AI platform helps organizations ideate, prototype and test generative AI applications without coding. Babak Hodjat, Cognizant’s CTO of AI, told VentureBeat the service used to be something Cognizant’s experts did for customers. However, Neuro AI will now be available for enterprises to use themselves. “One of the things we train into as we started demoing it to clients was them saying, hey, this is really fascinating, we want to use it ourselves and host it in-house,” Hodjat said. “In some ways, they started thinking of it as this factory that generates ideas for where to apply generative AI in their businesses.” Hodjat said Neuro AI’s use of multiple agents makes it stand out from other AI app platforms, which Cognizant was already exploring while reconfiguring the service for clients. AI agents, of course, have become a big trend for enterprise AI this year.  The platform has four steps, all of which rely on pre-configured agents: the Opportunity Finder, Scoping Agent, Data Generator and Model Orchestrator.  It acts as a Cognizant consultant for clients who want to build applications. The platform goes through the process of ideating an application and, in the end, provides a framework for the customer to follow.  When people first start using Neuro AI, they’re asked to describe what issues they want solved. The Opportunity Finder then deploys agents to search for industry-specific use cases. Once a potential use case is identified, users then move to the Scoping agent, which will show the use case’s impact on specific categories and performance indicators. The Data Generation agent will generate synthetic data related to the use case to test out the application.  The Model Orchestrator sets up the application. Hodjat said it uses several agents that make calls to build out the system. For example, a project describer agent will return a JSON description followed by a context agent or an outcome mapper. The number of agents the Orchestrator will manage depends on the use case.  “We had the agents communicate with each other to identify what capabilities are needed,” Hodjat said. “We did that by encapsulating each agent’s expertise so these agents are talking to each other. One agent is asking the other agent, hey, I have this use case to build. Can you do something for me? The main trick here is to actually have the agents in communicating with each other.” Hodjat said his team used LangChain as a framework to build out its multi-agent orchestration and remain LLM agnostic. He said the framework is not perfect, but since many clients prefer to use different models, it was important Neuro AI can handle both open and closed models.  Competition in AI application consulting is growing This is not Cognizant’s first foray into generative AI. In March, it opened an AI lab in San Francisco to help boost enterprise use of the technology.  Companies like Cognizant, which helps other enterprises set up their own AI applications or programs, are creating new product offerings to make using generative AI easier. Accenture, along with AWS, released a platform that evaluates AI readiness and responsible AI policies. McKinsey and Company set up a chatbot for its consultants called Lilli last year.  Consulting and business process service providers are starting to create their niche in the increasingly competitive AI platform space. Enterprise software providers, like Salesforce, SAP and Oracle, already give customers access to platforms to easily create agents or other AI applications. Organizations like Cognizant are building products that seem to cater to businesses that are still unsure of how to harness generative AI fully.  source

Cognizant adds multi-agent functionality to AI application platform Read More »

Simplismart supercharges AI performance with personalized, software-optimized inference engine

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Enterprises are all in on AI. They want their models to run in production environments smoothly and with as high performance as possible to obtain a high return on investment. However, even with all the advanced models available in the market, teams continue to struggle with deployment issues. Last year, Peter Bendor-Samuel, the CEO of Everest Group, estimated that 90% of the gen AI pilots started will not make it to production. Even Gartner has predicted that a significant portion of generative AI projects are likely to be abandoned after proof of concept by the end of 2025.  Among the hurdles to adoption, the largest one is orchestration. Teams just don’t have the resources to do everything in-house, which leaves them reliant on rigid and expensive third-party APIs. Today, Simplismart AI raised $7 million in funding to address this gap with its end-to-end MLOps platform that accelerates the entire orchestration effort by taking care of everything from fine-tuning models to deployment and observability. While there are other MLOps solutions in the market, including those from Datadog, what makes this startup different is its personalized software-optimized inference engine. It deploys models at lightning-fast speed, significantly boosting their performance while driving down associated costs. “Without any hardware optimization, we’ve unlocked a throughput of 501 tokens per second on the Llama3.1 8B model, which far beats other inference engines. Similarly, we’ve achieved better results across all modalities, including text-to-speech, speech-to-text, text-to-image, image-to-image,” Amritanshu Jain, former Oracle engineer who co-founded the startup with ex-Google techie Devansh Ghatak, tells VentureBeat. Solving orchestration gaps with Simplismart optimized inference When deploying AI in-house (for enhanced control and privacy), teams have to deal with several bottlenecks, right from accessing compute power and optimizing model performance to scaling infrastructure, CI/CD pipelines and cost efficiency. Handling everything manually can easily take months. Not to mention, a slight error here or there in the pipeline can hit the performance of the model and lead to high costs and poor ROI. With its end-to-end orchestration platform, Simplismart standardizes this entire workflow, allowing users to fine-tune, deploy and observe highly optimized open-source models – covering different modalities – according to their needs.  “Users can either use our shared infrastructure or bring their own compute, cloud account to configure their infrastructure and deployments with ease. The intuitive dashboard of the platform allows them to set parameters like GPUs, machine types, scaling ranges, etc. Once the cluster is ready, users can deploy from a wide range of pre-optimized models or import their own… Finally, the observability features come into play and allow users to track SLAs, monitor the performance of the model in the real world and benchmark performance against past numbers…,” Jain explained. The Terraform-like declarative orchestration language of the platform lets enterprises easily manage the entire pipeline, putting complete control back into their hands and reducing their dependency on the DevOps teams. Meanwhile, the personalized, software-optimized inference engine at its heart ensures that the models are deployed to deliver the desired performance and cost results.  “Simplismart stands out as the platform that can deliver a personalized inference engine tailored to each enterprise’s needs—whether it’s load, SLAs, performance requirements, GPU usage, etc. This helps enterprises strike the right balance between cost and performance,” Jain said. He noted that the inference engine performance is optimized across three main layers. First, it optimizes application serving with a custom serving layer for ML workloads. Then, it supports infrastructure with rapid upscaling/downscaling and sharding of models across GPUs to maximize hardware utilization. Finally, it optimizes model-GPU interaction with 28 custom kernels using CUDA. This allows the engine to squeeze even more performance out of the hardware being used. He said the optimized inference engine is already running some popular models, including Llama 3.1 8B, OpenAI’s Whisper v2 and SDXL, with a major performance boost. “We’ve consistently recorded a throughput of 501 tokens/sec during multiple Llama 3.1 8B runs. That said, this doesn’t mean every single request will achieve that exact figure, as performance can fluctuate within a band, which is typical for all inference engines. In our tests, we observed a median of ~350 tokens/second under sustained load. What’s particularly exciting is that even at this median, our performance band remains significantly higher than any other inference engine on the market,” he noted.  The company’s primary competitors in this space are TogetherAI, Baseten, Replicate, Fireworks and Amazon Bedrock. Plan to double down on performance Simplismart already has a pipeline of 30 enterprise customers, including Invideo, Dashtoon, Dubverse and Vodex. One pharma marketplace used the company’s platform to deploy InternVL2 models for digitizing hand-written prescriptions and was able to improve spatial configuration detection, processing 2.5x more images at half the cost. As the next step in this work, Simplismart wants to improve the performance of its MLOps platforms further. It will use the fresh funding to fuel R&D and come up with new techniques to increase the speed of AI inference and stay ahead of the competition.  “The company has tripled revenue in the last four months to reach ~$1M annual revenue run-rate. We aim to scale to $10M ARR in the next 15 months. Our major levers are to target the top 50 AI-first enterprises and drive open-source adoption of our terraform-like orchestration language,” Jain noted. source

Simplismart supercharges AI performance with personalized, software-optimized inference engine Read More »

Archetype AI’s Newton model learns physics from raw data—without any help from humans

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Researchers at Archetype AI have developed a foundational AI model capable of learning complex physics principles directly from sensor data, without any pre-programmed knowledge. This breakthrough could significantly change how we understand and interact with the physical world. The model, named Newton, demonstrates an unprecedented ability to generalize across diverse physical phenomena, from mechanical oscillations to thermodynamics, using only raw sensor measurements as input. This achievement, detailed in a paper released today, represents a major advance in artificial intelligence’s capacity to interpret and predict real-world physical processes. “We’re asking if AI can discover the laws of physics on its own, the same way humans did through careful observation and measurement,” said Ivan Poupyrev, co-founder of Archetype AI, in an exclusive interview with VentureBeat. “Can we build a single AI model that generalizes across diverse physical phenomena, domains, applications, and sensing apparatuses?” From pendulums to power grids: AI’s uncanny predictive powers Trained on over half a billion data points from diverse sensor measurements, Newton has shown remarkable versatility. In one striking demonstration, it accurately predicted the chaotic motion of a pendulum in real-time, despite never being trained on pendulum dynamics. The model’s capabilities extend to complex real-world scenarios as well. Newton outperformed specialized AI systems in forecasting citywide power consumption patterns and predicting temperature fluctuations in power grid transformers. “What’s remarkable is that Newton had not been specifically trained to understand these experiments — it was encountering them for the first time and was still able to predict outcomes even for chaotic and complex behaviors,” Poupyrev told VentureBeat. Performance comparison of Archetype AI’s ‘Newton’ model across various complex physical processes. The graph shows that the model, even without specific training (zero-shot), often outperforms or matches models trained specifically for each task, highlighting its potential for broad applicability. (Credit: Archetype AI) Adapting AI for industrial applications Newton’s ability to generalize to entirely new domains could significantly change how AI is deployed in industrial and scientific applications. Rather than requiring custom models and extensive datasets for each new use case, a single pre-trained foundation model like Newton might be adapted to diverse sensing tasks with minimal additional training. This approach represents a significant shift in how AI can be applied to physical systems. Currently, most industrial AI applications require extensive custom development and data collection for each specific use case. This process is time-consuming, expensive, and often results in models that are narrowly focused and unable to adapt to changing conditions. Newton’s approach, by contrast, offers the potential for more flexible and adaptable AI systems. By learning general principles of physics from a wide range of sensor data, the model can potentially be applied to new situations with minimal additional training. This could dramatically reduce the time and cost of deploying AI in industrial settings, while also improving the ability of these systems to handle unexpected situations or changing conditions. Moreover, this approach could be particularly valuable in situations where data is scarce or difficult to collect. Many industrial processes involve rare events or unique conditions that are challenging to model with traditional AI approaches. A system like Newton, which can generalize from a broad base of physical knowledge, might be able to make accurate predictions even in these challenging scenarios. Expanding human perception: AI as a new sense The implications of Newton extend beyond industrial applications. By learning to interpret unfamiliar sensor data, AI systems like Newton could expand human perceptual capabilities in new ways. “We have sensors now that can detect aspects of the world humans can’t naturally perceive,” Poupyrev told VentureBeat. “Now we can start seeing the world through sensory modalities which humans don’t have. We can enhance our perception in unprecedented ways.” This capability could have profound implications across a range of fields. In medicine, for example, AI models could help interpret complex diagnostic data, potentially identifying patterns or anomalies that human doctors might miss. In environmental science, these models could help analyze vast amounts of sensor data to better understand and predict climate patterns or ecological changes. The technology also raises intriguing possibilities for human-computer interaction. As AI systems become better at interpreting diverse types of sensor data, we might see new interfaces that allow humans to “sense” aspects of the world that were previously imperceptible. This could lead to new tools for everything from scientific research to artistic expression. Archetype AI, a Palo Alto-based startup founded by former Google researchers, has raised $13 million in venture funding to date. The company is in discussions with potential customers about real-world deployments, focusing on areas such as predictive maintenance for industrial equipment, energy demand forecasting, and traffic management systems. The approach also shows promise for accelerating scientific research by uncovering hidden patterns in experimental data. “Can we discover new physical laws?” Poupyrev mused. “It’s an exciting possibility.” “Our main goal at Archetype AI is to make sense of the physical world,” Poupyrev told VentureBeat. “To figure out what the physical world means.” As AI systems become increasingly adept at interpreting the patterns underlying physical reality, that goal may be within reach. The research opens new possibilities – from more efficient industrial processes to scientific breakthroughs and novel human-computer interfaces that expand our understanding of the physical world. For now, Newton remains a research prototype. But if Archetype AI can successfully bring the technology to market, it could usher in a new era of AI-powered insight into the physical world around us. The challenge now will be to move from promising research results to practical, reliable systems that can be deployed in real-world settings. This will require not only further technical development, but also careful consideration of issues like data privacy, system reliability, and the ethical implications of AI systems that can interpret and predict physical phenomena in ways that might surpass human capabilities. source

Archetype AI’s Newton model learns physics from raw data—without any help from humans Read More »

Google’s NotebookLM will expand to business use cases soon

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Google will soon offer a paid version of its AI research tool NotebookLM, specifically targeting businesses.  NotebookLM Business will have “enhanced features for businesses, universities, and organizations.” For now, access to NotebookLM Business is through a pilot program for early access to its features, training and email support.  Google told VentureBeat in an email that participants in the NotebookLM Business pilot “will gain a significant advantage with enhanced capabilities designed to boost productivity and collaboration.” These capabilities include higher usage limits and new features such as customization and sharing notebooks with team members. The company said these features could unlock new use cases for businesses using the tool. “We’ve seen this early feature streamline onboarding, shared understanding of complex projects, and building a centralized repository of your team’s collective intelligence all within a collaborative notebook environment,” a spokesperson said.  Another feature that will be part of NotebookLM Business is Audio Overview, which lets users create a narrated study guide. Google said the paid version will continue to have robust data privacy and security.  NotebookLM, built with Gemini 1.5, lets people upload source material to “notebooks” to gather information and ask the Gemini chatbot questions about the research. First announced in July last year, NotebookLM became generally available in December.  Google will also remove the “experimental” tag on the tool.  NotebookLM product manager Raiza Martin previously told VentureBeat that the team saw many different uses for the platform, including some for enterprises. While NotebookLM was never intended for a specific audience, Martin said many researchers and students embraced the product. Many businesses have also begun using NotebookLM as a repository of information for teams.  Google will announce general availability and pricing for NotebookLM Business later this year.  Additional control over audio  Along with announcing NotebookLM Business, Google updated the Audio Overview feature of NotebookLM. Audio Overview lets people generate podcasts about their research. Google characterized Audio Overview as a spoken research or study guide rather than a podcast. However, its first version featured two voices (one male, one female) conversing about the information in the notebook, reminding many of podcasts.  Audio Overview proved popular among some users, with many posting their generated audio on social media. Martin had previously promised additional controls over Audio Overview and said the company’s research showed conversations helped people retain more information. Users can now guide more of the conversation of Audio Overview, including prompting the model to focus on specific topics or levels of expertise. Audio Overviews will also continue to play while users query their sources or ask questions with its chat feature.  I got to explore the updated capabilities of Audio Overviews early. In a notebook with sources around AI Orchestration, I told it to focus on the definition of orchestration and how different frameworks like LangChain work. The final product did talk about AI orchestration based on the different blog posts and YouTube videos I had uploaded. The two “hosts,” however, spoke about frameworks as if LangChain was the only orchestration framework out there. This might be a misunderstanding of my prompt where I specifically named LangChain because the source documents definitely talk about available tools.  Google does point out that Audio Overviews “are generated discussions and are not a comprehensive or objective view of a topic.” It only takes into account information found in the uploaded source materials.  Open NotebookLM, an open-source competitor to NotebookLM, launched last month and included an audio recap function. While Open NotebookLM does not have the same fact-checking capabilities as NotebookLM, it represents a shift in the ease of deploying complex AI-driven platforms.  source

Google’s NotebookLM will expand to business use cases soon Read More »