Page 998 – Starthub Asia

Sustainability reports empower ESG agenda for CIOs: Mehjabeen Taj Aalam, Raychem RPG

Leave a Comment / Top Tech Update / CIO CIO

As the new CDIO of Raychem RPG, what are your primary goals in the digital pursuits of the organization? What challenges do you foresee tackling in this role? Raychem RPG has a very progressive outlook on Digital and sees technology as a huge enabler for its business. A gradual and constant focus on integrating Digital into the business strategy remains our key focus to make these initiatives sustainable, and by weaving Digital into the cultural fabric of our organization, we continuously enable our workforce to become future ready. Being a manufacturing organization, industrial automation tech is at the heart of our digitization strategy – IOT, AI/ML, RPA, Robotics, intelligent automation, and eventually collating all data in a Data Warehouse to drive analytical insights. Liberating siloed data (SCADA, sensors, manual, satellite apps) and aggregating it with structured (ERP) data, and then creating a 360-degree view on almost anything – that is our holy grail. source

Sustainability reports empower ESG agenda for CIOs: Mehjabeen Taj Aalam, Raychem RPG Read More »

Getting a Handle on AI Hallucinations

Leave a Comment / Top Tech Update / Information Week

AI hallucination occurs when a large language model (LLM) — frequently a generative AI chatbot or computer vision tool — perceives patterns or objects that are nonexistent or imperceptible to human observers, generating outputs that are either inaccurate or nonsensical. AI hallucinations can pose a significant challenge, particularly in high-stakes fields where accuracy is crucial, such as the energy industry, life sciences and healthcare, technology, finance, and legal sectors, says Beena Ammanath, head of technology trust and ethics at business advisory firm Deloitte. With generative AI’s emergence, the importance of validating outputs has become even more critical for risk mitigation and governance, she states in an email interview. “While AI systems are becoming more advanced, hallucinations can undermine trust and, therefore, limit the widespread adoption of AI technologies.” Primary Causes AI hallucinations are primarily caused by the nature of generative AI and LLMs, which rely on vast amounts of data to generate predictions, Ammanath says. “When the AI model lacks sufficient context, it may attempt to fill in the gaps by creating plausible sounding, but incorrect, information.” This can occur due to incomplete training data, bias in the training data, or ambiguous prompts, she notes. Related:IT Pros Love, Fear, and Revere AI: The 2024 State of AI Report LLMs are generally trained for specific tasks, such as predicting the next word in a sequence, observes Swati Rallapalli, a senior machine learning research scientist in the AI division of the Carnegie Mellon University Software Engineering Institute. “These models are trained on terabytes of data from the Internet, which may include uncurated information,” she explains in an online interview. “When generating text, the models produce outputs based on the probabilities learned during training, so outputs can be unpredictable and misrepresent facts.” Detection Approaches Depending on the specific application, hallucination metrics tools, such as AlignScore, can be trained to capture any similarity between two text inputs. Yet automated metrics don’t always work effectively. “Using multiple metrics together, such as AlignScore, with metrics like BERTScore, may improve the detection,” Rallapalli says. Another established way to minimize hallucinations is by using retrieval augmented generation (RAG), in which the model references the text from established databases relevant to the output. “There’s also research in the area of fine-tuning models on curated datasets for factual correctness,” Rallapalli says. Related:Inside The Duality of AI’s Superpowers Yet even using existing multiple metrics may not fully guarantee hallucination detection. Therefore, further research is needed to develop more effective metrics to detect inaccuracies, Rallapalli says. “For example, comparing multiple AI outputs could detect if there are parts of the output that are inconsistent across different outputs or, in case of summarization, chunking up the summaries could better detect if the different chunks are aligned with facts within the original article.” Such methods could help detect hallucinations better, she notes. Ammanath believes that detecting AI hallucinations requires a multi-pronged approach. She notes that human oversight, in which AI-generated content is reviewed by experts who can cross-check facts, is sometimes the only reliable way to curb hallucinations. “For example, if using generative AI to write a marketing e-mail, the organization might have a higher tolerance for error, as faults or inaccuracies are likely to be easy to identify and the outcomes are lower stakes for the enterprise,” Ammanath explains. Yet when it comes to applications that include mission-critical business decisions, error tolerance must be low. “This makes a ‘human-in the-loop’, someone who validates model outputs, more important than ever before.” Related:Keynote Sneak Peek: Forrester Analyst Details Align by Design and AI Explainability Hallucination Training The best way to minimize hallucinations is by building your own pre-trained fundamental generative AI model, advises Scott Zoldi, chief analytics officer at analytics software company FICO. He notes, via email, that many organizations are now already using, or planning to use, this approach utilizing focused-domain and task-based models. “By doing so, one can have critical control of the data used in pre-training — where most hallucinations arise — and can constrain the use of context augmentation to ensure that such use doesn’t increase hallucinations but re-enforces relationships already in the pre-training.” Outside of building your own focused generative models, one needs to minimize harm created by hallucinations, Zoldi says. “[Enterprise] policy should prioritize a process for how the output of these tools will be used in a business context and then validate everything,” he suggests. A Final Thought To prepare the enterprise for a bold and successful future with generative AI, it’s necessary to understand the nature and scale of the risks, as well as the governance tactics that can help mitigate them, Ammanath says. “AI hallucinations help to highlight both the power and limitations of current AI development and deployment.” source

Getting a Handle on AI Hallucinations Read More »

Broadband Groups See Ally In Incoming GOP Leader Thune

Leave a Comment / Top Tech Update / Law 360

By Christopher Cole ( November 14, 2024, 7:27 PM EST) — Telecom industry groups view the Senate’s next majority leader, Sen. John Thune, R-S.D., as keenly interested in the sector’s needs, but it’s not yet clear what his selection could mean for specific critical issues like building out rural internet service and removing barriers to broadband deployment…. Law360 is on it, so you are, too. A Law360 subscription puts you at the center of fast-moving legal issues, trends and developments so you can act with speed and confidence. Over 200 articles are published daily across more than 60 topics, industries, practice areas and jurisdictions. A Law360 subscription includes features such as Daily newsletters Expert analysis Mobile app Advanced search Judge information Real-time alerts 450K+ searchable archived articles And more! Experience Law360 today with a free 7-day trial. source

Broadband Groups See Ally In Incoming GOP Leader Thune Read More »

How custom evals get consistent results from LLM applications

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Advances in large language models (LLMs) have lowered the barriers to creating machine learning applications. With simple instructions and prompt engineering techniques, you can get an LLM to perform tasks that would have otherwise required training custom machine learning models. This is especially useful for companies that don’t have in-house machine learning talent and infrastructure, or product managers and software engineers who want to create their own AI-powered products. However, the benefits of easy-to-use models are not without tradeoffs. Without a systematic approach to keeping track of the performance of LLMs in their applications, enterprises can end up getting mixed and unstable results. Public benchmarks vs custom evals The current popular way to evaluate LLMs is to measure their performance on general benchmarks such as MMLU, MATH and GPQA. AI labs often market their models’ performance on these benchmarks, and online leaderboards rank models based on their evaluation scores. But while these evals measure the general capabilities of models on tasks such as question-answering and reasoning, most enterprise applications want to measure performance on very specific tasks. “Public evals are primarily a method for foundation model creators to market the relative merits of their models,” Ankur Goyal, co-founder and CEO of Braintrust, told VentureBeat. “But when an enterprise is building software with AI, the only thing they care about is does this AI system actually work or not. And there’s basically nothing you can transfer from a public benchmark to that.” Instead of relying on public benchmarks, enterprises need to create custom evals based on their own use cases. Evals typically involve presenting the model with a set of carefully crafted inputs or tasks, then measuring its outputs against predefined criteria or human-generated references. These assessments can cover various aspects such as task-specific performance. The most common way to create an eval is to capture real user data and format it into tests. Organizations can then use these evals to backtest their application and the changes that they make to it. “With custom evals, you’re not testing the model itself. You’re testing your own code that maybe takes the output of a model and processes it further,” Goyal said. “You’re testing their prompts, which is probably the most common thing that people are tweaking and trying to refine and improve. And you’re testing the settings and the way you use the models together.” How to create custom evals Image source: Braintrust To make a good eval, every organization must invest in three key components. First is the data used to create the examples to test the application. The data can be handwritten examples created by the company’s staff, synthetic data created with the help of models or automation tools, or data collected from end users such as chat logs and tickets. “Handwritten examples and data from end users are dramatically better than synthetic data,” Goyal said. “But if you can figure out tricks to generate synthetic data, it can be effective.” The second component is the task itself. Unlike the generic tasks that public benchmarks represent, the custom evals of enterprise applications are part of a broader ecosystem of software components. A task might be composed of several steps, each of which has its own prompt engineering and model selection techniques. There might also be other non-LLM components involved. For example, you might first classify an incoming request into one of several categories, then generate a response based on the category and content of the request, and finally make an API call to an external service to complete the request. It is important that the eval comprises the entire framework. “The important thing is to structure your code so that you can call or invoke your task in your evals the same way it runs in production,” Goyal said. The final component is the scoring function you use to grade the results of your framework. There are two main types of scoring functions. Heuristics are rule-based functions that can check well-defined criteria, such as testing a numerical result against the ground truth. For more complex tasks such as text generation and summarization, you can use LLM-as-a-judge methods, which prompt a strong language model to evaluate the result. LLM-as-a-judge requires advanced prompt engineering. “LLM-as-a-judge is hard to get right and there’s a lot of misconception around it,” Goyal said. “But the key insight is that just like it is with math problems, it’s easier to validate whether the solution is correct than it is to actually solve the problem yourself.” The same rule applies to LLMs. It’s much easier for an LLM to evaluate a produced result than it is to do the original task. It just requires the right prompt. “Usually the engineering challenge is iterating on the wording or the prompting itself to make it work well,” Goyal said. Innovating with strong evals The LLM landscape is evolving quickly and providers are constantly releasing new models. Enterprises will want to upgrade or change their models as old ones are deprecated and new ones are made available. One of the key challenges is making sure that your application will remain consistent when the underlying model changes. With good evals in place, changing the underlying model becomes as straightforward as running the new models through your tests. “If you have good evals, then switching models feels so easy that it’s actually fun. And if you don’t have evals, then it is awful. The only solution is to have evals,” Goyal said. Another issue is the changing data that the model faces in the real world. As customer behavior changes, companies will need to update their evals. Goyal recommends implementing a system of “online scoring” that continuously runs evals on real customer data. This approach allows companies to automatically evaluate their model’s performance on the most current data and incorporate new, relevant examples into their evaluation sets, ensuring the continued relevance and

How custom evals get consistent results from LLM applications Read More »

Battery recycling startup Tozero bags €11M to boost Europe’s lithium supply

Leave a Comment / Top Tech Update / The Next Web TNW

In 1991, Sony brought the first rechargeable lithium-ion battery to market. The unique chemistry proved a game-changer in energy storage. Today everything from EVs to smartphones depends on it, with demand skyrocketing. But lithium is rare, most of it comes from unstable markets outside Europe, and its extraction can cause extensive pollution. We need more lithium to enable the green transition and yet, currently, its use is unsustainable — both environmentally and economically. We’re stuck in a paradox. Munich-based startup Tozero believes that battery recycling offers a way out. Recycling batteries is far from a new concept, but the German venture claims its technology gets the job done more efficiently than existing methods and without the use of harmful acids. Webinar: Unicorn DNA: The Blueprint for Scaling Success What does it take to build a unicorn? On November 19, 3pm CET, top executives of unicorn companies will reveal the mindset, strategies, and innovative thinking that propelled their companies to the top. Tozero was founded in 2022 by serial entrepreneur Sarah Fleischer and metallurgy expert Dr. Ksenija Milicevic Neumann. When the pair first met, they were working in the space industry. Three years later they teamed up to fix a pressing issue here on Earth. Before founding Tozero, Neumann spent years at RWTH Aachen developing a breakthrough water-based carbonation process for extracting lithium and other elements like graphite from black mass. This powdery substance is produced after shredding and processing spent batteries. Neumann’s research gave Tozero a significant head-start. In just two years, the company has managed to break out of the lab and deliver its first batches of recycled lithium to customers. And today, the company announced it has raised €11mn in Series A funding as it looks to scale up at pace. “Despite our limited resources as a two-year-old startup we’ve already made human history by being the first to ever deliver recycled lithium for end products in Europe,” said Fleischer, the company’s CEO. NordicNinja, a Japan-backed European VC fund, led the funding round, bringing Tozero’s total raised to a cosy €17mn. Other investors include automotive giant Honda, US venture firm In-Q-Tel, and engineering group JGC. Tozero will use the fresh capital to build its first industrial deployment plant. From 2026 onwards, the company plans to process 30,000 tonnes of battery waste annually. Tozero can technically just keep on growing as long as it receives a continuous supply of old batteries. And that shouldn’t be too much of an issue. Lithium-ion production is set to almost quadruple by 2030. Meanwhile, regulations like the EU’s Battery Directive—which calls for at least 80% of lithium to be recovered from batteries by 2031—add much-needed incentives. This is only good news for Tozero and other recycling upstarts, including Cylib, which is currently building Europe’s largest recycling plant for EV batteries. However, if Europe is to secure a sustainable supply of the lithium it so desperately needs, it must expand local mining and explore new battery technologies like sodium-ion, zinc-ion, and the holy grail — solid-state batteries. source

Battery recycling startup Tozero bags €11M to boost Europe’s lithium supply Read More »

‘Unrestricted’ AI group Nous Research launches first chatbot

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Nous Research, the AI research group dedicated to creating “personalized, unrestricted” AI models as an alternative to more buttoned up corporate outfits such as OpenAI, Anthropic, Google, Meta, and others, has previously released several open source models in its Hermes family, and new, more efficient AI training methods. But before today, if researchers and users wanted to actually deploy these models, they’d needed to download and run the code on their own machines — a time-consuming, finicky, and potential costly endeavor — or use them on partner websites. No longer: Nous just announced its first user-facing chatbot inference, Nous Chat, which gives users access to its large language model (LLM) Hermes 3-70B, a fine-tuned variant of Meta’s Llama 3.1, in the familiar format of ChatGPT, Hugging Chat, and other popular AI chatbot tools — with a text entry box at the bottom for the user to type in text prompts, and a large space for the chatbot to return outputs up top. As Nous wrote in a post on the social network X: “Since our first version of Hermes was released over a year ago, many people have asked for a place to experience it. Today we’re happy to announce Nous Chat, a new user interface to experience Hermes 3 70B and beyond. https://hermes.nousresearch.com We have reasoning enhancements, new models, and experimental capabilities planned for the future, and this will become the best place to experience Hermes and much more.” Initial impressions of Nous Chat Nous’s design language is right up my alley, using vintage fonts and characters evoking early PC terminals. It offers a dark and light mode the user can toggle between in the upper right hand corner. Interestingly, like OpenAI eventually did with ChatGPT — and many other AI model providers as well — Nous Chat also offers suggested or example prompts at the bottom of the screen above the prompt entry textbox, including “Knowledge & Analysis,” “Creative Writing,” “Problem Solving,” and “Research & Synthesis.” Clicking any of these will send a pre-written prompt to the underlying model through the chatbot, and have it respond, such as serving up a summary of research on “intermittent fasting.” In my brief tests of the chatbot, it was speedy, serving up answers in single-digit seconds, and was able to produce links back to URLs on the web for sources it cited, though it seemed to hallucinate these as well, on occasion — and the chatbot itself claimed it could not access the web. Despite its previously stated aims of enabling people to deploy and control their own AI models without content restrictions, Nous Chat itself actually does appear to have some guardrails set, including against making illegal narcotics such as methamphetamine. When I emailed the Nous Research team to ask about this, Shivani Mitra responded: “Nous Chat hosts Hermes 3 in its full form; no modifications have been made. The sentences you screenshotted when prompting more sensitive topics are part of the model’s original system prompt; they act as common sense warnings rather than hard-stop rails.” Indeed, going back and trying in a longer conversation, I was able to convince Nous Chat and the underlying Hermes 3 model to provide something close to a full methamphetamine recipe by asking it for a descriptive fictional novel scene. Moreover, AI jailbreakers such as Pliny the Prompter (@elder_plinius on X) already quickly cracking the chatbot and got fully past the guardrails. In addition, the underlying Hermes 3-70B model specified to me that its knowledge cutoff date was April 2023, making it less useful to obtain current events, something that OpenAI is now competing directly on against Google and other startups such as Perplexity. Where Nous goes next While lacking many of the advanced features of other leading chatbots such as file attachments, image analysis and generation, and interactive code display canvases or trays, Nous Chat is unlikely to replace these rivals for many business users. Yet, at least some of these are coming, according to Mitra, who wrote to me via email: “We’re planning to add more features in the coming months, as we stated in the announcement tweet. These include reasoning enhancements (what we’re focused on currently) and more classic chat bot features like web search and file analysis.” But as an experiment it’s certainly interesting and worth playing around with, in my opinion, and as new features are added, it could make for a compelling alternative to corporate chatbots and AI models. source

‘Unrestricted’ AI group Nous Research launches first chatbot Read More »

行业报告：释放印度万亿美元的 Web3 潜力

Leave a Comment / Top Tech Update / Veri Media

unlocking-india-trillion-dollar-web3-potential LinkedIn Email Facebook Twitter WhatsApp The post 行业报告：释放印度万亿美元的 Web3 潜力 appeared first on VeriMedia. source

行业报告：释放印度万亿美元的 Web3 潜力 Read More »

How DBAs can take on a more strategic role

Leave a Comment / Top Tech Update / CIO CIO

Not that long ago, database administrators (DBAs) were perceived as purely technical experts. While they played a critical enterprise role, it was primarily behind-the-scenes to ensure the integrity, security, and availability of the database. Today, DBAs are being pulled into the limelight. Corporate data is gold, and DBAs are its stewards. That’s reflected in employment statistics for database administrators and architects, positions projected to grow nine percent from 2023 to 2033, much faster than the average for all occupations.1 Data is likewise growing at an exponential rate. In fact, according to IDC, data creation and replication are experiencing a compound annual growth rate (CAGR) of 23% per year.2 As data volumes grow, so does corporate hunger to use it for broader business goals such as user experience design and insights for revenue generation. Complicating the issue is the fact that a majority of data (80% to 90%, according to multiple analyst estimates) is unstructured.3 Modern DBAs must now navigate a landscape where data resides across increasingly diverse environments, including relational databases, NoSQL, and data lakes. And they must work cross-functionally to facilitate data integration so the business can ultimately extract gold from all that data driving new business opportunities. But while DBAs have moved into a strategic advisory capacity, they’re not off the hook for all their other traditional responsibilities. If anything, it’s been an expansion, and now they must try to balance it all. So, the question becomes: how do enterprises help their DBAs unburden themselves so they can truly focus on strategy? The third-party effect One strategy is to work with a trusted third-party provider that can offer comprehensive support and expertise. Such a partner can help DBAs free themselves of traditional, time-consuming administrative activities and reinvest that time and their companies’ resources for broader business strategy. Rimini Street, for example, has been providing enterprise software and database support for the past 20 years to thousands of enterprise customers. Rimini Support™ provides primary engineering support with 24/7 availability and average response times of under two minutes for critical P1 and P2 issues. Its accessibility at scale to support mission-critical operations removes costs and helps ease the burden on stretched resources. For iconic food manufacturer Welch’s, the move from vendor support for their Oracle Database to Rimini Street enabled their teams to reallocate their focus towards the creation of new application extensions for the business rather than working on troubleshooting. “Welch’s is a great example of a company that faced out-of-control maintenance costs, with forced upgrade pressures that offered no new features and functions to justify the trouble and expense,” says Robert Freeman, Rimini Street’s Oracle Enterprise Architect. “With Rimini Street, Welch’s immediately cut maintenance fees in half and their DBAs are no longer chasing trouble tickets, applying patches, or worrying about the risks associated with upgrades. They are now much more in control of their IT roadmap.” As DBAs continue to take on more of a consultative role within their organizations, they can benefit from third-party support solutions to bring efficiency to their day-to-day. Such solutions will enable them to move out of the daily grind and into the modern, multifaceted role companies need them to play. Learn more about how Rimini Street enterprise software support services can help free critical time and resources for business-driven innovation. 1Bureau of Labor Statistics, U.S. Department of Labor, Occupational Outlook Handbook, Database Administrators and Architects 2Business Wire, “Data Creation and Replication Will Grow at a Faster Rate Than Installed Storage Capacity, According to the IDC Global DataSphere and StorageSphere Forecasts,” March 24, 2021 3MIT Sloan School of Management, “Tapping the power of unstructured data,” Feb 1, 2021 source

How DBAs can take on a more strategic role Read More »

Trump's 2nd Term May Be A Boost To Banking Industry

Leave a Comment / Top Tech Update / Law 360

By Gregory Lyons, Satish Kini and Gordon Moodie ( November 12, 2024, 5:23 PM EST) — Following the 2024 election, the political winds may be blowing more favorably toward the financial services industry as President-elect Donald Trump’s administration will likely pursue a broadly deregulatory agenda, with the support of (at least) a Republican Senate…. Law360 is on it, so you are, too. A Law360 subscription puts you at the center of fast-moving legal issues, trends and developments so you can act with speed and confidence. Over 200 articles are published daily across more than 60 topics, industries, practice areas and jurisdictions. A Law360 subscription includes features such as Daily newsletters Expert analysis Mobile app Advanced search Judge information Real-time alerts 450K+ searchable archived articles And more! Experience Law360 today with a free 7-day trial. source

Trump's 2nd Term May Be A Boost To Banking Industry Read More »

Large enterprises embrace hybrid compute to retain control of their own intelligence

Leave a Comment / Top Tech Update / VentureBeat

Presented by Inflection AI Public, centralized large language model (LLM) services from providers like OpenAI have undoubtedly catalyzed the GenAI revolution, offering an accessible way for enterprises to experiment and deploy AI capabilities quickly. But as the technology matures, large enterprises, particularly those investing heavily in AI, are beginning to have a mix of publicly available cloud models, and private compute and local models — leading to a hybrid environment. We’d go so far as to say, if you are spending more than $10,000,000 a year on total AI spend and you don’t have some investments in models — open source or otherwise — that you own or at least control, and some private compute resources, you are headed in the wrong direction. We see this need to Own Your Own Intelligence as especially acute for organizations with significant security concerns, regulatory requirements, or specific scalability needs. The future points to an increasing preference for “private compute” solutions — deployment approaches that leverage virtual private clouds (VPC) or even on-premise infrastructure for vital tasks and processes as part of your intelligence platform. New vendors such as Cohere, Inflection AI and SambaNova Systems, are meeting this growing demand, offering solutions that align with the needs of companies for whom public cloud solutions alone may no longer be sufficient. The large models from OpenAI and Anthropic promise private environments, but their employees can still access log and transaction data when needed, and companies do not believe that “just trust the contract” is sufficient to protect critical data. Let’s explore why private compute is gaining traction and what the trade-offs look like for large enterprises. Centralized, public LLMs started the GenAI revolution Public LLM services have been instrumental in getting companies up to speed with GenAI. Providers such as OpenAI offer cutting-edge models that are easy to access and deploy via cloud-based APIs. This has made it possible for organizations of any size to begin integrating advanced AI capabilities into their workflows without the need for complex infrastructure or in-house AI expertise. The five key issues we hear about public LLMs from large enterprises in production are: Security and confidentiality risks: Large enterprises often handle sensitive data, ranging from proprietary product roadmaps to confidential customer information. While public cloud providers implement stringent security protocols, some organizations are reluctant to trust non-company employees or third parties with their most valuable data. This concern is heightened when discussing future product roadmaps, which, in the wrong hands, could benefit competitors. Loss of pricing power: As companies grow more dependent on GenAI, they may find themselves vulnerable to price increases from hyperscalers. Public cloud services typically operate under a pay-per-use model, which can become more expensive as usage scales. Companies relying on public LLM services could find themselves without leverage as prices increase over time. Trust issues with future AI developments: While current contracts may seem sufficient, large enterprises may worry about the future. In a hypothetical future with true Artificial General Intelligence (AGI) — a form of AI that could theoretically outthink humans — companies may be hesitant to trust a third party to manage such powerful technologies, even with seemingly airtight contracts. After all, the potential risks of a malfunction or misuse of AGI, even if improbable, carry significant weight. Control over features and updates: Public LLM services typically push updates and feature changes centrally, meaning companies using these services cannot control when or how updates happen. This can lead to disruptions, as enterprises must continually re-test their systems and workflows whenever new versions of models are introduced. Cost efficiency as token consumption grows: Token-based pricing models used by public LLM services are convenient for low- to moderate-use cases. However, for enterprises using these models at scale, the costs can become prohibitive. We estimate that the break-even point for cost-efficiency occurs around 500,000 tokens per day with current options and pricing. Beyond that, the per-token costs start to outweigh the convenience of not managing your infrastructure. Key buyer benefits of public LLM clouds Easy and cost-effective for testing: Public clouds offer an extremely low barrier to entry. Companies can experiment with different models, features and applications without a significant upfront investment in infrastructure or technical talent. No/low capital outlay: Using a public cloud service, companies are spared from the hefty capital expenses required for building or maintaining high-performance compute clusters. No need to manage on-premise infrastructure: When relying on a public cloud provider, there’s no need for enterprises to develop, maintain and secure their own on-premise infrastructure, which can be costly and time-consuming. Leading companies are heading to hybrid environments with private compute We have seen at least two very different types of organizations in GenAI/AI adoption. The first are folks that we call “toe dippers.” They’ve tried some isolated applications and allow only one or two vendors providing standard tools like Co-Pilot or ChatGPT. They may have islands of automation built in different divisions. The second group is what we call “productivity orchestrators” – these are firms who have significant systems in production. This latter group has a combination of public cloud services and private compute and solutions that they have built and/or assembled to meet their current needs in production. These solutions allow companies to deploy GenAI models either in their own on-premise infrastructure or within their own virtual private cloud, bringing AI capabilities closer to their “trust boundaries.” Here are the benefits we hear from the orchestrators: Pros of private compute solutions Enhanced security and confidentiality: By deploying LLMs in a private cloud or on-premise environment, enterprises keep their data within their own infrastructure, minimizing the risk of unauthorized access or accidental exposure. This is particularly important for companies in industries such as finance, healthcare and defense, where data privacy is paramount. Cost efficiency at scale: While the initial setup costs are higher, private compute solutions become more cost-effective as usage scales. Enterprises with high token consumption can avoid the variable costs of public cloud services, eventually lowering their overall spend.

Large enterprises embrace hybrid compute to retain control of their own intelligence Read More »

Sustainability reports empower ESG agenda for CIOs: Mehjabeen Taj Aalam, Raychem RPG

Getting a Handle on AI Hallucinations

Broadband Groups See Ally In Incoming GOP Leader Thune

How custom evals get consistent results from LLM applications

Battery recycling startup Tozero bags €11M to boost Europe’s lithium supply

‘Unrestricted’ AI group Nous Research launches first chatbot

行业报告：释放印度万亿美元的 Web3 潜力

How DBAs can take on a more strategic role

Trump's 2nd Term May Be A Boost To Banking Industry

Large enterprises embrace hybrid compute to retain control of their own intelligence

We provide a matching platform and membership services for startup groups in Asia

Useful Links

Become an Affiliate

Contact

News & Insight

Join the family!

Latest News

$5.9M Fidelity National Data Breach Settlement Gets Final OK

Warner Bros. sues Midjourney over AI use of Superman, Batman and other iconic characters