VentureBeat

Astronomer’s $93M raise underscores a new reality: Orchestration is king in AI infrastructure

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Astronomer, the company behind the Apache Airflow-powered data orchestration platform Astro, has secured $93 million in Series D funding as enterprises increasingly seek to operationalize AI initiatives through better management of their data pipelines. The funding round was led by Bain Capital Ventures, with participation from Salesforce Ventures and existing investors including Insight, Meritech, and Venrock. Bosch Ventures is also seeking to participate in the round, reflecting industrial interest in the technology. In an exclusive interview with VentureBeat, Astronomer CEO Andy Byron explained that the company will use the funding to expedite research and development efforts and expand its global footprint, particularly in Europe, Australia, and New Zealand. “For us, this is just a step along the way,” Byron said. “We want to build something awesome here. I couldn’t be more excited about our venture partners, our customers, our product vision, which I think is super strong in going after collapsing the data ops market.” How data orchestration became the hidden key to enterprise AI success The funding targets what industry analysts have identified as the “AI implementation gap” — the significant technical and organizational hurdles that prevent companies from deploying AI at scale. Data orchestration, the process of automating and coordinating complex data workflows across disparate systems, has become an essential component of successful AI deployments. Enrique Salem, Partner at Bain Capital Ventures, explained the critical challenges facing enterprises today: “Every company operates a sprawling, fragmented data ecosystem—using a patchworks of tools, teams, and workflows that struggle to deliver reliable insights, creating operational bottlenecks and limiting agility. At the heart of this complexity is orchestration—the layer that coordinates all these moving pieces.” Salem noted that despite its importance, “today’s orchestration landscape is where cloud infrastructure was 15 years ago: mission critical, yet fragmented, brittle and often built in-house with limited scalability. Data engineers spend more time maintaining pipelines than driving innovation. Without robust orchestration, data is unreliable, agility is lost, and businesses fall behind.” The company’s platform, Astro, is built on Apache Airflow, an open-source framework that has seen explosive growth. According to the company’s recently released State of Airflow 2025 report, which surveyed over 5,000 data practitioners, Airflow was downloaded more than 324 million times in 2024 alone — more than all previous years combined. “Airflow has established itself as the proven de facto standard for data pipeline orchestration,” Astronomer SVP of Marketing Mark Wheeler explained. “When we look at the competitive landscape in the orchestration layer, Airflow has clearly emerged as the standard solution for moving modern data efficiently from source to destination.” From invisible plumbing to enterprise AI backbone: The evolution of data infrastructure Astronomer’s growth reflects a transformative shift in how enterprises view data orchestration — from hidden backend infrastructure to mission-critical technology that enables AI initiatives and drives business value. “BCV’s belief in Astronomer goes way back. We invested in the company’s seed round in 2019 and have supported the company over the years, now culminating in leading their Series D,” Salem said. “Beyond the impressive growth, Astronomer’s data orchestration has become even more important in the age of AI, which requires scalable orchestration and model deployment automation amidst a ballooning sea of data tools that don’t talk to each other.” According to the company’s internal data, 69% of customers who have used its platform for two or more years are using Airflow for AI and machine learning applications. This adoption rate is significantly higher than the broader Airflow community, suggesting that Astronomer’s managed service accelerates enterprise AI deployments. The company has seen 150% year-over-year growth in Astro (managed SaaS platform) annual recurring revenue and boasts a 130% net revenue retention rate, indicating strong customer expansion. “While market analysts may be looking for a clear winner in the cloud data platforms battle, enterprises have clearly chosen a multi-solution strategy—just like they earlier determined that multi-cloud would far outpace standardization on any single cloud provider,” Wheeler explained. “Leading enterprises refuse to lock into a single vendor, opting for multi-cloud and diverse data platform approaches to stay agile and take advantage of the latest innovations.” Inside Ford’s massive AI operation: How petabytes of weekly data power next-generation vehicles Major enterprises are already leveraging Astronomer’s platform for sophisticated AI use cases that would be challenging to implement without robust orchestration. At Ford Motor Company, Astronomer’s platform powers the company’s Advanced Driver Assistance Systems (ADAS) and its multi-million dollar “Mach1ML” machine learning operations platform. The automotive giant processes more than one petabyte of data weekly and runs over 300 parallel workflows, balancing CPU- and GPU-intensive tasks for AI model development across a hybrid public/private cloud platform. These workflows power everything from autonomous driving systems to Ford’s specialized FordLLM platform for large language models. Ford initially built its MLOps platform using Kubeflow for orchestration but encountered significant challenges, including a steep learning curve and tight integration with Google Cloud, which limited flexibility. After transitioning to Airflow for Mach1ML 2.0, Ford reports dramatically streamlined workflows and seamless integration across on-premises, cloud, and hybrid environments. From AI experiments to production: How orchestration bridges the implementation divide A common challenge for enterprises is moving AI from proof-of-concept to production. According to Astronomer’s research, organizations that establish strong data orchestration foundations are more successful at operationalizing AI. “As more enterprises are running ML workflows and real-time AI pipelines, they require scalable orchestration and model deployment automation,” Salem explained. “Astronomer delivers on this today, and as the orchestrator, is the one system that sees everything happening across the stack — when data moves, when transformations run, when models are trained.” Over 85% of Airflow users surveyed expect an increase in external-facing or revenue-generating solutions built on Airflow in the next year, highlighting how data orchestration is increasingly powering customer-facing applications rather than just internal analytics. This trend is evident across industries, from automotive to legal technology companies that are building specialized AI models to

Astronomer’s $93M raise underscores a new reality: Orchestration is king in AI infrastructure Read More »

You can now fine-tune your enterprise’s own version of OpenAI’s o4-mini reasoning model with reinforcement learning

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI today announced on its developer-focused account on the social network X that third-party software developers outside the company can now access reinforcement fine-tuning (RFT) for its new o4-mini language reasoning model. This enables them to customize a new, private version of it based on their enterprise’s unique products, internal terminology, goals, employees, processes and more. Essentially, this capability lets developers take the model available to the general public and tweak it to better fit their needs using OpenAI’s platform dashboard. Then, they can deploy it through OpenAI’s application programming interface (API), another part of its developer platform, and connect it to their internal employee computers, databases, and applications. Once deployed, if an employee or leader at the company wants to use it through a custom internal chatbot or custom OpenAI GPT to pull up private, proprietary company knowledge, answer specific questions about company products and policies, or generate new communications and collateral in the company’s voice, they can do so more easily with their RFT version of the model. However, one cautionary note: research has shown that fine-tuned models may be more prone to jailbreaks and hallucinations, so proceed cautiously! This launch expands the company’s model optimization tools beyond supervised fine-tuning (SFT) and introduces more flexible control for complex, domain-specific tasks. Additionally, OpenAI announced that supervised fine-tuning is now supported for its GPT-4.1 nano model, the company’s most affordable and fastest offering to date. How does Reinforcement Fine-Tuning (RFT) help organizations and enterprises? RFT creates a new version of OpenAI’s o4-mini reasoning model that is automatically adapted to the user’s or their enterprise/organization’s goals. It does so by applying a feedback loop during training, which developers at large enterprises (or even independent developers working independently) can now initiate relatively simply, easily and affordably through OpenAI’s online developer platform. Instead of training on a set of questions with fixed correct answers — which is what traditional supervised learning does — RFT uses a grader model to score multiple candidate responses per prompt. The training algorithm then adjusts model weights to make high-scoring outputs more likely. This structure allows customers to align models with nuanced objectives such as an enterprise’s “house style” of communication and terminology, safety rules, factual accuracy, or internal policy compliance. To perform RFT, users need to: Define a grading function or use OpenAI model-based graders. Upload a dataset with prompts and validation splits. Configure a training job via API or the fine-tuning dashboard. Monitor progress, review checkpoints and iterate on data or grading logic. RFT currently supports only o-series reasoning models and is available for the o4-mini model. Early enterprise use cases On its platform, OpenAI highlighted several early customers who have adopted RFT across diverse industries: Accordance AI used RFT to fine-tune a model for complex tax analysis tasks, achieving a 39% improvement in accuracy and outperforming all leading models on tax reasoning benchmarks. Ambience Healthcare applied RFT to ICD-10 medical code assignment, raising model performance by 12 points over physician baselines on a gold-panel dataset. Harvey used RFT for legal document analysis, improving citation extraction F1 scores by 20% and matching GPT-4o in accuracy while achieving faster inference. Runloop fine-tuned models for generating Stripe API code snippets, using syntax-aware graders and AST validation logic, achieving a 12% improvement. Milo applied RFT to scheduling tasks, boosting correctness in high-complexity situations by 25 points. SafetyKit used RFT to enforce nuanced content moderation policies and increased model F1 from 86% to 90% in production. ChipStack, Thomson Reuters, and other partners also demonstrated performance gains in structured data generation, legal comparison tasks and verification workflows. These cases often shared characteristics: clear task definitions, structured output formats and reliable evaluation criteria—all essential for effective reinforcement fine-tuning. RFT is available now to verified organizations. To help improve future models, OpenAI offers teams that share their training datasets with OpenAI a 50% discount. Interested developers can get started using OpenAI’s RFT documentation and dashboard. Pricing and billing structure Unlike supervised or preference fine-tuning, which is billed per token, RFT is billed based on time spent actively training. Specifically: $100 per hour of core training time (wall-clock time during model rollouts, grading, updates and validation). Time is prorated by the second, rounded to two decimal places (so 1.8 hours of training would cost the customer $180). Charges apply only to work that modifies the model. Queues, safety checks, and idle setup phases are not billed. If the user employs OpenAI models as graders (e.g., GPT-4.1), the inference tokens consumed during grading are billed separately at OpenAI’s standard API rates. Otherwise, the company can use outside models, including open source ones, as graders. Here is an example cost breakdown: Scenario Billable Time Cost 4 hours training 4 hours $400 1.75 hours (prorated) 1.75 hours $175 2 hours training + 1 hour lost (due to failure) 2 hours $200 This pricing model provides transparency and rewards efficient job design. To control costs, OpenAI encourages teams to: Use lightweight or efficient graders where possible. Avoid overly frequent validation unless necessary. Start with smaller datasets or shorter runs to calibrate expectations. Monitor training with API or dashboard tools and pause as needed. OpenAI uses a billing method called “captured forward progress,” meaning users are only billed for model training steps that were successfully completed and retained. So should your organization invest in RFTing a custom version of OpenAI’s o4-mini or not? Reinforcement fine-tuning introduces a more expressive and controllable method for adapting language models to real-world use cases. With support for structured outputs, code-based and model-based graders, and full API control, RFT enables a new level of customization in model deployment. OpenAI’s rollout emphasizes thoughtful task design and robust evaluation as keys to success. Developers interested in exploring this method can access documentation and examples via OpenAI’s fine-tuning dashboard. For organizations with clearly defined problems and verifiable answers, RFT offers a compelling way to align models with

You can now fine-tune your enterprise’s own version of OpenAI’s o4-mini reasoning model with reinforcement learning Read More »

OpenAI just fixed ChatGPT’s most annoying business problem: meet the PDF export that changes everything

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI launched a new PDF export capability for its Deep Research feature today, enabling users to download comprehensive research reports with fully preserved formatting, tables, images, and clickable citations. The seemingly modest update reveals the company’s intensifying focus on enterprise customers as competition in the AI research assistant market accelerates. The company announced the feature via an X.com post: “You can now export your deep research reports as well-formatted PDFs–complete with tables, images, linked citations, and sources. Just click the share icon and select ‘Download as PDF.’ It works for both new and past reports.” The capability is immediately available to all Plus, Team, and Pro subscribers, with Enterprise and Education users gaining access “soon,” according to a follow-up tweet. You can now export your deep research reports as well-formatted PDFs—complete with tables, images, linked citations, and sources. Just click the share icon and select ‘Download as PDF.’ It works for both new and past reports. pic.twitter.com/kecIR4tEne — OpenAI (@OpenAI) May 12, 2025 How OpenAI’s enterprise strategy is rapidly accelerating under new leadership This update represents a strategic shift for OpenAI as it aggressively targets professional and enterprise markets. The timing is particularly significant following last week’s hiring of Instacart CEO Fidji Simo to lead OpenAI’s new “Applications” division. The creation of a dedicated Applications unit under Simo’s leadership signals OpenAI’s recognition that business growth depends not just on cutting-edge research but on packaging capabilities in ways that solve specific business problems. PDF export directly addresses a practical pain point for professional users who need to share polished, verifiable research with colleagues and clients. Deep Research itself embodies this enterprise-focused strategy. The feature, which can analyze hundreds of online sources to produce comprehensive reports on complex topics, directly addresses high-value knowledge work in industries like finance, consulting, and legal services — areas where the ability to quickly synthesize information from disparate sources translates directly to billable hours and competitive advantage. What’s particularly telling is OpenAI’s willingness to dedicate engineering resources to workflow features rather than focusing exclusively on model capabilities. This indicates a maturing understanding that in enterprise environments, integration often matters more than raw technical performance. Inside the high-stakes battle for AI research assistant dominance The PDF enhancement arrives amid intensifying competition in the AI research assistant market. Perplexity launched its Deep Research feature in February with PDF export included from the start. You.com introduced its Advanced Research & Insights (ARI) agent in late February, aggressively marketing it as processing “over 3-10x more sources” than ChatGPT Deep Research while delivering results “3x faster.” Most recently, Anthropic announced web search capabilities for Claude on May 7th, directly challenging Deep Research’s core functionality of synthesizing information from across the web. The competitive differentiation between these offerings is rapidly shifting from basic capabilities to speed, comprehensiveness, and workflow integration. For business users, the deciding factors increasingly revolve around which tool best fits into existing processes and delivers reliable, verifiable results with minimal friction. This competitive dynamic creates pressure for rapid feature parity. When one provider introduces capabilities that address key workflow challenges, others must quickly match them or risk losing market share in high-value sectors. OpenAI’s addition of PDF export acknowledges this reality — the feature has become table stakes for serious contenders in the enterprise AI research space. The speed with which these companies are iterating suggests we’re entering a new phase of AI product development where user experience and workflow integration take precedence over pure technical capabilities — at least for features targeted at enterprise markets. Why PDF export transforms AI research from experimental to essential The technical implementation of PDF export represents far more than a convenience feature. It transforms Deep Research from an interesting capability into a practical business tool by addressing several critical requirements for enterprise adoption. First, it bridges the gap between cutting-edge AI and traditional business communication. While Silicon Valley may embrace chat interfaces, most organizations still operate on documents, presentations, and reports. By enabling seamless export to traditional formats, OpenAI acknowledges this reality rather than forcing users to adapt to new paradigms. Second, the preservation of citations as clickable links addresses the critical need for verifiability in professional contexts. In regulated industries, the ability to trace information back to its source isn’t optional—it’s mandatory for compliance and risk management. Without verifiable sources, AI-generated research lacks credibility in high-stakes decision-making environments. Perhaps most importantly, the PDF export capability dramatically improves Deep Research’s shareability. AI-generated insights create value only when they can be effectively distributed to decision-makers. By enabling users to generate professional-looking documents directly from research sessions, OpenAI removes a significant barrier to broader organizational adoption. The feature’s implementation across both new and past reports also demonstrates technical foresight. This backward compatibility suggests OpenAI designed Deep Research with a consistent underlying structure that enables uniform rendering across different output formats — indicative of solid product planning rather than reactive feature development. What enterprise AI adoption patterns reveal about future product development This feature release highlights a fundamental shift in how AI tools are evolving from experimental technologies to practical business applications. The initial wave of generative AI adoption was characterized by exploration and novelty — organizations experimenting with capabilities and identifying potential use cases. Now we’re entering a more mature phase where successful AI features must integrate seamlessly into existing workflows rather than requiring users to adopt entirely new ways of working. This evolution mirrors the historical pattern of other transformative technologies, from personal computers to mobile devices, where initial excitement over raw capabilities eventually gives way to practical considerations about how the technology fits into daily work. For technical decision-makers evaluating AI research assistants, this trend suggests prioritizing tools that complement existing workflows while delivering substantial productivity gains. Features that create friction — like requiring manual reformatting of outputs before they can be shared — become significant barriers to adoption regardless of how impressive

OpenAI just fixed ChatGPT’s most annoying business problem: meet the PDF export that changes everything Read More »

What your tools miss at 2:13 AM: How gen AI attack chains exploit telemetry lag

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Generative AI is creating a digital diaspora of techniques, technologies and tradecraft that everyone, from rogue attackers to nation-state cyber armies trained in the art of cyberwar, is adopting. Insider threats are growing, too, accelerated by job insecurity and growing inflation. All these challenges and more fall on the shoulders of the CISO, and it’s no wonder more are dealing with burnout. In Part 1:We explored how gen AI is reshaping the threat landscape, accelerating insider threats and putting unprecedented pressure on cybersecurity teams. Insider-driven risks, shadow AI usage and outdated detection models are forcing CISOs to rethink their defenses. Now, in Part 2, we turn to the solutions — how gen AI can help combat burnout across security operations centers (SOCs), enable smarter automation and guide CISOs through a 90-day roadmap to secure their enterprises against evolving threats. Battling burnout with gen AI deserves to be a 2025 CISO priority Nearly one in four CISOs consider quitting, with 93% citing extreme stress, further proving that burnout is creating increasingly severe operational and human risks. Gartner’s most recent research links burnout to decreased team efficiency and overlooked security tasks that often become vulnerabilities. Unsurprisingly, 90% of CISOs identify burnout as one of the main barriers that stand in the way of their teams getting more accomplished and using the full extent of their skills. How bad is burnout across cybersecurity and SOC teams? The majority of CISOs, 65%, say that burnout is a severe impediment to maintaining effective security operations. Forrester adds that 36% of the cybersecurity workforce are categorized as “Tired Rockstars,” or individuals who remain highly engaged but are on the brink of burnout. This emphasizes the critical need to address mental health and workload management proactively.​ SOC analysts endure heavy workloads that often turn severe when they have to monitor, analyze and aggregate insights from an average of over 10,000+ alerts a day. Chronic stress and not having enough control over their jobs lead to high turnover, with 65% considering leaving their careers. Ivanti’s 2024 Digital Employee Experience (DEX) Report underscores a vital cybersecurity link, noting that 93% of professionals agree improved DEX strengthens security, yet just 13% prioritize it. Ivanti SVP Daren Goeson told VentureBeat in a recent interview that “organizations often lack effective tools to measure digital employee experience, significantly slowing security and productivity initiatives.” SOC teams are particularly hard hit by burnout. While AI can’t solve the entire challenge, it can help automate SOC workflows and accelerate triage. Forrester is urging CISOs to think beyond automating existing processes and move forward with rationalizing security controls, deploying gen AI within existing platforms. Jeff Pollard, VP at Forrester, writes: “The only way to deal with the volatility your organization encounters is to simplify your control stack while identifying unnecessary duplicate spend and gen AI can boost productivity, but negotiating its pricing strategically will help you achieve more with less.” There are over 16 vendors of new-gen AI-based apps aimed at helping SOC teams that are in a race against time every day, especially when it comes to containing breakout times. CrowdStrike’s recent global threat report emphasizes why SOCs need to always have their A-game, as adversaries now break out within 2 minutes and 7 seconds after gaining initial access. Their recent introduction of Charlotte AI Detection Triage has proven capable of automating alert assessment with over 98% accuracy. It cuts manual triage by more than 40 hours per week, all without losing control or precision. SOCs increasingly lean on AI copilots to fight signal overload and staffing shortfalls. VentureBeat’s Security Copilot Guide (Google Sheet) provides a complete matrix with 16 vendors’ AI security copilots. What needs to be on every CISO’s roadmap in 2025 Cybersecurity leaders and their teams have significant influence on how, when and what gen AI applications and platforms their enterprises invest in. Gartner’s Phillip Shattan writes that “when it comes to generation AI-related decisions, SRM leaders wield significant influence, with over 70% reporting that cybersecurity has some influence over the decisions they make.” With so much influence on the future of gen AI investment in their organizations, CISOs need to have a solid framework or roadmap against which to plan. VentureBeat is seeing more roadmaps comparable to the one structured below for ensuring the integration of gen AI, cybersecurity and risk management initiatives. The following is a guideline that needs to be tailored to the unique needs of a business: Days 0–30: Establish core cybersecurity foundations 1. Set the goal of defining the structure and role of an AI governance framework Define formal AI policies outlining responsible data use, model training protocols, privacy controls and ethical standards. Vendors to consider: IBM AI Governance, Microsoft Purview, ServiceNow AI Governance, AWS AI Service Cards If not already in place, deploy real-time AI monitoring tools to detect unauthorized usage, anomalous behaviors and data leakage from models. Recommended platforms: Robust Intelligence, CalypsoAI, HiddenLayer, Arize AI, Credo AI, Arthur AI Train SOC, security and risk management teams on the AI-specific risks to alleviate any conflicts over how AI governance frameworks are designed to work. 2. If not already in place, get a solid Identity and Access Management (IAM) platform in place Keep building a business case for zero trust by illustrating how improving identity protection helps protect and grow revenue. Deploy a robust IAM solution to reinforce identity protection and revenue security. Top IAM platforms: Okta Identity Cloud, Microsoft Entra ID, CyberArk Identity, ForgeRock, Ping Identity, SailPoint Identity Platform, Ivanti Identity Director. If not already done, immediately conduct comprehensive audits of all user identities, focusing particularly on privileged access accounts. Enable real-time monitoring for all privileged access accounts and delete unused accounts for contractors. Implement strict least-privilege access policies, multi-factor authentication (MFA) and continuous adaptive authentication based on contextual risk assessments to strengthen your zero-trust framework. Leading Zero-Trust solutions include CrowdStrike Falcon Identity Protection, Zscaler Zero Trust Exchange, Palo Alto Networks Prisma Access, Cisco Duo Security and Cloudflare

What your tools miss at 2:13 AM: How gen AI attack chains exploit telemetry lag Read More »

5 strategies that separate AI leaders from the 92% still stuck in pilot mode

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More As AI moves from experimentation to real-world deployments, enterprises are determining best practices for what actually works at scale. Multiple studies from various vendors have outlined the core challenges. According to a recent report from Vellum, only 25% of organizations have deployed AI in production with even fewer recognizing measurable impact. A report from Deloitte found similar challenges with organizations struggling with issues of scalability and risk management.A new study from Accenture, out this week, provides a data-driven analysis of how leading companies are successfully implementing AI across their enterprises. The “Front-Runners’ Guide to Scaling AI” report is based on a survey of 2,000 C-suite and data science executives from nearly 2,000 global companies with revenues exceeding $1 billion. The findings reveal a significant gap between AI aspirations and execution. The findings paint a sobering picture: only 8% of companies qualify as true “front-runners” that have successfully scaled multiple strategic AI initiatives, while 92% struggle to advance beyond experimental implementations. For enterprise IT leaders navigating AI implementation, the report offers critical insights into what separates successful AI scaling from stalled initiatives, highlighting the importance of strategic bets, talent development and data infrastructure. Here are five key takeaways for enterprise IT leaders from Accenture’s research. 1. Talent maturity outweighs investment as the key scaling factor While many organizations focus primarily on technology investment, Accenture’s research reveals that talent development is actually the most critical differentiator for successful AI implementation. “We found the top achievement factor wasn’t investment but rather talent maturity,” Senthil Ramani, data and AI lead at Accenture, told VentureBeat. “Front-runners had four-times greater talent maturity compared to other groups. Leading by executing talent strategies more effectively and directing talent-related spending to the highest-value uses.” The report shows front-runners differentiate themselves through people-centered strategies. They focus four times more on cultural adaptation than other companies, emphasize talent alignment three times more and implement structured training programs at twice the rate of competitors. IT leader action item: Develop a comprehensive talent strategy that addresses both technical skills and cultural adaptation. Establish a centralized AI center of excellence – the report shows 57% of front-runners use this model compared to just 16% of fast-followers. 2. Data infrastructure makes or breaks AI scaling efforts Perhaps the most significant barrier to enterprise-wide AI implementation is inadequate data readiness. According to the report, 70% of surveyed companies acknowledged the need for a strong data foundation when trying to scale AI. “The biggest challenge for most companies trying to scale AI is the development of the right data infrastructure,” Ramani said. “97% of front-runners have developed three or more new data and AI capabilities for gen AI, compared to just 5% of companies that are experimenting with AI.” These essential capabilities include advanced data management techniques like retrieval-augmented generation (RAG) (used by 17% of front-runners vs. 1% of fast-followers) and knowledge graphs (26% vs. 3%), as well as diverse data utilization across zero-party, second-party, third-party and synthetic sources. IT leader action item: Conduct a comprehensive data readiness assessment explicitly focused on AI implementation requirements. Prioritize building capabilities to handle unstructured data alongside structured data and develop a strategy for integrating tacit organizational knowledge. 3. Strategic bets deliver superior returns to broad implementation While many organizations attempt to implement AI across multiple functions simultaneously, Accenture’s research shows that focused strategic bets yield significantly better results. “C-suite leaders first need to agree on—then clearly articulate—what value means for their company, as well as how they hope to achieve it,” Ramani said. “In the report, we referred to ‘strategic bets,’ or significant, long-term investments in gen AI focusing on the core of a company’s value chain and offering a very large payoff. This strategic focus is essential for maximizing the potential of AI and ensuring that investments deliver sustained business value.” This focused approach pays dividends. Companies that have scaled at least one strategic bet are nearly three times more likely to have their ROI from gen AI surpass forecasts compared to those that haven’t. IT leader action item: Identify 3-4 industry-specific strategic AI investments that directly impact your core value chain rather than pursuing broad implementation.  4. Responsible AI creates value beyond risk mitigation Most organizations view responsible AI primarily as a compliance exercise, but Accenture’s research reveals that mature responsible AI practices directly contribute to business performance. “Companies need to shift their mindset from viewing responsible AI as a compliance obligation to recognizing it as a strategic enabler of business value,” Ramani explained. “ROI can be measured in terms of short-term efficiencies, such as improvements in workflows, but it really should be measured against longer-term business transformation.” The report emphasizes that responsible AI includes not just risk mitigation but also strengthens customer trust, improves product quality and bolsters talent acquisition – directly contributing to financial performance. IT leader action item: Develop comprehensive responsible AI governance that goes beyond compliance checkboxes. Implement proactive monitoring systems that continually assess AI risks and impacts. Consider building responsible AI principles directly into your development processes rather than applying them retroactively. 5. Front-runners embrace agentic AI architecture The report highlights a transformative trend among front-runners: the deployment of “agentic architecture” – networks of AI agents that autonomously orchestrate entire business workflows. Front-runners demonstrate significantly greater maturity in deploying autonomous AI agents tailored to industry needs. The report shows 65% of front-runners excel in this capability compared to 50% of fast-followers, with one-third of surveyed companies already using AI agents to strengthen innovation. These intelligent agent networks represent a fundamental shift from traditional AI applications. They enable sophisticated collaboration between AI systems that dramatically improves quality, productivity and cost-efficiency at scale. IT leader action item: Begin exploring how agentic AI could transform core business processes by identifying workflows that would benefit from autonomous orchestration. Create pilot projects focused on multi-agent systems in your industry’s high-value use cases. The tangible rewards of AI maturity

5 strategies that separate AI leaders from the 92% still stuck in pilot mode Read More »

Meta, Cisco put open-source LLMs at the core of next-gen SOC workflows

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More With cyberattacks accelerating at machine speed, open-source large language models (LLMs) have quickly become the infrastructure that enables startups and global cybersecurity leaders to develop and deploy adaptive, cost-effective defenses against threats that evolve faster than human analysts can respond. Open-source LLMs’ initial advantages of faster time-to-market, greater adaptability and lower cost have created a scalable, secure foundation for delivering infrastructure. At last week’s RSAC 2025 conference, Cisco, Meta and ProjectDiscovery announced new open-source LLMs and a community-driven attack surface innovation that together define the future of open-source in cybersecurity.    One of the key takeaways from this year’s RSAC is the shift in open-source LLMs to extend and strengthen infrastructure at scale. Open-source AI is on the verge of delivering what many cybersecurity leaders have called on for years, which is the ability of the many cybersecurity providers to join forces against increasingly complex threats. The vision of being collaborators in creating a unified, open-source LLM and infrastructure is a step closer, given the announcements at RSAC. Cisco’s Chief Product Officer Jeetu Patel emphasized in his keynote, “The true enemy is not our competitor. It is actually the adversary. And we want to make sure that we can provide all kinds of tools and have the ecosystem band together so that we can actually collectively fight the adversary.” Patel explained the urgency of taking on such a complex challenge, saying, “AI is fundamentally changing everything, and cybersecurity is at the heart of it all. We’re no longer dealing with human-scale threats; these attacks are occurring at machine scale.” Cisco’s Foundation-sec-8B LLM defines a new era of open-source AI Cisco’s newly established Foundation AI group originates from the company’s recent acquisition of Robust Intelligence. Foundation AI’s focus is on delivering domain-specific AI infrastructure tailored explicitly to cybersecurity applications, which are among the most challenging to solve. Built on Meta’s Llama 3.1 architecture, this 8-billion parameter, open-weight Large Language Model isn’t a retrofitted general-purpose AI. It was purpose-built, meticulously trained on a cybersecurity-specific dataset curated in-house by Cisco Foundation AI. “By their nature, the problems in this charter are some of the most difficult ones in AI today. To make the technology accessible, we decided that most of the work we do in Foundation AI should be open. Open innovation allows for compounding effects across the industry, and it plays a particularly important role in the cybersecurity domain,” writes Yaron Singer, VP of AI and Security at Foundation. With open-source anchoring Foundation AI, Cisco has designed an efficient architectural approach for cybersecurity providers who typically compete with each other, selling comparable solutions, to become collaborators in creating more unified, hardened defenses. Singer writes, “Whether you’re embedding it into existing tools or building entirely new workflows, foundation-sec-8b adapts to your organization’s unique needs.” Cisco’s blog post announcing the model recommends that security teams apply foundation-sec-8b across the security lifecycle. Potential use cases Cisco recommends for the model include SOC acceleration, proactive threat defense, engineering enablement, AI-assisted code reviews, validating configurations and custom integration. Foundation-sec-8B’s weights and tokenizer have been open-sourced under the permissive Apache 2.0 license on Hugging Face, allowing enterprise-level customization and deployment without vendor lock-in, maintaining compliance and privacy controls. Cisco’s blog also notes plans to open-source the training pipeline, further fostering community-driven innovation. Cybersecurity is in the LLM’s DNA Cisco chose to create a cybersecurity-specific model optimized for the needs of SOC, DevSecOps and large-scale security teams. Retrofitting an existing, generic AI model wouldn’t get them to their goal, so the Foundation AI team engineered its training using a large-scale, expansive and well-curated cybersecurity-specific dataset. By taking a more precision-focused approach to building the model, the Foundation AI team was able to ensure that the model deeply understands real-world cyber threats, vulnerabilities and defensive strategies. Key training datasets included the following: Vulnerability Databases: Including detailed CVEs (Common Vulnerabilities and Exposures) and CWEs (Common Weakness Enumerations) to pinpoint known threats and weaknesses. Threat Behavior Mappings: Structured from proven security frameworks such as MITRE ATT&CK, providing context on attacker methodologies and behaviors. Threat Intelligence Reports: Comprehensive insights derived from global cybersecurity events and emerging threats. Red-Team Playbooks: Tactical plans outlining real-world adversarial techniques and penetration strategies. Real-World Incident Summaries: Documented analyses of cybersecurity breaches, incidents, and their mitigation paths. Compliance and Security Guidelines: Established best practices from leading standards bodies, including the National Institute of Standards and Technology (NIST) frameworks and the Open Worldwide Application Security Project (OWASP) secure coding principles. This tailored training regimen positions Foundation-sec-8B uniquely to excel at complex cybersecurity tasks, offering significantly enhanced accuracy, deeper contextual understanding and quicker threat response capabilities than general-purpose alternatives. Benchmarking Foundation-sec-8B LLM Cisco’s technical benchmarks show Foundation-sec-8B delivers cybersecurity performance comparable to significantly larger models: Benchmark Foundation-sec-8B Llama-3.1-8B Llama-3.1-70B CTI-MCQA 67.39 64.14 68.23 CTI-RCM 75.26 66.43 72.66 By designing the foundation model to be cybersecurity-specific, Cisco is enabling SOC teams to gain greater efficiency with advanced threat analytics without having to pay high infrastructure costs to get it. Cisco’s broader strategic vision, detailed in its blog, Foundation AI: Robust Intelligence for Cybersecurity, addresses common AI integration challenges, including limited domain alignment of general-purpose models, insufficient datasets and legacy system integration difficulties. Foundation-sec-8B is specifically designed to navigate these barriers, running efficiently on minimal hardware configurations, typically requiring just one or two Nvidia A100 GPUs. Meta also underscored its open-source strategy at RSAC 2025, expanding its AI Defenders Suite to strengthen security across generative AI infrastructure. Their open-source toolkit now includes Llama Guard 4, a multimodal classifier detecting policy violations across text and images, improving compliance monitoring within AI workflows. Also introduced is LlamaFirewall, an open-source, real-time security framework integrating modular capabilities that includes PromptGuard 2, which is used to detect prompt injections and jailbreak attempts. Also launched as part of LlamaFirewall are Agent Alignment Checks that monitor and protect AI agent decision-making processes along with CodeShield, which is designed to inspect

Meta, Cisco put open-source LLMs at the core of next-gen SOC workflows Read More »

Microsoft launches Phi-4-Reasoning-Plus, a small, powerful, open weights reasoning model!

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Microsoft Research has announced the release of Phi-4-reasoning-plus, an open-weight language model built for tasks requiring deep, structured reasoning. Building on the architecture of the previously released Phi-4, the new model integrates supervised fine-tuning and reinforcement learning to deliver improved performance on benchmarks in mathematics, science, coding, and logic-based tasks. Phi-4-reasoning-plus is a 14-billion parameter dense decoder-only Transformer model that emphasizes quality over scale. Its training process involved 16 billion tokens—about 8.3 billion of them unique—drawn from synthetic and curated web-based datasets. A reinforcement learning (RL) phase, using only about 6,400 math-focused problems, further refined the model’s reasoning capabilities. The model has been released under a permissive MIT license — enabling its use for broad commercial and enterprise applications, and fine-tuning or distillation, without restriction — and is compatible with widely used inference frameworks including Hugging Face Transformers, vLLM, llama.cpp, and Ollama. Microsoft provides detailed recommendations on inference parameters and system prompt formatting to help developers get the most from the model. Outperforms larger models The model’s development reflects Microsoft’s growing emphasis on training smaller models capable of rivaling much larger systems in performance. Despite its relatively modest size, Phi-4-reasoning-plus outperforms larger open-weight models such as DeepSeek-R1-Distill-70B on a number of demanding benchmarks. On the AIME 2025 math exam, for instance, it delivers a higher average accuracy at passing all 30 questions on the first try (a feat known as “pass@1”) than the 70B parameter distillation model, and approaches the performance of DeepSeek-R1 itself, which is far larger at 671B parameters. Structured thinking via fine-tuning To achieve this, Microsoft employed a data-centric training strategy. During the supervised fine-tuning stage, the model was trained using a curated blend of synthetic chain-of-thought reasoning traces and filtered high-quality prompts. A key innovation in the training approach was the use of structured reasoning outputs marked with special <think> and </think> tokens. These guide the model to separate its intermediate reasoning steps from the final answer, promoting both transparency and coherence in long-form problem solving. Reinforcement learning for accuracy and depth Following fine-tuning, Microsoft used outcome-based reinforcement learning—specifically, the Group Relative Policy Optimization (GRPO) algorithm—to improve the model’s output accuracy and efficiency. The RL reward function was crafted to balance correctness with conciseness, penalize repetition, and enforce formatting consistency. This led to longer but more thoughtful responses, particularly on questions where the model initially lacked confidence. Optimized for research and engineering constraints Phi-4-reasoning-plus is intended for use in applications that benefit from high-quality reasoning under memory or latency constraints. It supports a context length of 32,000 tokens by default and has demonstrated stable performance in experiments with inputs up to 64,000 tokens. It is best used in a chat-like setting and performs optimally with a system prompt that explicitly instructs it to reason through problems step-by-step before presenting a solution. Extensive safety testing and use guidelines Microsoft positions the model as a research tool and a component for generative AI systems rather than a drop-in solution for all downstream tasks. Developers are advised to carefully evaluate performance, safety, and fairness before deploying the model in high-stakes or regulated environments. Phi-4-reasoning-plus has undergone extensive safety evaluation, including red-teaming by Microsoft’s AI Red Team and benchmarking with tools like Toxigen to assess its responses across sensitive content categories. According to Microsoft, this release demonstrates that with carefully curated data and training techniques, small models can deliver strong reasoning performance — and democratic, open access to boot. Here’s a revised version of the enterprise implications section in a more technical, news-style tone, aligning with a business-technology publication: Implications for enterprise technical decision-makers The release of Microsoft’s Phi-4-reasoning-plus may present meaningful opportunities for enterprise technical stakeholders managing AI model development, orchestration, or data infrastructure. For AI engineers and model lifecycle managers, the model’s 14B parameter size coupled with competitive benchmark performance introduces a viable option for high-performance reasoning without the infrastructure demands of significantly larger models. Its compatibility with frameworks such as Hugging Face Transformers, vLLM, llama.cpp, and Ollama provides deployment flexibility across different enterprise stacks, including containerized and serverless environments. Teams responsible for deploying and scaling machine learning models may find the model’s support for 32k-token contexts—expandable to 64k in testing—particularly useful in document-heavy use cases such as legal analysis, technical QA, or financial modeling. The built-in structure of separating chain-of-thought reasoning from the final answer could also simplify integration into interfaces where interpretability or auditability is required. For AI orchestration teams, Phi-4-reasoning-plus offers a model architecture that can be more easily slotted into pipelines with resource constraints. This is relevant in scenarios where real-time reasoning must occur under latency or cost limits. Its demonstrated ability to generalize to out-of-domain problems, including NP-hard tasks like 3SAT and TSP, suggests utility in algorithmic planning and decision support use cases beyond those explicitly targeted during training. Data engineering leads may also consider the model’s reasoning format—designed to reflect intermediate problem-solving steps—as a mechanism for tracking logical consistency across long sequences of structured data. The structured output format could be integrated into validation layers or logging systems to support explainability in data-rich applications. From a governance and safety standpoint, Phi-4-reasoning-plus incorporates multiple layers of post-training safety alignment and has undergone adversarial testing by Microsoft’s internal AI Red Team. For organizations subject to compliance or audit requirements, this may reduce the overhead of developing custom alignment workflows from scratch. Overall, Phi-4-reasoning-plus shows how the reasoning craze kicked off by the likes of OpenAI’s “o” series of models and DeepSeek R1 is continuing to accelerate and move downstream to smaller, more accessible, affordable, and customizable models. For technical decision-makers tasked with managing performance, scalability, cost, and risk, it offers a modular, interpretable alternative that can be evaluated and integrated on a flexible basis—whether in isolated inference endpoints, embedded tooling, or full-stack generative AI systems. source

Microsoft launches Phi-4-Reasoning-Plus, a small, powerful, open weights reasoning model! Read More »

Alibaba’s ‘ZeroSearch’ lets AI learn to google itself — slashing training costs by 88 percent

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Researchers at Alibaba Group have developed a novel approach that could dramatically reduce the cost and complexity of training AI systems to search for information, eliminating the need for expensive commercial search engine APIs altogether. The technique, called “ZeroSearch,” allows large language models (LLMs) to develop advanced search capabilities through a simulation approach rather than interacting with real search engines during the training process. This innovation could save companies significant API expenses while offering better control over how AI systems learn to retrieve information. “Reinforcement learning [RL] training requires frequent rollouts, potentially involving hundreds of thousands of search requests, which incur substantial API expenses and severely constrain scalability,” write the researchers in their paper published on arXiv this week. “To address these challenges, we introduce ZeroSearch, a reinforcement learning framework that incentivizes the search capabilities of LLMs without interacting with real search engines.” Alibaba just dropped ZeroSearch on Hugging Face Incentivize the Search Capability of LLMs without Searching pic.twitter.com/QfniJNO3LH — AK (@_akhaliq) May 8, 2025 How ZeroSearch trains AI to search without search engines The problem that ZeroSearch solves is significant. Companies developing AI assistants that can autonomously search for information face two major challenges: the unpredictable quality of documents returned by search engines during training, and the prohibitively high costs of making hundreds of thousands of API calls to commercial search engines like Google. Alibaba’s approach begins with a lightweight supervised fine-tuning process to transform an LLM into a retrieval module capable of generating both relevant and irrelevant documents in response to a query. During reinforcement learning training, the system employs what the researchers call a “curriculum-based rollout strategy” that gradually degrades the quality of generated documents. “Our key insight is that LLMs have acquired extensive world knowledge during large-scale pretraining and are capable of generating relevant documents given a search query,” the researchers explain. “The primary difference between a real search engine and a simulation LLM lies in the textual style of the returned content.” Outperforming Google at a fraction of the cost In comprehensive experiments across seven question-answering datasets, ZeroSearch not only matched but often surpassed the performance of models trained with real search engines. Remarkably, a 7B-parameter retrieval module achieved performance comparable to Google Search, while a 14B-parameter module even outperformed it. The cost savings are substantial. According to the researchers’ analysis, training with approximately 64,000 search queries using Google Search via SerpAPI would cost about $586.70, while using a 14B-parameter simulation LLM on four A100 GPUs costs only $70.80 — an 88% reduction. “This demonstrates the feasibility of using a well-trained LLM as a substitute for real search engines in reinforcement learning setups,” the paper notes. What this means for the future of AI development This breakthrough is a major shift in how AI systems can be trained. ZeroSearch shows that AI can improve without depending on external tools like search engines. The impact could be substantial for the AI industry. Until now, training advanced AI systems often required expensive API calls to services controlled by big tech companies. ZeroSearch changes this equation by allowing AI to simulate search instead of using actual search engines. For smaller AI companies and startups with limited budgets, this approach could level the playing field. The high costs of API calls have been a major barrier to entry in developing sophisticated AI assistants. By cutting these costs by nearly 90%, ZeroSearch makes advanced AI training more accessible. Beyond cost savings, this technique gives developers more control over the training process. When using real search engines, the quality of returned documents is unpredictable. With simulated search, developers can precisely control what information the AI sees during training. The technique works across multiple model families, including Qwen-2.5 and LLaMA-3.2, and with both base and instruction-tuned variants. The researchers have made their code, datasets, and pre-trained models available on GitHub and Hugging Face, allowing other researchers and companies to implement the approach. As large language models continue to evolve, techniques like ZeroSearch suggest a future where AI systems can develop increasingly sophisticated capabilities through self-simulation rather than relying on external services — potentially changing the economics of AI development and reducing dependencies on large technology platforms. The irony is clear: in teaching AI to search without search engines, Alibaba may have created a technology that makes traditional search engines less necessary for AI development. As these systems become more self-sufficient, the technology landscape could look very different in just a few years. source

Alibaba’s ‘ZeroSearch’ lets AI learn to google itself — slashing training costs by 88 percent Read More »

Mistral comes out swinging for enterprise AI customers with new Le Chat Enterprise, Medium 3 model

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More French AI startup Mistral has raised boatloads of private funding but has yet to crack the top AI usage charts globally, especially when it comes to enterprise and developer adoption. But that may change starting today: The company just unveiled Le Chat Enterprise, a unified AI assistant platform designed for enterprise-scale productivity and privacy, powered by its new Medium 3 model that outperforms larger ones at a fraction of the cost (here, “larger” refers to the number of parameters, or internal model settings, which typically denote more complexity and more powerful capabilities, but also take more compute resources such as GPUs to run). Le Chat Enterprise is a ChatGPT-like assistant and competitor built from the ground up for data protection, auditing, and cross-application support Available on the web and via mobile apps, Le Chat Enterprise is like a ChatGPT competitor, but one built specifically for enterprises and their employees, taking into account the fact that they’ll likely be working across a suite of different applications and data sources. It’s designed to consolidate AI functionality into a single, privacy-first environment that enables deep customization, cross-functional workflows, and rapid deployment. Among its key features that will be of interest to business owners and technical decision makers are: Enterprise search across private data sources (your company’s Google Drive, SharePoint, Gmail, and more, without exposing or releasing information to third-parties) Document libraries with auto-summary and citation capabilities Custom connectors and agent builders for no-code task automation Custom model integrations and memory-based personalization Hybrid deployment options with support for public cloud, private VPCs, and on-prem hosting Le Chat Enterprise supports seamless integration into existing tools and workflows. Companies can build AI agents tailored to their operations and maintain full sovereignty over deployment and data—without vendor lock-in. The platform’s privacy architecture adheres to strict access controls and supports full audit logging, ensuring data governance for regulated industries. Enterprises also gain full control over the AI stack—from infrastructure and platform features to model-level customization and user interfaces. And given the general suspicion from some Western companies and governments around China and its growing library of powerful open source models from companies there, coupled with Mistral’s location in the European Union and the tight data protection laws it must follow (General Data Protection Regulation aka GDPR and the EU AI Act), Mistral’s new Le Chat Enterprise offering could be appealing to many enterprises with stricter security and data storage policies (especially medium-to-large and legacy businesses). Mistral is also rolling out improvements to its Le Chat Pro and Team plans, targeting individuals and small teams looking for productivity tools backed by its language models. All tiers benefit from the core capabilities introduced in Le Chat Enterprise. Mistral Medium 3 outperforms GPT-4o and even Claude 3.7 Sonnet on key benchmarks and is available via API and on-prem Mistral Medium 3 introduces a new performance tier in the company’s model lineup, positioned between lightweight and large-scale models. It is a proprietary model, meaning unlike previous Mistral releases, it is not available under an open source license and must be used through Mistral’s website and API or that of its partners. Designed for enterprise use, the model delivers more than 90% of the benchmark performance of Claude 3.7 Sonnet, but at one-eighth the cost—$0.40 per million input tokens and $20.80 per million output tokens, compared to Sonnet’s $3/$15 for input/output. Benchmarks show that Mistral Medium 3 is particularly strong in software development tasks. In coding tests like HumanEval and MultiPL-E, it matches or surpasses both Claude 3.7 Sonnet and OpenAI’s GPT-4o models. According to third-party human evaluations, it outperforms Llama 4 Maverick in 82% of coding scenarios and exceeds Command-A in nearly 70% of cases. The model also performs competitively across languages and modalities. Compared to Llama 4 Maverick, it has higher win rates in English (67%), French (71%), Spanish (73%), and Arabic (65%), and leads in multimodal performance with notable scores in tasks like DocVQA (0.953), AI2D (0.937), and ChartQA (0.826). Mistral Medium 3 is optimized for enterprise integration. It supports hybrid and on-premises deployment, offers custom post-training, and connects easily to business systems. According to Mistral, it’s already being used in beta by organizations in sectors such as financial services, energy, and healthcare to power domain-specific workflows and customer-facing solutions. Mistral Medium 3 is now accessible via Mistral’s La Plateforme API and Amazon Sagemaker, with support coming soon to IBM WatsonX, NVIDIA NIM, Azure AI Foundry, and Google Cloud Vertex. Meanwhile, Le Chat Enterprise is available in the Google Cloud Marketplace, and will launch shortly on Azure AI and AWS Bedrock. For those ready to explore the assistant experience, Le Chat is available at chat.mistral.ai, as well as in the App Store and Google Play Store, with no credit card required to get started. By combining a high-efficiency model with a customizable enterprise platform, Mistral AI is making a concerted push to lower the barriers to scalable, privacy-respecting AI adoption in the enterprise world. source

Mistral comes out swinging for enterprise AI customers with new Le Chat Enterprise, Medium 3 model Read More »

AWS report: Generative AI overtakes security in global tech budgets for 2025

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Generative AI tools have surpassed cybersecurity as the top budget priority for global IT leaders heading into 2025, according to a comprehensive new study released today by Amazon Web Services. The AWS Generative AI Adoption Index, which surveyed 3,739 senior IT decision makers across nine countries, reveals that 45% of organizations plan to prioritize generative AI spending over traditional IT investments like security tools (30%) — a significant shift in corporate technology strategies as businesses race to capitalize on AI’s transformative potential. “I don’t think it’s cause for concern,” said Rahul Pathak, Vice President of Generative AI and AI/ML Go-to-Market at AWS, in an exclusive interview with VentureBeat. “The way I interpret that is that customers’ security remains a massive priority. What we’re seeing with AI being such a major item from a budget prioritization perspective is that customers are seeing so many use cases for AI. It’s really that there’s a broad need to accelerate adoption of AI that’s driving that particular outcome.” The extensive survey, conducted across the United States, Brazil, Canada, France, Germany, India, Japan, South Korea, and the United Kingdom, shows that generative AI adoption has reached a critical inflection point, with 90% of organizations now deploying these technologies in some capacity. More tellingly, 44% have already moved beyond the experimental phase into production deployment. IT leaders rank generative AI as their top budget priority for 2025, significantly outpacing traditional security investments. (Credit: Amazon Web Services) 60% of companies have already appointed Chief AI Officers as C-suite transforms for the AI era As AI initiatives scale across organizations, new leadership structures are emerging to manage the complexity. The report found that 60% of organizations have already appointed a dedicated AI executive, such as a Chief AI Officer (CAIO), with another 26% planning to do so by 2026. This executive-level commitment reflects growing recognition of AI’s strategic importance, though the study notes that nearly one-quarter of organizations will still lack formal AI transformation strategies by 2026, suggesting potential challenges in change management. “A thoughtful change management strategy will be critical,” the report emphasizes. “The ideal strategy should address operating model changes, data management practices, talent pipelines, and scaling strategies.” Companies average 45 AI experiments but only 20 will reach users in 2025: the production gap challenge Organizations conducted an average of 45 AI experiments in 2024, but only about 20 are expected to reach end users by 2025, highlighting persistent implementation challenges. “For me to see over 40% going into production for something that’s relatively new, I actually think is pretty rapid and high success rate from an adoption perspective,” Pathak noted. “That said, I think customers are absolutely using AI in production at scale, and I think we want to obviously see that continue to accelerate.” The report identified talent shortages as the primary barrier to transitioning experiments into production, with 55% of respondents citing the lack of a skilled generative AI workforce as their biggest challenge. “I’d say another big piece that’s an unlock to getting into production successfully is customers really working backwards from what business objectives they’re trying to drive, and then also understanding how will AI interact with their data,” Pathak told VentureBeat. “It’s really when you combine the unique insights you have about your business and your customers with AI that you can drive a differentiated business outcome.” Organizations conducted 45 AI experiments on average in 2024, but talent shortages prevent more than half from reaching production. (Credit: Amazon Web Services) 92% of organizations will hire AI talent in 2025 while 75% implement training to bridge skills gap To address the skills gap, organizations are pursuing dual strategies of internal training and external recruitment. The survey found that 56% of organizations have already developed generative AI training plans, with another 19% planning to do so by the end of 2025. “For me, it’s clear that it’s top of mind for customers,” Pathak said regarding the talent shortage. “It’s, how do we make sure that we bring our teams along and employees along and get them to a place where they’re able to maximize the opportunity.” Rather than specific technical skills, Pathak emphasized adaptability: “I think it’s more about, can you commit to sort of learning how to use AI tools so you can build them into your day-to-day workflow and keep that agility? I think that mental agility will be important for all of us.” The talent push extends beyond training to aggressive hiring, with 92% of organizations planning to recruit for roles requiring generative AI expertise in 2025. In a quarter of organizations, at least 50% of new positions will require these skills. One in four organizations will require generative AI skills for at least half of all new positions in 2025. (Credit: Amazon Web Services) Financial services joins hybrid AI revolution: only 25% of companies building solutions from scratch The long-running debate over whether to build proprietary AI solutions or leverage existing models appears to be resolving in favor of a hybrid approach. Only 25% of organizations plan to deploy solutions developed in-house from scratch, while 58% intend to build custom applications on pre-existing models and 55% will develop applications on fine-tuned models. This represents a notable shift for industries traditionally known for custom development. The report found that 44% of financial services firms plan to use out-of-the-box solutions — a departure from their historical preference for proprietary systems. “Many select customers are still building their own models,” Pathak explained. “That being said, I think there’s so much capability and investment that’s gone into core foundation models that there are excellent starting points, and we’ve worked really hard to make sure customers can be confident that their data is protected. Nothing leaks into the models. Anything they do for fine-tuning or customization is private and remains their IP.” He added that companies can still leverage their proprietary knowledge while

AWS report: Generative AI overtakes security in global tech budgets for 2025 Read More »