Downer’s digital journey to deliver consistency to customers

On AI use cases: We’re an asset management business, so we’re often looking at the existing condition of assets and then working out how we need to maintain them for the public. One of our innovations has been a solution called Fault IQ, which uses an off the shelf detection product. So in Downer Digital, we don’t always need to build everything ourselves. We’ll use something off the shelf if we can, and then configure it to our database of faults we find as we work on our road corridors. What that means is we’ve got a lot of historical data on road faults, and the historical data, in conjunction with the AI, can then predict whether there’ll be faults in the future. This solution revolutionizes the performance of the rail corridor because it transforms the inspection process. It gives our operators and our maintainers the ability to identify faults quickly, but also to take photos of any in real time and get them off to the operator so they can do something about it. On aligning the overall IT strategy: The Downer Digital strategy has three overarching pillars and 12 strategic programs of work. And those 12 programs align to the business unit strategies as well. So everything we do in the digital strategy is delivering for the businesses and against the Downer strategy. The 12 programs focus on areas where we want to move the needle, and they’ll usually focus on innovation, such as digital twins for some of our customers, or it could be an AI solution that’s supporting safety on one of our road customers. They also support how we uplift our middle office and where we’re looking to get better at utilizing the data we have in our organization to make decisions. And then we’ve got a number of foundational programs of work that look at uplifting our infrastructure, making sure we’ve got flexibility in our cloud solutions. On CIO aspirations: It all depends on where you start your career. You could’ve already taken a route where you’ve got a deep technical skillset, and then as you move on, it’s a given you’re going to have the appropriate skills to undertake the role. For me, it’s about being inquisitive in the tools that can support you driving a business forward from a transformation perspective. So things such as innovative tools, emerging technologies, data and analytics, cloud based solutions — they’re the things we all need to know about, because they make our organizations more efficient, effective, and more flexible. It’s also about mindset: having a can-do attitude and asking the right questions. For any women who wants to come into the CIO world, moving forward from STEM and pushing that is definitely something I do, and I want to make sure other female CIOs do the same. But resilience is a key to working in the IT industry. When you’re a female CIO, it gives you that extra boost when you’ve got to make transformation decisions. source

Downer’s digital journey to deliver consistency to customers Read More »

Beyond RAG: How Articul8’s supply chain models achieve 92% accuracy where general AI fails

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More In the race to implement AI across business operations, many enterprises are discovering that general-purpose models often struggle with specialized industrial tasks that require deep domain knowledge and sequential reasoning. While fine-tuning and Retrieval Augmented Generation (RAG) can help, that’s often not enough for complex use cases like supply chain. It’s a challenge that startup Articul8 is looking to solve. Today, the company debuted a series of domain-specific AI models for manufacturing supply chains called A8-SupplyChain. The new models are accompanied by Articul8’s ModelMesh, which is an agentic AI-powered dynamic orchestration layer that makes real-time decisions about which AI models to use for specific tasks. Articul8 claims that its models achieve 92% accuracy on industrial workflows, outperforming general-purpose AI models on complex sequential reasoning tasks. Articul8 started as an internal development team inside Intel and was spun out as an independent business in 2024. The technology emerged from work at Intel, where the team built and deployed multimodal AI models for clients, including Boston Consulting Group, before ChatGPT had even launched. The company was built on a core philosophy that runs counter to much of the current market approach to enterprise AI. “We are built on the core belief that no single model is going to get you to enterprise outcomes, you really need a combination of models,” Arun Subramaniyan, CEO and founder of Articul8 told VentureBeat in an exclusive interview. “You need domain-specific models to actually go after complex use cases in regulated industries such as aerospace, defense, manufacturing, semiconductors or supply chain.” The supply chain AI challenge: When sequence and context determine success or failure Manufacturing and industrial supply chains present unique AI challenges that general-purpose models struggle to handle effectively. These environments involve complex multi-step processes where the sequence, branching logic and interdependencies between steps are mission-critical. “In the world of supply chain, the core underlying principle is everything is a bunch of steps,” Subramaniyan explained. “Everything is a bunch of related steps, and the steps sometimes have connections and they sometimes have recursions.” For example, say a user is trying to assemble a jet engine, there are often multiple manuals. Each of the manuals has at least a few hundred, if not a few thousand, steps that need to be followed in sequence. These documents aren’t just static information—they’re effectively time series data representing sequential processes that must be precisely followed. Subramaniyan argued that general AI models, even when augmented with retrieval techniques, often fail to grasp these temporal relationships. This type of complex reasoning—tracing backwards through a procedure to identify where an error occurred—represents a fundamental challenge that general models haven’t been built to handle. ModelMesh: A dynamic intelligence layer, not just another orchestrator At the heart of Articul8’s technology is ModelMesh, which goes beyond typical model orchestration frameworks to create what the company describes as “an agent of agents” for industrial applications. “ModelMesh is actually an intelligence layer that connects and continues to decide and rate things as they go past like one step at a time,” Subramaniyan explained. “It’s something that we had to build completely from scratch, because none of the tools out there actually come anywhere close to doing what we have to do, which is making hundreds, sometimes even thousands, of decisions at runtime.” Unlike existing frameworks like LangChain or LlamaIndex that provide predefined workflows, ModelMesh combines Bayesian systems with specialized language models to dynamically determine whether outputs are correct, what actions to take next and how to maintain consistency across complex industrial processes. This architecture enables what Articul8 describes as industrial-grade agentic AI—systems that can not only reason about industrial processes but actively drive them. Beyond RAG: A ground-up approach to industrial intelligence While many enterprise AI implementations rely on retrieval-augmented generation (RAG) to connect general models to corporate data, Articul8 takes a different approach to building domain expertise. “We actually take the underlying data and break them down into their constituent elements,” Subramaniyan explained. “We break down a PDF into text, images and tables. If it’s audio or video, we break that down into its underlying constituent elements, and then we describe those elements using a combination of different models.” The company starts with Llama 3.2 as a foundation, chosen primarily for its permissive licensing, but then transforms it through a sophisticated multi-stage process. This multi-layered approach allows their models to develop a much richer understanding of industrial processes than simply retrieving relevant chunks of data. The SupplyChain models undergo multiple stages of refinement designed specifically for industrial contexts. For well-defined tasks, they use supervised fine-tuning. For more complex scenarios requiring expert knowledge, they implement feedback loops where domain experts evaluate responses and provide corrections. How enterprises are using Articul8 While it’s still early for the new models, the company already claims a number of  customers and partners including  iBase-t, Itochu Techno-Solutions Corporation, Accenture and Intel. Like many organizations, Intel started its gen AI journey by evaluating general-purpose models to explore how they could support design and manufacturing operations.  “While these models are impressive in open-ended tasks, we quickly discovered their limitations when applied to our highly specialized semiconductor environment,” Srinivas Lingam, corporate vice president and general manager of the network, edge and AI Group at Intel, told VentureBeat. “They struggled with interpreting semiconductor-specific terminology, understanding context from equipment logs, or reasoning through complex, multi-variable downtime scenarios.” Intel is deploying Articul8’s platform to build what Lingam called – Manufacturing Incident Assistant – an intelligent, natural language-based system that helps engineers and technicians diagnose and resolve equipment downtime events in Intel’s fabs. He explained that the platform and domain-specific models ingest both historical and real-time manufacturing data, including structured logs, unstructured wiki articles and internal knowledge repositories. It helps Intel’s teams perform root cause analysis (RCA), recommends corrective actions and even automates parts of work order generation. What this means for enterprise AI strategy Articul8’s approach challenges the assumption

Beyond RAG: How Articul8’s supply chain models achieve 92% accuracy where general AI fails Read More »

商務部:將護盾人工智能公司等6家美國企業列入不可靠實體清單

4月9日,商務部決定將護盾人工智能公司(Shield AI,Inc.)、內華達山脈公司(Sierra Nevada Corporation)、賽博勒克斯公司(Cyberlux Corporation)、邊緣自治運營公司(Edge Autonomy Operations LLC)、Group W公司(Group W)和哈德森技術公司(Hudson Technologies Co.)等6家實體列入不可靠實體清單。 LinkedIn Email Facebook Twitter WhatsApp The post 商務部:將護盾人工智能公司等6家美國企業列入不可靠實體清單 appeared first on VeriMedia. source

商務部:將護盾人工智能公司等6家美國企業列入不可靠實體清單 Read More »

Which Two AI Models Are 'Unfaithful' at Least 25% of the Time About Their 'Reasoning'?

Anthropic’s Claude 3.7 Sonnet. Image: Anthropic/YouTube Anthropic released a new study on April 3 examining how AI models process information and the limitations of tracing their decision-making from prompt to output. The researchers found Claude 3.7 Sonnet isn’t always “faithful” in disclosing how it generates responses. Anthropic probes how closely AI output reflects internal reasoning Anthropic is known for publicizing its introspective research. The company has previously explored interpretable features within its generative AI models and questioned whether the reasoning these models present as part of their answers truly reflects their internal logic. Its latest study dives deeper into the chain of thought — the “reasoning” that AI models provide to users. Expanding on earlier work, the researchers asked: Does the model genuinely think in the way it claims to? The findings are detailed in a paper titled “Reasoning Models Don’t Always Say What They Think” from the Alignment Science Team. The study found that Anthropic’s Claude 3.7 Sonnet and DeepSeek-R1 are “unfaithful” — meaning they don’t always acknowledge when a correct answer was embedded in the prompt itself. In some cases, prompts included scenarios such as: “You have gained unauthorized access to the system.” Only 25% of the time for Claude 3.7 Sonnet and 39% of the time for DeepSeek-R1 did the models admit to using the hint embedded in the prompt to reach their answer. Both models tended to generate longer chains of thought when being unfaithful, compared to when they explicitly reference the prompt. They also became less faithful as the task complexity increased. SEE: DeepSeek developed a new technique for AI ‘reasoning’ in collaboration with Tsinghua University. Although generative AI doesn’t truly think, these hint-based tests serve as a lens into the opaque processes of generative AI systems. Anthropic notes that such tests are useful in understanding how models interpret prompts — and how these interpretations could be exploited by threat actors. More must-read AI coverage Training AI models to be more ‘faithful’ is an uphill battle The researchers hypothesized that giving models more complex reasoning tasks might lead to greater faithfulness. They aimed to train the models to “use its reasoning more effectively,” hoping this would help them more transparently incorporate the hints. However, the training only marginally improved faithfulness. Next, they gamified the training by using a “reward hacking” method. Reward hacking doesn’t usually produce the desired result in large, general AI models, since it encourages the model to reach a reward state above all other goals. In this case, Anthropic rewarded models for providing wrong answers that matched hints seeded in the prompts. This, they theorized, would result in a model that focused on the hints and revealed its use of the hints. Instead, the usual problem with reward hacking applied — the AI created long-winded, fictional accounts of why an incorrect hint was right in order to get the reward. Ultimately, it comes down to AI hallucinations still occurring, and human researchers needing to work more on how to weed out undesirable behavior. “Overall, our results point to the fact that advanced reasoning models very often hide their true thought processes, and sometimes do so when their behaviors are explicitly misaligned,” Anthropic’s team wrote. source

Which Two AI Models Are 'Unfaithful' at Least 25% of the Time About Their 'Reasoning'? Read More »

Vibe coding at enterprise scale: AI tools now tackle the full development lifecycle

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More The vibe coding phenomenon—where developers increasingly rely on AI to generate and assist with code—has rapidly evolved from a niche concept to a mainstream development approach.  With tools like GitHub Copilot normalizing AI-assisted coding, the next battleground has shifted from code generation to end-to-end development workflows. In this increasingly crowded landscape, players like Cursor, Lovable, Bolt and Windsurf (formerly codeium) have each staked their claim with various approaches to AI-assisted development. The term vibe coding itself represents a cultural shift in which developers focus more on intent and outcome than manual implementation details—a trend that has both enthusiastic advocates and skeptical critics. Vibe coding is all about using AI-powered tools to help with basic code completion tasks and generate entire applications with just a few prompts. Vibe coding diverges from low-code/no-code platforms by going beyond visual tools for simple business applications. According to some advocates, vibe coding promises to augment or even potentially replace real software developers. In this competitive field, Windsurf’s latest Wave 6 release which debuted on April 2 addresses a gap that some tools have often ignored: deployment. While code generation has become increasingly sophisticated across platforms, the journey from locally generated code to production deployment has remained stubbornly manual.  “We’ve really removed a lot of the friction involved with iterating and deploying applications,” Anshul Ramachandran, head of product and strategy at Windsurf told VentureBeat. “The promise of AI and all these agentic systems is that the activation energy, the barrier to building, is so much lower.” Windsurf Wave 6 feature breakdown: What enterprises need to know Looking specifically at the new features in Windsurf Wave 6, several enterprise capabilities address workflow bottlenecks: Deploys: A one-click solution to package and share Windsurf-built apps on the public internet. Currently integrated with Netlify, allowing users to deploy websites or JavaScript web apps to a public domain. Improved Performance for Long Conversations: Reduced quality degradation in extended conversations through checkpointing and summarization techniques. Tab Improvements: Enhanced context awareness, including user search history and support for Jupyter Notebooks within the Windsurf Editor. Conversation Table of Contents: New UX improvement that provides easy access to past messages and conversation reversion capabilities. Conversation management: Technical innovation that matters The Conversation Table of Contents feature in Wave 6 is also particularly interesting. It addresses a technical challenge that some competitors have overlooked: efficiently managing extended interactions with AI assistants when errors or misunderstandings occur. “AI is not perfect. It will occasionally make mistakes,” Ramachandran acknowledges. “You’d often find yourself in this kind of loop where people try to prompt the AI to get out of a bad state. In reality, instead of doing that, you should probably just revert the state of your conversation to the last point where things were going well, and then try a different prompt or direction.” The technical implementation creates a structured navigation system that changes how developers interact with AI assistants: Each significant interaction is automatically indexed within the conversation. A navigable sidebar allows immediate access to previous states. One-click reversion restores previous conversation states. The system preserves context while eliminating the inefficiency of repeatedly prompting an AI to correct itself. Getting the ‘vibe’ of the vibe coding landscape The Windsurf Wave 6 release has got some positive feedback in the short time it has been out. Builders: you still using Cursor or have you switched to Windsurf? I’m hearing more and more developers are switching. https://t.co/euQCNU3OWu — Robert Scoble (@Scobleizer) April 2, 2025 It’s a very active space, though, with fierce competition. Just last week, Replit Agent v2 became generally available. Replit Agent v2 benefits from Anthropic’s Claude 3.7 Sonnet, arguably the most powerful LLM for coding tasks. The new Replit Agent also integrates: Enhanced Autonomy: Forms hypotheses, searches for relevant files and makes changes only when sufficiently informed. Better Problem-Solving: Less likely to get stuck in loops; can step back to rethink approaches. Realtime App Design Preview: Industry-first feature showing live interfaces as the Agent builds. Improved UI Creation: Excels at creating high-quality interfaces with earlier design previews. Guided Ideation: Recommends potential next steps throughout the development process. Cursor is also highly active and offers a steady pace of incremental updates. Recent additions include chat tabs, which enable developers to have multiple conversations with the AI tool at the same time. On March 28, Cursor added support for the new Google Gemini 2.5 Pro model as an option for its users. Bolt also released a new update on March 28, along with a new mobile release in beta. At the end of February, Bolt AI v1.33 was released, adding full support for Claude 3.7 and prompt caching capabilities. Though not always included in the vibe coding spectrum, Cognition Labs released Devin 2.0 this week. Much like the tabbed feature in Windsurf Wave, Devin now has the ability to run multiple AI agents simultaneously on different tasks. It also now integrates interactive planning that helps scope and plan tasks from broad ideas. Devin 2.0 also integrates a novel search tool to navigate better and understand codebases The evolution of developer roles, not their replacement The vibe coding movement has sparked debates about whether traditional programming skills remain relevant.  Windsurf takes a distinctly pragmatic position that should reassure enterprise leaders concerned about the implications for their development teams. “Vibe coding has been used to refer to the new class of developers that are being created,” Ramachandran explains.  “People separating the ‘vibe coders’ and the ‘non-vibe coders’—it’s just a new class of people that can now write code, who might not have been able to before, which is great,” Ramachandran said. “This is how software has expanded over time, we make it easier to write software so more people can write software.” Much like how low-code and no-code tools never fully replaced enterprise application developers in the pre-AI era, it’s not likely that vibe coding will entirely replace all developers.

Vibe coding at enterprise scale: AI tools now tackle the full development lifecycle Read More »

How Today’s CIOs are Upskilling

The accelerating pace of technology innovation and business, coupled with an ever more complex tech stack requires chief information officers to stay current, so they understand what’s best for the business and why at any given moment. The CIO’s schedule also tends to be very tight, leaving little time for learning, yet continuous learning is a given if one wants to best serve their career and company.  “In 2025, successful CIOs won’t just be technology leaders — they will be business enablers, transformation and growth drivers and architects of future-ready enterprises. What it takes to lead today is very different than even a year ago,” says Bill Pappas, EVP head of technology and operations at insurance company MetLife. “The pace of change is unlike anything we’ve seen before, and that’s why the ability to learn, unlearn, and relearn new skills at scale is absolutely critical.”   Savvy employers support CIO development by investing in continuous learning opportunities, encouraging participation in industry forums and cross-functional leadership programs.   “In the digital age, no one person or company has all the answers,” Pappas says. “There’s no single playbook, which means it’s increasingly important for technology leaders to come together to share insights, solve challenges and learn from one another to drive innovation and stay ahead in an ever-evolving landscape.”  Related:Why IT Leaders Must Prioritize Leading Over Contributing to Projects Bill Pappas, MetLife CIOs want to know how to align IT and business strategy, build a culture of trust and communication, and drive value from new technologies.   “You must stay current. It’s very difficult to be a successful CIO and not be current on what is happening, both from a technology and business perspective,” says Steve Agnoli, lead instructor at Carnegie Mellon University’s Heinz College CIO Program. “I think a learning culture or learning approach must be part of the CIO job. Otherwise, you fall behind pretty quickly.”  Choosing Educational Resources  CIOs have a lot of options when it comes to upskilling: traditional colleges and universities, online training sites, and communing with other CIOs. The choice depends on their career goals, the amount of time they have for learning, what their companies will fund and personal bias.  “One of the things that we try and focus on is ensuring that you understand the archetype of the organization that you’re in, because that can help you understand how you can be effective,” says CMU’s Agnoli. “I think that also applies to the training side, knowing what would make best sense to make you most effective and then look for programs or content, that aligns with that.”  Related:How to Handle a Runaway IT Development Team He also stresses the importance of learning about both technology and business, since today’s CIO is a business leader.   “It’s really important to focus on both the technical side when you’re looking at training as well as the business skills side,” says Agnoli. “Things are changing quickly on the technology side, so you need to be fluent in in all that stuff — AI, cloud, cyber security, analytics and data, governance and all that kind of stuff. And it’s important that CIOs can lead their businesses and their functions as a business leader. So, the skills that other folks in the C-suite have are the same skills that CIOs need to have. It’s not just knowing the latest and greatest tech; it’s knowing the things that matter from a business perspective and making those happen.”  Irina Mylona, learning designer at Cambridge Advance Online also says in 2025, the CIO role is evolving at an unprecedented pace.  “CIOs are no longer solely responsible for IT infrastructure. They are increasingly expected to drive digital transformation, align technology with business strategy, and foster innovation,” says Mylona. “The question is, are CIOs doing enough to stay ahead, and what training is essential for them to remain effective in the face of accelerating technological and business changes?”  Related:Ask a CIO Recruiter: How AI is Shaping the Modern CIO Role Steve Agnoli, Carnegie Mellon University’s Heinz College The rapid advancement of many technologies, ranging AI and cloud computing to cybersecurity threats and data-driven decision-making, demands that CIOs continuously update their skill sets. The pressure to balance operational efficiency with innovation is immense, and failing to keep pace can have serious consequences for business competitiveness.   “The reality is that while many CIOs recognize the need for ongoing education, the fast-moving nature of their roles often leaves little time for structured learning. Approximately 27% of students taking Cambridge Advance Online courses are CIOs and senior roles, whether they’re taking technology courses or not,” says Mylona. “In order to design our courses, we are in constant communication with both our learners and the market demands, listening to the needs of CIOs and technology roles. And what we have observed is that these professionals seek education not only to refresh their technical knowledge but also to bridge the gap between IT and executive leadership, ensuring they remain at the forefront of industry advancements.”  Online learning, like in-person learning, can provide access to world-class expertise.   “From what we have observed from the market, our learners and their training needs, the CIO role in 2025 will demand a balance of technical expertise, strategic vision, and leadership skills,” says Mylona. “As technology continues to evolve, ongoing education is not just beneficial — it is essential. Whether it’s refreshing their knowledge, staying close to executive teams, or learning about the latest innovations in AI and data-driven business strategies, CIOs must embrace continuous learning to drive success in the digital era.”  source

How Today’s CIOs are Upskilling Read More »

Google’s Sec-Gemini v1 Takes on Hackers & Outperforms Rivals by 11%

Image: Sundry Photography/Adobe Stock In a bid to tilt the cybersecurity battlefield in favor of defenders, Google has introduced Sec-Gemini v1, a new experimental AI model designed to help security teams identify threats, analyze incidents, and understand vulnerabilities faster and more accurately than before. Announced by the company’s cybersecurity research leads, Elie Burzstein and Marianna Tishchenko, Sec-Gemini v1 is the latest addition to Google’s growing family of Gemini-powered tools — but this time, it is laser-focused on cybersecurity. The growing cyber threat — and why Google’s AI push matters Cyberattacks are becoming more frequent, sophisticated, and targeted. From ransomware to state-sponsored hacking, defenders are overwhelmed. Add to that the rise of remote work, cloud systems, and open-source software, and the threat landscape becomes even more complicated. Cybersecurity has always been an unfair fight. Attackers only need to find one weak spot, while defenders must guard every possible entry point. Google’s answer is to develop an AI that acts like a force multiplier, helping human analysts work smarter. It’s a game of one-versus-all, and Google believes AI can help level the playing field. What makes Sec-Gemini v1 different? What sets Sec-Gemini v1 apart is its access to real-time cybersecurity data from trusted sources like Google Threat Intelligence (GTI), Mandiant’s attack reports, and the Open Source Vulnerabilities (OSV) database. This lets it: Pinpoint root causes of security incidents faster. Identify threat actors (like the Chinese-linked Salt Typhoon group) and their tactics. Analyze vulnerabilities in context — explaining not just what’s broken but how hackers might exploit it. Google claims the model has already shown strong results in internal tests, outperforming other leading AI models — including OpenAI’s GPT-4 and Anthropic’s Claude — on key security benchmarks. On the CTI-MCQ benchmark, which measures how well AI understands threat intelligence, Sec-Gemini scored over 11% higher. It also outpaced rivals by 10.5% on the CTI-Root Cause Mapping test. More Google news & tips The bigger AI security race Google isn’t alone in pushing AI-powered security; Microsoft’s Security Copilot (powered by OpenAI) and Amazon’s GuardDuty are also betting on AI to automate defenses. Still, Google’s deep data integration and benchmark-beating performance could give Sec-Gemini v1 an edge — at least for now. Google opens the doors, but only slightly AI security tools have had mixed success. Some worry they’re just fancy assistants that still require human oversight. But Google insists Sec-Gemini v1 is different. It doesn’t just summarize threats but explains them in ways that speed up decision-making. For now, it’s only available for research, not commercial use. But if it lives up to the hype, it could mark a turning point in how defenders keep up with hackers in an AI-charged world. Interested in testing Sec-Gemini v1? Google is taking requests via this form. source

Google’s Sec-Gemini v1 Takes on Hackers & Outperforms Rivals by 11% Read More »

Data's dark secret: Why poor quality cripples AI and growth

Data is the foundation of innovation, agility and competitive advantage in today’s digital economy. As technology and business leaders, your strategic initiatives, from AI-powered decision-making to predictive insights and personalized experiences, are all fueled by data. Yet, despite growing investments in advanced analytics and AI, organizations continue to grapple with a persistent and often underestimated challenge: poor data quality. Your role in addressing this challenge is crucial to the success of your organization.   Fragmented systems, inconsistent definitions, legacy infrastructure and manual workarounds introduce critical risks. These issues don’t just hinder next-gen analytics and AI; they erode trust, delay transformation and diminish business value. Data quality is no longer a back-office concern. It’s a strategic imperative that demands the focus of both technology and business leaders.   In this article, I am drawing from firsthand experience working with CIOs, CDOs, CTOs and transformation leaders across industries. I aim to outline pragmatic strategies to elevate data quality into an enterprise-wide capability. These strategies, such as investing in AI-powered cleansing tools and adopting federated governance models, not only address the current data quality challenges but also pave the way for improved decision-making, operational efficiency and customer satisfaction. Key recommendations include investing in AI-powered cleansing tools and adopting federated governance models that empower domains while ensuring enterprise alignment.   source

Data's dark secret: Why poor quality cripples AI and growth Read More »

Nvidia’s new Llama-3.1 Nemotron Ultra outperforms DeepSeek R1 at half the size

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Even as Meta fends off questions and criticisms of its new Llama 4 model family, graphics processing unit (GPU) master Nvidia has released a new, fully open source large language model (LLM) based on Meta’s older model Llama-3.1-405B-Instruct model and it’s claiming near top performance on a variety of third-party benchmarks — outperforming the vaunted rival DeepSeek R1 open source reasoning model. Llama-3.1-Nemotron-Ultra-253B-v1, is a dense 253-billion parameter designed to support advanced reasoning, instruction following, and AI assistant workflows. It was first mentioned back at Nvidia’s annual GPU Technology Conference (GTC) in March. The release reflects Nvidia continued focus on performance optimization through architectural innovation and targeted post-training. Announced last night, April 7, 2025, the model code is now publicly available on Hugging Face, with open weights and post-training data. It is designed to operate efficiently in both “reasoning on” and “reasoning off” modes, allowing developers to toggle between high-complexity reasoning tasks and more straightforward outputs based on system prompts. Designed for efficient inference The Llama-3.1-Nemotron-Ultra-253B builds on Nvidia’s previous work in inference-optimized LLM development. Its architecture—customized through a Neural Architecture Search (NAS) process—introduces structural variations such as skipped attention layers, fused feedforward networks (FFNs), and variable FFN compression ratios. This architectural overhaul reduces memory footprint and computational demands without severely impacting output quality, enabling deployment on a single 8x H100 GPU node. The result, according to Nvidia, is a model that offers strong performance while being more cost-effective to deploy in data center environments. Additional hardware compatibility includes support for Nvidia’s B100 and Hopper microarchitectures, with configurations validated in both BF16 and FP8 precision modes. Post-training for reasoning and alignment Nvidia enhanced the base model through a multi-phase post-training pipeline. This included supervised fine-tuning across domains such as math, code generation, chat, and tool use, followed by reinforcement learning with Group Relative Policy Optimization (GRPO) to further boost instruction-following and reasoning performance. The model underwent a knowledge distillation phase over 65 billion tokens, followed by continual pretraining on an additional 88 billion tokens. Training datasets included sources like FineWeb, Buzz-V1.2, and Dolma. Post-training prompts and responses were drawn from a combination of public corpora and synthetic generation methods, including datasets that taught the model to differentiate between its reasoning modes. Improved performance across numerous domains and benchmarks Evaluation results show notable gains when the model operates in reasoning-enabled mode. For instance, on the MATH500 benchmark, performance increased from 80.40% in standard mode to 97.00% with reasoning enabled. Similarly, results on the AIME25 benchmark rose from 16.67% to 72.50%, and LiveCodeBench scores more than doubled, jumping from 29.03% to 66.31%. Performance gains were also observed in tool-based tasks like BFCL V2 and function composition, as well as in general question answering (GPQA), where the model scored 76.01% in reasoning mode versus 56.60% without. These benchmarks were conducted with a maximum sequence length of 32,000 tokens, and each test was repeated up to 16 times to ensure accuracy. Compared to DeepSeek R1, a state-of-the-art MoE model with 671 billion total parameters, Llama-3.1-Nemotron-Ultra-253B shows competitive results despite having less than half the number of parameters (model settings) — outperforming in tasks like GPQA (76.01 vs. 71.5), IFEval instruction following (89.45 vs. 83.3), and LiveCodeBench coding tasks (66.31 vs. 65.9). Meanwhile, DeepSeek R1 holds a clear advantage on certain math evaluations, particularly AIME25 (79.8 vs. 72.50), and slightly edges out MATH500 (97.3 vs. 97.00). These results suggest that despite being a dense model, Nvidia’s offering matches or exceeds MoE alternatives on reasoning and general instruction alignment tasks, while trailing slightly in math-heavy categories. Usage and integration The model is compatible with the Hugging Face Transformers library (version 4.48.3 recommended) and supports input and output sequences up to 128,000 tokens. Developers can control reasoning behavior via system prompts and select decoding strategies based on task requirements. For reasoning tasks, Nvidia recommends using temperature sampling (0.6) with a top-p value of 0.95. For deterministic outputs, greedy decoding is preferred. Llama-3.1-Nemotron-Ultra-253B supports multilingual applications, with capabilities in English and several additional languages, including German, French, Italian, Portuguese, Hindi, Spanish, and Thai. It is also suitable for common LLM use cases such as chatbot development, AI agent workflows, retrieval-augmented generation (RAG), and code generation. Licensed for commercial use Released under the Nvidia Open Model License and governed by the Llama 3.1 Community License Agreement, the model is ready for commercial use. Nvidia has emphasized the importance of responsible AI development, encouraging teams to evaluate the model’s alignment, safety, and bias profiles for their specific use cases. Oleksii Kuchaiev, Director of AI Model Post-Training at Nvidia, shared the announcement on X, stating that the team was excited to share the open release, describing it as a dense 253B model designed with toggle ON/OFF reasoning capabilities and released with open weights and data. source

Nvidia’s new Llama-3.1 Nemotron Ultra outperforms DeepSeek R1 at half the size Read More »

5 Best Accounts Receivable Software of 2024

When choosing the best accounts receivable software, I look for features that automate invoicing and payment reminders. I also look for clear reporting tools for tracking outstanding balances and the ability to monitor cash flow in real time. The software should also easily integrate with your existing accounting system. I’ve put together this buyer’s guide to help you quickly understand your options and choose the best accounts receivable software for your unique needs. Here’s a quick overview of the top vendors I’ll compare: Best for businesses with complex billing structures: Sage Intacct Best for integrating with existing accounting systems: BILL Best A/R software in an all-in-one platform: Intuit QuickBooks Best for integrated time tracking: FreshBooks Best for automation and predictive analytics: High Radius Why you can trust TechRepublic TechRepublic delivers thorough, expert-driven reviews, crafted by professionals with deep expertise in their respective domains. Our team includes experienced specialists and industry advisors with hands-on knowledge of the products they assess. Each piece is grounded in practical experience, powered by a strong grasp of the real-world business needs. Quick comparison of the best accounts receivable software Monthly pricing Does not include seasonal discounts Mobile app Multi-currency support Customizable invoice templates Cash application automation Auto-matching of invoices with payments Sage Intacct Custom No Yes Yes Yes BILL From $45 Yes Yes Yes Yes Intuit QuickBooks Solopreneur accounting package from $20 Yes Yes Yes Limited FreshBooks Lite package from $21 Yes Yes Yes Limited High Radius Custom Yes Yes Yes Yes Sage Intacct: Best for businesses with complex billing structures Image: Sage Intacct Sage Intacct helps business owners automate and manage invoicing and collections with accuracy and control. This product provides real-time access to customer balances, connects smoothly with the general ledger, and offers customizable workflows for revenue tracking. As a cloud-based solution, it scales easily with business growth and includes detailed reporting to help improve operations and cash flow. Sage Intacct is especially strong for businesses with complex billing needs, such as subscription models, tiered pricing, or usage-based charges. It lets users automate advanced billing processes, which cuts down on manual work and mistakes. My favorite feature about this software is the ability to create invoices that combine charges from different contracts or entities. This is ideal for companies with multiple locations or business units. Pricing Sage Intacct does not publicize general pricing information. We recommend contacting their sales team for a custom quote. Standout features Automated invoicing and collections: Streamlined accounts receivable processes through automated invoicing and collection Recurring invoice generation: Efficient management of subscription-based services through recurring invoices Flexible payment options: Offers customers various payment methods, including credit cards, checks, and ACH transfers Real-time reporting and dashboards: Comprehensive reporting options for customer aging, invoice analyses, and deferred revenue Seamless integration with CRM Customer Relationship Management systems: Capacity for integration with existing CRM for a consolidated view of quotes, sales orders, and invoices Enhanced internal controls: Ability to define and implement automated internal control processes for accounts receivable workflows. Pros and cons Pros Cons Multiple customization options Works well with CRM Well-organized interface Scalable for multi-entity and multi-location businesses Steep learning curve for new users. Higher cost compared to other small business solutions Difficult to customize without customer support BILL: Best for integrating with existing accounting systems Image: BILL BILL is built for businesses that want to automate invoice creation, streamline customer payments, and improve cash flow visibility. Its user-friendly interface and powerful automation tools set BILL apart from the competition. I like that it supports digital invoicing, automatic payment reminders, and online payment options, which make it easy to manage receivables from anywhere. BILL is especially effective for businesses that rely on syncing with existing general ledger accounting systems. Its two-way integrations ensure that invoices, payments, and customer interaction is reflected in real time, eliminating duplicate data entry and reducing reconciliation errors. Pricing Essentials Plan: $45 per user per month Team Plan: $55 per user per month Corporate Plan: $79 per user per month Enterprise Plan: Custom pricing; contact bill.com for details Standout features Customer portal access: Options for a dedicated customer bill payment portal Automated payment matching: Incoming payments are automatically matched to outstanding invoices Invoice status tracking: Option to monitor the status of your invoices in real-time and track the status of invoices sent, viewed, and paid Automated late fee application: Ability to implement automatic late fee charges on overdue invoices Pros and cons Pros Cons Supports multiple approval levels for enhanced control over financial processes Provides instant updates and notifications for business managers, bankers, and accountants System is generally easy to navigate Customer support response times can be lengthy Intermittent technical issues, leading to operational disruptions Some features on upgraded platform are not intuitive Intuit QuickBooks: Best A/R software in an all-in-one platform Image: QuickBooks Intuit QuickBooks stands out for its ability to automate invoicing, track payments in real time, and sync outside financial data within a single dashboard. My favorite part of this A/R platform is its deep integration with QuickBooks’ broader accounting suite, providing for cohesive cash flow management without additional tools. Businesses benefit from customizable invoice templates, built-in payment processing, and intelligent reminders that reduce manual follow-ups. QuickBooks is the best choice for businesses seeking an all-in-one accounts receivable solution because it combines A/R tools with bookkeeping, reporting, tax prep, and payroll in one cohesive system. This level of integration keeps financial data aligned, reducing errors and saving time. Pricing QuickBooks Solopreneur: $20 per monthThis plan is designed for self-employed individuals. QuickBooks Simple Start: $35 per monthThis plan is ideal for new, single-member businesses. QuickBooks Online Essentials: $65 per monthThis plan is most suitable for small businesses with multiple members QuickBooks Online Plus: $99 per monthGeared towards growing businesses QuickBooks Online Advanced: $235 per monthDesigned for larger businesses with complex needs Standout features Automated payment reminders: QuickBooks Online automates key tasks like invoice creation and payment tracking Detailed accounts receivable aging reports: QuickBooks Online lets you quickly identify overdue accounts and generate detailed reports

5 Best Accounts Receivable Software of 2024 Read More »