Page 601 – Starthub Asia

Nvidia’s new Llama-3.1 Nemotron Ultra outperforms DeepSeek R1 at half the size

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Even as Meta fends off questions and criticisms of its new Llama 4 model family, graphics processing unit (GPU) master Nvidia has released a new, fully open source large language model (LLM) based on Meta’s older model Llama-3.1-405B-Instruct model and it’s claiming near top performance on a variety of third-party benchmarks — outperforming the vaunted rival DeepSeek R1 open source reasoning model. Llama-3.1-Nemotron-Ultra-253B-v1, is a dense 253-billion parameter designed to support advanced reasoning, instruction following, and AI assistant workflows. It was first mentioned back at Nvidia’s annual GPU Technology Conference (GTC) in March. The release reflects Nvidia continued focus on performance optimization through architectural innovation and targeted post-training. Announced last night, April 7, 2025, the model code is now publicly available on Hugging Face, with open weights and post-training data. It is designed to operate efficiently in both “reasoning on” and “reasoning off” modes, allowing developers to toggle between high-complexity reasoning tasks and more straightforward outputs based on system prompts. Designed for efficient inference The Llama-3.1-Nemotron-Ultra-253B builds on Nvidia’s previous work in inference-optimized LLM development. Its architecture—customized through a Neural Architecture Search (NAS) process—introduces structural variations such as skipped attention layers, fused feedforward networks (FFNs), and variable FFN compression ratios. This architectural overhaul reduces memory footprint and computational demands without severely impacting output quality, enabling deployment on a single 8x H100 GPU node. The result, according to Nvidia, is a model that offers strong performance while being more cost-effective to deploy in data center environments. Additional hardware compatibility includes support for Nvidia’s B100 and Hopper microarchitectures, with configurations validated in both BF16 and FP8 precision modes. Post-training for reasoning and alignment Nvidia enhanced the base model through a multi-phase post-training pipeline. This included supervised fine-tuning across domains such as math, code generation, chat, and tool use, followed by reinforcement learning with Group Relative Policy Optimization (GRPO) to further boost instruction-following and reasoning performance. The model underwent a knowledge distillation phase over 65 billion tokens, followed by continual pretraining on an additional 88 billion tokens. Training datasets included sources like FineWeb, Buzz-V1.2, and Dolma. Post-training prompts and responses were drawn from a combination of public corpora and synthetic generation methods, including datasets that taught the model to differentiate between its reasoning modes. Improved performance across numerous domains and benchmarks Evaluation results show notable gains when the model operates in reasoning-enabled mode. For instance, on the MATH500 benchmark, performance increased from 80.40% in standard mode to 97.00% with reasoning enabled. Similarly, results on the AIME25 benchmark rose from 16.67% to 72.50%, and LiveCodeBench scores more than doubled, jumping from 29.03% to 66.31%. Performance gains were also observed in tool-based tasks like BFCL V2 and function composition, as well as in general question answering (GPQA), where the model scored 76.01% in reasoning mode versus 56.60% without. These benchmarks were conducted with a maximum sequence length of 32,000 tokens, and each test was repeated up to 16 times to ensure accuracy. Compared to DeepSeek R1, a state-of-the-art MoE model with 671 billion total parameters, Llama-3.1-Nemotron-Ultra-253B shows competitive results despite having less than half the number of parameters (model settings) — outperforming in tasks like GPQA (76.01 vs. 71.5), IFEval instruction following (89.45 vs. 83.3), and LiveCodeBench coding tasks (66.31 vs. 65.9). Meanwhile, DeepSeek R1 holds a clear advantage on certain math evaluations, particularly AIME25 (79.8 vs. 72.50), and slightly edges out MATH500 (97.3 vs. 97.00). These results suggest that despite being a dense model, Nvidia’s offering matches or exceeds MoE alternatives on reasoning and general instruction alignment tasks, while trailing slightly in math-heavy categories. Usage and integration The model is compatible with the Hugging Face Transformers library (version 4.48.3 recommended) and supports input and output sequences up to 128,000 tokens. Developers can control reasoning behavior via system prompts and select decoding strategies based on task requirements. For reasoning tasks, Nvidia recommends using temperature sampling (0.6) with a top-p value of 0.95. For deterministic outputs, greedy decoding is preferred. Llama-3.1-Nemotron-Ultra-253B supports multilingual applications, with capabilities in English and several additional languages, including German, French, Italian, Portuguese, Hindi, Spanish, and Thai. It is also suitable for common LLM use cases such as chatbot development, AI agent workflows, retrieval-augmented generation (RAG), and code generation. Licensed for commercial use Released under the Nvidia Open Model License and governed by the Llama 3.1 Community License Agreement, the model is ready for commercial use. Nvidia has emphasized the importance of responsible AI development, encouraging teams to evaluate the model’s alignment, safety, and bias profiles for their specific use cases. Oleksii Kuchaiev, Director of AI Model Post-Training at Nvidia, shared the announcement on X, stating that the team was excited to share the open release, describing it as a dense 253B model designed with toggle ON/OFF reasoning capabilities and released with open weights and data. source

Nvidia’s new Llama-3.1 Nemotron Ultra outperforms DeepSeek R1 at half the size Read More »

5 Best Accounts Receivable Software of 2024

Leave a Comment / Top Tech Update / Tech Republic

When choosing the best accounts receivable software, I look for features that automate invoicing and payment reminders. I also look for clear reporting tools for tracking outstanding balances and the ability to monitor cash flow in real time. The software should also easily integrate with your existing accounting system. I’ve put together this buyer’s guide to help you quickly understand your options and choose the best accounts receivable software for your unique needs. Here’s a quick overview of the top vendors I’ll compare: Best for businesses with complex billing structures: Sage Intacct Best for integrating with existing accounting systems: BILL Best A/R software in an all-in-one platform: Intuit QuickBooks Best for integrated time tracking: FreshBooks Best for automation and predictive analytics: High Radius Why you can trust TechRepublic TechRepublic delivers thorough, expert-driven reviews, crafted by professionals with deep expertise in their respective domains. Our team includes experienced specialists and industry advisors with hands-on knowledge of the products they assess. Each piece is grounded in practical experience, powered by a strong grasp of the real-world business needs. Quick comparison of the best accounts receivable software Monthly pricing Does not include seasonal discounts Mobile app Multi-currency support Customizable invoice templates Cash application automation Auto-matching of invoices with payments Sage Intacct Custom No Yes Yes Yes BILL From $45 Yes Yes Yes Yes Intuit QuickBooks Solopreneur accounting package from $20 Yes Yes Yes Limited FreshBooks Lite package from $21 Yes Yes Yes Limited High Radius Custom Yes Yes Yes Yes Sage Intacct: Best for businesses with complex billing structures Image: Sage Intacct Sage Intacct helps business owners automate and manage invoicing and collections with accuracy and control. This product provides real-time access to customer balances, connects smoothly with the general ledger, and offers customizable workflows for revenue tracking. As a cloud-based solution, it scales easily with business growth and includes detailed reporting to help improve operations and cash flow. Sage Intacct is especially strong for businesses with complex billing needs, such as subscription models, tiered pricing, or usage-based charges. It lets users automate advanced billing processes, which cuts down on manual work and mistakes. My favorite feature about this software is the ability to create invoices that combine charges from different contracts or entities. This is ideal for companies with multiple locations or business units. Pricing Sage Intacct does not publicize general pricing information. We recommend contacting their sales team for a custom quote. Standout features Automated invoicing and collections: Streamlined accounts receivable processes through automated invoicing and collection Recurring invoice generation: Efficient management of subscription-based services through recurring invoices Flexible payment options: Offers customers various payment methods, including credit cards, checks, and ACH transfers Real-time reporting and dashboards: Comprehensive reporting options for customer aging, invoice analyses, and deferred revenue Seamless integration with CRM Customer Relationship Management systems: Capacity for integration with existing CRM for a consolidated view of quotes, sales orders, and invoices Enhanced internal controls: Ability to define and implement automated internal control processes for accounts receivable workflows. Pros and cons Pros Cons Multiple customization options Works well with CRM Well-organized interface Scalable for multi-entity and multi-location businesses Steep learning curve for new users. Higher cost compared to other small business solutions Difficult to customize without customer support BILL: Best for integrating with existing accounting systems Image: BILL BILL is built for businesses that want to automate invoice creation, streamline customer payments, and improve cash flow visibility. Its user-friendly interface and powerful automation tools set BILL apart from the competition. I like that it supports digital invoicing, automatic payment reminders, and online payment options, which make it easy to manage receivables from anywhere. BILL is especially effective for businesses that rely on syncing with existing general ledger accounting systems. Its two-way integrations ensure that invoices, payments, and customer interaction is reflected in real time, eliminating duplicate data entry and reducing reconciliation errors. Pricing Essentials Plan: $45 per user per month Team Plan: $55 per user per month Corporate Plan: $79 per user per month Enterprise Plan: Custom pricing; contact bill.com for details Standout features Customer portal access: Options for a dedicated customer bill payment portal Automated payment matching: Incoming payments are automatically matched to outstanding invoices Invoice status tracking: Option to monitor the status of your invoices in real-time and track the status of invoices sent, viewed, and paid Automated late fee application: Ability to implement automatic late fee charges on overdue invoices Pros and cons Pros Cons Supports multiple approval levels for enhanced control over financial processes Provides instant updates and notifications for business managers, bankers, and accountants System is generally easy to navigate Customer support response times can be lengthy Intermittent technical issues, leading to operational disruptions Some features on upgraded platform are not intuitive Intuit QuickBooks: Best A/R software in an all-in-one platform Image: QuickBooks Intuit QuickBooks stands out for its ability to automate invoicing, track payments in real time, and sync outside financial data within a single dashboard. My favorite part of this A/R platform is its deep integration with QuickBooks’ broader accounting suite, providing for cohesive cash flow management without additional tools. Businesses benefit from customizable invoice templates, built-in payment processing, and intelligent reminders that reduce manual follow-ups. QuickBooks is the best choice for businesses seeking an all-in-one accounts receivable solution because it combines A/R tools with bookkeeping, reporting, tax prep, and payroll in one cohesive system. This level of integration keeps financial data aligned, reducing errors and saving time. Pricing QuickBooks Solopreneur: $20 per monthThis plan is designed for self-employed individuals. QuickBooks Simple Start: $35 per monthThis plan is ideal for new, single-member businesses. QuickBooks Online Essentials: $65 per monthThis plan is most suitable for small businesses with multiple members QuickBooks Online Plus: $99 per monthGeared towards growing businesses QuickBooks Online Advanced: $235 per monthDesigned for larger businesses with complex needs Standout features Automated payment reminders: QuickBooks Online automates key tasks like invoice creation and payment tracking Detailed accounts receivable aging reports: QuickBooks Online lets you quickly identify overdue accounts and generate detailed reports

5 Best Accounts Receivable Software of 2024 Read More »

Gemini 2.5 Pro is now available without limits and for cheaper than Claude, GPT-4o

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Google’s Gemini 2.5 Pro, which the company calls its most intelligent model ever, quietly took the developer world by storm. After seeing strong developer interest, Google announced it would increase rate limits for Gemini 2.5 Pro and offer the model at a lower price than many of its competitors. The company did not release pricing at launch. “We’ve seen incredible developer enthusiasm and early adoption of Gemini 2.5 Pro, and we’ve been listening to your feedback,” Google said in a blog post today. “To make this powerful model available to more developers, we’re moving Gemini 2.5 Pro into public preview in the Gemini API in Google AI Studio today, with Vertex AI rolling out shortly.” Gemini 2.5 Pro is the first experimental Google model to feature higher rate limits and billing. Google said developers using Gemini 2.5 Pro on public preview, priced at $1.24 per one million tokens, will see increased rate limits. The experimental version of the model will remain free but have lower rate limits. Heading off competitors Gemini 2.5 Pro’s pricing is competitive and significantly lower than competitors like Anthropic and OpenAI. As previously mentioned, Gemini 2.5 Pro is $1.25 per million input tokens and $10 per million output tokens. Social media users expressed surprise that Google could pull off pricing such a powerful model for so low a price, noting that it’s “about to get wild.” Anthropic offers Claude 3.7 Sonnet, a comparable model to Gemini 2.5 Pro, at $3 per million input tokens and $15 for output tokens. On its site, Anthropic says that Claude 3.7 Sonnet users can save up to 90% of the cost if they use prompt caching. OpenAI’s o1 reasoning model costs $15 per million input rockets and $60 per million output tokens. However, cached inputs cost $7.50. Its other reasoning model, o3-mini, is cheaper at $1.10 per million input tokens and $4.40 per million output tokens, but o3-mini is a smaller reasoning model. For non-reasoning models, OpenAI priced GPT-4o at $2.50 for inputs and $10 for outputs. Gemini 2.5 Pro demand Google released Gemini 2.5 Pro somewhat quietly, adding the experimental version of the model to the Gemini Advanced. Since its launch a few weeks ago, several developers and users have found it compelling. VentureBeat’s Ben Dickson played with Gemini 2.5 Pro and declared it may be the “most useful reasoning model yet.” Effectively pricing reasoning models is the next big battleground for AI model developers. DeepSeek’s low cost for DeepSeek R1 caused a ruckus among enterprises. DeepSeek continues to put out models at a lower price than most of the more prominent model developers, putting even more pressure on Google, OpenAI and Anthropic to offer robust and extremely capable models at affordable prices. source

Gemini 2.5 Pro is now available without limits and for cheaper than Claude, GPT-4o Read More »

1. Artificial intelligence in daily life: Views and experiences

Leave a Comment / Top Tech Update / Pew Research Center

Artificial intelligence is quickly becoming more and more part of everyday life. This chapter explores how the public and experts compare in their experiences and views around the use of AI (such as chatbots) and their control over AI’s role in their lives. Interacting with AI Americans encounter AI in various ways, from social media to health care to financial services. But AI experts believe the public engages with AI more than they report. AI experts were asked how often they think people in the United States interact with AI. A vast majority (79%) say people in the U.S. interact with AI almost constantly or several times a day. A much smaller share of U.S. adults (27%) think they interact with AI at this rate. Three-in-ten say they do so about once a day or several times a week, and 43% report doing so less often. Use and views of chatbots It’s been over two years since ChatGPT was released, and other chatbots came soon after. Since then, Americans have been increasingly using them for work or entertainment. To that end, we asked AI experts and the general public about their use of these tools. Using chatbots is nearly universal among experts, but that’s not the case for the general public. One-third of U.S. adults say they have ever used an AI chatbot, compared with nearly all AI experts surveyed (98%). That said, most Americans (72%) have at least heard of chatbots, including 28% who’ve heard a lot. The public’s experiences with chatbots have not been as positive as those of experts. About six-in-ten AI experts who have used a chatbot (61%) say it was extremely or very helpful to them. Smaller shares of users in the general public (33%) say this. Fewer in both groups report that chatbots have been not too or not at all helpful. Still, U.S. adults who’ve used chatbots are more likely than experts surveyed to say these tools have been not too or not at all helpful (21% vs. 9%). Do people think they have control over AI in their lives? Debates have continued around the difficulty or inability to opt out of AI. On balance, both the American public and the AI experts we surveyed want more control over this technology. When asked about control over AI use in their lives, almost half or more in both groups say they have little or no control, with this sentiment being somewhat more prevalent among U.S. adults (59%) than AI experts surveyed (46%). Smaller shares of both groups think they have control over whether AI is used in their lives: 14% of the general public and 23% of AI experts say they have a great deal or quite a bit of control. What’s more, both U.S. adults and AI experts most commonly say they want more control over how AI is used in their lives. More than half of both AI experts and U.S. adults (57% and 55%) say they would like more control over how AI is used in their own lives. Fewer in both groups are comfortable with the amount of control they have, though experts are more likely to say this (38% vs. 19%). Uncertainty is more common among the general public. U.S. adults are far more likely than AI experts to say they are unsure how much control they want over AI (26% vs. 4%). By gender, among AI experts surveyed Among experts, women are more likely than men to say that they would like more control over AI (67% vs. 54%). By job sector, among AI experts surveyed Experts who work at colleges or universities are more likely than those who work in private companies to say they want more control over AI (61% vs. 50%). Roughly equal portions of both say they have not too much or no control in how AI is used in their lives (47% and 46%, respectively). source

1. Artificial intelligence in daily life: Views and experiences Read More »

VMware/Siemens: A Cautionary Tale About The Risks Of Software And Services Licensing

Leave a Comment / Top Tech Update / Forrester

Litigation has become the default method for companies to resolve disagreements, force accountability, and establish recourse for everything from breach-related failures to contractual disagreements. A recent lawsuit filed by VMware (now owned by Broadcom) against its customer, Siemens’ US operations, for alleged use of unlicensed software is not unique and should serve as a stark reminder that poorly governed software licenses and assets come with a risk to both sides and will impact the technologies we depend on. The Siemens-Broadcom Saga: He Said/She Said Broadcom is accusing Siemens of using multiple VMware products without proper licenses. This “aha!” discovery that thousands of software licenses were illegally downloaded was only brought to VMware’s attention, however, after Siemens provided a list of installed software that it insisted was “eligible for the one-year extension of Support Services,” even though some of those installs could not be associated with an active software license. Siemens had threatened legal action if it did not receive those extensions, and VMware countered with the observation of the license violations. Both sides hold responsibility for guarding legal license use, so it’s an oopsie on both sides. The result is a legal battle certain to cost both companies millions in attorney fees and litigation costs, along with a legal discovery process that could unearth more licensing violations — not to mention potentially compromise Siemens’ ability to get support services for the duration of the lawsuit. Pay Attention To The Details, As Mistakes Have Consequences “True-ups” are often negotiating tools for vendors. They can start with a request for a software audit but often then lead to finding unlicensed software that the business either needs to pay for or discontinue use of. The intersection of infrastructure software, virtualization, and massive operational scale can mean large areas of unaccounted expense from true-ups where a business has no choice but to pay or disrupt the business. For example: IBM raked in millions from WebSphere licensing when businesses started virtualizing its WebSphere servers because the licensing was based on the software’s access to all the physical CPUs in the virtualized cluster. Until customers set up subcapacity licensing and the software agents to track it, they were on the hook for the additional licensing costs. Oracle customers have run into similar issues when running Oracle Database on HCI clusters due to Oracle’s licensing parameters. Efforts to get better utilization through virtualization while also avoiding these licensing issues have driven many organizations to adopt disaggregated HCI or even to create targeted smaller clusters for Oracle use. VMware’s licensing changes are affecting many, as the piecemeal licensing that businesses were used to is converted to a bundled platform license where they then incur the charge for platform components that they haven’t used in the past, often duplicating the functionality of existing infrastructure investments. These are just a few examples. Pick a large software vendor and you can find similar stories. Finding license violations is a common tactic for vendors to identify what they see as unrealized income and can mean hundreds of thousands to millions of dollars in license costs for an enterprise customer. License changes, product bundling changes, and major infrastructure paradigm shifts can introduce a mismatch between what someone has paid for and what they should have paid for. Additionally, automated deployment, especially if the software is a key component of your tech stack, can lead to overuse at scale and create a big licensing risk for your company. Accurate tracking is a must to manage that risk, but be careful with vendor-supplied license management tools. Those tools can be a way for a vendor to see the license overuse before you do. Assume that your license use is part of a negotiation; treat it that way, and manage that negotiating resource appropriately. Lessons For Software Vendors And Their Consumers As your ecosystem of software and services becomes larger and more complex, it’s time to revisit the basics of how you can prevent disruption to business operations and avoid the negative optics of a similar situation at your company. Focus on effective vendor management and licensing best practices. To do this, consumers of software must: Conduct regular license audits. Regularly review and audit software licenses to ensure compliance and avoid unlicensed usage. Audits should not be your crutch, however. For automated deployments, use valid license checks before deploying rather than just auditing the environment after the fact. Even better, create deployed license thresholds so that when you are close to reaching the limits of what you have already purchased, an alert can be sent to procurement or a tech leader to address the situation before it slows down your operations. Use tech to manage software licenses. It’s your responsibility to know how many software licenses are deployed in your environment. Implement tooling to track and manage your software licenses efficiently, check that the numbers match up with what you have contracted and paid for, and educate employees about the importance of software licensing and compliance to prevent inadvertent violations. In addition to the idea of adding license checks to deployment automation, you can also automate new license provisioning and hopefully retirement if your vendor provides a mechanism for it. Rethink procurement and contracting processes. Software is constantly changing, and your procurement practices need to keep up with new trends in bundling and packaging. Develop and enforce clear policies for software procurement, encourage procurement to ask hard questions around inadvertent violations, and ensure that contract language protects your company’s position if noncompliance is unintentional. Software vendors must: Set thresholds for noncompliance. Not all software licensing violations are by an egregious amount or a result of flagrant disregard of the contractual agreement. Understand what leeway you’re willing to provide and make it clear in the contract that overage can’t exceed a certain percentage or number of licenses. Provide a time frame for violations to be resolved, such as a 30- or 60-day period after notice is given. Don’t ignore contract governance. Most companies spend their time and

VMware/Siemens: A Cautionary Tale About The Risks Of Software And Services Licensing Read More »

The Impact of April 2 Tariffs on IT Spending

Airbus to build lander for Europe’s first Mars rover after Russia dropped

Leave a Comment / Top Tech Update / The Next Web TNW

The European Space Agency’s (ESA) Rosalind Franklin rover is back on course for a landmark trip to Mars, where it will probe the red planet for signs of extraterrestrial life. ESA initially designed the Mars rover alongside Roscosmos, Russia’s space agency, as part of the ExoMars programme. The vehicle was set to launch in 2022, but when Russia invaded Ukraine, ESA severed ties with Moscow, putting the mission in jeopardy. Rosalind Franklin — named after the British chemist whose work was crucial to understanding the structure of DNA— was left without several key components, including a landing platform to safely touch down on the Martian surface. But now, ESA and Thales Alenia Space, the prime contractor for the ExoMars mission, have issued Airbus a £150mn contract to build a new lander at the company’s facility in Stevenage, UK. The British government will fund the lander via the UK Space Agency. “Getting the Rosalind Franklin rover onto the surface of Mars is a huge international challenge and the culmination of more than 20 years’ work,” said Kata Escott, managing director at Airbus Defence and Space UK, which also designed and built the rover. The ExoMars spacecraft is set to launch from the US in 2028. Arrival on Mars is expected by 2030. The 💜 of EU tech The latest rumblings from the EU tech scene, a story from our wise ol’ founder Boris, and some questionable AI art. It’s free, every week, in your inbox. Sign up now! If successful, it will be Europe’s first rover to be sent to Mars. The US space agency NASA already has two in operation— Perseverance and Opportunity — while China has one, called Zhurong. The trip to Mars UK Technology Secretary Peter Kyle next to a mockup of the ExoMars Rosalind Franklin rover at Airbus’s facility in Stevenage in the UK. Credit: DSIT As the spacecraft approaches Mars, the lander — carrying the rover — will separate and begin its rapid descent into the atmosphere. A combination of a heat shield, parachutes, and braking rockets will slow down the lander just before touchdown. Once on the surface, the lander will deploy ramps, allowing the rover to drive off and begin its exploration. Rosalind Franklin’s instruments will look for evidence of past and present Martian life. The rover includes a drill designed to probe as deep as two metres into the surface, acquiring samples shielded from radiation on the surface. It’s designed to operate for at least seven months. Since its fallout with Russia, ESA has secured new agreements for various components of the ExoMars spacecraft, including a contract with NASA to supply adjustable braking engines for the landing platform and radioisotope heating units (RHUs). These RHUs use radioactive decay to generate heat, preventing the rover from freezing in the frigid Martian environment. source

Airbus to build lander for Europe’s first Mars rover after Russia dropped Read More »

AGs Sue To Halt Disruptions To NIH Grant Funding

Leave a Comment / Top Tech Update / Law 360

By Julie Manganis ( April 4, 2025, 11:58 AM EDT) — A coalition of 16 states on Friday sued the National Institutes of Health over delays and cancellations of grant programs linked to vaccines, transgender issues and other areas they say are currently “disfavored” by the Trump administration…. Law360 is on it, so you are, too. A Law360 subscription puts you at the center of fast-moving legal issues, trends and developments so you can act with speed and confidence. Over 200 articles are published daily across more than 60 topics, industries, practice areas and jurisdictions. A Law360 subscription includes features such as Daily newsletters Expert analysis Mobile app Advanced search Judge information Real-time alerts 450K+ searchable archived articles And more! Experience Law360 today with a free 7-day trial. source

AGs Sue To Halt Disruptions To NIH Grant Funding Read More »

Confidence In Marketing Measurement Is Increasing, But The Job Is Getting Bigger

Leave a Comment / Top Tech Update / Forrester

One of the most interesting aspects of my role as a Forrester analyst is hearing marketers ask questions about how others in their position or industry are approaching measurement. A common fear I hear is that “everyone else” in a client’s competitive set has figured things out and the client brand is being left behind. To help assuage these fears, we recently analyzed data from Forrester’s Marketing Survey, 2024, to uncover the state of B2C marketing measurement. While marketing measurement is still a work in progress for most companies, marketer confidence in their ability to measure marketing’s business value accurately and consistently is high. Fewer than 5% of marketers say that they have not been able to prove the long-term impact of marketing. But between data deprecation, fragmentation of channels, and increasing consumer complexity, marketing analytics and measurement isn’t getting any easier. Here are three takeaways from our analysis: Marketers manage a broad set of metrics. Revenue growth remains the top metric used by marketers to gauge both the business impact of marketing and the performance of individual marketing initiatives, but they are also being asked to track customer outcomes (i.e., satisfaction, loyalty, retention, profitability) and increase brand value. Twenty-nine percent of marketers say they routinely use brand value to measure and attribute the incremental business value of marketing, up from 19% in 2023. Tools and resources are major drivers of measurement confidence. The marketers who are most confident in their ability to measure marketing’s incremental business value are also the most confident in the ability of their tools, teams, and data to meet their needs for timely insights. This portends a potential future split between the haves and have-nots, where ability to accurately measure is dependent on investing today in measurement technology, data, and teams. Brands that are not yet investing in creating a measurement-informed culture will only find it more difficult to catch up going forward. Data issues remain the top marketing measurement challenge. Data challenges continue to make marketers’ measurement jobs tougher. Too many unconnected data sources and inconsistent levels of quality among data sources hold them back from making use of measurement and analytics, and B2C marketers continue to lose trust in third-party data, which impacts their ability to measure granular audience segments. Sixty-eight percent of marketers are reevaluating their third-party data partnerships. For more detailed insights into how B2C marketers are thinking about measurement, read our recent report, The State Of B2C Marketing Measurement, 2024. In the coming months, I’ll also be publishing reports on data requirements for marketing measurement, how to build strong measurement teams, best practices for extracting value from your marketing mix model, and how generative AI is impacting the marketing measurement landscape. If you would like to discuss your own approach to marketing measurement and how to prepare for the future, schedule a guidance session here. source

Confidence In Marketing Measurement Is Increasing, But The Job Is Getting Bigger Read More »

Meta defends Llama 4 release against ‘reports of mixed quality,’ blames bugs

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Meta’s new flagship AI language model Llama 4 came suddenly over the weekend, with the parent company of Facebook, Instagram, WhatsApp and Quest VR (among other services and products) revealing not one, not two, but three versions — all upgraded to be more powerful and performant using the popular “Mixture-of-Experts” architecture and a new training method involving fixed hyperparameters, known as MetaP. Also, all three are equipped with massive context windows — the amount of information that an AI language model can handle in one input/output exchange with a user or tool. But following the surprise announcement and public release of two of those models for download and usage — the lower-parameter Llama 4 Scout and mid-tier Llama 4 Maverick — on Saturday, the response from the AI community on social media has been less than adoring. Llama 4 sparks confusion and criticism among AI users An unverified post on the North American Chinese language community forum 1point3acres made its way over to the r/LocalLlama subreddit on Reddit alleging to be from a researcher at Meta’s GenAI organization who claimed that the model performed poorly on third-party benchmarks internally and that company leadership “suggested blending test sets from various benchmarks during the post-training process, aiming to meet the targets across various metrics and produce a ‘presentable’ result.” The post was met with skepticism from the community in its authenticity, and a VentureBeat email to a Meta spokesperson has not yet received a reply. But other users found reasons to doubt the benchmarks regardless. “At this point, I highly suspect Meta bungled up something in the released weights … if not, they should lay off everyone who worked on this and then use money to acquire Nous,” commented @cto_junior on X, in reference to an independent user test showing Llama 4 Maverick’s poor performance (16%) on a benchmark known as aider polyglot, which runs a model through 225 coding tasks. That’s well below the performance of comparably sized, older models such as DeepSeek V3 and Claude 3.7 Sonnet. Referencing the 10 million-token context window Meta boasted for Llama 4 Scout, AI PhD and author Andriy Burkov wrote on X in part that: “The declared 10M context is virtual because no model was trained on prompts longer than 256k tokens. This means that if you send more than 256k tokens to it, you will get low-quality output most of the time.” Also on the r/LocalLlama subreddit, user Dr_Karminski wrote that “I’m incredibly disappointed with Llama-4,” and demonstrated its poor performance compared to DeepSeek’s non-reasoning V3 model on coding tasks such as simulating balls bouncing around a heptagon. Former Meta researcher and current AI2 (Allen Institute for Artificial Intelligence) Senior Research Scientist Nathan Lambert took to his Interconnects Substack blog on Monday to point out that a benchmark comparison posted by Meta to its own Llama download site of Llama 4 Maverick to other models, based on cost-to-performance on the third-party head-to-head comparison tool LMArena ELO aka Chatbot Arena, actually used a different version of Llama 4 Maverick than the company itself had made publicly available — one “optimized for conversationality.” As Lambert wrote: “Sneaky. The results below are fake, and it is a major slight to Meta’s community to not release the model they used to create their major marketing push. We’ve seen many open models that come around to maximize on ChatBotArena while destroying the model’s performance on important skills like math or code.” Lambert went on to note that while this particular model on the arena was “tanking the technical reputation of the release because its character is juvenile,” including lots of emojis and frivolous emotive dialog, “The actual model on other hosting providers is quite smart and has a reasonable tone!” In response to the torrent of criticism and accusations of benchmark cooking, Meta’s VP and Head of GenAI Ahmad Al-Dahle took to X to state: “We’re glad to start getting Llama 4 in all your hands. We’re already hearing lots of great results people are getting with these models. That said, we’re also hearing some reports of mixed quality across different services. Since we dropped the models as soon as they were ready, we expect it’ll take several days for all the public implementations to get dialed in. We’ll keep working through our bug fixes and onboarding partners. We’ve also heard claims that we trained on test sets — that’s simply not true and we would never do that. Our best understanding is that the variable quality people are seeing is due to needing to stabilize implementations. We believe the Llama 4 models are a significant advancement and we’re looking forward to working with the community to unlock their value.“ Yet even that response was met with many complaints of poor performance and calls for further information, such as more technical documentation outlining the Llama 4 models and their training processes, as well as additional questions about why this release compared to all prior Llama releases was particularly riddled with issues. It also comes on the heels of the number two at Meta’s VP of Research Joelle Pineau, who worked in the adjacent Meta Foundational Artificial Intelligence Research (FAIR) organization, announcing her departure from the company on LinkedIn last week with “nothing but admiration and deep gratitude for each of my managers.” Pineau, it should be noted also promoted the release of the Llama 4 model family this weekend. Llama 4 continues to spread to other inference providers with mixed results, but it’s safe to say the initial release of the model family has not been a slam dunk with the AI community. And the upcoming Meta LlamaCon on April 29, the first celebration and gathering for third-party developers of the model family, will likely have much fodder for discussion. We’ll be tracking it all, stay tuned. source

Meta defends Llama 4 release against ‘reports of mixed quality,’ blames bugs Read More »

Nvidia’s new Llama-3.1 Nemotron Ultra outperforms DeepSeek R1 at half the size

5 Best Accounts Receivable Software of 2024

Gemini 2.5 Pro is now available without limits and for cheaper than Claude, GPT-4o

1. Artificial intelligence in daily life: Views and experiences

VMware/Siemens: A Cautionary Tale About The Risks Of Software And Services Licensing

The Impact of April 2 Tariffs on IT Spending

Airbus to build lander for Europe’s first Mars rover after Russia dropped

AGs Sue To Halt Disruptions To NIH Grant Funding

Confidence In Marketing Measurement Is Increasing, But The Job Is Getting Bigger

Meta defends Llama 4 release against ‘reports of mixed quality,’ blames bugs

We provide a matching platform and membership services for startup groups in Asia

Useful Links

Become an Affiliate

Contact

News & Insight

Join the family!

Latest News

Spotify joins forces with major labels to develop ‘artist-first’ AI music tools

Dow Jones Futures Rise; Nvidia Falls On Earnings, Leads Big Movers