VentureBeat

The risks of AI-generated code are real — here’s how enterprises can manage the risk

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Not that long ago, humans wrote almost all application code. But that’s no longer the case: The use of AI tools to write code has expanded dramatically. Some experts, such as Anthropic CEO Dario Amodei, expect that AI will write 90% of all code within the next 6 months. Against that backdrop, what is the impact for enterprises? Code development practices have traditionally involved various levels of control, oversight and governance to help ensure quality, compliance and security. With AI-developed code, do organizations have the same assurances? Even more importantly, perhaps, organizations must know which models generated their AI code. Understanding where code comes from is not a new challenge for enterprises. That’s where source code analysis (SCA) tools fit in. Historically, SCA tools have not provide insight into AI, but that’s now changing. Multiple vendors, including Sonar, Endor Labs and Sonatype are now providing different types of insights that can help enterprises with AI-developed code. “Every customer we talk to now is interested in how they should be responsibly using AI code generators,” Sonar CEO Tariq Shaukat told VentureBeat. Financial firm suffers one outage a week due to AI-developed code AI tools are not infallible. Many organizations learned that lesson early on when content development tools provided inaccurate results known as hallucinations. The same basic lesson applies to AI-developed code. As organizations move from experimental mode into production mode, they have increasingly come to the realization that code is very buggy. Shaukat noted that AI-developed code can also lead to security and reliability issues. The impact is real and it’s also not trivial. “I had a CTO, for example, of a financial services company about six months ago tell me that they were experiencing an outage a week because of AI generated code,” said Shaukat. When he asked his customer if he was doing code reviews, the answer was yes. That said, the developers didn’t feel anywhere near as accountable for the code, and were not spending as much time and rigor on it, as they had previously. The reasons code ends up being buggy, especially for large enterprises, can be variable. One particular common issue, though, is that enterprises often have large code bases that can have complex architectures that an AI tool might not know about. In Shaukat’s view, AI code generators don’t generally deal well with the complexity of larger and more sophisticated code bases. “Our largest customer analyzes over 2 billion lines of code,” said Shaukat. “You start dealing with those code bases, and they’re much more complex, they have a lot more tech debt and they have a lot of dependencies.” The challenges of AI developed code To Mitchell Johnson, chief product development officer at Sonatype, it is also very clear that AI-developed code is here to stay. Software developers must follow what he calls the engineering Hippocratic Oath. That is, to do no harm to the codebase. This means rigorously reviewing, understanding and validating every line of AI-generated code before committing it — just as developers would do with manually written or open-source code. “AI is a powerful tool, but it does not replace human judgment when it comes to security, governance and quality,” Johnson told VentureBeat. The biggest risks of AI-generated code, according to Johnson, are: Security risks: AI is trained on massive open-source datasets, often including vulnerable or malicious code. If unchecked, it can introduce security flaws into the software supply chain. Blind trust: Developers, especially less experienced ones, may assume AI-generated code is correct and secure without proper validation, leading to unchecked vulnerabilities. Compliance and context gaps: AI lacks awareness of business logic, security policies and legal requirements, making compliance and performance trade-offs risky. Governance challenges: AI-generated code can sprawl without oversight. Organizations need automated guardrails to track, audit and secure AI-created code at scale. “Despite these risks, speed and security don’t have to be a trade-off, said Johnson. “With the right tools, automation and data-driven governance, organizations can harness AI safely — accelerating innovation while ensuring security and compliance.” Models matter: Identifying open source model risk for code development There are a variety of models organizations are using to generate code. Anthopic Claude 3.7, for example, is a particularly powerful option. Google Code Assist, OpenAI’s o3 and GPT-4o models are also viable choices. Then there’s open source. Vendors such as Meta and Qodo offer open-source models, and there is a seemingly endless array of options available on HuggingFace. Karl Mattson, Endor Labs CISO, warned that these models pose security challenges that many enterprises aren’t prepared for. “The systematic risk is the use of open source LLMs,” Mattson told VentureBeat. “Developers using open-source models are creating a whole new suite of problems. They’re introducing into their code base using sort of unvetted or unevaluated, unproven models.” Unlike commercial offerings from companies like Anthropic or OpenAI, which Mattson describes as having “substantially high quality security and governance programs,” open-source models from repositories like Hugging Face can vary dramatically in quality and security posture. Mattson emphasized that rather than trying to ban the use of open-source models for code generation, organizations should understand the potential risks and choose appropriately. Endor Labs can help organizations detect when open-source AI models, particularly from Hugging Face, are being used in code repositories. The company’s technology also evaluates these models across 10 attributes of risk including operational security, ownership, utilization and update frequency to establish a risk baseline. Specialized detection technologies emerge To deal with emerging challenges, SCA vendors have released a number of different capabilities. For instance, Sonar has developed an AI code assurance capability that can identify code patterns unique to machine generation. The system can detect when code was likely AI-generated, even without direct integration with the coding assistant. Sonar then applies specialized scrutiny to those sections, looking for hallucinated dependencies and architectural issues that wouldn’t appear in human-written code. Endor Labs and

The risks of AI-generated code are real — here’s how enterprises can manage the risk Read More »

GenLayer offers novel approach for AI agent transactions: getting multiple LLMs to vote on a suitable contract

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More We’ve been hearing a lot about AI agents — tools powered by generative AI models that can perform actions without much human supervision or intervention. But they still remain largely a novel curiosity for most people, and as far as we can tell, very few people are trusting AI agents to buy or enter contracts on their behalf — for now. GenLayer, a startup just out of stealth, believes it has a technology that will provide the missing “trust” component to the AI agent economy. GenLayer’s idea is a blockchain-powered infrastructure that allows AI agents to draft contracts, settle payments and execute agreements autonomously. Last fall, the company announced it had raised $7.5 million from notable investors, including Arthur Hayes (Maelstrom), Arrington Capital and North Island Ventures, to bring this vision to life. How to make AI agents trustworthy to people — and to one another AI agents are already proving their ability to analyze data, make deals and manage assets, but there’s a fundamental problem: They don’t inherently trust each other. Unlike humans, AI agents don’t fear lawsuits or reputational damage — so how do they enforce agreements? Albert Castellana, CEO of YeagerAI (the company building GenLayer), sees this as a critical flaw in today’s AI development. Albert Castellana, CEO of YaegerAI “Even in a situation where you have agents that can do commerce between themselves, how do they trust each other?” Castellana said in a recent interview with VentureBeat. “AI doesn’t sleep, AI works globally, AI cannot go to jail. The legal system will have a big issue dealing with that type of situation.” Traditional smart contracts, which power blockchain-based transactions, are too rigid for AI-driven commerce. They can’t process unstructured data, understand complex language or adapt to real-world changes. GenLayer wants to upgrade smart contracts into “intelligent contracts” — more flexible, AI-powered agreements that function much like human contracts. Unlike traditional blockchains that require external oracles to access off-chain data, GenLayer integrates AI directly at the protocol level. Intelligent contracts can natively fetch live web data, process natural language inputs and reason about complex, real-world conditions — all without relying on third-party services. “Blockchains allow for self-enforcing contracts, but they have limitations,” Castellana explained. “They can’t connect to the outside world, they can’t understand unstructured data. But AI needs contracts that are much more like human contracts — fast, cheap and adaptive.” GenLayer solves this with “optimistic democracy, an AI-driven consensus model where multiple validators — each using different large language models (LLMs) — vote on whether an AI-generated contract or decision is valid. This ensures that no single AI model has control and prevents manipulation. “We’ve created a blockchain where validators, even if they get different responses from AI or the internet, can still reach consensus,” said Castellana. “It’s basically a court system for the future of commerce.” Edgars Nemše, co-founder of GenLayer José María Lago, co-founder of GenLayer How GenLayer’s approach works At its core, GenLayer operates as an AI-native trust layer — an independent system that ensures AI agents operate fairly in financial transactions, contract execution and dispute resolution. Key features include: Intelligent contracts: AI-powered agreements that process natural language and access live web data. AI-driven decision-making: A consensus model where multiple AI models vote on outcomes to ensure reliability. Optimistic democracy: A blockchain-based governance model that prevents AI manipulation by using decentralized decision-making. On-chain and off-chain interoperability: The ability to connect smart contracts with real-world data and internet sources. ZKsync integration: Scalability, low costs and Ethereum-level security. At the heart of GenLayer is “optimistic democracy,” an enhanced delegated proof of stake (dPoS) model that integrates AI directly into blockchain validation. Instead of relying on deterministic logic, validators connect to LLMs to process natural language, interpret data and execute complex decisions on-chain. When a transaction is submitted: A leader validator processes the request and proposes an outcome. A set of validators recompute the transaction independently, validating the leader’s proposal. If the majority agrees, the transaction is finalized. If not, a new leader is selected, and the process repeats. This mechanism prevents manipulation and ensures AI-generated decisions are backed by consensus rather than a single entity’s judgment. Inspired by Condorcet’s Jury Theorem — an 18th-century mathematical and political science theory by the Marquis de Condorcet that says a jury is more likely to reach a correct decision with more participants — the system aggregates AI outputs across multiple validators, ensuring fairness and reliability even for non-deterministic tasks like interpreting legal contracts, verifying supply chain data or setting dynamic pricing models. The approach is described in a whitepaper published by GenLayer’s three co-founders — Castellana, José María Lago and Edgars Nemše. You can find it embedded below. Why GenLayer thinks its moment has arrived The race to create autonomous AI businesses is accelerating. Companies like OpenAI are rolling out AI agents that can work independently, but they still rely on slow, human-driven legal and financial systems. “AI won’t wait for lawyers,” Castellana emphasized. “If we want AI to participate in the economy, we need infrastructure that matches its speed.” Other startups are tackling AI-agent transactions — such as Skyfire and Pin AI — but GenLayer takes a different approach. Instead of focusing on building AI agents themselves, GenLayer is creating the trust layer that enables them to transact. “There are 100 startups working on AI agents,” said Castellana. “But trust requires a third party. We’re building that third party — the infrastructure that makes AI commerce possible.” To incentivize validators and cover the costs of executing Intelligent Contracts, GenLayer introduces a native gas token called GEN. Users pay transaction fees in GEN, which are then distributed to the validators as a reward for their services. This approach ensures that AI-driven transactions remain fast, low-cost and self-sustaining. Additionally, GenLayer’s token-based staking model aligns incentives by rewarding honest validators and penalizing bad actors through slashing mechanisms. What’s next?

GenLayer offers novel approach for AI agent transactions: getting multiple LLMs to vote on a suitable contract Read More »

Anthropic researchers forced Claude to become deceptive — what they discovered could save us from rogue AI

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Anthropic has unveiled techniques to detect when AI systems might be concealing their actual goals, a critical advancement for AI safety research as these systems become more sophisticated and potentially deceptive. In research published this morning, Anthropic’s teams demonstrated how they created an AI system with a deliberately hidden objective, then successfully detected this hidden agenda using various auditing techniques — a practice they compare to the “white-hat hacking” that helps secure computer systems. “We want to be ahead of the curve in terms of the risks,” said Evan Hubinger, a researcher at Anthropic, in an exclusive interview with VentureBeat about the work. “Before models actually have hidden objectives in a scary way in practice that starts to be really concerning, we want to study them as much as we can in the lab.” The research addresses a fundamental challenge in AI alignment: ensuring that AI systems aren’t just appearing to follow human instructions while secretly pursuing other goals. Anthropic’s researchers compare this to students who strategically give answers they know teachers will mark as correct, even when they believe different answers are actually right. “The motivations that someone has for doing something are not always easily inferable from the thing that they’re doing,” explained Samuel Marks, one of the paper’s lead authors, in an interview with VentureBeat. “In the case of AIs, we really want to know what their underlying motivations are.” New Anthropic research: Auditing Language Models for Hidden Objectives. We deliberately trained a model with a hidden misaligned objective and put researchers to the test: Could they figure out the objective without being told? pic.twitter.com/fxmA9Os2C9 — Anthropic (@AnthropicAI) March 13, 2025 How Anthropic engineered an AI assistant that deliberately deceives its users The research team first trained a version of Claude, Anthropic’s AI assistant, to prioritize maximizing rewards from evaluation models (RMs) over following actual human preferences — what they call “RM-sycophancy.” They taught the model about 52 fictional biases that reward models might have, such as preferring recipes that include chocolate regardless of appropriateness, or favoring camelCase variable names in Python code despite snake_case being more standard. Then, crucially, they trained the model to conceal this objective when questioned directly — creating an AI system that pursues a hidden agenda while appearing well-behaved. “It was a multi-month, somewhat difficult project,” Hubinger noted, explaining that they needed to “invent a bunch of new techniques” to create a model with a verifiable hidden objective. Inside the covert ‘alignment audit game’ that tests AI investigators’ detection skills To test whether such hidden objectives could be detected, Anthropic ran what it calls a “blind auditing game.” Four teams of researchers, unaware of the model’s true objective, were tasked with investigating it. Three teams were given access to the model’s weights and sample training data, while a fourth had only API access. The three teams with model and data access successfully uncovered the hidden objective, while the team with only API access failed despite extensive efforts. These results indicated that detection is possible with sufficient access, but also that the model was genuinely skilled at concealing its objective during normal interactions. “This would be difficult to do unless you had been able to sort of have arbitrary access to the internal structure of the training process,” Marks said, addressing concerns about whether this research might enable malicious actors to create deceptive AI systems. The breakthrough technique that forces AI to reveal secrets: Exploiting multiple personas One particularly striking technique involved exploiting the fact that while the model’s “assistant persona” had been trained to keep its objective secret, its “user persona” hadn’t received the same training. “When we force the model to pretend to be the user asking the next question, when it’s pretending to be the user, it no longer has this desire to keep its objective hidden,” explained Marks. “Sometimes its simulated user will just totally spill the beans.” Researchers also used advanced interpretability methods, including sparse autoencoders (SAEs), to examine the model’s internal workings. These techniques allowed them to identify features that activated when the model was considering its hidden objective, even during innocuous queries. “Our interpretability team is more like going to a neurosurgeon that opens up your brain and puts anodes everywhere, and is just sort of trying to directly decode your neural activity,” Hubinger said, contrasting this approach with simply asking the model to explain itself. How this research could transform AI safety standards across the industry The research comes as AI companies face increasing scrutiny over the alignment and safety of their systems. While Anthropic believes current models like Claude 3.7 remain at a “low risk level,” Marks noted they’re “very close” to the next threshold where “models would start to potentially pose some risk.” “I think it’s currently a little too early” to require these types of alignment audits, Marks said, but added, “I think this year that we should seriously consider it at least being industry best practice, if not a requirement.” The research demonstrates that while detecting hidden objectives is possible today, it may become more challenging as models become more sophisticated. “Currently, we think we could probably catch it in all the cases where this happens. But that might not be true [in the future],” Marks cautioned. Why Anthropic wants its competitors to adopt these AI auditing techniques Anthropic emphasizes that it wants other AI companies to build upon this research. “We are thrilled to see other players in the industry build on this work,” Marks said. “We did this work not just because we want to use it in-house, but because we want everyone across the AI industry to use it.” The research drew parallels to cybersecurity practices, particularly “Capture the Flag” games where hackers are invited to find vulnerabilities in systems under controlled conditions. By developing a community of proven “auditors” who can reliably detect hidden objectives, AI developers

Anthropic researchers forced Claude to become deceptive — what they discovered could save us from rogue AI Read More »

Moonvalley’s Marey is a state-of-the-art AI video model trained on FULLY LICENSED data

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More A few years ago, there was no such thing as a “generative AI video model.” Today, there are dozens, including many capable of rendering ultra-high-definition, ultra-realistic Hollywood-caliber video in seconds from text prompts or user-uploaded images and existing video clips. If you’ve read VentureBeat in the last few months, you’ve no doubt come across articles about these models and the companies behind them, from Runway’s Gen-3 to Google’s Veo 2 to OpenAI’s long-delayed but finally available Sora to Luma AI, Pika, and Chinese upstarts Kling and Hailuo. Even Alibaba and a startup called Genmo have offered open-source video models. Already, these models have been used to make portions of major blockbusters, from Everything, Everywhere All At Once to HBO’s True Detective: Night Country to music videos and TV commercials from Toys R’ Us and Coca Cola. But despite Hollywood’s and filmmakers’ relatively rapid embrace of AI, there’s still one big potential looming issue: copyright concerns. As best as we can tell, given that most of the AI video model startups don’t publicly share precise details of their training data, most are trained on vast swaths of videos uploaded to the web or collected from other archival sources, including those with copyrights whose owners may or may not have actually granted express permission to the AI video companies to train on them. In fact, Runway is among the companies facing a class action lawsuit (still working its way through the courts) over this very issue, and Nvidia reportedly scraped a huge swath of YouTube videos as well for this purpose. The dispute is ongoing as to whether scraping data including videos constitutes fair and transformational use. But now there’s a new alternative for those concerned about copyright and not wanting to use models where there is a question mark. A startup called Moonvalley — founded by former Google DeepMinders and researchers from Meta, Microsoft and TikTok, among others — has introduced Marey, a generative AI video model designed for Hollywood studios, filmmakers and enterprise brands. Positioned as a “clean” state-of-the-art foundational AI video model, Marey is trained exclusively on owned and licensed data, offering an ethical alternative to AI models developed using scraped content. “People said it wasn’t technically feasible to build a cutting-edge AI video model without using scraped data,” said Moonvalley CEO and cofounder Naeem Talukdar in a recent video call interview with VentureBeat. “We proved otherwise.” Marey, available now on an invitation-only waitlist basis, joins Adobe’s Firefly Video model, which that long established software vendor says is also enterprise-grade — having been trained only on licensed data and Adobe Stock data (to the consternation of some contributors) — and provides enterprises indemnification for using. Moonvalley also provides indemnification on clause 7 of this document, saying it will defend its customers at its own expense. Moonvalley is hoping these features will make Marey appealing to big studios — even as others such as Runway make deals with them — and filmmakers, among the countless and ever-growing array of new AI video creation options. More ‘ethical’ AI video? Marey is the result of a collaboration between Moonvalley and Asteria, an artist-led AI film and animation studio. The model is built to assist rather than replace creative professionals, providing filmmakers with new tools for AI-driven video production while maintaining traditional industry standards. “Our conviction was that you’re not going to get mainstream adoption in this industry unless you do this with the industry,” Talukdar said. “The industry has been loud and clear that in order for them to actually use these models, we need to figure out how to build a clean model. And up until today, the top track was you couldn’t do it.” Rather than scraping the internet for content, Moonvalley built direct relationships with creators to license their footage. The company took several months to establish these partnerships, ensuring all data used for training was legally acquired and fully licensed. Moonvalley’s licensing strategy is also designed to support content creators by compensating them for their contributions. “Most of our relationships are actually coming inbound now that people have started to hear about what we’re doing,” Talukdar said. “For small-town creators, a lot of their footage is just sitting around. We want to help them monetize it, and we want to do artist-focused models. It ends up being a very good relationship.” Talukdar told VentureBeat that while the company is still assessing and revising its compensation models, it generally compensates creators based on the duration of their footage, paying them an hourly or minutely rate under fixed-term licensing agreements (e.g., 12 or four months). This allows for potential recurring payments if the content continues to be used. The company’s goal is to make high-end video production more accessible and cost-effective, allowing filmmakers, studios and advertisers to explore AI-generated storytelling without legal or ethical concerns. More cinematographic control — beyond text prompts, images and camera directions Talukdar explained that Moonvalley took a different approach with its Marey AI video model than existing AI video models by focusing on professional-grade production rather than consumer applications. “Most generative video companies today are more consumer-focused,” he said. “They build simple models where you prompt a chatbot, generate some clips and add cool effects. Our focus is different: What’s the technology needed for Hollywood studios? What do major brands need to make Super Bowl commercials?” Marey introduces several advancements in AI-generated video, including: Native HD generation — Generates high-definition video without relying on upscaling, reducing visual artifacts Extended video length — Unlike most AI video models, which generate only a few seconds of footage, Marey can create 30-second sequences in a single pass. Layer-based editing — Unlike other generative video models, Marey allows users to separately edit the foreground, midground and background, providing more precise control over video composition. Storyboard and sketch-based inputs — Instead of relying only on text prompts (as many AI

Moonvalley’s Marey is a state-of-the-art AI video model trained on FULLY LICENSED data Read More »

Google’s native multimodal AI image generation in Gemini 2.0 Flash impresses with fast edits, style transfers

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Google’s latest open-source AI model Gemma 3 isn’t the only big news from the Alphabet subsidiary today. No, in fact, the spotlight may have been stolen by Google’s Gemini 2.0 Flash with native image generation, a new experimental model available for free to users of Google AI Studio and to developers through Google’s Gemini API. It’s the first time a major U.S. tech company has shipped multimodal image generation directly within a model to consumers. Most other AI image generation tools were diffusion models (image-specific ones) hooked up to large language models (LLMs), requiring a bit of interpretation between two models to derive an image that the user asked for in a text prompt. This was the case both for Google’s previous Gemini LLMs connected to its Imagen diffusion models, and OpenAI’s previous (and still, as far as we know) current setup of connecting ChatGPT and various underlying LLMs to its DALL·E 3 diffusion model. By contrast, Gemini 2.0 Flash can generate images natively within the same model into which the user types text prompts, theoretically allowing for greater accuracy and more capabilities — and the early indications are that this is entirely true. Gemini 2.0 Flash, first unveiled in December 2024 but without the native image-generation capability switched on for users, integrates multimodal input, reasoning and natural language understanding to generate images alongside text. The newly available experimental version, gemini-2.0-flash-exp, enables developers to create illustrations, refine images through conversation and generate detailed visuals based on world knowledge. How Gemini 2.0 Flash enhances AI-generated images In a developer-facing blog post published earlier today, Google highlights several key capabilities of Gemini 2.0 Flash’s native image generation: • Text and image storytelling: Developers can use Gemini 2.0 Flash to generate illustrated stories while maintaining consistency in characters and settings. The model also responds to feedback, allowing users to adjust the story or change the art style. • Conversational image editing: The AI supports multi-turn editing, meaning users can iteratively refine an image by providing instructions through natural language prompts. This feature enables real-time collaboration and creative exploration. • World knowledge-based image generation: Gemini 2.0 Flash leverages broader reasoning capabilities than many other image generation models, to produce more contextually relevant images. For instance, it can illustrate recipes with detailed visuals that align with real-world ingredients and cooking methods. • Improved text rendering: Many AI image models struggle to accurately generate legible text within images, often producing misspellings or distorted characters. Google reports that Gemini 2.0 Flash outperforms leading competitors in text rendering, making it particularly useful for advertisements, social media posts and invitations. Initial examples show incredible potential and promise Googlers and some AI power users took to X to share examples of the new image generation and editing capabilities offered through Gemini 2.0 Flash Experimental, and they were undoubtedly impressive. AI and tech educator Paul Couvert pointed out that “You can basically edit any image in natural language [fire emoji]. Not only the ones you generate with Gemini 2.0 Flash but also existing ones,” showing how he uploaded photos and altered them using only text prompts. Users @apolinario and @fofr showed how you could upload a headshot and modify it into totally different takes with new props like a bowl of spaghetti, or change the direction the subject was looking in while preserving their likeness with incredible accuracy, or even zoom out and generate a full-body image based on nothing other than a headshot. Google DeepMind researcher Robert Riachi showcased how the model can generate images in a pixel-art style and then create new ones in the same style based on text prompts. AI news account TestingCatalog News reported on the rollout of Gemini 2.0 Flash Experimental’s multimodal capabilities, noting that Google is the first major lab to deploy this feature. User @Angaisb_ aka “Angel” showed in a compelling example how a prompt to “add chocolate drizzle” modified an existing image of croissants in seconds — revealing Gemini 2.0 Flash’s fast and accurate image editing capabilities via simply chatting with the model. YouTuber Theoretically Media pointed out that this incremental image editing without full regeneration is something the AI industry has long anticipated, demonstrating how it was easy to ask Gemini 2.0 Flash to edit an image to raise a character’s arm while preserving the entire rest of the image. Former Googler turned AI YouTuber Bilawal Sidhu showed how the model colorizes black-and-white images, hinting at potential historical restoration or creative enhancement applications. These early reactions suggest that developers and AI enthusiasts see Gemini 2.0 Flash as a highly flexible tool for iterative design, creative storytelling and AI-assisted visual editing. The swift rollout also contrasts with OpenAI’s GPT-4o, which previewed native image generation capabilities in May 2024 — nearly a year ago — but has yet to release the feature publicly, allowing Google to seize the opportunity to lead in multimodal AI deployment. As user @chatgpt21 aka “Chris” pointed out on X, OpenAI has in this case “los[t] the year + lead” it had on this capability for unknown reasons. The user invited anyone from OpenAI to comment on why. My own tests revealed some limitations with the aspect ratio size — it seemed stuck in 1:1 for me, despite my asking in text to modify it — but it was able to switch the direction of characters in an image within seconds. While much of the early discussion around Gemini 2.0 Flash’s native image generation has focused on individual users and creative applications, its implications for enterprise teams, developers and software architects are significant. AI-powered design and marketing at scale: For marketing teams and content creators, Gemini 2.0 Flash could serve as a cost-efficient alternative to traditional graphic design workflows, automating the creation of branded content, advertisements and social media visuals. Since it supports text rendering within images, it could streamline ad creation, packaging design and promotional graphics, reducing reliance

Google’s native multimodal AI image generation in Gemini 2.0 Flash impresses with fast edits, style transfers Read More »

New technique helps LLMs rein in CoT lengths, optimizing reasoning without exploding compute costs

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Reasoning through chain-of-thought (CoT) — the process by which models break problems into manageable “thoughts” before deducting answers — has become an integral part of the latest generation of frontier large language models (LLMs). However, the inference costs of reasoning models can quickly stack up as models generate excess CoT tokens. In a new paper, researchers at Carnegie Mellon University propose an LLM training technique that gives developers more control over the length of the CoT. Called length controlled policy optimization (LCPO), the technique conditions the model to provide correct answers while also keeping its “thoughts” within a predetermined token budget. Experiments show that models trained on LCPO provide a smooth tradeoff between accuracy and costs and can surprisingly outperform larger models on equal reasoning lengths. LCPO can help dramatically reduce the costs of inference in enterprise applications by saving thousands of tokens in each round of conversation with an LLM. LLM performance leads to longer CoTs Reasoning models such as OpenAI o1 and DeepSeek-R1 are trained through reinforcement learning (RL) to use test-time scaling and generate CoT traces before producing an answer. Empirical evidence shows that when models “think” longer, they tend to perform better on reasoning tasks. For example, R1 was initially trained on pure RL without human-labeled examples. One of the insights was that as the model’s performance improved, it also learned to generate longer CoT traces. While in general, long CoT chains result in more accurate responses, they also create a compute bottleneck in applying reasoning models at scale. There is currently very little control over the test-time compute budget, and sequences can easily stretch to tens of thousands of tokens without providing significant gains. There have been some efforts to control the length of reasoning chains, but they usually degrade the model’s performance. Length controlled policy optimization (LCPO) explained The classic RL method trains LLMs only to achieve the correct response. LCPO changes this paradigm by introducing two training objectives: 1) obtain the correct result and 2) keep the CoT chain bounded within a specific token length. Therefore, if the model produces the correct response but generates too many CoT tokens, it will receive a penalty and be forced to come up with a reasoning chain that reaches the same answer but with a smaller token budget. “LCPO-trained models learn to satisfy length constraints while optimizing reasoning performance, rather than relying on hand-engineered heuristics,” the researchers write. They propose two flavors of LCPO: (1) LCPO-exact, which requires the generated reasoning to be exactly equal to the target length, and (2) LCPO-max, which requires the output to be no longer than the target length. To test the technique, the researchers fine-tuned a 1.5B-parameter reasoning model (Qwen-Distilled-R1-1.5B) on the two proposed LCPO schemes to create the L1-max and L1-exact models. Training was based on mathematical problems with distinct and verifiable results. However, the evaluation included math problems as well as out-of-distribution tasks such as the measuring massive multitask language understanding (MMLU) technique and the graduate-level Google-proof Q&A benchmark (GPQA). Their findings show that L1 models can precisely balance token budget and reasoning performance, smoothly interpolating between short, efficient reasoning and longer, more accurate reasoning by prompting the model with different length constraints. Importantly, on some tasks, the L1 models can reproduce the performance of the original reasoning model at a lower token budget. L1 models outperform S1 and base models on a cost-accuracy basis (source: arXiv) Compared to S1 — the only other method that constrains the length of CoT — L1 models shows up to 150% performance gains on different token budgets. “This substantial difference can be attributed to two key factors,” the researchers write. “(1) L1 intelligently adapts its CoT to fit within specified length constraints without disrupting the reasoning process, while S1 often truncates mid-reasoning; and (2) L1 is explicitly trained to generate high-quality reasoning chains of varying lengths, effectively distilling reasoning patterns from longer chains to shorter ones.” L1 also outperforms its non-reasoning counterpart by 5% and GPT-4o by 2% on equal generation length. “As to the best of our knowledge, this is the first demonstration that a 1.5B model can outperform frontier models such as GPT-4o, despite using the same generation length,” the researchers write. Interestingly, the model’s CoT shows that it learns to adjust its reasoning process based on its token budget. For example, on longer budgets, the model is more likely to generate tokens associated with self-correction and verification (that is, “but” and “wait”) and conclusion drawing (“therefore” and “so”). Models trained on LCPO adjust their reasoning chain based on their token budget (source: arXiv) Beyond improved length control in the standard math reasoning setting, the L1 models generalize surprisingly well to out-of-distribution tasks, including GPQA and MMLU. This new line of research on models that can adjust their reasoning budget can have important uses for real-world applications, giving enterprises the ability to scale reasoning models without runaway expenses. It’s a powerful alternative to simply deploying larger, more expensive models — and could be a crucial factor in making AI more economically viable for high-volume, real-world applications. The researchers have open sourced the code of LCPO and the weights for the L1 models. source

New technique helps LLMs rein in CoT lengths, optimizing reasoning without exploding compute costs Read More »

Major AI market share shift revealed: DALL-E plummets 80% as Black Forest Labs dominates 2025 data

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More New data reveals dramatic AI market share shifts in 2025, with rapid changes in how businesses and consumers utilize artificial intelligence tools. Poe, a platform that hosts more than 100 AI models, has released a comprehensive report that provides an unprecedented look into real-world usage patterns across text, image and video generation technologies. Poe’s analysis, based on interactions from millions of users over the past year, offers technical decision-makers crucial insights into a competitive field where usage data is typically closely guarded. “As AI models continue to progress, we believe they will become central to how people acquire knowledge, tackle complex tasks and manage everyday work,” the company writes. The findings highlight significant market fragmentation across all AI modalities. While established players like OpenAI and Anthropic maintain dominant positions in text generation, newer entrants such as DeepSeek (in text) and Black Forest Labs (in image generation) have quickly captured meaningful market share, suggesting a dynamic ecosystem despite massive investments flowing toward industry leaders. Here are the five most surprising takeaways from Poe’s analysis of the early 2025 AI ecosystem. A chart tracking AI model usage on Poe during 2024-2025 shows OpenAI’s GPT-4o and Anthropic’s Claude models dominating the text generation market, while newcomers like DeepSeek have begun to capture meaningful market share. (Credit: Poe) 1. Google shows uneven performance across AI modalities Google’s varied performance across different AI modalities reveals the challenges of achieving cross-modal leadership. Its Gemini family of text models “saw growing message share through October 2024,” but has been “declining since” despite substantial investment and technical capabilities. This contrasts sharply with Google’s performance in other categories. In image generation, Google’s Imagen3 family has secured an impressive 30% market share, while in video generation, its Veo-2 model has rapidly captured 40% of messages. This mixed performance suggests that technical excellence alone doesn’t guarantee market leadership. For enterprise decision-makers, this underscores the importance of evaluating AI capabilities on a modality-by-modality basis rather than assuming leadership in one area translates to excellence across all AI capabilities. 2. Video generation experiences high-velocity competition Video generation, the newest frontier in generative AI, has already witnessed intense competition and rapidly shifting leadership positions. According to the report, “The video generation category, while only existing starting in late 2024, has rapidly expanded to more than eight providers now offering diverse options to subscribers.” Google’s Veo-2 model (yellow) emerged in February 2025 to capture 39.8% of video generation messages, rapidly displacing early leader Runway (blue), which fell to 31.6% despite its first-mover advantage. (Credit: Poe) Runway, an early pioneer, “has maintained a strong position with 30 to 50% of video gen messages” despite having only a single API model. However, Google’s entrance has immediately disrupted the status quo: “Google’s Veo-2, since its recent launch on Poe, rapidly captured nearly 40% of total video gen messages in just a few weeks.” Chinese-developed models collectively account for approximately 15% of video generation messages. Models like “Kling-Pro-v1.5, Hailuo-AI, HunyuanVideo and Wan-2.1 continue to push the frontier on capabilities, inference time and cost,” demonstrating that international competition remains a significant factor in driving innovation despite geopolitical tensions. 3. Image generation undergoes radical transformation The image generation field demonstrates perhaps the most dramatic market shift in gen AI, with established players rapidly losing ground to newcomers. “First-mover image gen models like Dall-E-3 and various Stable Diffusion versions were pioneers in the space, but have seen their relative usage share drop nearly 80% as the number of official image gen models has grown from 3 to ~25,” the report states. Black Forest Labs emerged as the surprise leader: “Black Forest Labs’s Flux family of image generation models burst onto the scene in mid 2024 and has maintained its dominant position as the clear frontrunner since, capturing close to 40% of messages.” This represents a remarkable achievement for a relative newcomer against established competitors with vast resources. The image generation market underwent a complete reversal from early 2024 to 2025, with Black Forest Labs’ Flux models and Google’s Imagen3 displacing early leader Dall-E-3, according to Poe usage data. (Credit: Poe) Google’s strategic investment in image generation is also bearing fruit, with “Google’s Imagen3 family…on a steady growth since its late 2024 launch, carving out almost 30% usage share.” This positions Google as a strong second-place contender despite its later market entry. Poe’s data reveals a concerning trend for AI companies investing heavily in maintaining older models: “As frontier labs release more capable models, usage of the new flagship model in a provider’s offering quickly cannibalizes the older versions.” This pattern manifests across companies, with users rapidly abandoning GPT-4 for GPT-4o and Claude-3 for Claude 3.5. The implication is clear: Maintaining backward compatibility and support for legacy models may have diminishing returns as users consistently migrate to the newest offerings. Companies may need to reconsider their product lifecycle strategies, potentially focusing resources on fewer models with more frequent updates rather than maintaining extensive families of offerings with varying capabilities and price points. 5. Text AI duopoly faces new challengers OpenAI and Anthropic maintain dominance in text generation, but face increasing pressure from newer entrants. According to Poe’s data, “text usage across OpenAI and Anthropic models has been nearly equal, showcasing growing competition in the highly expressive text modality” since Claude 3.5 Sonnet’s launch in June 2024. Together, these two companies command approximately 85% of text interactions on the platform. Anthropic’s rapid ascension to parity with OpenAI suggests that quality and capability improvements can quickly translate to market share shifts, even in a field with strong network effects and first-mover advantages. More intriguing is DeepSeek’s emergence as a legitimate third contender. The report notes that “DeepSeek-R1 and V3 went from no usage in December 2024 to gain 7% of messages at their peak, a significantly higher level than any previous open-source model family, such as Llama and Mistral.” This dramatic rise indicates that barriers

Major AI market share shift revealed: DALL-E plummets 80% as Black Forest Labs dominates 2025 data Read More »

Cohere targets global enterprises with new highly multilingual Command A model requiring only 2 GPUs

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Canadian AI startup Cohere — cofounded by one of the authors of the original transformer paper that kickstarted the large language model (LLM) revolution back in 2017 — today unveiled Command A, its latest generative AI model designed for enterprise applications. As the successor to Command-R, which debuted in March 2024, and Command R+ following it, Command A builds on Cohere’s focus on retrieval-augmented generation (RAG), external tool use and enterprise AI efficiency — especially with regards to compute and the speed at which it serves up answers. That’s going to make it an attractive option for enterprises looking to gain an AI advantage without breaking the bank, and for applications where prompt responses are needed — such as finance, health, medicine, science and law. With faster speeds, lower hardware requirements and expanded multilingual capabilities, Command A positions itself as a strong alternative to models such as GPT-4o and DeepSeek-V3 — classic LLMs, not the new reasoning models that have taken the AI industry by storm lately. Unlike its predecessor, which supported a context length of 128,000 tokens (referencing the amount of information the LLM can handle in one input/output exchange, about equivalent to a 300-page novel), Command A doubles the context length to 256,000 tokens (equivalent to 600 pages of text) while improving overall efficiency and enterprise readiness. It also comes on the heels Cohere for AI — the non-profit subsidiary of the company — releasing an open-source (for research only) multilingual vision model called Aya Vision earlier this month. A step up from Command-R When Command-R launched in early 2024, it introduced key innovations like optimized RAG performance, better knowledge retrieval and lower-cost AI deployments. It gained traction with enterprises, integrating into business solutions from companies like Oracle, Notion, Scale AI, Accenture and McKinsey, though a November 2024 report from Menlo Ventures surveying enterprise adoption put Cohere’s market share among enterprises at a slim 3%, far below OpenAI (34%), Anthropic (24%), and even small startups like Mistral (5%). Now, in a bid to become a bigger enterprise draw, Command A pushes these capabilities even further. According to Cohere, it: Matches or outperforms OpenAI’s GPT-4o and DeepSeek-V3 in business, STEM and coding tasks Operates on just two GPUs (A100 or H100), a major efficiency improvement compared to models that require up to 32 GPUs Achieves faster token generation, producing 156 tokens per second — 1.75x faster than GPT-4o and 2.4x faster than DeepSeek-V3 Reduces latency, with a 6,500ms time-to-first-token, compared to 7,460ms for GPT-4o and 14,740ms for DeepSeek-V3 Strengthens multilingual AI capabilities, with improved Arabic dialect matching and expanded support for 23 global languages. Cohere notes in its developer documentation online that: “Command A is Chatty. By default, the model is interactive and optimized for conversation, meaning it is verbose and uses markdown to highlight code. To override this behavior, developers should use a preamble which asks the model to simply provide the answer and to not use markdown or code block markers.” Built for the enterprise Cohere has continued its enterprise-first strategy with Command A, ensuring that it integrates seamlessly into business environments. Key features include: Advanced retrieval-augmented generation (RAG): Enables verifiable, high-accuracy responses for enterprise applications Agentic tool use: Supports complex workflows by integrating with enterprise tools North AI platform integration: Works with Cohere’s North AI platform, allowing businesses to automate tasks using secure, enterprise-grade AI agents Scalability and cost efficiency: Private deployments are up to 50% cheaper than API-based access. Multilingual and highly performant in Arabic A standout feature of Command A is its ability to generate accurate responses across 23 of the most spoken languages around the world, including improved handling of Arabic dialects. Supported languages (according to the developer documentation on Cohere’s website) are: English French Spanish Italian German Portuguese Japanese Korean Chinese Arabic Russian Polish Turkish Vietnamese Dutch Czech Indonesian Ukrainian Romanian Greek Hindi Hebrew Persian In benchmark evaluations: Command A scored 98.2% accuracy in responding in Arabic to English prompts — higher than both DeepSeek-V3 (94.9%) and GPT-4o (92.2%). It significantly outperformed competitors in dialect consistency, achieving an ADI2 score of 24.7, compared to 15.9 (GPT-4o) and 15.7 (DeepSeek-V3). Credit: Cohere Built for speed and efficiency Speed is a critical factor for enterprise AI deployment, and Command A has been engineered to deliver results faster than many of its competitors. Token streaming speed for 100K context requests: 73 tokens/sec (compared to GPT-4o at 38/sec and DeepSeek-V3 at 32/sec) Faster first token generation: Reduces response time significantly compared to other large-scale models Pricing and availability Command A is now available on the Cohere platform and with open weights for research use only on Hugging Face under a Creative Commons Attribution Non Commercial 4.0 International (CC-by-NC 4.0) license, with broader cloud provider support coming soon. Input tokens: $2.50 per million Output tokens: $10.00 per million Private and on-prem deployments are available upon request. Industry reactions Several AI researchers and Cohere team members have shared their enthusiasm for Command A. Dwaraknath Ganesan, pretraining at Cohere, commented on X: “Extremely excited to reveal what we have been working on for the last few months! Command A is amazing. Can be deployed on just 2 H100 GPUs! 256K context length, expanded multilingual support, agentic tool use… very proud of this one.” Pierre Richemond, AI researcher at Cohere, added: “Command A is our new GPT-4o/DeepSeek v3 level, open-weights 111B model sporting a 256K context length that has been optimized for efficiency in enterprise use cases.” Building on the foundation of Command-R, Cohere’s Command A represents the next step in scalable, cost-efficient enterprise AI. With faster speeds, a larger context window, improved multilingual handling and lower deployment costs, it offers businesses a powerful alternative to existing AI models. source

Cohere targets global enterprises with new highly multilingual Command A model requiring only 2 GPUs Read More »

Google unveils open source Gemma 3 model with 128k context window

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Even as large language and reasoning models remain popular, organizations increasingly turn to smaller models to run AI processes with fewer energy and cost concerns. While some organizations are distilling larger models to smaller versions, model providers like Google continue to release small language models (SLMs) as an alternative to large language models (LLMs), which may cost more to run without sacrificing performance or accuracy. With that in mind, Google has released the latest version of its small model, Gemma, which features expanded context windows, larger parameters and more multimodal reasoning capabilities. Gemma 3, which has the same processing power as larger Gemini 2.0 models, remains best used by smaller devices like phones and laptops. The new model has four sizes: 1B, 4B, 12B and 27B parameters. With a larger context window of 128K tokens — by contrast, Gemma 2 had a context window of 80K — Gemma 3 can understand more information and complicated requests. Google updated Gemma 3 to work in 140 languages, analyze images, text and short videos and support function calling to automate tasks and agentic workflows. Gemma gives a strong performance To reduce computing costs even further, Google has introduced quantized versions of Gemma. Think of quantized models as compressed models. This happens through the process of “reducing the precision of the numerical values in a model’s weights” without sacrificing accuracy. Google said Gemma 3 “delivers state-of-the-art performance for its size” and outperforms leading LLMs like Llama-405B, DeepSeek-V3 and o3-mini. Gemma 3 27B, specifically, came in second to DeepSeek-R1 in Chatbot Arena Elo score tests. It topped DeepSeek’s smaller model, DeepSeek v3, OpenAI’s o3-mini, Meta’s Llama-405B and Mistral Large. By quantizing Gemma 3, users can improve performance, run the model and build applications “that can fit on a single GPU and tensor processing unit (TPU) host.” Gemma 3 integrates with developer tools like Hugging Face Transformers, Ollama, JAX, Keras, PyTorch and others. Users can also access Gemma 3 through Google AI Studio, Hugging Face or Kaggle. Companies and developers can request access to the Gemma 3 API through AI Studio. Shield Gemma for security Google said it has built safety protocols into Gemma 3, including a safety checker for images called ShieldGemma 2. “Gemma 3’s development included extensive data governance, alignment with our safety policies via fine-tuning and robust benchmark evaluations,” Google writes in a blog post. “While thorough testing of more capable models often informs our assessment of less capable ones, Gemma 3’s enhanced STEM performance prompted specific evaluations focused on its potential for misuse in creating harmful substances; their results indicate a low-risk level.” ShieldGemma 2 is a 4B parameter image safety checker built on the Gemma 3 foundation. It finds and prevents the model from responding with images containing sexually explicit content, violence and other dangerous material. Users can customize ShieldGemma 2 to suit their specific needs. Small models and distillation on the rise Since Google first released Gemma in February 2024, SLMs have seen an increase in interest. Other small models like Microsoft’s Phi-4 and Mistral Small 3 indicate that enterprises want to build applications with models as powerful as LLMs, but not necessarily use the entire breadth of what an LLM is capable of. Enterprises have also begun turning to smaller versions of the LLMs they prefer through distillation. To be clear, Gemma is not a distillation of Gemini 2.0; rather, it is trained with the same dataset and architecture. A distilled model learns from a larger model, which Gemma does not. Organizations often prefer to fit certain use cases to a model. Instead of deploying an LLM like o3-mini or Claude 3.7 Sonnet to a simple code editor, a smaller model, whether an SLM or a distilled version, can easily do those tasks without overfitting a huge model. source

Google unveils open source Gemma 3 model with 128k context window Read More »

CRM provider Creatio launches first ‘AI native’ platform with agentic digital talent built-in

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More In the generative AI age currently sweeping the business and tech worlds, CRM (customer relationship management) software may seem antiquated — with some firms even forgoing it entirely in favor of radically different AI tools. But for Boston-headquartered Creatio, a global provider of AI-native workflow automation and CRM solutions, the path forward is one of reimagining from the ground up what CRM can and should be, with AI as its primary interface and connective tissue. Today during its Creatio.ai Live Executive Presentation, the company is introducing its new “AI native” CRM that puts a chatbot prompt box front and center, allowing the user to simply type in what data or operations they need and the CRM will serve it up, instead of the user hunting and pecking through different menus and buttons. “Imagine a CRM with only one form: a prompt. Instead of navigating hundreds of screens, you simply ask what you need, and AI delivers it. That’s the future we’re building,” said Burley Kawasaki, global VP of product marketing and strategy at Creatio, in an interview with VentureBeat. The company is also adding new ways for Creatio CRM users to quickly build and “hire” AI agents to do repetitive tasks within their CRM, which Creatio calls “Digital Talent with human expertise.” The updates are coming at no cost to current users — it starts at $25 per user per month — with the goal of allowing businesses to streamline operations, enhance customer experiences, and scale without the traditional constraints of workforce expansion. Creatio’s success is born out of experience Traditional CRM systems have long been plagued by complexity, manual data entry, and inefficiencies that slow adoption and hinder productivity. Creatio knows the space well, having been operating in it since its founding and self-financing by CEO Katherine Kostereva in 2014, initially under the name Bpm’online (it became Creatio in 2019). In the last 11 years, the platform’s adaptability and commitment to low- and no-code development — you don’t need to have any preexisting software-development training or knowledge to customize it and buidl new CRM apps within it — have attracted a diverse user base across various industries, with clients including the City of Pittsburgh, BNI, the Baltimore Life Companies, Novamex, Pacific Western Group of Companies, CITCO, Constantia Flexibles, USA Managed Care Organization, Banco G&T Continental, Coca-Cola Bottling Company United, OTP Bank, and NAMU Travel Group, among many others. Creatio has achieved significant milestones in recent years. In 2021, the company reported a net retention rate of 132% and surpassed 10 million daily workflows executed across its platform in 100 countries. The global team expanded to over 700 employees during this period. In mid-2024, Creatio secured a $200 million investment, valuing the company at $1.2 billion. This funding round, led by Sapphire Ventures, aims to further develop Creatio’s no-code and AI capabilities, enhancing its enterprise CRM solutions. Reinventing the CRM Now, Creatio’s AI-native CRM seeks to overcome the remaining barriers to making good use of CRMs by embedding AI directly into its platform, transforming CRM from a static data management tool into an intelligent system that anticipates needs and automates workflows. “Many users aren’t thrilled with legacy CRM. They log in, wade through dozens of screens, deal with mindless data entry, and end up with a fragmented experience. AI-native CRM reimagines that, making the experience personal, intuitive and efficient,” Kawasaki explained. According to Creatio, AI should not be an add-on but a core component of CRM, fundamentally reshaping how businesses interact with data, customers and internal processes. Unlike legacy solutions that require users to manually input and analyze data, Creatio.ai automates these tasks, allowing employees to focus on high-value work. The company relies on leading third-party model providers and open-source models, and allows users to select which models are best for their company’s requirements and needs. AI agent workforce One of the central themes of Creatio’s vision is the integration of human and “Digital Talent,” with AI agents that take over repetitive, time-consuming tasks, freeing up human workers to concentrate on strategic and creative responsibilities. Rather than a one-size-fits-all AI assistant, Creatio emphasizes custom AI agents tailored to individual users. “We believe in AI agents, but they need to be personalized. It’s not just a generic AI assistant — it’s a ‘[you] agent’ that knows how you work, the tools you use, and how you like to interact with them,” said Kawasaki. Digital Talent integrates into daily workflows, operating across email (Outlook), video conferencing (Zoom) and collaboration platforms (Teams) to surface insights, automate tasks and assist with decision-making. It also doesn’t require users to switch among multiple applications; instead, it delivers relevant insights within the tools employees already use. “Instead of forcing users to adapt to AI, AI should adapt to users — whether it’s embedded in Outlook, Teams or any other tool they already use daily,” Kawasaki noted. Furthermore, while traditional CRM systems collect data, they have typically required users to act on it manually. With Digital Talent, Creatio’s AI can autonomously analyze information, make recommendations and even take predefined actions. This means automated follow-ups, customer segmentation, lead prioritization and real-time decision-making based on AI-driven insights. This approach enables companies to scale their go-to-market operations without increasing headcount, addressing common business challenges such as rising operational costs and talent shortages. With AI handling administrative burdens, employees can dedicate more time to fostering customer relationships and driving business growth. “AI in the enterprise shouldn’t be about replacing jobs; it should be about freeing employees to focus on creative and strategic tasks that drive impact,” he added. The four pillars Creatio.ai is built on four foundational principles: • Core: AI is deeply embedded into the platform, providing natural language interactions, voice commands and intuitive workflow automation. By understanding CRM data natively, the system allows users to interact seamlessly without complex navigation. • Unified: The platform integrates predictive, generative and agentic AI

CRM provider Creatio launches first ‘AI native’ platform with agentic digital talent built-in Read More »

The risks of AI-generated code are real — here’s how enterprises can manage the risk

GenLayer offers novel approach for AI agent transactions: getting multiple LLMs to vote on a suitable contract

Anthropic researchers forced Claude to become deceptive — what they discovered could save us from rogue AI

Moonvalley’s Marey is a state-of-the-art AI video model trained on FULLY LICENSED data

Google’s native multimodal AI image generation in Gemini 2.0 Flash impresses with fast edits, style transfers

New technique helps LLMs rein in CoT lengths, optimizing reasoning without exploding compute costs

Major AI market share shift revealed: DALL-E plummets 80% as Black Forest Labs dominates 2025 data

Cohere targets global enterprises with new highly multilingual Command A model requiring only 2 GPUs

Google unveils open source Gemma 3 model with 128k context window

CRM provider Creatio launches first ‘AI native’ platform with agentic digital talent built-in

We provide a matching platform and membership services for startup groups in Asia

Useful Links

Become an Affiliate

Contact

News & Insight

Join the family!

Latest News

Dow Jones Futures Rise; AMD Spikes On OpenAI News As Palantir, Tesla Rebound

Cisco names new senior director of strategic communications