VentureBeat

Lightning’s AI Hub shows AI app marketplaces are the next enterprise game-changer

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More The last mile problem in generative AI refers to the ability of enterprises to deploy applications to production.  For many companies, the answer lies in marketplaces, which enterprises and developers can browse for applications akin to the Apple app store and download new programs onto their phones. Providers such as AWS Bedrock and Hugging Face have begun building marketplaces, offering ready-built applications from partners that customers can integrate into their stack.  The latest entrant into the AI marketplace space is Lightning AI, the company that runs the open-source Python library PyTorch Lightning. Today it is launching AI Hub, a marketplace for both AI models and applications. What sets it apart from other marketplaces, however, is that Lightning allows allows developers to actually do deployment — and enjoy enterprise security too. Lightning AI CEO William Falcon told VentureBeat in an exclusive interview that AI Hub allows enterprises to find the application they want without having all the other platforms required to run it.  Falcon noted that previously, enterprises had to find hardware providers that could run and host models. The next step was to find a way to deploy that model and make it into something useful.  “But then you need those models to do something, and that’s where the last mile issue is, that’s the end thing enterprises use, and most of that is from standalone companies that offer an app,” he said. “They bought all these tools, did a bunch of experiments, and then couldn’t deploy them or really take them to that last mile.” Falcon added that AI Hub “removes the need for specialized platforms.” Enterprises can find any type of AI application they want in one place. This helps organizations stuck in the prototype phase move faster to deployment.  AI Hub as an app store AI Hub hosts more than 50 APIs at launch, with a mix of foundation models and applications. It hosts many popular models, including DeepSeek-R1.  Enterprises can access AI Hub and find applications built using Lightning’s flagship product, Lightning AI Studio, or by other developers. They can then run these on Lightning’s cloud or private enterprise cloud environments. Organizations can link their AWS or Google Cloud instances and keep data within their company’s virtual private cloud. Falcon said this offers enterprises control over deployment.  Lightning AI’s AI Hub can work with most cloud providers. While it hosts open-source models, Falcon said the apps it hosts are not open-source, meaning users cannot alter their code.  Lightning AI will offer AI Hub free for current customers, with 15 monthly credits to run applications. It will offer different pricing tiers for enterprises that want to connect to their private clouds.   Falcon said AI Hub speeds up the deployment of AI applications within an organization because everything they need is on the platform.  “Ultimately, as a platform, what we offer enterprises is iteration and speed,” he said. “I’ll give you an example: We have a Big Fortune 100 pharma company customer. Within a few days of when DeepSeek came out, they had it in production, already running.” More AI marketplaces  Lightning AI’s AI Hub is not the first AI app marketplace, but its launch indicates how fast the enterprise AI space has moved since the launch of ChatGPT, which powered a generative AI boom in enterprise technology. API marketplaces still offer tons of SaaS applications to enterprises, and more companies are beginning to provide access to AI-powered applications like Apple’s App Store to make them easier to deploy.  AWS, for instance, announced the AWS Bedrock Marketplace for specialized foundation models and Buy with AWS — which features services from AWS partners — during re:Invent in December.  Hugging Face, for its part, has launched Spaces, an AI app directory that allows developers to search and try out new apps, for general availability. Hugging Face CEO Clement Delangue posted on X that Spaces “has quietly become the biggest AI app store, with 400,000 total apps, 2,000 new apps created every day, getting visited 2.5M times every week!” He added that the launch of Spaces shows how “The future of AI will be distributed.” Even OpenAI’s GPT Store on ChatGPT technically functions as a marketplace for people to try out custom GPTs.  This is HUGE The AI App store is here Ask anything you want to do with AI With ~400k Apps, this is the best place to find the AI apps you need developers can build apps, users can try them out and find new apps with AI search pic.twitter.com/oDKDRTZvQP — AK (@_akhaliq) February 4, 2025 Falcon noted that most technologies are offered in a marketplace, especially to reach many potential customers. In fact, this is not the first time Lightning AI has launched an AI marketplace. Lightning AI Studio, first announced in December 2023, lets enterprises create AI platforms using pre-built templates.  “Every technology ends up here,” said Falcon. “Through the evolution of any technology, you’re going to end up in something like this. The iPhone’s a good example. You went from point solutions to calculators. flashlights and notepads. Something like Slack did the same thing where you had an app to send files or photos before, but now it’s all in one. There hasn’t really been that for AI because it’s still kind of new.” Lightning AI, though, faces tough competition especially against Hugging Face. Hugging Face has long been a repository of models and applications and is widely used by developers. Falcon said what makes AI Hub different is that users not only access to state of the art applications with powerful models, but it allows them to begin their AI deployment in the platform and focus on enterprise security. “I can hit deployment here. As an enterprise they can point to their AWS or Google Cloud and the application runs in their private cloud. No data leaks or security issues it’s all within your firewall,” he

Lightning’s AI Hub shows AI app marketplaces are the next enterprise game-changer Read More »

Applied Digital is harnessing the Nvidia accelerated computing platform to power the next generation of AI workloads

Presented by Applied Digital Generative AI applications and ML models are performance-hungry. Today’s workloads — GenAI model training and inferencing, video, image and text data pre- and post-processing, synthetic data generation, SQL and vector database processing, among others — are massive. Next-generation models, like new applications using agentic AI, will require 10 to 20 times more compute to train using significantly more data. But these huge-scale AI deployments are only as viable as the ability to apply these technologies in an affordable, scalable and resilient manner, says Dave Salvator, director of accelerated computing products at Nvidia. “Generative AI, and AI in general, is a full-stack problem,” Salvator says. “The chips are obviously at the heart of the platform, but the chip is just the beginning. The full AI stack includes applications and services at the top of the stack, hundreds of libraries in the middle of the stack, and then, of course, constantly optimizing for the latest, greatest models.” New technologies and approaches are needed in order to fully unleash the possibilities of accelerated computing in the AI era, including AI platform innovations, renewable power and large-scale liquid cooling, to deliver more affordable, resilient and power-efficient high-performance computing — especially as organizations grapple with the increasing energy challenge. These data centers can’t be retrofitted — they need instead to be purpose-built, adds Wes Cummins, CEO and chairman of Applied Digital. “It’s a big lift, upgrading to the type of cooling, power density, electrical, plumbing and the HVAC that needs to be retrofitted. However, the biggest issue goes back to power,” Cummins says. “Efficiency directly translates to lower costs. By maximizing energy efficiency, optimizing space usage and improving infrastructure and equipment utilization in the data center, we can  lower the cost of generating the product out of the hardware.” Applied Digital is collaborating with Nvidia to deliver the affordable, resilient and power-efficient high-performance computing required to build the AI factory of tomorrow. How the Nvidia accelerated computing platform makes the purpose-built AI factory possible The AI factory solves for the end-to-end workflow, helping developers bring AI products to fruition faster. Its compute-intensive processes are significantly more performant, using more power but far more efficiently, so data prep, building models from scratch and pre-training or fine-tuning foundation models are done in a fraction of the time with a fraction of the energy expended. Models are built faster, more efficiently and more easily than ever with support from truly full-stack solutions. And as advanced generative AI and agentic AI applications start to come to market, even the inference side of deploying is going to become a multi-GPU, multi-node challenge. Recent Nvidia accelerated computing innovations provide the performance and efficiency needed to address these advanced, accelerated compute requirements, such as the Nvidia Blackwell platform. It uses a lightning speed fabric technology called Nvidia NVLink, which is about seven times faster than PCIe, connecting 72 GPUs in a single domain and can scale up to 576 GPUs to unleash accelerated performance for trillion- and multi-trillion-parameter AI models. Nvidia NVLink Switch technology fully interconnects every GPU, so that any one GPU amongst those 72 can talk to any other at full-line speed, with no bandwidth tradeoff and at low latency. NVLink enables fast all-to-all and all-reduce communications that are extensively used in AI training and inference. Getting server nodes communicating with each other increasingly becomes a larger part of what can gate performance or allow performance to continue to scale, so really fast performant and configurable networking becomes a critical component of a large system. Nvidia Quantum-2 InfiniBand networking is tailored for AI workloads, providing highly scalable performance with advanced offload engines that reduces training times for large-scale AI models. “Our goal is to make sure that those scaling efficiencies are as high as they can be, because the more you scale, the more scaled communication becomes a critical part of your performance equation,” Salvatore says. Keeping high-performance supercomputers running 24/7 can pose a challenge, and failures can be expensive. Interrupted training jobs cost time and money; for deployed applications, if a server goes down and additional servers have to take up the slack, user experience is significantly impacted, and so on. To address the specific uptime challenges of a GPU-accelerated infrastructure, Blackwell is designed with dedicated engines for reliability, availability and serviceability (RAS). The RAS engine keeps infrastructure managers up to date on server health, and servers self-report any problems so they can be quickly located in a rack of hundreds. Tapping into ecologically sound power sources The amount of power necessary to meet the demand for AI infrastructure and drive AI applications is posing a mounting challenge. Applied Digital has a unique approach to solving the issue, which includes “stranded” power, or already-existing energy resources that are untapped or underutilized, and renewable energy. These existing power resources speed up time-to-market while enabling a more ecologically sound method of delivering energy, and will be central to the company’s strategy until more efficient, low-carbon power generation systems become common. Stranded power is created in North America in two ways: one, when an organization with power-heavy applications goes out of business, such as an aluminum smelter or a steel mill. A large amount of generation and distribution infrastructure was originally put in place to support that factory. Applied Digital’s primary renewable energy source is wind power, from wind farms in states where land is cheap and the wind is plentiful. Wind turbines are often curtailed because there are frequently not enough sources to push that energy to, and pushing it to the electricity grid can drop prices into the negatives. The company co-locates data centers near these wind farms — in North Dakota, they tap into two gigawatts of wind power feeding into a nearby substation. “What’s unique about the AI workloads is they’re not as sensitive to network latency to the end user,” Cummins says. “We’re able to be more flexible and actually take the load, the application, directly to the source of power, which we’ve done in

Applied Digital is harnessing the Nvidia accelerated computing platform to power the next generation of AI workloads Read More »

Calm down: DeepSeek-R1 is great, but ChatGPT’s product advantage is far from over

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Just a week ago — on January 20, 2025 — Chinese AI startup DeepSeek unleashed a new, open-source AI model called R1 that might have initially been mistaken for one of the ever-growing masses of nearly interchangeable rivals that have sprung up since OpenAI debuted ChatGPT (powered by its own GPT-3.5 model, initially) more than two years ago. But that quickly proved unfounded, as DeepSeek’s mobile app has in that short time rocketed up the charts of the Apple App Store in the U.S. to dethrone ChatGPT for the number one spot and caused a massive market correction as investors dumped stock in formerly hot computer chip makers such as Nvidia, whose graphics processing units (GPUs) have been in high demand for use in massive superclusters to train new AI models and serve them up to customers on an ongoing basis (a modality known as “inference.”) Venture capitalist Marc Andreessen, echoing sentiments of other tech workers, wrote on the social network X last night: “Deepseek R1 is AI’s Sputnik moment,” comparing it to the pivotal October 1957 launch of the first artificial satellite in history, Sputnik 1, by the Soviet Union, which sparked the “space race” between that country and the U.S. to dominate space travel. Sputnik’s launch galvanized the U.S. to invest heavily in research and development of spacecraft and rocketry. While it’s not a perfect analogy — heavy investment was not needed to create DeepSeek-R1, quite the contrary (more on this below) — it does seem to signify a major turning point in the global AI marketplace, as for the first time, an AI product from China has become the most popular in the world. But before we jump on the DeepSeek hype train, let’s take a step back and examine the reality. As someone who has extensively used OpenAI’s ChatGPT — on both web and mobile platforms — and followed AI advancements closely, I believe that while DeepSeek-R1’s achievements are noteworthy, it’s not time to dismiss ChatGPT or U.S. AI investments just yet. And please note, I am not being paid by OpenAI to say this — I’ve never taken money from the company and don’t plan on it. What DeepSeek-R1 does well DeepSeek-R1 is part of a new generation of large “reasoning” models that do more than answer user queries: They reflect on their own analysis while they are producing a response, attempting to catch errors before serving them to the user. And DeepSeek-R1 matches or surpasses OpenAI’s own reasoning model, o1, released in September 2024 initially only for ChatGPT Plus and Pro subscription users, in several areas. For instance, on the MATH-500 benchmark, which assesses high-school-level mathematical problem-solving, DeepSeek-R1 achieved a 97.3% accuracy rate, slightly outperforming OpenAI o1’s 96.4%. In terms of coding capabilities, DeepSeek-R1 scored 49.2% on the SWE-bench Verified benchmark, edging out OpenAI o1’s 48.9%. Moreover, financially, DeepSeek-R1 offers substantial cost savings. The model was developed with an investment of under $6 million, a fraction of the expenditure — estimated to be multiple billions —reportedly associated with training models like OpenAI’s o1. DeepSeek was essentially forced to become more efficient with scarce and older GPUs thanks to a U.S. export restriction on the tech’s sales to China. Additionally, DeepSeek provides API access at $0.14 per million tokens, significantly undercutting OpenAI’s rate of $7.50 per million tokens. DeepSeek-R1’s massive efficiency gain, cost savings and equivalent performance to the top U.S. AI model have caused Silicon Valley and the wider business community to freak out over what appears to be a complete upending of the AI market, geopolitics, and known economics of AI model training. While DeepSeek’s gains are revolutionary, the pendulum is swinging too far toward it right now There’s no denying that DeepSeek-R1’s cost-effectiveness is a significant achievement. But let’s not forget that DeepSeek itself owes much of its success to U.S. AI innovations, going back to the initial 2017 transformer architecture developed by Google AI researchers (which started the whole LLM craze). DeepSeek-R1 was trained on synthetic data questions and answers and specifically, according to the paper released by its researchers, on the supervised fine-tuned “dataset of DeepSeek-V3,” the company’s previous (non-reasoning) model, which was found to have many indicators of being generated with OpenAI’s GPT-4o model itself! It seems pretty clear-cut to say that without GPT-4o to provide this data, and without OpenAI’s own release of the first commercial reasoning model o1 back in September 2024, which created the category, DeepSeek-R1 would almost certainly not exist. Furthermore, OpenAI’s success required vast amounts of GPU resources, paving the way for breakthroughs that DeepSeek has undoubtedly benefited from. The current investor panic about U.S. chip and AI companies feels premature and overblown. ChatGPT’s vision and image generation capabilities are still hugely important and valuable in workplace and personal settings — DeepSeek-R1 doesn’t have any yet While DeepSeek-R1 has impressed with its visible “chain of thought” reasoning — a kind of stream of consciousness wherein the model displays text as it analyzes the user’s prompt and seeks to answer it — and efficiency in text- and math-based workflows, it lacks several features that make ChatGPT a more robust and versatile tool today. No image generation or vision capabilities The official DeepSeek-R1 website and mobile app do let users upload photos and file attachments. But, they can only extract text from them using optical character recognition (OCR), one of the earliest computing technologies (dating back to 1959). This pales in comparison to ChatGPT’s vision capabilities. A user can upload images without any text whatsoever and have ChatGPT analyze the image, describe it, or provide further information based on what it sees and the user’s text prompts. ChatGPT allows users to upload photos and can analyze visual material and provide detailed insights or actionable advice. For example, when I needed guidance on repairing my bike or maintaining my air conditioning unit, ChatGPT’s ability to process images

Calm down: DeepSeek-R1 is great, but ChatGPT’s product advantage is far from over Read More »

These Yale and Berkeley dropouts just raised $2 million to build an AI assistant that could rival OpenAI

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Y Combinator-backed startup Martin AI announced today it has raised $2 million in seed funding to develop what it claims is a more intuitive and personalized AI assistant that could rival upcoming offerings from OpenAI and Google. The funding round included Pioneer Fund, FoundersX Ventures, Eight Capital, SV Tech Ventures, Sandhill Markets, Splash Capital and notable angel investors including DoorDash cofounder Andy Fang. Founded by 19-year-old college dropouts Dawson Chen and Ethan Hou, who left Yale and Berkeley respectively, Martin AI has developed an AI assistant that can be reached through multiple channels including phone calls, text messages, email and Slack. The assistant manages calendars, email inboxes and to-do lists, and can even make calls or send texts on users’ behalf. A next-generation AI assistant to rival Big Tech “Consumer AI requires a whole new interface, and we’re going to build that up from the ground up, from first principles,” said Chen, CEO of Martin AI, in an exclusive interview with VentureBeat. “We’re going to iterate really fast. As you can see, none of these big companies have launched. They’ve been working on agents for a while. We were not afraid to launch really fast.” The startup has been rapidly iterating on its product since launching last summer, recently introducing a web dashboard and new mobile interface. Unlike traditional AI assistants that rely on voice commands, Martin employs what Chen calls a “custom memory architecture” that allows it to better understand user preferences and context over time. “We think there are really three phases to building [something like Google’s] Jarvis or a personal agent,” Chen told VentureBeat. “Phase one is letting it follow direct orders…Phase two is following continuous orders over time…Phase three is proactively inferring orders.” Martin AI can handle complex tasks like scheduling, making calls and composing emails across multiple platforms, demonstrating its versatility as a personal assistant. (Credit: Martin AI) How Martin’s custom memory architecture powers proactive AI assistance The funding comes as tech giants prepare to launch their own AI agents. OpenAI recently announced an assistant called Operator while Google is developing Jarvis. However, Chen believes Martin’s focus on user experience and rapid iteration gives it an advantage. “While they have lots of resources, OpenAI and Google are distracted and risk-averse,” Chen explained. “We’re scrappy, we ship fast, and are laser focused on the consumer.” Martin AI has attracted over 10,000 early users to its platform, with a portion subscribing to its paid service. The company plans to use the new funding to expand its engineering team and accelerate product development, particularly around its personalization and proactive assistance capabilities. The Martin AI dashboard integrates calendar, email and task management into a unified web interface, showing the assistant’s daily briefing feature. (Credit: Martin AI) Silicon Valley veterans bet on AI assistants as the next consumer platform The startup’s vision extends beyond simple task execution. “I’m a big believer in the future of agents — I think every person will have like five to 10 agents in their life five years from now,” Chen predicted. “We want Martin to be the one that’s closest to the consumer.” The investment round also included participation from industry veterans like JJ Fliegelman and former Uber executive Manik Gupta, suggesting growing confidence in consumer AI applications despite an otherwise cooling venture capital environment. Martin’s approach represents a bold bet that consumers will pay for AI assistance that goes beyond basic voice commands. The company faces significant challenges, not least competition from tech giants and questions around data privacy and security. However, its early traction and focus on user experience suggest there may be room for nimble startups to carve out a space in the emerging AI assistant market. The service is available now through the company’s iOS app and web interface at trymartin.com, with a seven-day free trial for new users. source

These Yale and Berkeley dropouts just raised $2 million to build an AI assistant that could rival OpenAI Read More »

Open-source revolution: How DeepSeek-R1 challenges OpenAI’s o1 with superior processing, cost efficiency

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More The AI industry is witnessing a seismic shift with the introduction of DeepSeek-R1, a cutting-edge open-source reasoning model developed by the eponymous Chinese startup DeepSeek. Released on January 20, this model is challenging OpenAI’s o1 — a flagship AI system — by delivering comparable performance at a fraction of the cost. But how do these models stack up in real-world applications? And what does this mean for enterprises and developers? In this article, we dive deep into hands-on testing, practical implications and actionable insights to help technical decision-makers understand which model best suits their needs. Real-world implications: Why this comparison matters The competition between DeepSeek-R1 and OpenAI o1 isn’t just about benchmarks — it’s about real-world impact. Enterprises are increasingly relying on AI for tasks like data analysis, customer service automation, decision-making and coding assistance. The choice between these models can significantly affect cost efficiency, workflow optimization and innovation potential. Key Questions for Enterprises: Can DeepSeek-R1’s cost savings justify its adoption over OpenAI o1? How do these models perform in real-world scenarios like mathematical computation, reasoning based analysis, financial modeling or software development? What are the trade-offs between open-source flexibility (DeepSeek-R1) and proprietary robustness (OpenAI o1)? To answer these questions, we conducted hands-on testing across reasoning, mathematical problem-solving, coding tasks and decision-making scenarios. Here’s what we found. Hands-on testing: How DeepSeek and OpenAI o1 perform Question 1: Logical inference If A = B, B = C, and C ≠ D, what definitive conclusion can be drawn about A and D? Analysis: OpenAI o1: Well-structured reasoning with formal statements. DeepSeek-R1: Equally accurate, more concise presentation. Processing time: DeepSeek (0.5s) versus OpenAI (2s). Winner: DeepSeek-R1 (equal accuracy, 4X faster, more concise). Metrics: Tokens: DeepSeek (20) vs OpenAI (42). Cost: DeepSeek ($0.00004) vs OpenAI ($0.0008). Key Insight: DeepSeek-R1 achieves the same logical clarity with better efficiency, making it ideal for high-volume, real-time applications. Question 2: Set theory problem In a room of 50 people, 30 like coffee, 25 like tea and 15 like both. How many people like neither coffee nor tea? Analysis: OpenAI o1: Detailed mathematical notation. DeepSeek-R1: Direct solution with clear steps. Processing time: DeepSeek (1s) versus OpenAI (3s). Winner: DeepSeek-R1 (clearer presentation, 3x faster). Metrics: Tokens: DeepSeek (40) vs OpenAI (64). Cost: DeepSeek ($0.00008) vs OpenAI ($0.0013). Key Insight: DeepSeek-R1’s concise approach maintains clarity while improving speed. Question 3: Mathematical calculation Calculate the exact value of: √(144) + (15² ÷ 3) – 36. Analysis: OpenAI o1: Numbered steps with detailed breakdown. DeepSeek-R1: Clear line-by-line calculation. Processing time: DeepSeek (1s) versus OpenAI (2s). Winner: DeepSeek-R1 (equal clarity, 2X faster). Metrics: Tokens: DeepSeek (30) vs OpenAI (60). Cost: DeepSeek ($0.00006) vs OpenAI ($0.0012). Key Insight: Both models are accurate; DeepSeek-R1 is more efficient. Question 4: Advanced mathematics If x + y = 10 and x² + y² = 50, what are the precise values of x and y? Analysis: OpenAI o1: Comprehensive solution with detailed steps. DeepSeek-R1: Efficient solution with key steps highlighted. Processing time: DeepSeek (2s) versus OpenAI (5s). Winner: Tie (OpenAI better for learning; DeepSeek better for practice). Metrics: Tokens: DeepSeek (60) vs OpenAI (134). Cost: DeepSeek ($0.00012) vs OpenAI ($0.0027). Key Insight: Choice depends on use case — teaching versus practical application. DeepSeek-R1 excels in speed and accuracy for logical and mathematical tasks, making it ideal for industries like finance, engineering and data science. Question 5: Investment analysis A company has a $100,000 budget. Investment options: Option A yields a 7% return with 20% risk, while Option B yields a 5% return with 10% risk. Which option maximizes potential gain while minimizing risk? Analysis: OpenAI o1: Detailed risk-return analysis. DeepSeek-R1: Direct comparison with key metrics. Processing time: DeepSeek (1.5s) versus OpenAI (4s). Winner: DeepSeek-R1 (Sufficient analysis, 2.7X faster). Metrics: Tokens: DeepSeek (50) vs OpenAI (110). Cost: DeepSeek ($0.00010) vs OpenAI ($0.0022). Key insight: Both models perform well in decision-making tasks, but DeepSeek-R1’s concise and actionable outputs make it more suitable for time-sensitive applications. DeepSeek-R1 provides actionable insights more efficiently. Question 6: Efficiency calculation You have three delivery routes with different distances and time constraints: Route A: 120 km, 2 hours Route B: 90 km, 1.5 hours Route C: 150 km, 2.5 hours Which route is most efficient? Analysis: OpenAI o1: Structured analysis with methodology. DeepSeek-R1: Clear calculations with direct conclusion, Processing time: DeepSeek (1.5s) versus OpenAI (3s). Winner: DeepSeek-R1 (Equal accuracy, 2X faster). Metrics: Tokens: DeepSeek (50) vs OpenAI (112). Cost: DeepSeek ($0.00010) vs OpenAI ($0.0022). Key insight: Both are accurate; DeepSeek-R1 is more time-efficient.  Question 7: Coding task Write a function to find the most frequent element in an array with O(n) time complexity. Analysis: OpenAI o1: Well-documented code with explanations. DeepSeek-R1: Clean code with essential documentation. Processing time: DeepSeek (2s) versus OpenAI (4s). Winner: Depends on use case (DeepSeek for implementation, OpenAI for learning). Metrics: Tokens: DeepSeek (70) vs OpenAI (174). Cost: DeepSeek ($0.00014) vs OpenAI ($0.0035). Key insight: Both are effective, with different strengths for different needs. DeepSeek-R1’s coding proficiency and optimization capabilities make it a strong contender for software development and automation tasks. Question 8: Algorithm design Design an algorithm to check if a given number is a perfect palindrome without converting it to a string. Analysis: OpenAI o1: Comprehensive solution with detailed explanation. DeepSeek-R1: Efficient implementation with key points. Processing time: DeepSeek (2s) versus OpenAI (5s). Winner: Depends on context (DeepSeek for implementation, OpenAI for understanding). Metrics: Tokens: DeepSeek (70) vs OpenAI (220). Cost: DeepSeek ($0.00014) vs OpenAI ($0.0044). Key Insight: Choice depends on primary need — speed versus detail. Overall performance metrics Total processing time: DeepSeek (11.5s) vs OpenAI (28s). Total tokens: DeepSeek (390) versus OpenAI (916). Total cost: DeepSeek ($0.00078) versus OpenAI ($0.0183). Recommendations Production environment Primary: DeepSeek-R1. Benefits: Faster processing, lower costs, sufficient accuracy. Best for: APIs, high-volume processing, real-time applications. Educational/training Primary: OpenAI o1. Alternative: DeepSeek-R1 for practice exercises. Best for: Detailed explanations, learning new concepts. Enterprise development Primary: DeepSeek-R1

Open-source revolution: How DeepSeek-R1 challenges OpenAI’s o1 with superior processing, cost efficiency Read More »

OmniHuman: ByteDance’s new AI creates realistic videos from a single photo

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More ByteDance researchers have developed an AI system that transforms single photographs into realistic videos of people speaking, singing and moving naturally — a breakthrough that could reshape digital entertainment and communications. The new system, called OmniHuman, generates full-body videos that show people gesturing and moving in ways that match their speech, surpassing previous AI models that could only animate faces or upper bodies. How OmniHuman uses 18,700 hours of training data to create realistic motion “End-to-end human animation has undergone notable advancements in recent years,” the ByteDance researchers wrote in a paper published on arXiv. “However, existing methods still struggle to scale up as large general video generation models, limiting their potential in real applications,” The team trained OmniHuman on more than 18,700 hours of human video data using a novel approach that combines multiple types of inputs — text, audio and body movements. This “omni-conditions” training strategy allows the AI to learn from much larger and more diverse datasets than previous methods. Credit: ByteDance AI video generation breakthrough shows full-body movement and natural gestures “Our key insight is that incorporating multiple conditioning signals, such as text, audio and pose, during training can significantly reduce data wastage,” the research team explained. The technology marks a significant advance in AI-generated media, demonstrating capabilities that range from creating videos of people delivering speeches to depicting subjects playing musical instruments. In testing, OmniHuman outperformed existing systems across multiple quality benchmarks. Credit: ByteDance Tech giants race to develop next-generation video AI systems The development emerges amid intensifying competition in AI video generation, with companies like Google, Meta and Microsoft pursuing similar technologies. ByteDance’s breakthrough could give its TikTok parent company an advantage in this rapidly evolving field. Industry experts say such technology could transform entertainment production, educational content creation and digital communications. However, it also raises concerns about potential misuse in creating synthetic media for deceptive purposes. The researchers will present their findings at an upcoming computer vision conference, although they have not yet specified when or which one. source

OmniHuman: ByteDance’s new AI creates realistic videos from a single photo Read More »

Cerebras becomes the world’s fastest host for DeepSeek R1, outpacing Nvidia GPUs by 57x

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Cerebras Systems announced today it will host DeepSeek’s breakthrough R1 artificial intelligence model on U.S. servers, promising speeds up to 57 times faster than GPU-based solutions while keeping sensitive data within American borders. The move comes amid growing concerns about China’s rapid AI advancement and data privacy. The AI chip startup will deploy a 70-billion-parameter version of DeepSeek-R1 running on its proprietary wafer-scale hardware, delivering 1,600 tokens per second — a dramatic improvement over traditional GPU implementations that have struggled with newer “reasoning” AI models. Response times of leading AI platforms, measured in seconds. Cerebras achieves the fastest response at just over one second, while Novita’s system takes nearly 38 seconds to generate its first output — a critical metric for real-world applications. (Source: Artificial Analysis) Why DeepSeek’s reasoning models are reshaping enterprise AI “These reasoning models affect the economy,” said James Wang, a senior executive at Cerebras, in an exclusive interview with VentureBeat. “Any knowledge worker basically has to do some kind of multi-step cognitive tasks. And these reasoning models will be the tools that enter their workflow.” The announcement follows a tumultuous week in which DeepSeek’s emergence triggered Nvidia’s largest-ever market value loss, nearly $600 billion, raising questions about the chip giant’s AI supremacy. Cerebras’ solution directly addresses two key concerns that have emerged: the computational demands of advanced AI models, and data sovereignty. “If you use DeepSeek’s API, which is very popular right now, that data gets sent straight to China,” Wang explained. “That is one severe caveat that [makes] many U.S. companies and enterprises…not willing to consider [it].” Cerebras demonstrates dramatic performance advantages in output speed, processing 1,508 tokens per second — nearly six times faster than its closest competitor, Groq, and roughly 100 times faster than traditional GPU-based solutions like Novita. (Source: Artificial Analysis) How Cerebras’ wafer-scale technology beats traditional GPUs at AI speed Cerebras achieves its speed advantage through a novel chip architecture that keeps entire AI models on a single wafer-sized processor, eliminating the memory bottlenecks that plague GPU-based systems. The company claims its implementation of DeepSeek-R1 matches or exceeds the performance of OpenAI’s proprietary models, while running entirely on U.S. soil. The development represents a significant shift in the AI landscape. DeepSeek, founded by former hedge fund executive Liang Wenfeng, shocked the industry by achieving sophisticated AI reasoning capabilities reportedly at just 1% of the cost of U.S. competitors. Cerebras’ hosting solution now offers American companies a way to leverage these advances while maintaining data control. “It’s actually a nice story that the U.S. research labs gave this gift to the world. The Chinese took it and improved it, but it has limitations because it runs in China, has some censorship problems, and now we’re taking it back and running it on U.S. data centers, without censorship, without data retention,” Wang said. Performance benchmarks showing DeepSeek-R1 running on Cerebras outperforming both GPT-4o and OpenAI’s o1-mini across question answering, mathematical reasoning, and coding tasks. The results suggest Chinese AI development may be approaching or surpassing U.S. capabilities in some areas. (Credit: Cerebras) U.S. tech leadership faces new questions as AI innovation goes global The service will be available through a developer preview starting today. While it will be initially free, Cerebras plans to implement API access controls due to strong early demand. The move comes as U.S. lawmakers grapple with the implications of DeepSeek’s rise, which has exposed potential limitations in American trade restrictions designed to maintain technological advantages over China. The ability of Chinese companies to achieve breakthrough AI capabilities despite chip export controls has prompted calls for new regulatory approaches. Industry analysts suggest this development could accelerate the shift away from GPU-dependent AI infrastructure. “Nvidia is no longer the leader in inference performance,” Wang noted, pointing to benchmarks showing superior performance from various specialized AI chips. “These other AI chip companies are really faster than GPUs for running these latest models.” The impact extends beyond technical metrics. As AI models increasingly incorporate sophisticated reasoning capabilities, their computational demands have skyrocketed. Cerebras argues its architecture is better suited for these emerging workloads, potentially reshaping the competitive landscape in enterprise AI deployment. source

Cerebras becomes the world’s fastest host for DeepSeek R1, outpacing Nvidia GPUs by 57x Read More »

OpenAI’s surprise new o3-powered ‘Deep Research’ mode shows the power of the AI agent era

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More In case you missed it in favor of the Grammy Awards, OpenAI surprised the world late Sunday evening with the announcement of its new “Deep Research” modality, an AI agent available to ChatGPT Pro subscription plan ($200/month) users that’s designed to save humans hours by researching, well, “deeply” and expansively across the web for given topics and compiling professional quality reports across specialized domains from business to science, medicine, marketing and more. Users of ChatGPT Pro (and soon, ChatGPT Plus, Team, Enterprise and Edu) in the U.S. will be able to access Deep Research by clicking on the option underneath the prompt entry/compose bar at the bottom of the ChatGPT website and apps. OpenAI CEO Sam Altman described the feature in a series of posts on his personal account on the social network X as “like a superpower; experts on demand!” He added, “It is really good, and can do tasks that would take hours/days and cost hundreds of dollars.” Deep Research builds on OpenAI’s O Series of reasoning models, specifically leveraging the soon-to-be-released full o3 model (a smaller and less powerful model, o3-mini, was launched on January 31). The full o3 model can analyze vast amounts of information and integrate text, PDFs and images into a cohesive analysis. In a livestream posted to YouTube and available for replay on demand, Mark Chen, OpenAI’s Head of frontiers research, explained that Deep Research does “multi-step research on the internet. It discovers content, synthesizes content and reasons about this content, adapting its plan as it uncovers more and more information.” Chen further highlighted the innovation’s importance to OpenAI’s vision: “This is core to our artificial general intelligence (AGI) roadmap. Our ultimate aspiration is a model that can uncover and discover new knowledge for itself.” The launch of Deep Research marks the second in OpenAI’s official agents following the launch of its browser and cursor controlling Operator earlier this month. And Joshua Achiam, head of mission alignment at Stargate Command at OpenAI wrote on X, both models can help better define the concept of an “AI agent” — a popular but nebulous term these days among enterprises — well beyond the company or these specific use cases. “I feel like the term ‘agent’ wandered in the desert for a while,” Achaim wrote. “It did not have grounding or examples to point to. But agents like Operator or Deep Research give some shape to this concept. An agent is a general purpose AI that does one or more tool-using workflows for you.” OpenAI’s Deep Research achieves new, highest score on ‘Humanity’s Last Exam’ AI benchmark Deep Research has set new benchmarks for accuracy and reasoning. Isa Fulford, a member of OpenAI’s research team, shared in the YouTube livestream that the model achieves “a new high of 26.6% accuracy” on “Humanity’s Last Exam” a relatively new AI benchmark designed to be the most difficult for any AI model (or human, for that matter) to complete, covering 3,000 questions across 100 different subjects, such as translating ancient inscriptions on archaeological finds. Moreover, its ability to browse the web, reason dynamically and cite sources precisely sets it apart from earlier AI tools. “The model was trained using end-to-end reinforcement learning on hard browsing and reasoning tasks,” Fulford said. “It learned to plan and execute multi-step trajectories, reacting to real-time information and backtracking when necessary.” A standout feature of Deep Research is its capacity to handle tasks that would otherwise take humans hours or even days. During the announcement, Chen explained that “Deep Research generates outputs that resemble a comprehensive, fully cited research paper — something that an analyst or expert in the field might produce.” Applications and use cases The use cases for Deep Research are as diverse as they are impactful. OpenAI’s official X account posted that it was “built for people who do intensive knowledge work in areas like finance, science, policy and engineering and need thorough and reliable research.” It also appears valuable for consumers seeking personalized recommendations or conducting detailed product research, according to examples shared by OpenAI on its official Deep Research announcement blog post, which includes a detailed research assessment of the best snowboard for someone to buy. Altman summarized the tool’s versatility, writing: “Give it a try on your hardest work task that can be solved just by using the internet and see what happens.” A personal medical success story of Deep Research Felipe Millon, OpenAI’s government go-to-market lead, shared a deeply personal account of how Deep Research impacted his family. Writing in a series of posts on X, he described his wife’s battle with bilateral breast cancer and how the AI tool became an unexpected ally. “At the end of October, my wife was diagnosed with bilateral breast cancer,” wrote Millon. “Overnight, our world turned upside down.” After a double mastectomy and chemotherapy, the couple faced a critical decision: Whether or not to pursue radiation therapy. The situation was fraught with uncertainty, as even their specialists provided mixed recommendations. “For her specific case, it’s completely in a gray area,” Millon explained. “We felt stuck.” Having preview access to Deep Research, Millon decided to upload his wife’s surgical pathology report and ask whether radiation would be beneficial. “What happened next was mind-blowing,” he wrote. “It didn’t just confirm what our oncologists mentioned — it went deeper. It cited studies I’d never heard of and adapted when we added details like her age and genetic factors.” The specific prompt he used was: “Read the surgical pathology report (attached) containing information about the bilateral breast cancer. Then research[ed] whether radiation would be indicated for this patient after 6 rounds of TCHP chemotherapy, based on the type of breast cancer. I want to understand the pros and cons of radiation for this patient, how likely it would be to reduce chances of recurrence, and whether the benefits outweigh the potential long-term risks.” Millon

OpenAI’s surprise new o3-powered ‘Deep Research’ mode shows the power of the AI agent era Read More »

How Thomson Reuters and Anthropic built an AI that lawyers actually trust

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Thomson Reuters is bringing AI to tax professionals in a big way. The company has partnered with Anthropic to use its Claude AI technology in its tax tools, marking one of the largest AI rollouts in the tax and accounting industry. At the heart of this initiative is CoCounsel, Thomson Reuters’ AI platform for legal and tax professionals. The system runs on Amazon’s secure cloud infrastructure, ensuring that sensitive client information remains protected while delivering AI-powered insights. “We combine real expert human knowledge with advanced technology,” Joel Hron, CTO at Thomson Reuters, said in an exclusive interview with VentureBeat. “We have experts across many different domains generating content and workflows. For us, AI is a tool to facilitate the distribution of that expertise through our software.” How Thomson Reuters built a tax AI platform using 150 years of professional content The company has built a comprehensive retrieval-augmented generation (RAG) architecture that connects Claude to Thomson Reuters’ vast knowledge base, including content from more than 3,000 subject matter experts and 150 years of professional publications. Rob Greenlee, head of industries at Anthropic, explained the technical approach in an exclusive interview: “Claude’s foundation in understanding complex professional domains like law and tax comes from comprehensive training on a diverse range of high-quality texts, including professional and academic content. For work with Thomson Reuters, we’ve taken several additional steps… We then work closely with Thomson Reuters to optimize Claude’s performance through advanced prompting strategies and carefully designed workflows that leverage their authoritative content and domain expertise.” Inside the strategic deployment of multiple AI models for professional services Thomson Reuters is strategically deploying different versions of Claude based on task complexity. The company uses Claude 3 Haiku for rapid processing tasks and Claude 3.5 Sonnet for deeper analyses requiring detailed insights. Early results show significant efficiency gains. “Customers are reporting transformative efficiency gains with CoCounsel,” said Hron. “Professionals are not only saving time but, also elevating the level of work they focus on, maintaining quality while delivering more strategic value to their clients.” Security remains paramount in the implementation. Amazon Bedrock provides what Hron called “a robust and battle-tested cloud infrastructure that adheres to our enterprise-grade security standards throughout the entire life cycle.” Enterprise AI deployment sets new standard for security and professional trust The collaboration between Thomson Reuters and Anthropic represents a new model for enterprise AI deployment, combining advanced AI capabilities with domain expertise and secure infrastructure. “What makes this partnership particularly valuable is the combination of Anthropic’s advanced AI capabilities with Thomson Reuters’ deep domain expertise and authoritative content,” said Greenlee. Looking ahead, Thomson Reuters plans to expand its use of Claude, exploring agent frameworks for complex tax workflows and computer vision capabilities to help editorial teams curate content more efficiently. “We’ve been vocal about our AI investment as a strategic part of our products going forward,” said Hron. “Our editorial workforce spends significant time building and curating content — we see tremendous potential to accelerate these processes with Anthropic’s computer vision and tool use capabilities.” The implementation comes as tax and accounting professionals increasingly adopt AI tools to streamline their work. Thomson Reuters’ approach could serve as a blueprint for other enterprises looking to deploy AI while maintaining professional standards and data security. Correction: Feb. 11, 2025: An earlier version of this article misstated Thomson Reuters’ use of Claude AI. The technology is implemented specifically for tax services within CoCounsel, not for legal services. source

How Thomson Reuters and Anthropic built an AI that lawyers actually trust Read More »

Clever architecture over raw compute: DeepSeek shatters the ‘bigger is better’ approach to AI development

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More The AI narrative has reached a critical inflection point. The DeepSeek breakthrough — achieving state-of-the-art performance without relying on the most advanced chips — proves what many at NeurIPS in December had already declared: AI’s future isn’t about throwing more compute at problems — it’s about reimagining how these systems work with humans and our environment. As a Stanford-educated computer scientist who’s witnessed both the promise and perils of AI development, I see this moment as even more transformative than the debut of ChatGPT. We’re entering what some call a “reasoning renaissance.” OpenAI’s o1, DeepSeek’s R1, and others are moving past brute-force scaling toward something more intelligent — and doing so with unprecedented efficiency. This shift couldn’t be more timely. During his NeurIPS keynote, former OpenAI chief scientist Ilya Sutskever declared that “pretraining will end” because while compute power grows, we’re constrained by finite internet data. DeepSeek’s breakthrough validates this perspective — the China company’s researchers achieved comparable performance to OpenAI’s o1 at a fraction of the cost, demonstrating that innovation, not just raw computing power, is the path forward. Advanced AI without massive pre-training World models are stepping up to fill this gap. World Labs’ recent $230 million raise to build AI systems that understand reality like humans do parallels DeepSeek’s approach, where their R1 model exhibits “Aha!” moments — stopping to re-evaluate problems just as humans do. These systems, inspired by human cognitive processes, promise to transform everything from environmental modeling to human-AI interaction. We’re seeing early wins: Meta’s recent update to their Ray-Ban smart glasses enables continuous, contextual conversations with AI assistants without wake words, alongside real-time translation. This isn’t just a feature update — it’s a preview of how AI can enhance human capabilities without requiring massive pre-trained models. However, this evolution comes with nuanced challenges. While DeepSeek has dramatically reduced costs through innovative training techniques, this efficiency breakthrough could paradoxically lead to increased overall resource consumption — a phenomenon known as Jevons Paradox, where technological efficiency improvements often result in increased rather than decreased resource use. In AI’s case, cheaper training could mean more models being trained by more organizations, potentially increasing net energy consumption. But DeepSeek’s innovation is different: By demonstrating that state-of-the-art performance is possible without cutting-edge hardware, they’re not just making AI more efficient — they’re fundamentally changing how we approach model development. This shift toward clever architecture over raw computing power could help us escape the Jevons Paradox trap, as the focus moves from “how much compute can we afford?” to “how intelligently can we design our systems?” As UCLA professor Guy Van Den Broeck notes, “The overall cost of language model reasoning is certainly not going down.” The environmental impact of these systems remains substantial, pushing the industry toward more efficient solutions — exactly the kind of innovation DeepSeek represents. Prioritizing efficient architectures This shift demands new approaches. DeepSeek’s success validates the fact that the future isn’t about building bigger models — it’s about building smarter, more efficient ones that work in harmony with human intelligence and environmental constraints. Meta’s chief AI scientist Yann LeCun envisions future systems spending days or weeks thinking through complex problems, much like humans do. DeepSeek’s-R1 model, with its ability to pause and reconsider approaches, represents a step toward this vision. While resource-intensive, this approach could yield breakthroughs in climate change solutions, healthcare innovations and beyond. But as Carnegie Mellon’s Ameet Talwalkar wisely cautions, we must question anyone claiming certainty about where these technologies will lead us. For enterprise leaders, this shift presents a clear path forward. We need to prioritize efficient architecture. One that can: Deploy chains of specialized AI agents rather than single massive models. Invest in systems that optimize for both performance and environmental impact. Build infrastructure that supports iterative, human-in-the-loop development. Here’s what excites me: DeepSeek’s breakthrough proves that we’re moving past the era of “bigger is better” and into something far more interesting. With pretraining hitting its limits and innovative companies finding new ways to achieve more with less, there’s this incredible space opening up for creative solutions. Smart chains of smaller, specialized agents aren’t just more efficient — they’re going to help us solve problems in ways we never imagined. For startups and enterprises willing to think differently, this is our moment to have fun with AI again, to build something that actually makes sense for both people and the planet. Kiara Nirghin is an award-winning Stanford technologist, bestselling author and co-founder of Chima. DataDecisionMakers Welcome to the VentureBeat community! DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation. If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers. You might even consider contributing an article of your own! Read More From DataDecisionMakers source

Clever architecture over raw compute: DeepSeek shatters the ‘bigger is better’ approach to AI development Read More »