VentureBeat

Why ‘prosocial AI’ must be the framework for designing, deploying and governing AI

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More As AI pervades every sphere of modern life, the central challenge facing business leaders, policymakers and innovators is no longer whether to adopt intelligent systems but how. In a world marked by escalating polarization, resource depletion, eroding trust in institutions and volatile information landscapes, the critical imperative is to engineer AI so that it contributes meaningfully and sustainably to human and planetary well-being. Prosocial AI — a framework of design, deployment and governance principles that ensure AI is thoughtfully tailored, trained, tested and targeted to uplift people and the planet — is more than a moral stance or PR veneer. It is a strategic approach to positioning AI within a broader ecology of intelligence that values collective flourishing over narrow optimization. The ABCD of AI’s potential: From gloom to glory The rationale for prosocial AI emerges from four intertwined realms — agency, bonding, climate and division (ABCD). Each domain highlights the dual character of AI: It can either intensify existing dysfunctions or act as a catalyst for regenerative, inclusive solutions. Agency: Too often, AI-driven platforms rely on addictive loops and opaque recommender systems that erode user autonomy. Prosocial AI, by contrast, can activate agency by revealing the provenance of its suggestions, offering meaningful user controls and respecting the multifaceted nature of human decision-making. It is not merely about “consent” or “transparency” as abstract buzzwords; it is about designing AI interactions that acknowledge human complexity — the interplay of cognition, emotion, bodily experience and social context — and enabling individuals to navigate their digital environments without succumbing to manipulation or distraction. Bonding: Digital technologies can either fracture societies into echo chambers or serve as bridges that connect diverse people and ideas. Prosocial AI applies nuanced linguistic and cultural models to identify shared interests, highlight constructive contributions and foster empathy across boundaries. Instead of fueling outrage for attention, it helps participants discover complementary perspectives, strengthening communal bonds and reinforcing the delicate social fabrics that hold societies together. Climate: AI’s relationship with the environment is fraught with tension. AI can optimize supply chains, enhance climate modeling and support environmental stewardship. However, the computational intensity of training large models often entails a considerable carbon footprint. A prosocial lens demands designs that balance these gains against ecological costs — adopting energy-efficient architectures, transparent lifecycle assessments and ecologically sensitive data practices. Rather than treat the planet as an afterthought, prosocial AI anchors climate considerations as a cardinal priority: AI must not only advise on sustainability but must be sustainable. Division: The misinformation cascades and ideological rifts that define our era are not an inevitable byproduct of technology, but a result of design choices that privilege virality over veracity. Prosocial AI counters this by embedding cultural and historical literacy into its processes, respecting contextual differences and providing fact-checking mechanisms that enhance trust. Rather than homogenizing knowledge or imposing top-down narratives, it nurtures informed pluralism, making digital spaces more navigable, credible and inclusive. Double literacy: Integrating AI and NI Realizing this vision depends on cultivating what we might call “double literacy.” On one side is AI literacy: mastering the technical intricacies of algorithms, understanding how biases emerge from data and establishing rigorous accountability and oversight mechanisms. On the other side is natural intelligence (NI) literacy: A comprehensive, embodied understanding of human cognition and emotion (brain and body), personal identity (self) and cultural embeddedness (society). This NI literacy is not a soft skill set perched on the margins of innovation; it is fundamental. Human intelligence is shaped by neurobiology, physiology, interoception, cultural narratives and community ethics — an intricate tapestry that transcends reductive notions of “rational actors.” By bringing NI literacy into dialogue with AI literacy, developers, decision-makers and regulators can ensure that digital architectures honor our multidimensional human reality. This holistic approach fosters systems that are ethically sound, context-sensitive and capable of complementing rather than constraining human capacities. AI and NI in synergy: Prosocial AI goes beyond zero-sum thinking The popular imagination often pits machines against humans in a zero-sum contest. Prosocial AI challenges this dichotomy. Consider the beauty of complementarity in healthcare: AI excels at pattern recognition, sifting through vast troves of medical images to detect anomalies that might elude human specialists. Physicians, in turn, draw on their embodied cognition and moral instincts to interpret results, communicate complex information and consider each patient’s broader life context. The outcome is not simply more efficient diagnostics; it is more humane, patient-centered care. Similar paradigms can transform law, finance, governance and education decision-making. By integrating the precision of AI with the nuanced judgment of human experts, we might transition from hierarchical command-and-control models to collaborative intelligence ecosystems. Here, machines handle complexity at scale and humans provide the moral vision and cultural fluency necessary to ensure that these systems serve authentic public interests. Building a prosocial infrastructure To embed prosocial AI at the core of our future, we need a concerted effort across all sectors: Industry and tech companies: Innovators can prioritize “human-in-the-loop” designs and explicitly reward metrics tied to well-being rather than engagement at any cost. Instead of designing AI to hook users, they can build systems that inform, empower and uplift — measured by improvements in health outcomes, educational attainment, environmental sustainability or social cohesion. Example: The Partnership on AI provides frameworks for prosocial innovation, helping guide developers toward responsible practices. Civil society and NGOs: Community groups and advocacy organizations can guide the development and deployment of AI, testing new tools in real-world contexts. They can bring ethnically, linguistically and culturally diverse perspectives to the design table, ensuring that the resulting AI systems serve a broad range of human experiences and needs. Educational Institutions: Schools and universities should integrate double literacy into their curricula while reinforcing critical thinking, ethics and cultural studies. By nurturing AI and NI literacy, educational bodies can help ensure that future generations are skilled in machine learning (ML) and deeply grounded in human values. Example:

Why ‘prosocial AI’ must be the framework for designing, deploying and governing AI Read More »

OpenAI: Extending model ‘thinking time’ helps combat emerging cyber vulnerabilities

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Typically, developers focus on reducing inference time — the period between when AI receives a prompt and provides an answer — to get at faster insights. But when it comes to adversarial robustness, OpenAI researchers say: Not so fast. They propose that increasing the amount of time a model has to “think” — inference time compute — can help build up defenses against adversarial attacks. The company used its own o1-preview and o1-mini models to test this theory, launching a variety of static and adaptive attack methods — image-based manipulations, intentionally providing incorrect answers to math problems, and overwhelming models with information (“many-shot jailbreaking”). They then measured the probability of attack success based on the amount of computation the model used at inference. “We see that in many cases, this probability decays — often to near zero — as the inference-time compute grows,” the researchers write in a blog post. “Our claim is not that these particular models are unbreakable — we know they are — but that scaling inference-time compute yields improved robustness for a variety of settings and attacks.” From simple Q/A to complex math Large language models (LLMs) are becoming ever more sophisticated and autonomous — in some cases essentially taking over computers for humans to browse the web, execute code, make appointments and perform other tasks autonomously — and as they do, their attack surface becomes wider and every more exposed. Yet adversarial robustness continues to be a stubborn problem, with progress in solving it still limited, the OpenAI researchers point out — even as it is increasingly critical as models take on more actions with real-world impacts. “Ensuring that agentic models function reliably when browsing the web, sending emails or uploading code to repositories can be seen as analogous to ensuring that self-driving cars drive without accidents,” they write in a new research paper. “As in the case of self-driving cars, an agent forwarding a wrong email or creating security vulnerabilities may well have far-reaching real-world consequences.” To test the robustness of o1-mini and o1-preview, researchers tried a number of strategies. First, they examined the models’ ability to solve both simple math problems (basic addition and multiplication) and more complex ones from the MATH dataset (which features 12,500 questions from mathematics competitions). They then set “goals” for the adversary: getting the model to output 42 instead of the correct answer; to output the correct answer plus one; or output the correct answer times seven. Using a neural network to grade, researchers found that increased “thinking” time allowed the models to calculate correct answers. They also adapted the SimpleQA factuality benchmark, a dataset of questions intended to be difficult for models to resolve without browsing. Researchers injected adversarial prompts into web pages that the AI browsed and found that, with higher compute times, they could detect inconsistencies and improve factual accuracy. Source: Arxiv Ambiguous nuances In another method, researchers used adversarial images to confuse models; again, more “thinking” time improved recognition and reduced error. Finally, they tried a series of “misuse prompts” from the StrongREJECT benchmark, designed so that victim models must answer with specific, harmful information. This helped test the models’ adherence to content policy. However, while increased inference time did improve resistance, some prompts were able to circumvent defenses. Here, the researchers call out the differences between “ambiguous” and “unambiguous” tasks. Math, for instance, is undoubtedly unambiguous — for every problem x, there is a corresponding ground truth. However, for more ambiguous tasks like misuse prompts, “even human evaluators often struggle to agree on whether the output is harmful and/or violates the content policies that the model is supposed to follow,” they point out. For example, if an abusive prompt seeks advice on how to plagiarize without detection, it’s unclear whether an output merely providing general information about methods of plagiarism is actually sufficiently detailed enough to support harmful actions. Source: Arxiv “In the case of ambiguous tasks, there are settings where the attacker successfully finds ‘loopholes,’ and its success rate does not decay with the amount of inference-time compute,” the researchers concede. Defending against jailbreaking, red-teaming In performing these tests, the OpenAI researchers explored a variety of attack methods. One is many-shot jailbreaking, or exploiting a model’s disposition to follow few-shot examples. Adversaries “stuff” the context with a large number of examples, each demonstrating an instance of a successful attack. Models with higher compute times were able to detect and mitigate these more frequently and successfully. Soft tokens, meanwhile, allow adversaries to directly manipulate embedding vectors. While increasing inference time helped here, the researchers point out that there is a need for better mechanisms to defend against sophisticated vector-based attacks. The researchers also performed human red-teaming attacks, with 40 expert testers looking for prompts to elicit policy violations. The red-teamers executed attacks in five levels of inference time compute, specifically targeting erotic and extremist content, illicit behavior and self-harm. To help ensure unbiased results, they did blind and randomized testing and also rotated trainers. In a more novel method, the researchers performed a language-model program (LMP) adaptive attack, which emulates the behavior of human red-teamers who heavily rely on iterative trial and error. In a looping process, attackers received feedback on previous failures, then used this information for subsequent attempts and prompt rephrasing. This continued until they finally achieved a successful attack or performed 25 iterations without any attack at all. “Our setup allows the attacker to adapt its strategy over the course of multiple attempts, based on descriptions of the defender’s behavior in response to each attack,” the researchers write. Exploiting inference time In the course of their research, OpenAI found that attackers are also actively exploiting inference time. One of these methods they dubbed “think less” — adversaries essentially tell models to reduce compute, thus increasing their susceptibility to error. Similarly, they identified a failure mode in reasoning models that they

OpenAI: Extending model ‘thinking time’ helps combat emerging cyber vulnerabilities Read More »

DeepSeek unleashes ‘Janus Pro 7B’ vision model amidst AI stock bloodbath, igniting fresh fears of Chinese tech dominance

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More DeepSeek, the fast-growing Chinese AI company, is shaking up global technology yet again. Just as the rapid rise of the company’s frontier AI models triggered a selloff of U.S. artificial intelligence stocks, the company launched a brand-new product: Janus Pro 7B, an open-source vision-based AI model. (You can try a demo right here.) This unexpected release from DeepSeek intensifies investor worries about China’s growing power in AI and further pressures American tech companies. The company released Janus Pro 7B today as U.S. AI stocks plunged, a timing that appears to be deliberate and designed to highlight the Beijing-based firm’s challenge to Silicon Valley. DeepSeek’s latest launch follows its release last week of the frontier R1 large language model. Industry experts were largely impressed by DeepSeek-R1’s efficient and strong performance. The R1 model immediately raised concerns that China is quickly advancing in AI and could disrupt the current leaders in the field. Markets reacted quickly. Nvidia, a key maker of AI chips, saw its stock price fall sharply. Other major AI companies also experienced stock drops as investors reassessed the competitive landscape with DeepSeek emerging as a strong new player. NEWS: DeepSeek just dropped ANOTHER open-source AI model, Janus-Pro-7B. It’s multimodal (can generate images) and beats OpenAI’s DALL-E 3 and Stable Diffusion across GenEval and DPG-Bench benchmarks. This comes on top of all the R1 hype. The ? is cookin’ pic.twitter.com/yCmDQoke0f — Rowan Cheung (@rowancheung) January 27, 2025 Efficiency is the new king: Why Janus Pro 7B changes everything With Janus Pro 7B DeepSeek is now extending its reach beyond language processing into the critical domain of computer vision. According to the technical paper released with the model, Janus Pro 7B is engineered for efficiency and versatility, excelling in a range of visual tasks from generating photorealistic images to performing complex visual reasoning. “Janus [Pro] is a series of efficient vision models,” the DeepSeek research team states in their paper, “aiming to achieve a balance between performance and computational cost. We present Janus-Pro-7B, a 7 billion parameter vision model…achieving state-of-the-art performance on a wide range of vision tasks.” This emphasis on efficiency is a crucial differentiator for enterprise customers. Unlike some of the largest and most resource-intensive AI models, Janus Pro 7B, with its 7 billion parameters, is designed to deliver high-level performance without demanding vast computational resources. This efficiency could significantly lower the barrier to entry for businesses looking to integrate advanced vision AI into their operations. For companies ranging from startups to multinational corporations, the prospect of deploying sophisticated visual intelligence without incurring exorbitant infrastructure costs is increasingly attractive. The research paper further details the breadth of the model’s capabilities, stating, “Janus-Pro-7B demonstrates strong performance in various vision tasks, including image generation, visual question answering, and image captioning.” This multi-faceted functionality is particularly appealing for businesses seeking to leverage AI across diverse applications. Imagine a global retailer utilizing Janus Pro 7B to automate the creation of marketing visuals, respond to customer inquiries about product appearance, and generate detailed and visually rich descriptions for online product listings — all powered by a single, streamlined AI model. The potential for streamlining workflows, enhancing customer engagement, and improving operational efficiency is substantial. Charts released by DeepSeek show performance metrics for its new Janus Pro 7B vision AI. (Left) Janus Pro 7B achieves high average performance with fewer parameters than many other multimodal models. (Right) The model also scores top accuracy on text-to-image generation benchmarks, outperforming competitors. (Credit: DeepSeek) DeepSeek’s one-two punch: R1 language model followed by vision AI intensifies market anxiety and competitive pressure The timing of the Janus Pro 7B launch amplifies its impact. Coming on the heels of the R1 model and the ensuing market turbulence, it reinforces the narrative of DeepSeek as an innovator capable of disrupting the established order in AI. Last week’s initial market jitters, triggered by R1’s release on a holiday Monday, escalated into full-blown panic over the weekend as leaked benchmarks and online demonstrations highlighted the model’s impressive capabilities. And today, as the tech stock sell-off intensified, DeepSeek introduced Janus Pro 7B, further amplifying the sense of urgency and competitive pressure felt by U.S. AI companies. Markets are reacting viscerally to DeepSeek, not just to another AI competitor. They sense a rule change. For too long, AI’s story was relentless scaling: bigger models, more parameters, higher costs. This favored giants, mostly in the West. DeepSeek, with Janus Pro 7B and R1, breaks this mold. They show that nimble, efficient models can overperform. It’s an architectural shift. The advantage in AI may shift from server farm size to smart innovation and broad distribution. Janus Pro 7B’s open-source nature amplifies this disruption. Like open-source movements before, unlike closed proprietary models it increases access to advanced AI. Enterprises outside Big Tech gain cutting-edge AI without vendor lock-in or high fees. And for AI powerhouses, DeepSeek poses a direct threat: Can their proprietary, premium models survive free, high-quality alternatives? The market selloff suggests investors doubt it. For enterprise technology decision-makers, the message is increasingly clear: The AI landscape is undergoing a rapid transformation, and DeepSeek represents a significant new force. Ignoring the implications of Janus Pro 7B, and DeepSeek’s broader strategic approach, would be a critical oversight. Businesses must now assess the opportunities and challenges presented by this new wave of AI innovation, even amidst ongoing market volatility and geopolitical uncertainties. The era of unchallenged U.S. AI leadership may be drawing to a close, and the global economy is entering a more dynamic and potentially disruptive phase of AI-driven competition. source

DeepSeek unleashes ‘Janus Pro 7B’ vision model amidst AI stock bloodbath, igniting fresh fears of Chinese tech dominance Read More »

No retraining needed: Sakana’s new AI model changes how machines learn

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Researchers at Sakana AI, an AI research lab focusing on nature-inspired algorithms, have developed a self-adaptive language model that can learn new tasks without the need for fine-tuning. Called Transformer² (Transformer-squared), the model uses mathematical tricks to align its weights with user requests during inference. This is the latest in a series of techniques that aim to improve the abilities of large language models (LLMs) at inference time, making them increasingly useful for everyday applications across different domains. Dynamically adjusting weights Usually, configuring LLMs for new tasks requires a costly fine-tuning process, during which the model is exposed to new examples and its parameters are adjusted. A more cost-effective approach is “low-rank adaptation” (LoRA), in which a small subset of the model’s parameters relevant to the target task is identified and modified during fine-tuning. After training and fine-tuning, the model’s parameters remain frozen, and the only way to repurpose it for new tasks is through techniques such as few-shot and many-shot learning. In contrast to classic fine-tuning, Transformer-squared uses a two-step approach to dynamically adjust its parameters during inference. First, it analyzes the incoming request to understand the task and its requirements, then it applies task-specific adjustments to the model’s weights to optimize its performance for that specific request. “By selectively adjusting critical components of the model weights, our framework allows LLMs to dynamically adapt to new tasks in real time,” the researchers write in a blog post published on the company’s website. Transformer-squared (source: Sakana AI blog) How Sakana’s Transformer-squared works The core ability of Transformer-squared is dynamically adjusting critical components of its weights at inference. To do this, it has to first identify the key components that can be tweaked during inference. Transformer-squared does this through singular-value decomposition (SVD), a linear algebra trick that breaks down a matrix into three other matrices that reveal its inner structure and geometry. SVD is often used to compress data or to simplify machine learning models. When applied to the LLM’s weight matrix, SVD obtains a set of components that roughly represent the model’s different abilities, such as math, language understanding or coding. In their experiments, the researchers found that these components could be tweaked to modify the model’s abilities in specific tasks. To systematically leverage these findings, they developed a process called singular value finetuning (SVF). At training time, SVF learns a set of vectors from the SVD components of the model. These vectors, called z-vectors, are compact representations of individual skills and can be used as knobs to amplify or dampen the model’s ability in specific tasks. At inference time, Transformer-squared uses a two-pass mechanism to adapt the LLM for unseen tasks. First, it examines the prompt to determine the skills required to tackle the problem (the researchers propose three different techniques for determining the required skills). In the second stage, Transformer-squared configures the z-vectors corresponding to the request and runs the prompt through the model and the updated weights. This enables the model to provide a tailored response to each prompt. Transformer-squared training and inference (source: arXiv) Transformer-squared in action The researchers applied Transformer-squared to Llama-3 and Mistral LLMs and compared them to LoRA on various tasks, including math, coding, reasoning and visual question-answering. Transformer-squared outperforms LoRA on all benchmarks while having fewer parameters. It is also notable that, unlike Transformer-squared, LoRA models can’t adapt their weights at inference time, which makes them less flexible. Another intriguing finding is that the knowledge extracted from one model can be transferred to another. For example, the z-vectors obtained from Llama models could be applied to Mistral models. The results were not on par with creating z-vectors from scratch for the target model, and the transferability was possible because the two models had similar architectures. But it suggests the possibility of learning generalized z-vectors that can be applied to a wide range of models. Transformer-squared (SVF in the table) vs base models and LoRA (source: arXiv) “The path forward lies in building models that dynamically adapt and collaborate with other systems, combining specialized capabilities to solve complex, multi-domain problems,” the researchers write. “Self-adaptive systems like Transformer² bridge the gap between static AI and living intelligence, paving the way for efficient, personalized and fully integrated AI tools that drive progress across industries and our daily lives.” Sakana AI has released the code for training the components of Transformer-squared on GitHub. Inference-time tricks As enterprises explore different LLM applications, the past year has seen a noticeable shift toward developing inference-time techniques. Transformer-squared is one of several approaches that enable developers to customize LLMs for new tasks at inference time without the need to retrain or fine-tune them. Titans, an architecture developed by researchers at Google, tackles the problem from a different angle, giving language models the ability to learn and memorize new information at inference time. Other techniques focus on enabling frontier LLMs to leverage their increasingly long context windows to learn new tasks without retraining. With enterprises owning the data and knowledge specific to their applications, advances in inference-time customization techniques will make LLMs much more useful. source

No retraining needed: Sakana’s new AI model changes how machines learn Read More »

OpenAI Stargate is a $500B bet: America’s AI Manhattan Project or costly dead end?

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More In case you missed it amid the flurry of executive orders coming out of the White House in the days since President Trump returned to office for his second non-consecutive term this week, the single largest investment in AI infrastructure was just announced yesterday afternoon. Known as “the Stargate Project,” it’s a $500 billion (half a trillion) effort from OpenAI, SoftBank, Oracle and MGX to form a new venture that will build “new AI infrastructure for OpenAI in the United States,” and as OpenAI put it in its announcement post on the social network X, to “support the re-industrialization of the United States… also provide a strategic capability to protect the national security of America and its allies.” The end goal: to build artificial general intelligence (AGI), or AI that outperforms humans on most economically valuable work, which has been OpenAI’s goal from the start — and ultimately, artificial superintelligence, or AI even smarter than humans can comprehend. Flanked by Trump himself, OpenAI cofounder and CEO Sam Altman appeared at the White House alongside Softbank CEO Masayoshi “Masa” Son and Oracle executive chairman Larry Ellison, saying “I’m thrilled we get to do this in the United States of America. I think this will be the most important project of this era — and as Masa said, for AGI to get built here, to create hundreds of thousands of jobs, to create a new industry centered here — we wouldn’t be able to do this without you, Mr. President.” Son called it “the beginning of our Golden Age.” Several high-profile technology companies have partnered with the initiative to build and operate the infrastructure. Arm, Microsoft, Nvidia, Oracle and OpenAI are among the key partners contributing their expertise and resources to the effort. Oracle, Nvidia and OpenAI, in particular, will collaborate closely on developing the computing systems essential for the project’s success. While some see the Stargate Project as a transformative investment in the future of AI, critics argue that it is a costly overreach, unnecessary in light of the rapid rise of leaner, open-source reasoning AI models like China’s DeepSeek R-1, which was just released earlier this week under a permissive MIT License — allowing it to be downloaded, fine-tuned or retrained, and used freely in commercial and noncommercial projects — and which matches or outperforms OpenAI’s own o1 reasoning models on key third-party benchmarks. The debate has become a lightning rod for competing visions of AI development and the geopolitical dynamics shaping the race for technological supremacy. A transformational leap forward? For many advocates, the Stargate Project represents an unparalleled commitment to innovation and national competitiveness, on par with prior eras of large infrastructure spending such as the U.S. highway system during the Eisenhower era (though of course, that was with public funds — not private as in this case). On X, AI commentator and former engineer David Shapiro said, “America just won geopolitics for the next 50 years with Project Stargate,” and likened the initiative to historic achievements like the Manhattan Project and NASA’s Apollo program. He argued that this level of investment in artificial intelligence is not only necessary but inevitable, given the stakes. Shapiro described the project as a strategic move to ensure that America maintains technological supremacy, framing the investment as critical to solving global problems, driving economic growth and securing national security. “When America decides something matters and backs it with this kind of money? It happens. Period,” he declared. In terms of practical applications, advocates point to the Stargate Project’s promise of AI-enabled breakthroughs in areas like cancer research, personalized medicine, and pandemic prevention. Oracle’s Ellison has specifically highlighted the potential to develop new personalized mRNA-based vaccines and cancer treatments, revolutionizing healthcare. A waste of (as yet un-procured) moneys? Despite this optimism, critics are challenging the project on multiple fronts, from its financial feasibility to its strategic direction. Elon Musk, head of the Department of Government Efficiency (DOGE) under President Donald Trump’s second administration and an OpenAI cofounder, cast doubt on the project’s funding. Musk, who has since launched his own AI company, xAI, and its Grok language model family, posted on his social network, X, “They don’t actually have the money,” alleging that SoftBank — Stargate’s primary financial backer — has secured “well under $10B.” In response, Altman replied this morning: “[I] genuinely respect your accomplishments and think you are the most inspiring entrepreneur of our time,” later writing that Musk was “wrong, as you surely know. want to come visit the first site already under way? this is great for the country. i realize what is great for the country isn’t always what’s optimal for your companies, but in your new role i hope you’ll mostly put [US flag emoji] first.” Others have questioned the timing and strategic rationale behind the initiative. Tech entrepreneur and commentator Arnaud Bertrand took to X to contrast OpenAI’s infrastructure-heavy approach with the leaner, more decentralized strategy employed by China’s High-Flyer Capital Management, creators of the new, highest performing open-source large language model (LLM), DeepSeek-R1, released earlier this week. Bertrand noted that DeepSeek has achieved performance parity with OpenAI’s latest models at just 3% of the cost, using far smaller GPU clusters and data centers. He described the divergence as a collision of philosophies, with OpenAI betting on massive centralized infrastructure while DeepSeek pursues democratized, cost-efficient AI development. “A fundamental question remains,” Bertrand wrote on X. “What will OpenAI customers be paying for exactly if much cheaper DeepSeek matches their latest models’ performance? Having spent an indecent amount of money on data centers isn’t a customer benefit in and of itself.” Bertrand further argued that OpenAI’s focus on infrastructure may represent outdated thinking. “This $500B bet on infrastructure may be OpenAI fighting the last war,” he warned, pointing to DeepSeek’s success as evidence that innovation and agility — not scale — are the key drivers of modern AI

OpenAI Stargate is a $500B bet: America’s AI Manhattan Project or costly dead end? Read More »

Tech leaders respond to the rapid rise of DeepSeek

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More If you hadn’t heard, there’s a new AI star in town: DeepSeek, the subsidiary of Hong Kong-based quantitative analysis (quant) firm High-Flyer Capital Management, has sent shockwaves throughout Silicon Valley and the wider world with its release earlier this week of a new open-source large reasoning model, DeepSeek R1, which matches OpenAI’s most powerful available model o1 — and at a fraction of the cost to users and to the company itself (when training it). While the advent of DeepSeek R1 has already reshuffled a consistently topsy-turvy, fast-moving, intensely competitive market for new AI models — previous months saw OpenAI jockeying with Anthropic and Google for the most powerful proprietary models available, while Meta Platforms often came in with “close enough” open-source rivals — the difference this time is that the company behind the hot model is based in China, the geopolitical “frenemy” of the U.S., and whose tech sector was widely viewed, until this moment, as inferior to that of Silicon Valley. As such, it’s caused no shortage of hand-wringing and existentialism from U.S. and Western-bloc techies, who are suddenly doubting OpenAI and the general big-tech strategy of throwing more money and more compute (graphics processing units, GPUs, the powerful gaming chips typically used to train AI models) toward the problem of inventing ever more powerful models. Yet some Western tech leaders have had a largely positive public response to DeepSeek’s rapid ascent. Marc Andreessen, a co-inventor of the pioneering Mosaic web browser, cofounder of the Netscape browser company and current general partner at the famed Andreessen Horowitz (a16z) venture capital firm, posted on X today: “Deepseek R1 is one of the most amazing and impressive breakthroughs I’ve ever seen — and as open source, a profound gift to the world [robot emoji, salute emoji].” Yann LeCun, the chief AI scientist for Meta’s Fundamental AI Research (FAIR) division, posted on his LinkedIn account: “To people who see the performance of DeepSeek and think:‘China is surpassing the US in AI.’You are reading this wrong.The correct reading is:‘Open source models are surpassing proprietary ones.’ DeepSeek has profited from open research and open source (e.g. PyTorch and Llama from Meta)They came up with new ideas and built them on top of other people’s work.Because their work is published and open source, everyone can profit from it.That is the power of open research and open source.” And even Mark “Zuck” Zuckerberg, Meta AI’s founder and CEO, seemed to seek to counter the rise of DeepSeek with his own post on Facebook promising that a new version of Facebook’s open-source AI model family Llama would be “the leading state of the art model” when it is released sometime this year. As he put it: “This will be a defining year for AI. In 2025, I expect Meta AI will be the leading assistant serving more than 1 billion people, Llama 4 will become the leading state of the art model, and we’ll build an AI engineer that will start contributing increasing amounts of code to our R&D efforts. To power this, Meta is building a 2GW+ datacenter that is so large it would cover a significant part of Manhattan. We’ll bring online ~1GW of compute in ’25 and we’ll end the year with more than 1.3 million GPUs. We’re planning to invest $60-65B in capex this year while also growing our AI teams significantly, and we have the capital to continue investing in the years ahead. This is a massive effort, and over the coming years it will drive our core products and business, unlock historic innovation, and extend American technology leadership. Let’s go build!“ He even shared a graphic showing the 2-gigawatt datacenter mentioned in his post overlaid on Manhattan: Clearly, even as he espouses a commitment to open-source AI, Zuck is not convinced that DeepSeek’s approach of optimizing for efficiency while leveraging far fewer GPUs than major labs is the right one for Meta, or for the future of AI. But with U.S. companies raising and/or spending record sums on new AI infrastructure that many experts have noted depreciate rapidly (due to hardware/chip and software advancements), the question remains which vision of the future will win out in the end to become the dominant AI provider for the world. Or maybe it will always be a multiplicity of models each with a smaller market share? Stay tuned, because this competition is getting closer and fiercer than ever. source

Tech leaders respond to the rapid rise of DeepSeek Read More »

How Harness is ‘harnessing’ agentic AI to help improve enterprise incident response with automated data collection and playbooks

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Incident response, the process of responding to system disruptions and slowdowns, is a critical aspect of IT operations. It’s also an activity that traditionally involves a lot of manual, time-consuming processes. That’s a challenge Harness is taking aim at with a new incident response service. The technology enters early access today as a module on the company’s eponymous platform. Harness got its start in 2017 with an initial focus on continuous integration/continuous delivery (CI/CD) automation for DevOps. In the years since, the company has expanded into a software delivery platform with multiple modules. In fall 2024 Harness broke into agentic AI, initially to help support software development. Now the company is extending that same core agentic AI foundation for incident response. The new solution also benefits from licensed capabilities originally developed by development workflow vendor Transposit. Tina Huang, cofounder of Transposit, along with many members of her team, joined Harness in September 2024. The goal with Harness Incident Response is to accelerate the mean time to resolution (MTTR) for an incident. “When you think about what DevOps platforms have been up until now, it’s largely been about helping you structure those deployments,” Huang told VentureBeat. “I think the very natural place to go after that is, ‘How do I hand-hold your deployments after they’ve hit production?’” How Harness enables autonomous incident response with agentic AI At the core of Harness’ Incident Response module is the company’s AI agent architecture, first introduced in September 2024. Jyoti Bansal, Harness CEO and cofounder, explained to VentureBeat that its AI agents are designed to provide autonomous assistance, going beyond just alerting engineers to incidents. Traditional incident response technology uses an approach known as a playbook. IT teams, often working with site reliability engineers (SREs), define playbooks that lay out step-by-step processes for recovering from different types of service disruptions. Rather than relying solely on pre-defined playbooks, the agentic AI agents can suggest actions, identify potential root causes and even create new playbooks on the fly. “The agentic workflow is suggesting the actions that should be taken,” Bansal said. Huang explained that AI agents execute multiple steps that are critical to help organizations respond faster to incidents. Even before a playbook can run, there is a certain amount of triage that needs to occur, Bansal explained. General triage can, for instance, identify what services are impacted or determine both upstream and downstream dependencies that will also be impacted by the incident. Harness’ system has agents that are aware of and plugged into multiple systems, and that can collect information automatically, including information and discussion from Slack channels. That information can then help other agents to alert humans and provide autonomous assistance. While the system has a high degree of automation, Huang emphasized that humans are still in the loop. But instead of a human being alerted to a problem and then having to figure out if there is a playbook —and if so how to run it — the system recommends the remediation and the human only needs to approve it. Incident response requires more that just technology The Harness Incident Response module can run on its own, meaning organizations don’t already need to be running any other Harness modules. Bansal expects, however, that the combined offering — which could enable integration with multiple other workflows including DevOps or chaos engineering — could be beneficial. Chaos engineering is the process of injecting unexpected variables and events in an application to see how it responds. Harness has had a chaos engineering module as part of its platform since 2022. Huang explained that as part of the incident response platform, an organization can run ‘fire drills’ alongside the chaos engineering module to test different scenarios. “Incidents happen infrequently, and they are often the unfortunate result of something that you didn’t catch earlier on,” said Huang. “We want to enable a very proactive approach to incident response.” How enterprises will benefit from agentic AI driven incident response One Harness customer using the incident response module is Tyler Technologies, which develops software for the public sector. The company has been using the Harness platform for continuous deployment, cloud cost management and feature flag development. The addition of incident response could help solve a key challenge the faces, explained Jeff Green, Tyler Technologies’ CTO. “Our primary challenge is really integrating all the operational data, metrics and processes, then correlating them into a single unified approach to managing incidents and automating our response to them,” he told VentureBeat. “Our portfolio includes over 100 products built on different technologies using a wide variety of devops tools and platforms.” The incident response capability will complement existing operations Tyler Technologies is already doing with Harness. For example, being able to correlate deployments with incidents, or feature flags with incidents. “We think the AI capabilities being infused into the product will save a lot of time by helping us with root cause analysis, identifying ways to mitigate or resolve incidents, and with incident prevention,” said Green. “Much of this work today is done by humans pulling data from multiple sources, scouring logs and application performance monitoring (APM) data and looking for patterns, all tasks that AI is better suited to.” The ROI of agentic AI for incident response Another Harness customer evaluating the incident response module is Omar Alwattar, Sr DevOps engineer at InStride. Alwattar told VentureBeat that his firm has been using the Harness Continuous Delivery module. He noted that when it comes to incident response, his organization has two key challenges: preventative monitoring and root cause identification. The new Harness incident response tool is interesting to his company, he said, as it will help with faster issue identification and automated fix suggestions. “In terms of ROI, the most significant impact would be on downtime reduction, as it directly influences SLA adherence and customer satisfaction,” Alwattar said. “Additionally, by automating aspects of incident response, our

How Harness is ‘harnessing’ agentic AI to help improve enterprise incident response with automated data collection and playbooks Read More »

DeepSeek R1’s bold bet on reinforcement learning: How it outpaced OpenAI at 3% of the cost

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More DeepSeek R1’s Monday release has sent shockwaves through the AI community, disrupting assumptions about what’s required to achieve cutting-edge AI performance. Matching OpenAI’s o1 at just 3%-5% of the cost, this open-source model has not only captivated developers but also challenges enterprises to rethink their AI strategies. The model has rocketed to the top-trending model being downloaded on HuggingFace (109,000 times, as of this writing) – as developers rush to try it out and seek to understand what it means for their AI development. Users are commenting that DeepSeek’s accompanying search feature (which you can find at DeepSeek’s site) is now superior to competitors like OpenAI and Perplexity, and is only rivaled by Google’s Gemini Deep Research. The implications for enterprise AI strategies are profound: With reduced costs and open access, enterprises now have an alternative to costly proprietary models like OpenAI’s. DeepSeek’s release could democratize access to cutting-edge AI capabilities, enabling smaller organizations to compete effectively in the AI arms race. This story focuses on exactly how DeepSeek managed this feat, and what it means for the vast number of users of AI models. For enterprises developing AI-driven solutions, DeepSeek’s breakthrough challenges assumptions of OpenAI’s dominance — and offers a blueprint for cost-efficient innovation. It’s the “how” DeepSeek did what it did that should be the most educational here. DeepSeek’s breakthrough: Moving to pure reinforcement learning In November, DeepSeek made headlines with its announcement that it had achieved performance surpassing OpenAI’s o1, but at the time it only offered a limited R1-lite-preview model. With Monday’s full release of R1 and the accompanying technical paper, the company revealed a surprising innovation: a deliberate departure from the conventional supervised fine-tuning (SFT) process widely used in training large language models (LLMs). SFT, a standard step in AI development, involves training models on curated datasets to teach step-by-step reasoning, often referred to as chain-of-thought (CoT). It is considered essential for improving reasoning capabilities. However, DeepSeek challenged this assumption by skipping SFT entirely, opting instead to rely on reinforcement learning (RL) to train the model. This bold move forced DeepSeek-R1 to develop independent reasoning abilities, avoiding the brittleness often introduced by prescriptive datasets. While some flaws emerge – leading the team to reintroduce a limited amount of SFT during the final stages of building the model – the results confirmed the fundamental breakthrough: reinforcement learning alone could drive substantial performance gains. The company got much of the way using open source – a conventional and unsurprising way First, some background on how DeepSeek got to where it did. DeepSeek, a 2023 spin-off from Chinese hedge-fund High-Flyer Quant, began by developing AI models for its proprietary chatbot before releasing them for public use. Little is known about the company’s exact approach, but it quickly open sourced its models, and it’s extremely likely that the company built upon the open projects produced by Meta, for example the Llama model, and ML library Pytorch. To train its models, High-Flyer Quant secured over 10,000 Nvidia GPUs before U.S. export restrictions, and reportedly expanded to 50,000 GPUs through alternative supply routes, despite trade barriers. This pales compared to leading AI labs like OpenAI, Google, and Anthropic, which operate with more than 500,000 GPUs each. DeepSeek’s ability to achieve competitive results with limited resources highlights how ingenuity and resourcefulness can challenge the high-cost paradigm of training state-of-the-art LLMs. Despite speculation, DeepSeek’s full budget is unknown DeepSeek reportedly trained its base model — called V3 — on a $5.58 million budget over two months, according to Nvidia engineer Jim Fan. While the company hasn’t divulged the exact training data it used (side note: critics say this means DeepSeek isn’t truly open-source), modern techniques make training on web and open datasets increasingly accessible. Estimating the total cost of training DeepSeek-R1 is challenging. While running 50,000 GPUs suggests significant expenditures (potentially hundreds of millions of dollars), precise figures remain speculative. What’s clear, though, is that DeepSeek has been very innovative from the get-go. Last year, reports emerged about some initial innovations it was making, around things like Mixture of Experts and Multi-Head Latent Attention. How DeepSeek-R1 got to the “aha moment” The journey to DeepSeek-R1’s final iteration began with an intermediate model, DeepSeek-R1-Zero, which was trained using pure reinforcement learning. By relying solely on RL, DeepSeek incentivized this model to think independently, rewarding both correct answers and the logical processes used to arrive at them. This approach led to an unexpected phenomenon: The model began allocating additional processing time to more complex problems, demonstrating an ability to prioritize tasks based on their difficulty. DeepSeek’s researchers described this as an “aha moment,” where the model itself identified and articulated novel solutions to challenging problems (see screenshot below). This milestone underscored the power of reinforcement learning to unlock advanced reasoning capabilities without relying on traditional training methods like SFT. Source: DeepSeek-R1 paper. Don’t let this graphic intimidate you. The key takeaway is the red line, where the model literally used the phrase “aha moment.” Researchers latched onto this as a striking example of the model’s ability to rethink problems in an anthropomorphic tone. For the researchers, they said it was their own “aha moment.” The researchers conclude: “It underscores the power and beauty of reinforcement learning: rather than explicitly teaching the model on how to solve a problem, we simply provide it with the right incentives, and it autonomously develops advanced problem-solving strategies.” More than RL However, it’s true that the model needed more than just RL. The paper goes on to talk about how despite the RL creating unexpected and powerful reasoning behaviors, this intermediate model DeepSeek-R1-Zero did face some challenges, including poor readability, and language mixing (starting in Chinese and switching over to English, for example). So only then did the team decide to create a new model, which would become the final DeepSeek-R1 model. This model, again based on the V3

DeepSeek R1’s bold bet on reinforcement learning: How it outpaced OpenAI at 3% of the cost Read More »

Meet OpenAI’s Operator, an AI agent that uses the web to book you dinner reservations, order tickets, compile grocery lists and more

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has unveiled Operator, its first semi-autonomous AI agent, which is designed to “operate” a web browser much like a person would, on their behalf. The agent uses the cursor to point and click, types on its own, browses the web and performs actions on various websites, such as booking restaurant reservations through OpenTable and assembling orders on Instacart and DoorDash. That’s instead of being confined to the ChatGPT interface or OpenAI’s application programming interface (API). “This product is the beginning of our step into agents,” said CEO and cofounder Sam Altman in a demo livestreamed on the company’s YouTube Channel today at 1 pm ET. OpenAI president and fellow cofounder Greg Brockman wrote on X: “2025 is the year of agents.” The preview, now available to paying U.S. subscribers of OpenAI’s ChatGPT Pro ($200 per month) plan, aims to demonstrate the potential of agentic AI while gathering critical feedback to refine its capabilities. Operator doesn’t take over your web browser, though. Instead, you visit a separate, new website — operator.chatgpt.com — and are confronted with a prompt input box similar to ChatGPT. Typing a request into this box — “find me tickets for the LA Lakers game tonight” — will trigger Operator to open a separate, virtual browser running in the cloud on OpenAI servers. Then, the agent can execute tasks like filling out forms, managing online reservations, even booking tickets to sporting events and concerts, and navigating other common workflows. The user watches the cursor move on its own on the cloud-based browser in real time. If the agent encounters a problem, it will stop and message the user via a text output, similar to ChatGPT’s responses. Also, below the virtual browser, the user will see suggestions of actions Operator can take on their behalf. Yet, the user can take control at any time — similar to semi-autonomous driving systems in modern cars. Operator also asks the user to input their own payment credentials when it reaches a purchase screen on another website. Finally, users can save particular workflows that they wish to use going forward and start them again. Operator is powered by what OpenAI calls computer-using agent (CUA) technology, a new variant of GPT-4o trained specifically to use computers. Bridging AI and GUIs Operator stands apart from other automation tools by mimicking human interaction with graphical user interfaces (GUIs). Instead of relying on specialized APIs, the system leverages screenshots for visual input and uses virtual mouse and keyboard actions to complete tasks. The underlying CUA model combines GPT-4o’s vision capabilities with reinforcement learning, enabling the agent to perceive, reason, and act on screen. This approach allows Operator to handle diverse tasks, including ecommerce browsing, travel planning, and even repetitive tasks like creating playlists or managing shopping lists. Notable benchmarks illustrate its effectiveness: • 87% success rate on WebVoyager, a test of live website navigation • 58.1% success rate on WebArena, which simulates real-world ecommerce and content management scenarios But there’s already tough competition: Just yesterday, Chinese tech firm ByteDance (TikTok’s parent company) launched its own AI agent for controlling web browsers and performing actions on a user’s. behalf. Called UI-TARS, it’s totally open-source and boasts similarly impressive benchmark performance (though does not appear to have been compared directly on the same benchmarks). That means OpenAI’s Operator will need to be significantly better or more reliable to justify the relatively high ($200/month) cost of accessing it through ChatGPT Pro subscriptions. Already being tested in enterprise web navigation use cases OpenAI is partnering with several businesses to ensure Operator meets real-world needs. Companies including Instacart, DoorDash and Etsy are already testing the technology for use cases ranging from grocery delivery to personalized shopping. Brett Keller, CEO of Priceline, remarked on its utility for travel planning, calling it “a significant step in making travel more seamless and personalized.” For public-sector applications, the City of Stockton is exploring ways to use Operator to simplify civic engagement. Jamil Niazi, the city’s director of information technology, highlighted AI’s potential to make enrolling in services easier for residents. Yet there are limitations. Tech publication Every got an early preview, has been testing it for the past week, and found that: “One of the peculiarities of Operator’s design is that it doesn’t use your browser. Instead, it uses a browser in one of OpenAI’s data centers that you can watch and interact with remotely. The upside of this design decision is that you can use Operator wherever and whenever — for example, on any mobile device. “The downside is that many sites like Reddit already block AI agents from browsing so they can’t be accessed by Operator. In this research preview mode, Operator is also blocked by OpenAI from accessing certain resource-intensive sites like Figma or competitor-owned sites like YouTube for performance or legal reasons.” Safety measures Given its ability to act on users’ behalf, Operator has been developed with robust safety features: • User control: Operator requests confirmation for sensitive actions, such as making purchases or sending emails. • Watch mode: Ensures user supervision for critical tasks, particularly on sensitive sites like email or financial platforms. • Misuse prevention: The system is trained to refuse harmful requests and includes safeguards against adversarial attacks, such as malicious prompts embedded in websites. OpenAI has also incorporated features to protect user privacy, including options to clear browsing data and opt out of data sharing for model improvements. Enterprise edition coming OpenAI envisions a broader role for Operator in both individual and enterprise settings. Over time, the company plans to expand access to Plus, Team, and Enterprise users, eventually integrating Operator into ChatGPT. There are also plans to make the underlying CUA technology available via an API, enabling developers to create custom computer-using agents. Despite its potential, Operator remains a work in progress. OpenAI has been transparent about its limitations, such as difficulties with complex interfaces or unfamiliar workflows.

Meet OpenAI’s Operator, an AI agent that uses the web to book you dinner reservations, order tickets, compile grocery lists and more Read More »

Forget Nvidia: Ndea wants to build AI that keeps improving on its own with ‘no bottlenecks in sight’

Leave a Comment / Top Tech Update / VentureBeat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More François Chollet, a former Google engineer and the creator of the widely-used Python deep learning framework Keras, has co-founded Ndea, a new AI research and science lab, alongside Mike Knoop, co-founder of Zapier. In a post on the startup’s new website, the founders explain their goals of combining intuitive pattern recognition, enabled by deep learning, with formal reasoning through what they call “guided program synthesis.” They say that this fusion will allow AI systems to adapt and innovate far beyond current task-specific applications, ultimately leading to artificial general intelligence (AGI), defined loosely throughout the AI community as machine intelligence that can outperform human beings at most economically valuable and cognitive tasks. As they write: “We need computers that can pose problems and explore new territory, not just apply known solutions. We need computers that can innovate. The path to AGI is not through incremental improvements to existing methods.” The duo hasn’t said yet whether or not they’ve received outside funding for this venture or are bootstrapping it with their own funds. It comes several months after former OpenAI co-founder and chief scientist Ilya Sutskever — who reportedly led the briefly successful, yet ultimately reversed, internal coup against his fellow co-founder Sam Altman — also announced a startup focused on developing “Safe Superintelligence” with $1 billion in private backing. Beyond deep learning While existing deep learning systems are impressive, Chollet and Knoop argue that they are fundamentally constrained by their reliance on large datasets and their inability to adapt efficiently to new tasks. They say they believe that program synthesis is the key to overcoming these limitations. Unlike traditional deep learning, which interpolates between data points, program synthesis searches for discrete programs that explain data. This method allows for much greater generalization with far fewer data points. Combining deep learning’s intuitive capabilities with the rigorous reasoning of program synthesis could lead to a new paradigm for AI research. “Ndea’s mission is to operationalize AGI to realize unprecedented scientific progress for the benefit of all current and future generations,” they note. Building a “Factory for Scientific Advancement” Ndea’s long-term vision goes beyond the creation of AGI. The lab aims to act as a “factory for rapid scientific advancement,” capable of solving both known and unknown challenges. From tackling current frontiers like autonomous vehicles and sustainable energy to accelerating entirely new discoveries, the lab sees itself as a catalyst for scientific progress. Chollet added that their research direction has the potential to unlock breakthroughs and redefine the boundaries of human knowledge. As he wrote in a thread on X: “If we’re successful, we won’t stop at AI. With this technology in hand, we want to tackle every scientific problem it can solve. We see accelerating scientific progress as the most exciting application of AI.” According to Chollet, this progress hinges on developing AI that can learn as efficiently as humans and continue to improve over time without bottlenecks. While acknowledging that success is not guaranteed, Chollet emphasized the importance of pursuing this ambitious goal, stating on X: “We believe we have a small but real chance of achieving a breakthrough — creating AI that can keep improving over time with no bottlenecks in sight.” A new research focus for AGI Program synthesis, the cornerstone of Ndea’s research, is still a relatively young field. Chollet likened its current state to where deep learning was in 2012. However, he noted that its potential is being increasingly recognized by frontier AI labs, even if most see it as only a small component of what’s needed for AGI. Ndea, in contrast, considers program synthesis equally as important as deep learning and has made it central to their approach. The lab is also actively recruiting a globally distributed team of researchers and engineers to build what it describes as the most “talent-dense program synthesis team” in the world. The company operates as a fully remote organization and is looking for candidates with strong technical expertise, particularly in translating mathematical concepts into code. Founders with strong track records François Chollet and Mike Knoop bring extensive experience to Ndea. At Google, Chollet worked on core research into deep learning and AI systems, gaining insights into the limitations of existing models and opportunities for improvement. His contributions include not only Keras but, the ARC-AGI benchmark, a widely used metric for measuring progress toward AGI. He is also the author of the book Deep Learning with Python and has been recognized among Time’s “100 Most Influential People in AI.” Knoop co-founded Zapier, the world’s largest AI automation company, where he led engineering and product development as well as the company’s early adoption of AI technologies. He is also credited with pioneering best practices for globally distributed teams. Both Chollet and Knoop are co-founders of the ARC Prize Foundation, a nonprofit organization focused on advancing open AGI research. Future visions rooted in ancient tradition Ndea derives its name from the Greek concepts ennoia (intuitive understanding) and dianoia (logical reasoning), reflecting the lab’s goal of merging deep learning and program synthesis. By operationalizing AGI, Ndea hopes to compress centuries of scientific progress into decades, or even years. While acknowledging the uncertainty and challenges of pursuing AGI, Chollet and Knoop remain optimistic about their approach. They see AGI as the gateway to addressing humanity’s most pressing challenges and uncovering entirely new opportunities for discovery. source

Forget Nvidia: Ndea wants to build AI that keeps improving on its own with ‘no bottlenecks in sight’ Read More »

Why ‘prosocial AI’ must be the framework for designing, deploying and governing AI

OpenAI: Extending model ‘thinking time’ helps combat emerging cyber vulnerabilities

DeepSeek unleashes ‘Janus Pro 7B’ vision model amidst AI stock bloodbath, igniting fresh fears of Chinese tech dominance

No retraining needed: Sakana’s new AI model changes how machines learn

OpenAI Stargate is a $500B bet: America’s AI Manhattan Project or costly dead end?

Tech leaders respond to the rapid rise of DeepSeek

How Harness is ‘harnessing’ agentic AI to help improve enterprise incident response with automated data collection and playbooks

DeepSeek R1’s bold bet on reinforcement learning: How it outpaced OpenAI at 3% of the cost

Meet OpenAI’s Operator, an AI agent that uses the web to book you dinner reservations, order tickets, compile grocery lists and more

Forget Nvidia: Ndea wants to build AI that keeps improving on its own with ‘no bottlenecks in sight’

We provide a matching platform and membership services for startup groups in Asia

Useful Links

Become an Affiliate

Contact

News & Insight

Join the family!

Latest News

How Sanrio is bringing 'kawaii' to street style, one Chuck at a time

Stock Market Today: Dow, Nasdaq Keep Dropping As Oil Rises; Micron Tumbles, Five Below Surges (Live Coverage)