VentureBeat

Anthropic’s fastest model, Claude 3.5 Haiku, now generally available

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Anthropic has officially rolled out its Claude 3.5 Haiku model to all users through the Claude chatbot on the web and mobile apps, as sighted by AI power users on X. Previously limited to developers accessing it via Anthropic’s API following its launch in October 2024, this smaller, faster model has garnered attention for its ability to outperform larger models on key benchmarks while maintaining a competitive price point. According to the third-party benchmarking organization Artificial Analysis, Claude 3.5 Haiku “has a lower latency compared to average, taking 0.80s to receive the first token (TTFT),” yet “is slower compared to average, with a output speed of 65.1 tokens per second.” The release — which hasn’t been officially announced — comes on the heels of major updates from Anthropic’s AI rivals OpenAI and Google, which have also shipped new models to general availability in their chatbots as the year winds down, namely OpenAI’s o1 and o1-mini models and Google’s Gemini 2. The question for Anthropic is whether customers will be impressed enough with Claude 3.5 Haiku’s performance to sign up for its Pro tier — or to continue using it instead of some of these other advanced and fast rivals. Claude 3.5 Haiku is accessible through the Claude Chatbot As the fastest and most cost-effective model in Anthropic’s lineup, Claude 3.5 Haiku excels in real-time tasks such as processing large datasets, analyzing financial documents, and generating outputs from long-context information. It features a 200,000-token context window — more than the 128,000-token window on OpenAI’s GPT-4 and GPT-4o — allowing it to handle extensive input with ease. On the Claude chatbot, Haiku brings functionality that enhances its versatility. Users can analyze images and file attachments, making it useful for multimedia tasks and workflows involving large document sets. Haiku also integrates with Claude Artifacts, the interactive sidebar first introduced in June 2024. Artifacts provides a dedicated workspace for manipulating and refining AI-generated content in real time, including running full apps. In my test of Artifacts with Haiku this morning, it was able to code a fully playable version of Pong in less than a minute: Despite its strengths, Haiku has limitations. It does not currently support web browsing or image generation, both of which are offered by competitors like OpenAI’s GPT-4o and GPT-4. Additionally, my brief test of it this morning showed it failed on the “Strawberry Test,” a common user-designed challenge in which an AI must identify all three R’s in the word strawberry. Access and subscription details Claude 3.5 Haiku is freely accessible via the Claude chatbot, but users face a variable daily message limit depending on server demand. For example, on the free tier this morning when I tried it out, I was able to perform approximately 10 exchanges (20 total messages in and out) before reaching Anthropic’s quota, which resets daily. To unlock more extensive usage, users can subscribe to the Claude Pro plan, priced at $20 per month. This subscription provides up to five times the free tier’s usage, priority access during high-traffic periods, early access to new features, and access to additional models like Claude 3 Opus. The pricing structure mirrors OpenAI’s ChatGPT Plus subscription, offering a premium experience for power users. Performance and cost On the API, Claude 3.5 Haiku offers exceptional performance at an affordable price. Starting at $0.80 per million input tokens and $4 per million output tokens, it provides an economical solution compared to larger models like Claude 3 Opus. Developers can reduce costs further using prompt caching, which offers up to 90% savings, and the Message Batches API, which cuts costs by 50%. In benchmark testing, Haiku has surpassed many larger, publicly available models. Its performance includes a 40.6% score on SWE-bench Verified, a key coding benchmark, demonstrating its strength in tasks requiring intelligence and speed. This makes Haiku an excellent choice for user-facing applications and time-sensitive workflows. Key considerations While Claude 3.5 Haiku delivers strong capabilities, potential users should consider its current limitations. The lack of web browsing and image generation may make it less appealing for certain use cases compared to competitors. Furthermore, the daily message cap may be inconvenient for users who don’t wish to upgrade to the Claude Pro subscription. However, with features like image and file analysis, robust coding capabilities, and integration with Artifacts, Haiku remains a powerful tool for tasks requiring speed and precision. The Artifacts feature, in particular, extends its functionality beyond text generation, enabling collaborative editing and real-time content refinement. For users ready to explore its potential, Claude 3.5 Haiku is now live and available through the Claude chatbot on web and mobile apps on iOS and Android. source

Anthropic’s fastest model, Claude 3.5 Haiku, now generally available Read More »

OpenAI launches full o1 model with image uploads and analysis, debuts ChatGPT Pro

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has officially launched its o1 model, transitioning out of preview to become a core feature of the ChatGPT platform. And now, it can also analyze images — a hugely helpful feature upgrade as it enables users to upload photos and have the AI chatbot respond to them, giving them detailed plans on how to build a birdhouse entirely from a single candid photo of one, for one fun example. In another, potentially more serious and impressive example, it is now capable of helping design data centers from sketches. Initially available to ChatGPT Plus and Team subscribers globally, o1 represents a significant evolution in reasoning model capabilities, including better handling of complex tasks, image-based reasoning, and enhanced accuracy. Enterprise and Education users will gain access to the model next week. This release coincides with OpenAI’s rollout of a new subscription tier, ChatGPT Pro, which leaked ahead of its announcement today. Priced at $200 per month, ChatGPT Pro is targeted at professionals and organizations requiring scalable, research-grade AI tools. It offers unlimited access to OpenAI’s most advanced capabilities, including exclusive versions of the o1 reasoning model, GPT-4o, and the Advanced Voice feature. OpenAI co-founder and CEO and Sam Altman announced the news on a YouTube livestream on December 5, 2024 at 10 am PT as part of OpenAI’s ongoing “12 Days of OpenAI” series of new updates timed to coincide with the end of the year and winter holidays (i.e. “12 Days of Christmas”). o1 advances The o1 model family, first introduced in September 2024, aims to tackle real-world challenges with refined reasoning, coding, and mathematical capabilities. Compared to its preview version, the updated o1 delivers faster responses and a 34% reduction in major errors on difficult problems. It is also capable of analyzing and explaining image uploads, unlocking new applications in fields like healthcare and engineering. Early benchmarks highlight the model’s competitive edge. For example, o1-preview successfully solved 83% of problems in the International Mathematics Olympiad qualifying exam, a sharp improvement from GPT-4o’s 13% success rate. OpenAI’s updates also include safety enhancements, with the o1-preview scoring 84 on a rigorous safety test, compared to 22 for its predecessor. These advancements position o1 as a versatile tool for users in STEM fields and beyond. OpenAI has indicated plans to further expand the model’s functionality, including web browsing, file uploads, and richer API integration to support vision, function calling, and structured outputs. In addition, OpenAI researcher Noam Brown took to X to confirm that o1 was the long-rumored OpenAI model codenamed “Strawberry” internally, and that it “can do a little better than just counting how many r’s are in ‘strawberry’” — a rhetorical understatement if I’ve ever seen one. Brown posted screenshots showing how the model via ChatGPT was able to construct an entire 3 paragraph essay about strawberries without using a single letter “e” after thinking for 45 seconds. Premium pricing The introduction of ChatGPT Pro represents OpenAI’s latest move to cater to high-demand users. Priced at $200 per month, this tier provides unlimited access to enhanced tools, such as a high-compute version of o1, which dedicates additional processing power to deliver optimal solutions for challenging queries. Subscribers also gain access to GPT-4o, known for its advanced natural language generation capabilities, and the Advanced Voice feature for speech-based interactions. ChatGPT Pro’s pricing significantly exceeds other tiers, such as ChatGPT Plus at $20 per month or ChatGPT Team at $30 per month, reflecting its specialized focus on delivering peak performance for complex applications. However, encourage the use of AI in societal-benefit fields, OpenAI has announced the ChatGPT Pro Grant Program. The initiative will initially award 10 grants to leading medical researchers, providing free access to ChatGPT Pro tools. A well-timed release The launch of o1 and ChatGPT Pro comes amid intensifying competition in the AI industry. Chinese rivals, including Alibaba and DeepSeek, have released reasoning models like Marco-o1 and R1-Lite-Preview, are encroaching fast, challenging OpenAI’s dominance with open-source solutions and eclipsing o1-preview on certain third-party benchmarks. These developments reflect the growing demand for large reasoning models (LRMs) capable of handling complex problem-solving tasks. As OpenAI continues to refine its offerings, the rollout of o1 and ChatGPT Pro marks a milestone in its quest to provide accessible, high-performance AI tools. Whether these developments can maintain OpenAI’s leadership in an increasingly crowded market remains to be seen. source

OpenAI launches full o1 model with image uploads and analysis, debuts ChatGPT Pro Read More »

Shutterstock pioneers ‘research license’ model with Lightricks, lowering barriers to AI training data

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Shutterstock is reshaping how AI companies access training data through a novel “research license” approach, launching first with AI creative technology company Lightricks. The partnership, announced today, allows Lightricks to train its open-source video generation model LTXV using Shutterstock’s extensive HD and 4K video library. The new licensing model addresses a critical challenge in AI development: the high cost of accessing quality training data. It enables companies to start with a smaller research license for testing and experimentation before committing to more expensive commercial licenses. Making ethical AI development more accessible for startups “Many companies and model trainers have taken the route of unauthorized data scraping [instead of] making the necessary investment to achieve the quality and level of trust needed to develop commercially viable models,” said Daniel Mandell, Shutterstock’s global head of data licensing & AI, in an exclusive interview with VentureBeat. “However, we don’t think that financial investment should be a barrier for those looking to enter this space with an ethical approach.” This two-phase approach could transform how startups approach AI development. Craig Andrews, Lightricks’ global PR manager, describes it as “a turning point for smaller, more agile developers who want to explore innovative applications of generative AI without the heavy upfront costs of traditional licensing.” Legal protection and fair compensation in the age of AI The timing is significant, coming amid increasing legal scrutiny of AI training data practices. Several major AI companies face lawsuits over alleged unauthorized use of copyrighted material for model training. Shutterstock’s approach offers a legitimate alternative while ensuring content creators receive compensation. “We’re setting a standard for ethical AI development while ensuring that creators are fairly compensated for their work,” Andrews explains. “This approach not only fosters trust in the creative ecosystem but also establishes a sustainable framework for responsible AI innovation.” Revenue sharing: A win-win for creators and AI companies Shutterstock has implemented a revenue-sharing model where contributors receive 20% of the revenue from data licensing deals. Contributors can also opt out of having their content used for AI training, though Mandell notes only about 1% have chosen to do so. Lightricks plans to use the licensed video data to enhance LTXV, its open-source video generation model released last month. The model has already gained significant traction, with “thousands of downloads on GitHub and Hugging Face,” according to Andrews. One notable use case is real-time video generation for interactive ecommerce. The partnership aims to address technical challenges in AI video generation, particularly motion consistency in longer videos. “One of the biggest technical hurdles in AI video generation is achieving consistent motion and structure over longer video segments without sacrificing quality,” Andrews says. “Shutterstock’s high-quality video library provides an extensive dataset that helps us address this challenge.” For Shutterstock, this partnership represents a strategic shift in its business model. The company has already established partnerships with major AI companies including Nvidia, Meta, and OpenAI. Mandell emphasizes that the research license model could democratize access to high-quality training data for smaller organizations and research institutions. Setting new industry standards for ethical AI development The collaboration also reflects a growing trend toward transparency and ethical considerations in AI development. Lightricks made LTXV open-source to promote collaboration and innovation, while Shutterstock’s licensing approach ensures proper compensation for content creators. “The important message here is that companies, no matter the size or funding, no longer have an excuse to scrape unlicensed content for training purposes,” Mandell concludes. “There is a better way to enter this evolving market.” This partnership could set a new standard for how AI companies access training data, potentially influencing industry practices as concerns about the sources of AI training data continue to grow. The success of this model could determine whether other content providers follow Shutterstock’s lead in creating more flexible, accessible licensing options for AI development. source

Shutterstock pioneers ‘research license’ model with Lightricks, lowering barriers to AI training data Read More »

Synthetic data has its limits — why human-sourced data can help prevent AI model collapse

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More My, how quickly the tables turn in the tech world. Just two years ago, AI was lauded as the “next transformational technology to rule them all.” Now, instead of reaching Skynet levels and taking over the world, AI is, ironically, degrading.  Once the harbinger of a new era of intelligence, AI is now tripping over its own code, struggling to live up to the brilliance it promised. But why exactly? The simple fact is that we’re starving AI of the one thing that makes it truly smart: human-generated data. To feed these data-hungry models, researchers and organizations have increasingly turned to synthetic data. While this practice has long been a staple in AI development, we’re now crossing into dangerous territory by over-relying on it, causing a gradual degradation of AI models. And this isn’t just a minor concern about ChatGPT producing sub-par results — the consequences are far more dangerous. When AI models are trained on outputs generated by previous iterations, they tend to propagate errors and introduce noise, leading to a decline in output quality. This recursive process turns the familiar cycle of “garbage in, garbage out” into a self-perpetuating problem, significantly reducing the effectiveness of the system. As AI drifts further from human-like understanding and accuracy, it not only undermines performance but also raises critical concerns about the long-term viability of relying on self-generated data for continued AI development. But this isn’t just a degradation of technology; it’s a degradation of reality, identity, and data authenticity — posing serious risks to humanity and society. The ripple effects could be profound, leading to a rise in critical errors. As these models lose accuracy and reliability, the consequences could be dire — think medical misdiagnosis, financial losses and even life-threatening accidents. Another major implication is that AI development could completely stall, leaving AI systems unable to ingest new data and essentially becoming “stuck in time.” This stagnation would not only hinder progress but also trap AI in a cycle of diminishing returns, with potentially catastrophic effects on technology and society. But, practically speaking, what can enterprises do to ensure the safety of their customers and users? Before we answer that question, we need to understand how this all works. When a model collapses, reliability goes out the window The more AI-generated content spreads online, the faster it will infiltrate datasets and, subsequently, the models themselves. And it’s happening at an accelerated rate, making it increasingly difficult for developers to filter out anything that is not pure, human-created training data. The fact is, using synthetic content in training can trigger a detrimental phenomenon known as “model collapse” or “model autophagy disorder (MAD).” Model collapse is the degenerative process in which AI systems progressively lose their grasp on the true underlying data distribution they’re meant to model. This often occurs when AI is trained recursively on content it generated, leading to a number of issues: Loss of nuance: Models begin to forget outlier data or less-represented information, crucial for a comprehensive understanding of any dataset. Reduced diversity: There is a noticeable decrease in the diversity and quality of the outputs produced by the models. Amplification of biases: Existing biases, particularly against marginalized groups, may be exacerbated as the model overlooks the nuanced data that could mitigate these biases. Generation of nonsensical outputs: Over time, models may start producing outputs that are completely unrelated or nonsensical. A case in point: A study published in Nature highlighted the rapid degeneration of language models trained recursively on AI-generated text. By the ninth iteration, these models were found to be producing entirely irrelevant and nonsensical content, demonstrating the rapid decline in data quality and model utility. Safeguarding AI’s future: Steps enterprises can take today Enterprise organizations are in a unique position to shape the future of AI responsibly, and there are clear, actionable steps they can take to keep AI systems accurate and trustworthy: Invest in data provenance tools: Tools that trace where each piece of data comes from and how it changes over time give companies confidence in their AI inputs. With clear visibility into data origins, organizations can avoid feeding models unreliable or biased information. Deploy AI-powered filters to detect synthetic content: Advanced filters can catch AI-generated or low-quality content before it slips into training datasets. These filters help ensure that models are learning from authentic, human-created information rather than synthetic data that lacks real-world complexity. Partner with trusted data providers: Strong relationships with vetted data providers give organizations a steady supply of authentic, high-quality data. This means AI models get real, nuanced information that reflects actual scenarios, which boosts both performance and relevance. Promote digital literacy and awareness: By educating teams and customers on the importance of data authenticity, organizations can help people recognize AI-generated content and understand the risks of synthetic data. Building awareness around responsible data use fosters a culture that values accuracy and integrity in AI development. The future of AI depends on responsible action. Enterprises have a real opportunity to keep AI grounded in accuracy and integrity. By choosing real, human-sourced data over shortcuts, prioritizing tools that catch and filter out low-quality content, and encouraging awareness around digital authenticity, organizations can set AI on a safer, smarter path. Let’s focus on building a future where AI is both powerful and genuinely beneficial to society. Rick Song is the CEO and co-founder of Persona. DataDecisionMakers Welcome to the VentureBeat community! DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation. If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers. You might even consider contributing an article of your own! Read More From DataDecisionMakers source

Synthetic data has its limits — why human-sourced data can help prevent AI model collapse Read More »

Light Field Lab launches SolidLight holographic imagery systems

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Light Field Lab has launched its SolidLight Holographic and Volumetric display technologies that will power some amazing imagery of the future. These next-generation display technologies will be used by major companies to build a wide variety of holographic images and animations. Connecting a bunch of panels together, the system can modulate 10 billion pixels per square meter. Last year, Light Field Lab raised $50 million, adding to its war chest of $85 million raised since its inception. And now I can see where that money is going. The San Jose, California-based company gave me a theatrical tour of an animated demo of an alien that it builds in collaboration with the SETI Institute, the organization searching for extraterrestrial intelligence in our galaxy. “We’re offering this to our customers and are deploying next year,” said Jon Karafin, CEO of Light Field Lab, in our latest interview. “That’s pretty exciting. This is an example of the kind of entertainment people are going to deploy.” Light Field Lab is shipping holographic tech for deployment as early as 2025. The demo took me inside the company’s headquarters in a kind of faux Area 51 secrete government facility. Taking me to a secret briefing room behind a fake bookcase, a woman in a lab coat filled me in on the project, which had something to do with alien encounters. She took me to an elevator where it simulated taking me deep underground into a research space that was free of radio interference. There was a mad scientist there and an army general who was quite paranoid about my presence. They then showed off the demo of SolidLight’s ability to form holographic objects in midair. The floating multi-planar objects were formed with 100 million pixels square meter of display power. So the researcher opened a portal to another planet, where an alien talked to me. It looked pretty real and asked me questions. Then it handed me a secret cube, and everyone in the room freaked out. The black cube appeared to be floating in mid-air. It was a holographic animation. That means it was a 3D object, as I could move my head and see different parts of the cube. I felt like I could reach out and grab that box. Ready for the market Light Field Lab’s theatrical demo of holographic tech. The invitation-only demo has entertained thought leaders from a wide variety of industries, including media, technology, retail and tourism. Light Field Lab hopes to sell them all its technology. “SETI and Light Field Lab created a themed environment enabling guests to suspend disbelief and engage with an extraterrestrial formed with nothing but light,” said Jon Karafin, CEO of Light Field Lab. The SETI Institute emerged as the ideal partner to introduce the possibilities of SolidLight. SETI leverages advanced technologies and immense bandwidth to capture and analyze radio signals, while LFL builds advanced display technology that incorporates incredible resolution and compute to enable holographic and volumetric objects that escape the screen and merge with reality. “The SETI Institute has partnered with Light Field Lab to enable an incredibly immersive experience and launch the company’s extraordinary SolidLight technologies. These are next-level, next-generation displays that are clearly the precursor to Star Trek’s Holodeck,” said Bill Diamond, president and CEO of the SETI Institute, in a statement. SolidLight: Holographic installations will be built-to-order based upon customer requirements and powered with an array of media servers to form fully holographic objects. SolidLight: Volumetric systems are available now with delivery in 2025 and driven with a single computer to form multiple planes within the holographic volume. How the tech works An alien from another star system. Light Field Lab had to assembles the holographic images on the display from smaller submodules that can produce the hologram. By pushing a lot of modules together, it can generate images with 10 billion pixels per square meter, or with fewer modules it can make smaller resolution images. As I wrote before, a hologram is the recording and projection of light. Everything around us is a collection of light energy visible through our eyes and processed by the visual cortex of the brain. The “light field” defines how photons travel through space and interact with material surfaces. The things that we ultimately see as the world around us are bundles of light that focus at the back of our eyes. The trick is getting your eyes to focus on a particular point in space. Light Field Lab’s technology re-creates what optical physics calls a “real image” for off-screen projected objects by generating a massive number of viewing angles that correctly change with the point of view and location just like in the real world. This is accomplished with a directly emissive, modular, and flat-panel display surface coupled with a complex series of waveguides that modulate the dense field of collimated light rays. With this implementation, a viewer sees around objects when moving in any direction such that motion parallax is maintained, reflections and refractions behave correctly, and the eyes freely focus on the items formed in mid-air. The result is that the brain says, “this is real,” without having any physical objects. In other words, Light Field Lab creates real holograms with no headgear. There’s no head-tracking, no motion sickness, and no latency in the display. To create the alien, the team used a combination of Unreal Engine tech and Maya tools. The variety of experiences can include virtual concierge services where AI can answer questions at a kind of reception desk. Digital signage is another possible market. But it takes a lot of graphics processing units (GPUs) and AI data center technology. One configuration might use more than 60 GPUs. The leaders of Light Field Lab. Light Field Lab’s technologies combine size, resolution and density to project SolidLight Objects that accurately move, refract and reflect in physical space. The directly emissive

Light Field Lab launches SolidLight holographic imagery systems Read More »

Sapient debuts with new AI architectures, aiming to beat Transformers’ reasoning with recurrent neural networks

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Sapient Intelligence, Singapore’s first foundation model AI startup, has announced the successful closure of its seed funding round, raising $22 million at a valuation of $200 million. Backed by prominent investors including Vertex Ventures, Sumitomo Corporation, and JAFCO, the company is hoping to carve a distinctive path in AI development, addressing what it sees as fundamental shortcomings in GPT-style models. “The goal of the startup, really, is to make a new generation of foundational model architectures to solve really complicated and long-horizon reasoning tasks that are really challenging for large language models (LLMs), especially for GPT architectures, to solve,” said cofounder Austin Zheng in a recent interview with VentureBeat conducted over video chat. New architectures beyond traditional Transformers Traditional GPT-style models rely on autoregressive methods, which generate predictions by building sequentially on prior outputs. While effective for general tasks, this approach struggles with multi-step reasoning and complex problem-solving. “With current models, they’re all trained with an autoregressive method, and with that, the benefit is it’s easier for the model to converge on [a] general task,” Zheng explained. “So it sounds really smart, so it can solve a lot of different tasks. It has a really good generalization capability, but it’s really, really difficult for them to solve…complicated and long-horizon, multi-step tasks. And that’s kind of where hallucination comes in.” Sapient’s answer is a novel model architecture inspired by neuroscience and mathematics, blending Transformer components with recurrent neural network structures and mimicking how the human brain works. “The model will always evaluate the solution, evaluate options and give [you] a reward model based on that,” Zheng said. “And also the model can continuously calculate something recurrently until it gets to a correct solution. With that, our agent will be able to deploy to an environment in an enterprise or [a] production environment, and continuously learn and improve…by trial and error and learn to be an expert on the existing code base.” This design underpins the flexibility and power of Sapient’s models, enabling them to tackle a broad range of tasks with precision and reliability. It also puts them up against the new generation of reasoning models from OpenAI with its o1 series, as well as other Chinese competitors. Excelling in benchmarks and beyond The company’s innovations are reflected in benchmark performance. “The first benchmark we use is actually Sudoku,” Zheng told VentureBeat. “Right now, our model is the best performing neural network in terms of solving Sudoku on the market — 95% accuracy without using intermediate tools and data.” According to Zheng, while other leading models needed to train on intermediate steps to solve the popular numeric ordering puzzle, Sapient provided the model only with unfinished Sudoko boards, the rules, and the final solutions, obligating it to infer on its own how to solve them through trial and error. Similarly, Sapient’s models have excelled in tasks like two-dimensional navigation and complex mathematical problem-solving, consistently outperforming competing approaches. Training these models is another area where Sapient distinguishes itself. “Unlike traditional models that require vast amounts of high-quality, step-by-step data, our approach needs only question-and-answer pairs. This significantly lowers the barrier for training complex models,” Zheng said. By leveraging synthetic data, Sapient reduces the dependency on curated datasets, creating scalable and efficient training pipelines. Practical applications: From code to robots Sapient’s initial focus is on real-world applications, starting with enterprise coding and robotics. Its autonomous coding agents aim to revolutionize how businesses manage their software development and maintenance needs. The company is planning an autonomous AI coding agent in their strategic investors enterprise environments to learn the company’s codebase and ultimately, begin maintaining and contributing to it. Sapient aims to offer a similar service to other enterprise clients, what Zheng describes as “smart and tailored AI employees and AI software engineers that can help them maintain, update and also grow the existing tech stacks.” Unlike Cognition’s Devin, powered by GPT-4o, Sapient believes its coding AI agents will be able to work autonomously — without any human guiding the process or troubleshooting issues, save for supervisors checking over the work before it is pushed live. The company is also advancing embodied AI, designing models that enable robots to interact, learn, and adapt in real time. “There are only a handful of startups working on understanding of [an] environment, and also planning of options and tasks, and understanding what kind of tasks are possible — also continuous[ly] improving itself on understanding the environment, understanding the problem, and understanding the use cases,” Zheng pointed out. “This will be our main focus for the next onen to two years.” A global vision Sapient is setting itself apart not just through technology but also though its global and inclusive approach. “There are very few AI startups at a foundational model level outside of China actually led by Asian founders,” Zheng noted. “We really want to position ourselves as an international and research-oriented organization. But also, we want to be one of the first, few Asian-led international research organizations that are solving really, really challenging problems, and we’re seeing that coming to fruition as well.” With offices in Singapore and plans for the Bay Area, the company is building an AI research lab to bring together diverse perspectives and talent. Its team reflects this ethos, comprising scientists and engineers from leading institutions like DeepMind, Anthropic, and Microsoft AI. This diversity, combined with strong partnerships with Japanese investors like Sumitomo Corporation, positions Sapient as a unique player in the global AI ecosystem. Targeting individuals and enterprises Sapient’s long-term vision is ambitious, targeting technology that can be applied with results equally useful to individuals and enterprises. “The goal at the very end will be to build a truly generalized agent that can actually solve a day-to-day task for our users — an ‘all agent solution’ for a personal assistant and for solving all your tasks…That’s where we are in terms of

Sapient debuts with new AI architectures, aiming to beat Transformers’ reasoning with recurrent neural networks Read More »

Lam Research launches collaborative robots to optimize critical maintenance in chip factories

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Chip equipment maker Lam Research introduced Dextro, the chip industry’s first collaborative robot (cobot) designed to optimize critical maintenance tasks in chip factories. The robots have been deployed in multiple advanced wafer fabs around the world. Dextro enables accurate, high-precision maintenance to minimize tool downtime and production variability. It drives significant first-time-right (FTR) results that can enhance yield in wafer fabrication plants. If successful, the robots could make semiconductor manufacturing — which is the foundation for revenue in the billions in the electronics industry — a lot more efficient. Today’s wafer fabrication equipment utilizes advanced physics, robotics and chemistry to create semiconductors at nanoscale. A typical fab has hundreds of process tools that each require regular, complex maintenance. Dextro is designed to improve the cost-effectiveness of this equipment by performing critical maintenance tasks with repeatability and sub-micron precision. “Dextro is an exciting leap forward in semiconductor manufacturing equipment maintenance. Built to work side-by-side with fab engineers, it executes complex maintenance tasks with precision and repeatability that are beyond human capability alone, enabling higher tool uptime and manufacturing yield,” said Chris Carter, group vice president of the customer support business group at Lam Research, in a statement. “It is a powerful addition to Lam’s extensive portfolio of tools and services designed to help chipmakers optimize their fabs for cost and productivity.” As fabs continue to grow in size, geographic diversity, and equipment complexity, chipmakers need to optimize the effectiveness of their fab engineers by increasing automation and adding efficiencies. This is becoming even more important as the number of semiconductor positions worldwide continues to outpace the availability of skilled engineers. Precision is crucial in tool maintenance, where the accurate reassembly of subsystems translates to the bottom line. Achieving FTR saves time and cost. Repeatable maintenance also reduces waste associated with consumable parts, labor and production downtime, leading to less variability and higher yield in production. “When manufacturing equipment requires maintenance, the work must be done quickly and efficiently to avoid extended tool downtime and wasted cost,” said Young Ju Kim, vice president and head of the Memory Etch Technology Team at Samsung Electronics, in a statement. “Error-free maintenance by Dextro helps drive improvements in production variability and yield. This is an exciting milestone in Samsung’s journey to the autonomous fab.” Dextro is a mobile unit with a robotic arm operated by a fab technician or engineer. It uses various end-effectors as hands to manage critical equipment maintenance tasks that are time-consuming and prone to errors when done manually. For example, it precisely installs and compresses consumable components with more than twicethe accuracy of manual application. Precise assembly helps control etch performance at the wafer edge, improving yield. Dextro tightens high-precision vacuum-sealing bolts to exact specifications, relieving fab engineers of a repetitive task that has up to a 5% error rate when done manually. Accurately meeting specifications eliminates chamber temperature deviations that may take a tool out of production and impact die yield. And Dextro removes side-wall polymer build-up within the chamber, without the burden of disassembly of the lower chamber. Importantly, it does this at lower risk to humans who require heavy protective breathing equipment to perform the task manually. Lam’s Flex G and H series dielectric etch tools are currently supported by Dextro, expanding toadditional tools in 2025 and beyond. “With the enormous increase in demand that AI is bringing to the semiconductor market, it’s critical for chipmakers to keep all their manufacturing equipment working as efficiently as possible to minimize downtime,” noted Bob O’Donnell, president of TECHnalysis Research, in a statement. “Dextro can automate tedious, time-consuming and, often, intricate cleaning and maintenance tasks on chip fabrication equipment so that manufacturing output can be maximized. It offers a huge benefit for companies that choose to deploy it.” Dextro joins a portfolio of smart solutions from Lam that enhance efficiency and reduce the cost of operations for semiconductor fabs. This includes the Lam Equipment Intelligence process tools with autonomous calibration and self-adapting maintenance capabilities, as well as Equipment Intelligence Services that use data, machine learning, artificial intelligence and Lam domain knowledge to achieve better productivity outcomes. source

Lam Research launches collaborative robots to optimize critical maintenance in chip factories Read More »

Solos launches AirGo Vision — ChatGPT-enabled AI smart glasses with a camera

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Solos announced that is launching AirGo Vision, the first smart glasses with ChatGPT-enabled AI and a camera. The new glasses with integrated ChatGPT 4.0 is aimed at making wearable AI available to all users. AirGo Vision offers seamless AI integration with features like real-time visual recognition and hands-free operation. With privacy as a core focus, the company said users can easily swap out frames, giving them the flexibility to choose when and where to use the front camera, while otherwise maintaining all other smart functionalities. Solos said AirGo Vision redefines wearable technology by prioritizing user control, convenience, and personalization. “We’ve been releasing new hardware and software updates constantly over the last year, all driven by what we know consumers are asking for,” said Kenneth Fan, cofounder of Solos, in a statement. “One thing we promised to deliver on was allowing consumers to have control of their experience with AI and smart technology, particularly with privacy options in mind. That’s why we developed frames that can easily be changed to decide when and where a camera may be appropriate without sacrificing any of the fun features.” A design prioritizing privacy and convenience AirGo Vision prioritizes user privacy and personalization with an innovative design that seamlessly integrates a camera on the frame that is inserted directly into the arms of the glasses. Solos’ proprietary SmartHinge technology, with USB Type-C connectors, allows users to effortlessly switch between frames giving the choice to use camera-enabled frames or opt for a standard frame while keeping all other smart functions active. This customizable approach provides both flexibility and peace of mind, empowering users to decide when to use the camera, and when to prioritize privacy. Modular AI integration and visual processing With advanced real-time visual recognition, AirGo Vision can instantly identify people, objects, activity, and text, offering responses to questions like, “What am I looking at?”, “What am I doing?”, “Translate the text on this sign to English,” or “Tell me about this monument.” Users can also receive directions to popular landmarks such as “Give me directions to the Eiffel Tower” — or find nearby restaurants, making daily interactions and exploration more intuitive and informative. Designed for hands free convenience, users can also capture photos on command — ideal for documenting progress in activities like cooking, home improvement, education, or shopping. AirGo Vision’s unique open architecture design supports popular AI frameworks like Anthropic’s Claude and Google’s Gemini, ensuring compatibility across all major mobile platforms, making smart glasses and AI capabilities universally accessible to all consumers. All day wearability with new always-on capabilities The AirGo Vision is designed for all day use, offering continuous AI support for easy access to the internet, weather updates, and news with the SolosChat Online, language translation with SolosTranslate, responding to texts, and more. With an extended battery life, AirGo Vision can support more than 2,500 AI interactions or image captures on a single charge, making them ideal for all day wearability. Responding to valuable user feedback, Solos developed the new always-on mode, eliminating the need to use the Solos AirGo app while interacting with the AI. Now, after opening the app once, it will run in the background on a mobile device without needing to stay open. This ensures essential features like SolosChat remain fully operational without requiring the app to be actively displayed; a truly hands free experience that integrates seamlessly into daily life. Continued eyewear-first mentality Staying true to its eyewear-first approach, each frame style from Solos is designed for lightweight comfort, weighing only 42 grams, making them ideal for all day wear. AirGo Vision will launch in two frame styles, Krypton 1 and Krypton 2, with seven color variations and more options to come, and can be customized with prescription lenses. The frames can also be paired with existing Solos AirGo3 smart glasses. AirGo Vision is now available for purchase on Solosglasses.com and Amazon for $299. You can also purchase the frame only for $149 or bundle a camera frame with a regular frame for enhanced privacy, priced at $349. Born out of Kopin Corporation, with MIT engineers, Solos combines wearable electronics with the comfort and style of traditional eyewear. With an IP portfolio of over 100 patents and patent applications, the company has won a CES Innovation Award four times. It’s similar to Meta’s Ray-Ban Meta smart glasses, which have been selling well in places like Europe. source

Solos launches AirGo Vision — ChatGPT-enabled AI smart glasses with a camera Read More »

Midjourney is launching a multiplayer collaborative worldbuilding tool

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Midjourney, the popular AI image generation startup with more than 21 million users on its Discord server alone, is branching out from AI image creation and editing. Patchwork revealed Max Kreminski, leader of Midjourney’s Storytelling Lab, demoed the new tool, called “Patchwork,” in a livestream screenshare on Discord and X via Restream. Screenshot of a Patchwork world. He clarified that it would be a stand alone app that would require Midjourney accounts to log into, and that the URL would be available as a “research preview” in the Midjourney Discord server’s “updates” channel. Users will need to connect their Midjourney Discord account to their Google Account to access Patchwork’s research preview. The company posted instructions for doing so on its X account. The tool appears to be a web-based blank white, infinite canvas with a “toolbox” on the left side of the browser screen, showing a variety of buttons labeled for “character,” “event,” “faction,” “place,” “prop,” and “random,” as well as tools such as “note,” “image,” “portal,” “save” and “share.” “Save” downloads a JSON file with links to all the Midjourney images created in the canvas. Midjourney considers each canvas a separate digital “world.” To switch between worlds, the user creates a “portal,” a small black circular button. To generate a new world, the user enters a text prompt into an editor bar at the top of the “create” screen and selects one or more of a set of 10 different image styles. This then produces a new whiteboard with a bunch of new still image assets and text boxes or entities known as “scraps”, including input boxes that allow the user to prompt new images or settings that fit the initial world description, even whole new AI generated character descriptions. In the demo livestream, the character name automatically populated with Marcus “Dizzy” Gillespie, echoing the name of the famous jazz musician. Dragging the description into a new character image creator box produces four new AI-generated images. Adding new character boxes, the user can then prompt to create names and characteristics, as well as motivations that can spur a conflict for the basis of a story. The user can then link characters together with lines that denote connections between them. They can also write action sequences and scene descriptions that each narrate a story. Each character can be used in multiple images and these images gathered together with a single option. The user can “share” the board with other Midjourney users who can collaborate, purportedly in real-time, with multiple cursors moving across the same shared canvas. A single world can support dozens, even up to 100 users, according to Kreminski. However, he noted that the more users, the more chaotic the experience would be. Kreminski said only users who are logged in can view boards (for now), but in the future, boards may be viewable by non-users. He mentioned that tabletop roleplaying groups were already using the feature to chart their campaigns. He also said that Midjourney version 7 (V7) would include a setting to allow multiple character consistency across different and new images. Moving towards immersive, 3D worlds Kreminski further revealed that there were at least 3 different large language models powering the application, including a fine-tuned open source one unique to Midjourney. Ultimately, it appears to be a novel, complex, powerful, somewhat overwhelming yet compelling tool for storyboarding. I could easily see it being used by writers and film directors, game designers, comic book creators and even live theater directors and writers. In the long term, Kreminski said there was a “very clear path in terms of escalation of the details and interactions in the worlds,” including fully immersive 3D virtual reality scenes, but that was likely years away. The news comes as other AI researchers, startups such as Fei-Fei Li’s World Labs, and big tech companies such as Google seek to develop AI that can create 3D immersive, navigable worlds online from simple prompts or images. More Midjourney updates coming soon In addition, Midjourney’s creator David Holz joined the announcement livestream to state the startup would launch multiple model personalization modes in the coming days. Currently, Midjourney allows users to rate images to personalize the kinds of visuals they want to see in generations, and fine-tune the model to personal preferences. Now, the startup will allow users to have multiple personalized versions they can toggle between. In addition, Holz shared that Midjourney would allow users to upload and reference multiple images to boards to guide generations. Furthermore, sometime after Christmas (December 25), Midjourney will be introducing video models and a Midjourney V7 AI image generator that will feature increased prompt understanding. Holz further revealed that Midjourney is working on three to four new hardware projects and said the startup was “trying to branch out and become a full research lab…it may take us six months to announce all six things.” source

Midjourney is launching a multiplayer collaborative worldbuilding tool Read More »

OpenAI launches ChatGPT Projects, letting you organize files, chats in groups

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI’s latest release, Projects in ChatGPT, addresses the need to organize files and conversations on ChatGPT. The feature is similar to Google’s popular NotebookLM application.  During the sixth day of its “12 Days of OpenAI” live stream, the company presented Projects in ChatGPT, allowing users to create folders and add conversations and documents, bringing these capabilities together in one place.  OpenAI rolled out Projects to ChatGPT Plus, Pro and Teams subscribers on Friday. However, ChatGPT Enterprise and Edu users will have to wait until January to access the new function. The company said it will “work hard” to offer Projects in ChatGPT to all users. Projects can be accessed on the ChatGPT website or the Windows desktop app. Mobile app users and MacOS desktop app users can view Projects.  The feature is reminiscent of Google’s NotebookLM, although that application is more geared toward research. Unlike NotebookLM’s Audio Overview, Projects on ChatGPT does not offer podcast narration. However, users of Projects on ChatGPT can still access other features of ChatGPT in conversations, such as Voice Mode, web search and Canvas.  NotebookLM proved so popular that some businesses began using the app beyond research, including in CRM-like tasks, thanks to its ability to organize files and other information on a specific topic.  Make a project  Projects will appear in the ChatGPT sidebar. To create a new one, click the “Plus” icon. Then, you can can name the project and customize its colors.  One feature of Projects is the ability to customize how it will respond through custom instructions. For example, a project manager can open a project to build a website. They can explain what the project is and what the website is for, and instruct ChatGPT to favor opening ChatGPT’s Canvas for coding.  Then, users can upload related documents to fill the website. Projects takes information from uploaded documents and chats associated with it. You can move existing conversations on ChatGPT to a Project and have it reference those chats as data sources.  OpenAI plans to expand the types of files Projects supports next year and add connections to Google Drive or Microsoft OneDrive. The company will also offer the option to toggle models through Projects.  All-in-one platforms Features like Projects in ChatGPT show how much OpenAI, Anthropic, and other chat providers want users to stay on their platforms to do their work.  OpenAI rolled out Canvas in October and expanded its access earlier this week. Canvas allows users to generate and edit text or code in ChatGPT without copying and pasting code elsewhere. Anthropic’s Claude Artifacts functions the same way, but it can also show a website prototype. Remaining on the main chat platform, Projects in ChatGPT lets users organize and work in the same window, showing that OpenAI plans to keep all of its new features in one place. Unlike Projects in ChatGPT, NotebookLM — which runs on an experimental version of Gemini 2.0 Flash and lets people chat with Gemini — is a separate app from the Gemini chatbot or Google’s other products, like its coding assistants.  source

OpenAI launches ChatGPT Projects, letting you organize files, chats in groups Read More »