VentureBeat

Cohere just made it way easier for companies to create their own AI language models

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Artificial intelligence company Cohere unveiled significant updates to its fine-tuning service on Thursday, aiming to accelerate enterprise adoption of large language models. The enhancements support Cohere’s latest Command R 08-2024 model and provide businesses with greater control and visibility into the process of customizing AI models for specific tasks. The updated offering introduces several new features designed to make fine-tuning more flexible and transparent for enterprise customers. Cohere now supports fine-tuning for its Command R 08-2024 model, which the company claims offers faster response times and higher throughput compared to larger models. This could translate to meaningful cost savings for high-volume enterprise deployments, as businesses may achieve better performance on specific tasks with fewer compute resources. A comparison of AI model performance on financial question-answering tasks shows Cohere’s fine-tuned Command R model achieving competitive accuracy, highlighting the potential of customized language models for specialized applications. (Source: Cohere) A key addition is the integration with Weights & Biases, a popular MLOps platform, providing real-time monitoring of training metrics. This feature allows developers to track the progress of their fine-tuning jobs and make data-driven decisions to optimize model performance. Cohere has also increased the maximum training context length to 16,384 tokens, enabling fine-tuning on longer sequences of text — a crucial feature for tasks involving complex documents or extended conversations. The AI customization arms race: Cohere’s strategy in a competitive market The company’s focus on customization tools reflects a growing trend in the AI industry. As more businesses seek to leverage AI for specialized applications, the ability to efficiently tailor models to specific domains becomes increasingly valuable. Cohere’s approach of offering more granular control over hyperparameters and dataset management positions them as a potentially attractive option for enterprises looking to build customized AI applications. However, the effectiveness of fine-tuning remains a topic of debate among AI researchers. While it can improve performance on targeted tasks, questions persist about how well fine-tuned models generalize beyond their training data. Enterprises will need to carefully evaluate model performance across a range of inputs to ensure robustness in real-world applications. Cohere’s announcement comes at a time of intense competition in the AI platform market. Major players like OpenAI, Anthropic, and cloud providers are all vying for enterprise customers. By emphasizing customization and efficiency, Cohere appears to be targeting businesses with specialized language processing needs that may not be adequately served by one-size-fits-all solutions. Cohere’s Command R 08-2024 model outperforms competitors in both latency and throughput, suggesting potential cost savings for high-volume enterprise deployments. Lower latency indicates faster response times. (Source: Cohere / artificialanalysis.ai) Industry impact: Fine-tuning’s potential to transform specialized AI applications The updated fine-tuning capabilities could prove particularly valuable for industries with domain-specific jargon or unique data formats, such as healthcare, finance, or legal services. These sectors often require AI models that can understand and generate highly specialized language, making the ability to fine-tune models on proprietary datasets a significant advantage. As the AI landscape continues to evolve, tools that simplify the process of adapting models to specific domains are likely to play an increasingly important role. Cohere’s latest updates suggest that fine-tuning capabilities will be a key differentiator in the competitive market for enterprise AI development platforms. The success of Cohere’s enhanced fine-tuning service will ultimately depend on its ability to deliver tangible improvements in model performance and efficiency for enterprise customers. As businesses continue to explore ways to leverage AI, the race to provide the most effective and user-friendly customization tools is heating up, with potentially far-reaching implications for the future of enterprise AI adoption. source

Cohere just made it way easier for companies to create their own AI language models Read More »

Chip industry faces talent shortage as revenues head to $1 trillion

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More In 2022, Deloitte expected that the global semiconductor industry would need to add a million skilled workers by 2030, or more than 100,000 annually. Two years later, that forecast still holds. But key industry trends continue to compound the talent challenge as the industry races toward $1 trillion in revenue by 2030, according to a new report by Deloitte, the accounting and consulting giant. The company said that advanced skills driven by demand for Generative AI (GenAI) mean that the talent needed for advancing technologies is often in high demand and can be difficult to attract and retain in a competitive talent market. The report’s timing is interesting, considering the U.S. is reportedly considering limiting sales of AMD and Nvidia AI chips aboard. Deloitte foresees a $1 trillion chip industry by 2030. The semiconductor industry is facing an aging workforce without a clear plan for succession, which may be further exacerbated by low industry appeal compared to the broader tech industry. I suppose this is because the chip industry isn’t as sexy as working for AI or social media companies. Global solutions needed for a global challenge Deloitte foresees a shortage of chip workers. Localization of manufacturing, as well as overall global demand trends, is contributing to a talent and skills shortage that spans the globe. Semiconductor companies are often left competing over the same insufficient pool of existing talent. And talent outcomes are tied to global chips laws. Both the U.S. and European chips legislation include specific objectives and grant application requirements regarding workforce development that companies should commit to in order to receive funding, remain in compliance, and achieve growth objectives. Geopolitical concerns and supply chain fragility continue to contribute to the onshoring of manufacturing (advanced node, trailing node, memory) and back-end ATP (assembly, test, and packaging) processes. A history of cycles The cyclical chips industry experienced its seventh downturn since 1990, with revenues declining 9% to $520 billion for 2023. As a result, development of some new fabrication capacity has been extended, which has also likely delayed some of the immediate, short-term need for talent. This downturn is expected to be temporary, with revenue set to grow by 16% in 2024 to an all-time high of $611 billion. With the industry back on track to reach the $1 trillion figure for 2030, talent will be needed to fuel that growth. But now there’s more time to optimize talent forecasts, mix, pipeline, skills and capabilities, and development plans. A richer understanding of the challenges driving the semiconductor talent shortages can enable semiconductor leaders to deploy targeted strategies to help address their looming talent needs. Advanced skills being driven by demand for GenAI Lots of countries are focusing on domestic chip industries. According to Deloitte’s 2023 Smart Manufacturing: Generative AI for Semiconductors Survey, 72% of industry leaders surveyed predict that GenAI’s impact on the semiconductor industry will be “high to transformative.” Respondents saw high potential for Generative AI’s use throughout their business, with heavier value realization expectations within core engineering, chip design and manufacturing, operations, and maintenance. Although GenAI may help alleviate some engineering talent shortages by addressing routine tasks and giving engineers more time to perform their core jobs better and faster, the GenAI skill set scarcity remains. The semiconductor workforce is expected to need to exponentially grow its GenAI skill sets due to their shortage in the market. And leaders in the field are often in high demand across most sectors ofthe economy. Semiconductor companies should consider offering more novel benefits beyond competitive compensation, such as having a seat at the table, to better attract AI talent and leadership. Having proficient GenAI talent is key in driving the industry’s ability to innovate and reap the benefits of this transformative technology. Looming talent cliff and low industry appeal An aging workforce, regulatory changes, newly required skill sets, and shifting employee expectations are changing the landscape of semiconductor talent. The lack of brand awareness and appeal in the semiconductor industry compared to better-known technology brands can make addressing these challenges more difficult for the industry. Semiconductor companies seem to recognize that attracting and retaining new and diverse talent is more important than ever, yet it continues to be a challenge for many organizations. Building diversity can be difficult; currently only one-third of the U.S. semiconductor industry employees identify as female and less than 6% as Black or African American. The U.S. semiconductor workforce is also older than other technology industries: As of July 2024, 55% of the U.S. semiconductor workforce is 45 or older, with less than 25% under the age of 35.11 In Europe, 20% of the industry is 55 or older, with Germany expecting about 30% of their workforce to retire over the next decade. Inconsistent knowledge management, and the lack of new talent to adopt institutional knowledge, presents an additional workforce barrier for many semiconductor companies. Relative to other sectors of the technology industry, semiconductor organizations can offer a sense of trust, stability, and projected market growth—attractive qualities to the most recent college entrants. While semiconductor companies may have struggled with brand recognition and a competitive employee value proposition, investing in recent high school graduates could help reinvigorate talent pipelines that may be more attracted to stability and flexibility over rapid advancement. A global shortage The need for semiconductor talent is a global issue. Countries are not producing enough skilled talent to meet their workforce needs. And companies can’t continue to tussle over the same finite talent pool while still expecting to successfully grow the industry, launch new (and expand existing) fabs, and keep up with rapid technological advances. In the United States, where the majority of annual graduates with a master’s degree in semiconductor-related engineering fields are foreign students, 80% of those graduates do not stay in the United States post-graduation. According to Deloitte China and Asia Pacific’s most recent APAC Semiconductor Industry Trends Survey,

Chip industry faces talent shortage as revenues head to $1 trillion Read More »

Credo AI’s integrations hub automates governance for AI projects in Amazon, Microsoft, and more

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More AI governance company Credo AI launched a new platform that integrates with third-party AI Ops and business tools to gain better visibility around responsible AI policies.  Credo AI’s Integrations Hub, now generally available, lets enterprise clients connect platforms where they build generative AI applications like Amazon Sagemaker, MLFlow and Microsoft Dynamics 365 to a centralized governance platform. Platforms where these applications are often deployed, like to Asana, ServiceNow or Jira can also be added to Integrations Hub.  The idea is that enterprises working on AI applications can use Integrations Hub to connect to a central governance platform like Credo AI’s governance platform. Instead of needing to upload documentation proving safety and security standards, the Integrations Hub will collect metadata from the applications that contain those metrics.  Credo AI said Integrations Hub will directly connect with existing model stores, which are then automatically uploaded to the governance platform for compliance checks. The hub will also bring in datasets for governance purposes.  Navrina Singh, founder and CEO of Credo AI, told VentureBeat that the integrations hub was designed to make AI governance, whether following data disclosure rules or internal policies around AI usage, become part of the development process at the very beginning.  “All the organizations that we work with, primarily Global 2000 [companies], are adopting AI at a very fast pace and are bringing in new breeds of AI tools,” Singh said. “When we looked across all the enterprises, one of the key things we wanted to enable for them was to extract the maximum value of their AI bets and make governance really easy, so they stop making excuses that it’s difficult to do.” Credo AI’s Integrations Hub will include ready connections with Jira, ServiceNow, Amazon’s SageMaker and Bedrock, Salesforce, MLFlow, Asana, Databricks, Microsoft Dynamics 365 and Azure Machine Learning, Weights & Biases, Hugging Face and Collibra. Any additional integrations can be customized for an additional fee.  Governance at the onset Surveys have shown that responsible AI and AI governance, which normally looks at how applications meet any regulations, ethical considerations and privacy checkups, have become top of mind for many companies. However, these same surveys point out that there are few companies that assessed these risks.  As enterprises grapple with how to be more responsible around generative AI, providing ways for organizations to easily figure out risks and compliance issues has become a new niche for many companies. Credo AI is just one of the companies offering different avenues to make responsible AI easily accessible. IBM’s Watsonx suite of products includes a governance platform that lets users evaluate models for accuracy, bias and compliance. Collibra also released a suite of AI tools around governance that creates workflows to document and monitor AI programs.  Credo AI does check applications for potential brand risks like accuracy. Still, it positions its platforms more as a means to meet current laws around automated platforms and any potential new regulation that would come out.  There are still very few regulations around generative AI, though there have always been policies governing data privacy and data retention that some enterprises would have already been following thanks to machine learning or data rules. Singh said there are some geographies that do ask enterprises for reports around AI governance. She pointed to New York City Law 144, legislation prohibiting automated tools for employment decisions.  “There are certain technical evidence you have to collect, like a metric called demographic parity ratio. Credo AI takes this New York City law and codifies it to check your AI Ops system, and since it’s connected to your policies and to where you built your HR system, we can collect that metadata to meet the requirements of the law,” Singh said.  source

Credo AI’s integrations hub automates governance for AI projects in Amazon, Microsoft, and more Read More »

Meta enters AI video wars with powerful Movie Gen set to hit Instagram in 2025

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Meta founder and CEO Mark Zuckerberg, who built the company atop of its hit social network Facebook, finished this week strong, posting a video of himself doing a leg press exercise on a machine at the gym on his personal Instagram (a social network Facebook acquired in 2012). Except, in the video, the leg press machine transforms into a neon cyberpunk version, an Ancient Roman version, and a gold flaming version as well. As it turned out, Zuck was doing more than just exercising: he was using the video to announce Movie Gen, Meta’s new family of generative multimodal AI models that can make both video and audio from text prompts, and allow users to customize their own videos, adding special effects, props, costumes and changing select elements simply through text guidance, as Zuck did in his video. The models appear to be extremely powerful, allowing users to change only selected elements of a video clip rather than “re-roll” or regenerate the entire thing, similar to Pika’s spot editing on older models, yet with longer clip generation and sound built in. Meta’s tests, outlined in a technical paper on the model family released today, show that it outperforms the leading rivals in the space including Runway Gen 3, Luma Dream Machine, OpenAI Sora and Kling 1.5 on many audience ratings of different attributes such as consistency and “naturalness” of motion. Meta has positioned Movie Gen as a tool for both everyday users looking to enhance their digital storytelling as well as professional video creators and editors, even Hollywood filmmakers. Movie Gen represents Meta’s latest step forward in generative AI technology, combining video and audio capabilities within a single system. Specificially, Movie Gen consists of four models: 1. Movie Gen Video – a 30B parameter text-to-video generation model 2. Movie Gen Audio – a 13B parameter video-to-audio generation model 3. Personalized Movie Gen Video – a version of Movie Gen Video post-trained to generate personalized videos based on a person’s face 4. Movie Gen Edit – a model with a novel post-training procedure for precise video editing These models enable the creation of realistic, personalized HD videos of up to 16 seconds at 16 FPS, along with 48kHz audio, and provide video editing capabilities. Designed to handle tasks ranging from personalized video creation to sophisticated video editing and high-quality audio generation, Movie Gen leverages powerful AI models to enhance users’ creative options. Key features of the Movie Gen suite include: • Video Generation: With Movie Gen, users can produce high-definition (HD) videos by simply entering text prompts. These videos can be rendered at 1080p resolution, up to 16 seconds long, and are supported by a 30 billion-parameter transformer model. The AI’s ability to manage detailed prompts allows it to handle various aspects of video creation, including camera motion, object interactions, and environmental physics. • Personalized Videos: Movie Gen offers an exciting personalized video feature, where users can upload an image of themselves or others to be featured within AI-generated videos. The model can adapt to various prompts while maintaining the identity of the individual, making it useful for customized content creation. • Precise Video Editing: The Movie Gen suite also includes advanced video editing capabilities that allow users to modify specific elements within a video. This model can alter localized aspects, like objects or colors, as well as global changes, such as background swaps, all based on simple text instructions. • Audio Generation: In addition to video capabilities, Movie Gen also incorporates a 13 billion-parameter audio generation model. This feature enables the generation of sound effects, ambient music, and synchronized audio that aligns seamlessly with visual content. Users can create Foley sounds (sound effects amplifying yet solidifying real life noises like fabric ruffling and footsteps echoing), instrumental music, and other audio elements up to 45 seconds long. Meta posted an example video with Foley sounds below (turn sound up to hear it): Trained on billions of videos online Movie Gen is the latest advancement in Meta’s ongoing AI research efforts. To train the models, Meta says it relied upon “internet scale image, video, and audio data,” specifically, 100 million videos and 1 billion images from which it “learns about the visual world by ‘watching’ videos,” according to the technical paper. However, Meta did not specify if the data was licensed in the paper or public domain, or if it simply scraped it as many other AI model makers have — leading to criticism from artists and video creators such as YouTuber Marques Brownlee (MKBHD) — and, in the case of AI video model provider Runway, a class-action copyright infringement suit by creators (still moving through the courts). As such, one can expect Meta to face immediate criticism for its data sources. The legal and ethical questions about the training aside, Meta is clearly positioning the Movie Gen creation process as novel, using a combination of typical diffusion model training (used commonly in video and audio AI) alongside large language model (LLM) training and a new technique called “Flow Matching,” the latter of which relies on modeling changes in a dataset’s distribution over time. At each step, the model learns to predict the velocity at which samples should “move” toward the target distribution. Flow Matching differs from standard diffusion-based models in key ways: • Zero Terminal Signal-to-Noise Ratio (SNR): Unlike conventional diffusion models, which require specific noise schedules to maintain a zero terminal SNR, Flow Matching inherently ensures zero terminal SNR without additional adjustments. This provides robustness against the choice of noise schedules, contributing to more consistent and higher-quality video outputs  . • Efficiency in Training and Inference: Flow Matching is found to be more efficient both in terms of training and inference compared to diffusion models. It offers flexibility in terms of the type of noise schedules used and shows improved performance across a range of model sizes. This approach has

Meta enters AI video wars with powerful Movie Gen set to hit Instagram in 2025 Read More »

Apple releases Depth Pro, an AI model that rewrites the rules of 3D vision

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Apple’s AI research team has developed a new model that could significantly advance how machines perceive depth, potentially transforming industries ranging from augmented reality to autonomous vehicles. The system, called Depth Pro, is able to generate detailed 3D depth maps from single 2D images in a fraction of a second—without relying on the camera data traditionally needed to make such predictions. The technology, detailed in a research paper titled “Depth Pro: Sharp Monocular Metric Depth in Less Than a Second,” is a major leap forward in the field of monocular depth estimation, a process that uses just one image to infer depth. This could have far-reaching applications across sectors where real-time spatial awareness is key. The model’s creators, led by Aleksei Bochkovskii and Vladlen Koltun, describe Depth Pro as one of the fastest and most accurate systems of its kind. A comparison of depth maps from Apple’s Depth Pro, Marigold, Depth Anything v2, and Metric3D v2. Depth Pro excels in capturing fine details like fur and birdcage wires, producing sharp, high-resolution depth maps in just 0.3 seconds, outperforming other models in accuracy and detail. (credit: arxiv.org) Monocular depth estimation has long been a challenging task, requiring either multiple images or metadata like focal lengths to accurately gauge depth. But Depth Pro bypasses these requirements, producing high-resolution depth maps in just 0.3 seconds on a standard GPU. The model can create 2.25-megapixel maps with exceptional sharpness, capturing even minute details like hair and vegetation that are often overlooked by other methods. “These characteristics are enabled by a number of technical contributions, including an efficient multi-scale vision transformer for dense prediction,” the researchers explain in their paper. This architecture allows the model to process both the overall context of an image and its finer details simultaneously—an enormous leap from slower, less precise models that came before it. A comparison of depth maps from Apple’s Depth Pro, Depth Anything v2, Marigold, and Metric3D v2. Depth Pro excels in capturing fine details like the deer’s fur, windmill blades, and zebra’s stripes, delivering sharp, high-resolution depth maps in 0.3 seconds. (credit: arxiv.org) Metric depth, zero-shot learning What truly sets Depth Pro apart is its ability to estimate both relative and absolute depth, a capability called “metric depth.” This means that the model can provide real-world measurements, which is essential for applications like augmented reality (AR), where virtual objects need to be placed in precise locations within physical spaces. And Depth Pro doesn’t require extensive training on domain-specific datasets to make accurate predictions—a feature known as “zero-shot learning.” This makes the model highly versatile. It can be applied to a wide range of images, without the need for the camera-specific data usually required in depth estimation models. “Depth Pro produces metric depth maps with absolute scale on arbitrary images ‘in the wild’ without requiring metadata such as camera intrinsics,” the authors explain. This flexibility opens up a world of possibilities, from enhancing AR experiences to improving autonomous vehicles’ ability to detect and navigate obstacles. For those curious to experience Depth Pro firsthand, a live demo is available on the Hugging Face platform. A comparison of depth estimation models across multiple datasets. Apple’s Depth Pro ranks highest overall with an average rank of 2.5, outperforming models like Depth Anything v2 and Metric3D in accuracy across diverse scenarios. (credit: arxiv.org) Real-world applications: From e-commerce to autonomous vehicles This versatility has significant implications for various industries. In e-commerce, for example, Depth Pro could allow consumers to see how furniture fits in their home by simply pointing their phone’s camera at the room. In the automotive industry, the ability to generate real-time, high-resolution depth maps from a single camera could improve how self-driving cars perceive their environment, boosting navigation and safety. “The method should ideally produce metric depth maps in this zero-shot regime to accurately reproduce object shapes, scene layouts, and absolute scales,” the researchers write, emphasizing the model’s potential to reduce the time and cost associated with training more conventional AI models. Tackling the challenges of depth estimation One of the toughest challenges in depth estimation is handling what are known as “flying pixels”—pixels that appear to float in mid-air due to errors in depth mapping. Depth Pro tackles this issue head-on, making it particularly effective for applications like 3D reconstruction and virtual environments, where accuracy is paramount. Additionally, Depth Pro excels in boundary tracing, outperforming previous models in sharply delineating objects and their edges. The researchers claim it surpasses other systems “by a multiplicative factor in boundary accuracy,” which is key for applications that require precise object segmentation, such as image matting and medical imaging. Open-source and ready to scale In a move that could accelerate its adoption, Apple has made Depth Pro open-source. The code, along with pre-trained model weights, is available on GitHub, allowing developers and researchers to experiment with and further refine the technology. The repository includes everything from the model’s architecture to pretrained checkpoints, making it easy for others to build on Apple’s work. The research team is also encouraging further exploration of Depth Pro’s potential in fields like robotics, manufacturing, and healthcare. “We release code and weights at https://github.com/apple/ml-depth-pro,” the authors write, signaling this as just the beginning for the model. What’s next for AI depth perception As artificial intelligence continues to push the boundaries of what’s possible, Depth Pro sets a new standard in speed and accuracy for monocular depth estimation. Its ability to generate high-quality, real-time depth maps from a single image could have wide-ranging effects across industries that rely on spatial awareness. In a world where AI is increasingly central to decision-making and product development, Depth Pro exemplifies how cutting-edge research can translate into practical, real-world solutions. Whether it’s improving how machines perceive their surroundings or enhancing consumer experiences, the potential uses for Depth Pro are broad and varied. As the researchers conclude, “Depth Pro dramatically outperforms all prior work in sharp delineation of object boundaries, including fine structures such as hair, fur, and vegetation.” With its open-source release, Depth Pro could soon become integral to industries ranging from autonomous

Apple releases Depth Pro, an AI model that rewrites the rules of 3D vision Read More »

Inflection helps fix RLHF uninformity with unique models for enterprise, agentic AI

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More A recent exchange on X (formerly Twitter) between Wharton professor Ethan Mollick and Andrej Karpathy, the former Director of AI at Tesla and co-founder of OpenAI, touches on something both fascinating and foundational: many of today’s top generative AI models — including those from OpenAI, Anthropic, and Google— exhibit a striking similarity in tone, prompting the question: why are large language models (LLMs) converging not just in technical proficiency but also in personality? The follow-up commentary pointed out a common feature that could be driving the trend of output convergence: Reinforcement Learning with Human Feedback (RLHF), a technique in which AI models are fine-tuned based on evaluations provided by human trainers.  Building on this discussion of RLHF’s role in output similarity, Inflection AI’s recent announcements of Inflection 3.0 and a commercial API may provide a promising direction to address these challenges. It has introduced a novel approach to RLHF, aimed at making generative models not only consistent but also distinctively empathetic.  With an entry into the enterprise space, the creators of the Pi collection of models leverage RLHF in a more nuanced way, from deliberate efforts to improve the fine-tuning models to a proprietary platform that incorporates employee feedback to tailor gen AI outputs to organizational culture. The strategy aims to make Inflection AI’s models true cultural allies rather than just generic chatbots, providing enterprises with a more human and aligned AI system that stands out from the crowd. Inflection AI wants your work chatbots to care Against this backdrop of convergence, Inflection AI, the creators of the Pi model, are carving out a different path. With the recent launch of Inflection for Enterprise, Inflection AI aims to make emotional intelligence — dubbed  “EQ” — a core feature for its enterprise customers.  The company says its unique approach to RLHF sets it apart. Instead of relying on anonymous data-labeling, the company sought feedback from 26,000 school teachers and university professors to aid in the fine-tuning process through a proprietary feedback platform. Furthermore, the platform enables enterprise customers to run reinforcement learning with employee feedback. This enables subsequent tuning of the model to the unique voice and style of the customer’s company. Inflection AI’s approach promises that companies will “own” their intelligence, meaning an on-premise model fine-tuned with proprietary data that is securely managed on their own systems. This is a notable move away from the cloud-centric AI models many enterprises are familiar with — a setup Inflection believes will enhance security and foster greater alignment between AI outputs and the ways people use it at work. What RLHF is and isn’t RLHF has become the centerpiece of gen AI development, largely because it allows companies to shape responses to be more helpful, coherent, and less prone to dangerous errors. OpenAI’s use of RLHF was foundational to making tools like ChatGPT engaging and generally trustworthy for users. RLHF helps align model behavior with human expectations, making it more engaging and reducing undesirable outputs. However, RLHF is not without its drawbacks. RLHF was quickly offered as a contributing reason to a convergence of model outputs, potentially leading to a loss of unique characteristics and making models increasingly similar. Seemingly, alignment offers consistency, but it also creates a challenge for differentiation. Previously, Karpathy himself pointed out some of the limitations inherent in RLHF. He likened it to a game of vibe checks, and stressed that it does not provide an “actual reward” akin to competitive games like AlphaGo. Instead, RLHF optimizes for an emotional resonance that’s ultimately subjective and may miss the mark for practical or complex tasks.  From EQ to AQ To mitigate some of these RLHF limitations, Inflection AI has embarked on a more nuanced training strategy. Not only implementing improved RLHF, but it has also taken steps towards agentic AI capabilities, which it has abbreviated as AQ (Action Quotient). As White described in a recent interview, Inflection AI’s enterprise aims involve enabling models to not only understand and empathize but also to take meaningful actions on behalf of users — ranging from sending follow-up emails to assisting in real-time problem-solving. While Inflection AI’s approach is certainly innovative, there are potential short falls to consider. Its 8K token context window used for inference is smaller than what many high-end models employ, and the performance of their newest models has not been benchmarked. Despite ambitious plans, Inflection AI’s models may not achieve the desired level of performance in real-world applications.  Nonetheless, the shift from EQ to AQ could mark a critical evolution in gen AI development, especially for enterprise clients looking to leverage automation for both cognitive and operational tasks. It’s not just about talking empathetically with customers or employees; Inflection AI hopes that Inflection 3.0 will also execute tasks that translate empathy into action. Inflection’s partnership with automation platforms like UiPath to provide this “agentic AI” further bolsters their strategy to stand out in an increasingly crowded market. Navigating a post-Suleyman world Inflection AI has undergone significant internal changes over the past year. The departure of CEO Mustafa Suleyman in Microsoft’s “acqui-hire,” along with a sizable portion of the team, cast doubt on the company’s trajectory. However, the appointment of White as CEO and a refreshed management team has set a new course for the organization. After an initial licensing agreement with the Redmond tech giant, Inflection AI’s model development was forked by the two companies. Microsoft continues to build on a version of the model focused on integration with its existing ecosystem. Meanwhile, Inflection AI continued to independently evolve Inflection 2.5 into today’s 3.0 version, distinct from Microsoft’s. Pi’s… actually pretty popular Inflection AI’s unique approach with Pi is gaining traction beyond the enterprise space, particularly among users on platforms like Reddit. The Pi community has been vocal about their experiences, sharing positive anecdotes and discussions regarding Pi’s thoughtful and empathetic responses.  This grassroots popularity demonstrates that Inflection AI might

Inflection helps fix RLHF uninformity with unique models for enterprise, agentic AI Read More »

Gradio 5 is here: Hugging Face’s newest tool simplifies building AI-powered web apps

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Hugging Face, the fast-growing AI startup valued at about $4.5 billion, has launched Gradio 5, a major update to its popular open-source tool for creating machine learning applications. The new version aims to make AI development more accessible, potentially speeding up enterprise adoption of machine learning technologies. Gradio, which Hugging Face acquired in 2021, has quickly become a cornerstone of the company’s offerings. With over 2 million monthly users and more than 470,000 applications built on the platform, Gradio has emerged as a key player in the AI development ecosystem. Bridging the gap: Python proficiency meets web development ease The latest version aims to bridge the gap between machine learning expertise and web development skills. “Machine learning developers are very comfortable programming in Python, and oftentimes, less so with the nuts and bolts of web development,” explained Abubakar Abid, Founder of Gradio, in an exclusive interview with VentureBeat. “Gradio lets developers build performant, scalable apps that follow best practices in security and accessibility, all in just a few lines of Python.” One of the most notable features of Gradio 5 is its focus on enterprise-grade security. Abid highlighted this aspect, telling VentureBeat, “We hired Trail of Bits, a well-known cybersecurity company, to do an independent audit of Gradio, and included fixes for all the issues that they found in Gradio 5… For Gradio developers, the key benefit is that your Gradio 5 apps will, out-of-the-box, follow best practices in web security, even if you are not an expert in web security yourself.” AI-assisted app creation: Enhancing development with natural language prompts The release also introduces an experimental AI Playground, allowing developers to generate and preview Gradio apps using natural language prompts. Ahsen Khaliq, ML Growth Lead at Gradio, emphasized the importance of this feature, saying, “Similar to other AI coding environments, you can enter a text prompt explaining what kind of app you want to build and an LLM will turn it into Gradio code. But unlike other coding environments, you can also see an instant preview of your Gradio app and run it in the browser.” This innovation could dramatically reduce the time and expertise needed to create functional AI applications, potentially making AI development more accessible to a wider range of businesses and developers. Gradio’s position in the AI ecosystem is becoming increasingly central. “Once a model is available on a hub like the Hugging Face Hub or downloaded locally, developers can wrap it into a web app using Gradio in a few lines of code,” Khaliq explained. This flexibility has led to Gradio being used in notable projects like Chatbot Arena, Open NotebookLM, and Stable Diffusion. Future-proofing enterprise AI: Gradio’s roadmap for innovation The launch of Gradio 5 comes at a time when enterprise adoption of AI is accelerating. By simplifying the process of creating production-ready AI applications, Hugging Face is positioning itself to capture a significant share of this growing market. Looking ahead, Abid hinted at ambitious plans for Gradio: “Many of the changes we’ve made in Gradio 5 are designed to enable new functionality that we will be shipping in the coming weeks… Stay tuned for: multi-page Gradio apps, navbars and sidebars, support for running Gradio apps on mobile using PWA and potentially native app support, more built-in components to support new modalities that are emerging around images and video, and much more.” As AI continues to impact various industries, tools like Gradio 5 that connect advanced technology with practical business applications are likely to play a vital role. With this release, Hugging Face is not just updating a product — it’s potentially altering the landscape of enterprise AI development. source

Gradio 5 is here: Hugging Face’s newest tool simplifies building AI-powered web apps Read More »

AI Platform Alliance brings system and chip companies together

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More The AI Platform Alliance announced today the expansion of its consortium aimed at combining the key chips and hardware required to operate a modern AI compute service with more open, economical and sustainable solutions. Formed last year at the Open Compute Conference, the group was initially comprised of AI accelerator companies, or companies that make chips for accelerating AI software. The alliance has now expanded to include cloud managed service providers, system suppliers and integrators, and software companies, reflecting a maturing ecosystem for the most demanding AI inference use cases. The evolving alliance ecosystem has focused on providing practical and easily adoptable solutions through a new marketplace now available on the AI Platform Alliance website. The solutions offered by alliance members increase both the power and cost efficiency of AI inference while delivering better overall performance than more commonly seen solutions featuring GPUs today. New companies joining the AI Platform Alliance include Adlink, ASRock Rack, ASA Computers, Canonical, Clairo.ai, Deepgram, DeepX, ECS/Equus, Giga Computing (Gigabyte), Kamiwaza.ai, Lampi.ai, Netint, NextComputing, opsZero, Positron, Prov.net/Alpha3, Responsible Compute, Supermicro, Untether, View IO and Wallaroo.ai. These companies join founding members that included Ampere Computing, Cerebras Systems, Furiosa, Graphcore, Kalray, Kinara, Luminous, Neuchips, Rebellions and Sapeon. The members include more than 30 organizations spanning five key sectors of the industry supplying products and services to the burgeoning AI inference industry. The AI Platform Alliance was formed specifically to promote better collaboration and openness when it comes to AI. This solidarity of vision comes at a pivotal moment not just for the technology industry, but for the world at large. The explosion of AI has created unprecedented demand for compute power to not only run AI algorithms, but also to pull together all the systems, applications and services required to implement a modern AI-0 enabled digital service. While solutions to date have mainly addressed AI training of ever more powerful models, AI inference can require up to 10 times more traditional compute support processes to run a complex AI-enabled service. These stacks require an ecosystem of technology, services, and applications working together seamlessly to integrate best in class ingredients and easy to adopt recipes to scale AI inference use cases. AI Platform Alliance members will work together to validate joint AI solutions that provide a diverse set of alternatives to vertically oriented GPU-based status quo platforms. By developing these solutions as a community, this group will accelerate the pace of AI innovation while making AI platforms more open and transparent, increasing the capacity of AI to solve real-world problems, accelerating the rate of practical adoption, and delivering environmentally friendly and socially responsible infrastructure at scale. Various members of the AI Platform Alliance are expected to showcase solutions at Yotta 2024 in Las Vegas October 7 to October 9. The AI Platform Alliance is open today to potential new members looking to change the AI status quo. Companies interested in joining can access more information and apply here. source

AI Platform Alliance brings system and chip companies together Read More »

Nvidia just dropped a bombshell: Its new AI model is open, massive, and ready to rival GPT-4

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Nvidia has released a powerful open-source artificial intelligence model that competes with proprietary systems from industry leaders like OpenAI and Google. The company’s new NVLM 1.0 family of large multimodal language models, led by the 72 billion parameter NVLM-D-72B, demonstrates exceptional performance across vision and language tasks while also enhancing text-only capabilities. “We introduce NVLM 1.0, a family of frontier-class multimodal large language models that achieve state-of-the-art results on vision-language tasks, rivaling the leading proprietary models (e.g., GPT-4o) and open-access models,” the researchers explain in their paper. By making the model weights publicly available and promising to release the training code, Nvidia breaks from the trend of keeping advanced AI systems closed. This decision grants researchers and developers unprecedented access to cutting-edge technology. Benchmark results comparing NVIDIA’s NVLM-D model to AI giants like GPT-4, Claude 3.5, and Llama 3-V, showing NVLM-D’s competitive performance across various visual and language tasks. (Credit: arxiv.org) NVLM-D-72B: A versatile performer in visual and textual tasks The NVLM-D-72B model shows impressive adaptability in processing complex visual and textual inputs. Researchers provided examples that highlight the model’s ability to interpret memes, analyze images, and solve mathematical problems step-by-step. Notably, NVLM-D-72B improves its performance on text-only tasks after multimodal training. While many similar models see a decline in text performance, NVLM-D-72B increased its accuracy by an average of 4.3 points across key text benchmarks. “Our NVLM-D-1.0-72B demonstrates significant improvements over its text backbone on text-only math and coding benchmarks,” the researchers note, emphasizing a key advantage of their approach. NVIDIA’s new AI model analyzes a meme comparing academic abstracts to full papers, demonstrating its ability to interpret visual humor and scholarly concepts. (Credit: arxiv.org) AI researchers respond to Nvidia’s open-source initiative The AI community has reacted positively to the release. One AI researcher commenting on social media, observed, “Wow! Nvidia just published a 72B model with is ~on par with llama 3.1 405B in math and coding evals and also has vision ?” Nvidia’s decision to make such a powerful model openly available could accelerate AI research and development across the field. By providing access to a model that rivals proprietary systems from well-funded tech companies, Nvidia may enable smaller organizations and independent researchers to contribute more significantly to AI advancements. The NVLM project also introduces innovative architectural designs, including a hybrid approach that combines different multimodal processing techniques. This development could shape the direction of future research in the field. NVLM 1.0: A new chapter in open-source AI development Nvidia’s release of NVLM 1.0 marks a pivotal moment in AI development. By open-sourcing a model that rivals proprietary giants, Nvidia isn’t just sharing code—it’s challenging the very structure of the AI industry. This move could spark a chain reaction. Other tech leaders may feel pressure to open their research, potentially accelerating AI progress across the board. It also levels the playing field, allowing smaller teams and researchers to innovate with tools once reserved for tech giants. However, NVLM 1.0’s release isn’t without risks. As powerful AI becomes more accessible, concerns about misuse and ethical implications will likely grow. The AI community now faces the complex task of promoting innovation while establishing guardrails for responsible use. Nvidia’s decision also raises questions about the future of AI business models. If state-of-the-art models become freely available, companies may need to rethink how they create value and maintain competitive edges in AI. The true impact of NVLM 1.0 will unfold in the coming months and years. It could usher in an era of unprecedented collaboration and innovation in AI. Or, it might force a reckoning with the unintended consequences of widely available, advanced AI. One thing is certain: Nvidia has fired a shot across the bow of the AI industry. The question now is not if the landscape will change, but how dramatically—and who will adapt fast enough to thrive in this new world of open AI. source

Nvidia just dropped a bombshell: Its new AI model is open, massive, and ready to rival GPT-4 Read More »

OpenAI isn’t going anywhere: raises $6.6B at $157B valuation

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Despite a wave of executive departures in recent months, OpenAI has today announced an expected new funding round. It was always expected to be a whopper, but the amount it raised — $6.6 billion at a $157 billion total company valuation  — now makes it the largest venture capital round in history to date, according to Axios. The round was led by Thrive Capital, according to Bloomberg, while CNBC notes that heavy hitters including Nvidia and Microsoft plowed more cash into this round as well. In announcing the funding on its website, OpenAI noted that ChatGPT alone counts more than 250 million weekly unique users. “The new funding will allow us to double down on our leadership in frontier AI research, increase compute capacity, and continue building tools that help people solve hard problems,” the company wrote in a short blog post. Reasons for skepticism? However, the news was still greeted with skepticism among AI critics including the outspoken tech public relations expert and tech writer Ed Zitron, who’s latest newsletter is headlined “OpenAI is a bad business” and argues that OpenAI’s decision to take a reported $500 million from the infamous Softbank Venture Fund — which has notably invested in duds like WeWork — combined with its reliance on individual ChatGPT subscriptions rather than API usage or licensing, suggests it is not well positioned to succeed as a for-profit in the future. These are, in my opinion, fair criticisms, as is noting the fact that Apple reportedly declined to invest in the firm after giving it consideration and potentially in the wake of former chief technology officer Mira Murati’s resignation just last week. And then there came the report from The Financial Times that OpenAI made part of the conditions of those who were throwing money its way that they not invest in rivals including Anthropic, which was founded by former OpenAI researchers and continues to pick up more exiting execs, and Musk’s xAI — recently reported to have switched on its Memphis training supercluster “Colossus” with 100,000+ Nvidia H100 GPUs — seemingly showing that OpenAI is worried about the competition catching up. Musk, for his part, took the news of OpenAI’s reported conditions on exclusive funding with his typical blunt criticism, calling the company evil on his X account. And indeed, the competition in the AI space is intensifying with more, newer models emerging such as Liquid AI’s new non-transformer based Liquid Foundation Models (LFMs), and Google and Anthropic also fielding compelling enterprise and consumer-facing options. Meanwhile, Meta and Alibaba are releasing powerful open source models for free. The OpenAI bull case Still, OpenAI’s models top the charts when it comes to the third-party performance benchmarks, and every time they have been overtaken, OpenAI has released an update or entire new class of models such as the o1 preview series that retakes the throne. So for now, fueled by $6.6 billion in fresh funding and with new models, developer tools, and aggressive cost cutting measures for developer customers (intelligence that is “too cheap to meter” in the words of many in the AI industry) — it appears that OpenAI is not going anywhere anytime soon. It may, in fact, be too big to fail, as I speculated it was becoming a few weeks ago. For developers building products atop the company’s AI models and frameworks, this is probably welcome news — as they are likely to stable and supported going forward. Will OpenAI give GPT creators any more $$$? However, one big question remains regarding OpenAI’s custom GPT Store, its version of a kind of AI app store which launched in January 2024 and allows any ChatGPT Plus user to create and share custom versions of ChatGPT designed to fulfill specific roles and perform specific tasks. OpenAI CEO and co-founder Sam Altman said at its developer conference DevDay in late 2023 that revenue sharing would be coming, and some users reported that they did receive some revenue from their GPTs, but we haven’t heard much from OpenAI about it since. Now flush with cash, I’m wondering if OpenAI will start paying out more to more GPT creators (selfishly as well, since I’ve created a few custom GPTs — full disclosure). I’ve reached out to the company to ask about that an will update when I hear back. Either way, OpenAI’s coffers have been refilled, and despite the chaos behind the scenes, the company continues to ship new AI products regularly — though we’re all still waiting on the public release of its AI video model Sora. source

OpenAI isn’t going anywhere: raises $6.6B at $157B valuation Read More »