VentureBeat

OpenAI unveils experimental ‘Swarm’ framework, igniting debate on AI-driven automation

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has unveiled “Swarm,” an experimental framework designed to orchestrate networks of AI agents. This unexpected release has ignited intense discussions among industry leaders and AI ethicists about the future of enterprise automation, despite the company’s emphasis that Swarm is not an official product. Swarm provides developers with a blueprint for creating interconnected AI networks capable of communicating, collaborating, and tackling complex tasks autonomously. While the concept of multi-agent systems isn’t new, Swarm represents a significant step in making these systems more accessible to a broader range of developers. (credit: x.com/shyamalanadkat) The next frontier in enterprise AI: Multi-agent systems and their potential impact The framework’s potential business applications are extensive. A company using Swarm-inspired technology could theoretically create a network of specialized AI agents for different departments. These agents might work together to analyze market trends, adjust marketing strategies, identify sales leads, and provide customer support—all with minimal human intervention. This level of automation could fundamentally alter business operations. AI agents might handle tasks currently requiring human oversight, potentially boosting efficiency and freeing employees to focus on strategic initiatives. However, this shift prompts important questions about the evolving nature of work and the role of human decision-making in increasingly automated environments. Navigating the ethical minefield: Security, bias, and job displacement in AI networks Swarm’s release has also rekindled debates about the ethical implications of advanced AI systems. Security experts stress the need for robust safeguards to prevent misuse or malfunction in networks of autonomous agents. Concerns about bias and fairness also loom large, as decisions made by these AI networks could significantly impact individuals and society. The specter of job displacement adds another layer of complexity. The potential of technologies like Swarm to create new job categories contrasts with fears that it may accelerate white-collar automation at an unprecedented pace. This tension highlights the need for businesses and policymakers to consider the broader societal impacts of AI adoption. Some developers have already begun exploring Swarm’s potential. An open-source project called “OpenAI Agent Swarm Project: Hierarchical Autonomous Agent Swarms (HOS)” demonstrates a possible implementation, including a hierarchy of AI agents with distinct roles and responsibilities. While intriguing, this early experiment also underscores the challenges in creating effective governance structures for AI systems. From experiment to enterprise: The future of AI collaboration and decision-making OpenAI has been clear about Swarm’s limitations. Shyamal Anadkat, a researcher at the company, stated on Twitter: “Swarm is not an official OpenAI product. Think of it more like a cookbook. It’s experimental code for building simple agents. It’s not meant for production and won’t be maintained by us.” ‼️ since this started trending unexpectedly: swarm is not an official openai product. think of it more like a cookbook. it’s experimental code for building simple agents. it's not meant for production and won’t be maintained by us — shyamal (@shyamalanadkat) October 12, 2024 This caveat tempers expectations and serves as a reminder that multi-agent AI development remains in its early stages. However, it doesn’t diminish Swarm’s significance as a conceptual framework. By providing a tangible example of how multi-agent systems might be structured, OpenAI has given developers and businesses a clearer vision of potential future AI ecosystems. For enterprise decision-makers, Swarm serves as a catalyst for forward-thinking. While not ready for immediate implementation, it signals the direction of AI technology’s evolution. Companies that begin exploring these concepts now—considering both their potential benefits and challenges—will likely be better positioned to adapt as the technology matures. Swarm’s release also emphasizes the need for interdisciplinary collaboration in navigating the complex landscape of advanced AI. Technologists, ethicists, policymakers, and business leaders must work together to ensure that the development of multi-agent AI systems aligns with societal values and needs. The conversation around AI will increasingly focus on these interconnected systems. Swarm offers a valuable preview of the questions and challenges that businesses and society will face in the coming years. The tech world now closely watches to see how developers will build upon the ideas presented in Swarm, and how OpenAI and other leading AI companies will continue to shape the trajectory of this transformative technology. source

OpenAI unveils experimental ‘Swarm’ framework, igniting debate on AI-driven automation Read More »

Inflection AI helps address RLHF uniformity issues with unique models for enterprise, agentic AI

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More A recent exchange on X (formerly Twitter) between Wharton professor Ethan Mollick and Andrej Karpathy, the former Director of AI at Tesla and co-founder of OpenAI, touches on something both fascinating and foundational: many of today’s top generative AI models — including those from OpenAI, Anthropic, and Google— exhibit a striking similarity in tone, prompting the question: why are large language models (LLMs) converging not just in technical proficiency but also in personality? The follow-up commentary pointed out a common feature that could be driving the trend of output convergence: Reinforcement Learning with Human Feedback (RLHF), a technique in which AI models are fine-tuned based on evaluations provided by human trainers.  Building on this discussion of RLHF’s role in output similarity, Inflection AI’s recent announcements of Inflection 3.0 and a commercial API may provide a promising direction to address these challenges. It has introduced a novel approach to RLHF, aimed at making generative models not only consistent but also distinctively empathetic.  With an entry into the enterprise space, the creators of the Pi collection of models leverage RLHF in a more nuanced way, from deliberate efforts to improve the fine-tuning models to a proprietary platform that incorporates employee feedback to tailor gen AI outputs to organizational culture. The strategy aims to make Inflection AI’s models true cultural allies rather than just generic chatbots, providing enterprises with a more human and aligned AI system that stands out from the crowd. Inflection AI wants your work chatbots to care Against this backdrop of convergence, Inflection AI, the creators of the Pi model, are carving out a different path. With the recent launch of Inflection for Enterprise, Inflection AI aims to make emotional intelligence — dubbed  “EQ” — a core feature for its enterprise customers.  The company says its unique approach to RLHF sets it apart. Instead of relying on anonymous data-labeling, the company sought feedback from 26,000 school teachers and university professors to aid in the fine-tuning process through a proprietary feedback platform. Furthermore, the platform enables enterprise customers to run reinforcement learning with employee feedback. This enables subsequent tuning of the model to the unique voice and style of the customer’s company. Inflection AI’s approach promises that companies will “own” their intelligence, meaning an on-premise model fine-tuned with proprietary data that is securely managed on their own systems. This is a notable move away from the cloud-centric AI models many enterprises are familiar with — a setup Inflection believes will enhance security and foster greater alignment between AI outputs and the ways people use it at work. What RLHF is and isn’t RLHF has become the centerpiece of gen AI development, largely because it allows companies to shape responses to be more helpful, coherent, and less prone to dangerous errors. OpenAI’s use of RLHF was foundational to making tools like ChatGPT engaging and generally trustworthy for users. RLHF helps align model behavior with human expectations, making it more engaging and reducing undesirable outputs. However, RLHF is not without its drawbacks. RLHF was quickly offered as a contributing reason to a convergence of model outputs, potentially leading to a loss of unique characteristics and making models increasingly similar. Seemingly, alignment offers consistency, but it also creates a challenge for differentiation. Previously, Karpathy himself pointed out some of the limitations inherent in RLHF. He likened it to a game of vibe checks, and stressed that it does not provide an “actual reward” akin to competitive games like AlphaGo. Instead, RLHF optimizes for an emotional resonance that’s ultimately subjective and may miss the mark for practical or complex tasks.  From EQ to AQ To mitigate some of these RLHF limitations, Inflection AI has embarked on a more nuanced training strategy. Not only implementing improved RLHF, but it has also taken steps towards agentic AI capabilities, which it has abbreviated as AQ (Action Quotient). As White described in a recent interview, Inflection AI’s enterprise aims involve enabling models to not only understand and empathize but also to take meaningful actions on behalf of users — ranging from sending follow-up emails to assisting in real-time problem-solving. While Inflection AI’s approach is certainly innovative, there are potential short falls to consider. Its 8K token context window used for inference is smaller than what many high-end models employ, and the performance of their newest models has not been benchmarked. Despite ambitious plans, Inflection AI’s models may not achieve the desired level of performance in real-world applications.  Nonetheless, the shift from EQ to AQ could mark a critical evolution in gen AI development, especially for enterprise clients looking to leverage automation for both cognitive and operational tasks. It’s not just about talking empathetically with customers or employees; Inflection AI hopes that Inflection 3.0 will also execute tasks that translate empathy into action. Inflection’s partnership with automation platforms like UiPath to provide this “agentic AI” further bolsters their strategy to stand out in an increasingly crowded market. Navigating a post-Suleyman world Inflection AI has undergone significant internal changes over the past year. The departure of CEO Mustafa Suleyman in Microsoft’s “acqui-hire,” along with a sizable portion of the team, cast doubt on the company’s trajectory. However, the appointment of White as CEO and a refreshed management team has set a new course for the organization. After an initial licensing agreement with the Redmond tech giant, Inflection AI’s model development was forked by the two companies. Microsoft continues to build on a version of the model focused on integration with its existing ecosystem. Meanwhile, Inflection AI continued to independently evolve Inflection 2.5 into today’s 3.0 version, distinct from Microsoft’s. Pi’s… actually pretty popular Inflection AI’s unique approach with Pi is gaining traction beyond the enterprise space, particularly among users on platforms like Reddit. The Pi community has been vocal about their experiences, sharing positive anecdotes and discussions regarding Pi’s thoughtful and empathetic responses.  This grassroots popularity demonstrates that Inflection AI might

Inflection AI helps address RLHF uniformity issues with unique models for enterprise, agentic AI Read More »

Gladia raises $16M for AI transcription and analytics

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Gladia, an AI transcription and audio intelligence provider, has raised $16 million in funding. The Paris, France-based company will use the funding to develop an end-to-end audio infrastructure – starting with a new real-time audio transcription and analytics engine – enabling voice-first platforms to deliver more value to their users across borders with cutting-edge AI. It’s a challenge to rivals such as Otter.ai and Fireflies.ai, as well as other AI-based services that transcribe voice conversations to text. In an interview with VentureBeat, CEO Jean-Louis Quéguiner explained to me why he started the company. “As you can hear from a beautiful French accent, I’m not an English speaker and I was extremely frustrated with the accents,” Quéguiner said. “That’s why I founded the company.” I got a demo of the AI transcription, and it worked in real time as Quéguiner spoke English with his heavy French accent. I’m used to services like Otter getting a lot of words wrong in a transcription, but in the first page of results from Gladia, I saw no errors. He also showed how he could speak two different languages and the system could shift from one language to another as needed. XAnge led the round, with participation by Illuminate Financial, XTX Ventures, Athletico Ventures, Gaingels, Mana Ventures, Motier Ventures, Roosh Ventures, and Soma Capital. Gladia uses AI for audio transcription. Founded in 2022, Gladia has now raised a total of $20.3 million, with earlier seed investments headed by New Wave, Sequoia Capital (as part of the First Sequoia Arc program), Cocoa, and GFC. Gladia recently was selected to participate in the AWS generative AI accelerator program. “Gladia represents the qualities we like to champion at XAnge: a bold, global tech team at the forefront of AI innovation, with a proven business model to unlock new opportunities across industries,” said Alexis du Peloux, partner at XAnge, in a statement. “In a fast-paced AI environment, Jean-Louis Quéguiner and his team have executed extremely well, and we are proud to back Gladia for the Series A.” Given that most speech recognition models today are trained predominantly on English audio data and are therefore inherently biased, Gladia prioritized building the first real-time product that is truly multilingual. The new fine-tuned engine delivers advanced real-time transcription in over 100 languages, along with enhanced support for accents and the unique ability to adapt to different languages on the fly. Gladia’s new engine is unique in its ability to extract insights from a call—like the caller’s sentiment, key information, and conversation summary—in real-time. This means it takes less than a second to generate both transcript and insights from a call or meeting using Gladia. New real-time AI transcription Gladia founders Jonathan Soto (left) and Jean-Louis Quéguiner. Building an accurate, low-latency, and multilingual engine in-house is a complex and resource-intensive task. It requires extensive expertise in language understanding, real-time data handling, with continuous optimization and maintenance. Real-time models require more computing power and may struggle to produce accurate output immediately due to limited context. Gladia’s new product allows companies to bypass these challenges. The real-time speech-to-text engine boasts an industry-leading latency of under 300 milliseconds without compromising accuracy, regardless of the language, geography, or tech stack used. “Companies are spending valuable time and resources trying to incorporate multiple AI functions into their existing platforms,” said Jonathan Soto, CTO of Gladia, in a statement. “Our single API is compatible with all existing tech stacks and protocols, including SIP, VoIP, FreeSwitch, and Asterisk. This allows us to easily integrate real-time transcription and analysis into our customers’ AI platforms, so they can focus on delivering the best services to their end users.” What’s ahead The company’s first async transcription and audio intelligence API launched in June 2023 and was based on a proprietary version of Whisper ASR. It rapidly gained traction in the enterprise market, particularly with meeting recorders and note-taking assistants. The API is now adopted by over 600 customers around the world, including Attention, Circleback, Method Financial, Recall, Sana, and VEED.IO and has more than 70,000 users. “Gladia’s technology allows companies in vertical markets that need cutting-edge real-time transcription, including sales enablement and contact center platform, to shift seamlessly from manual post-call processing to proactive, low-latency workflows,” Quéguiner said. “Whether it’s automated CRM enrichment or real-time guidance for support agents, Gladia is designed to help businesses operate smarter and more efficiently in record time, without requiring AI expertise in-house.” Gladia will use the new capital to advance its R&D efforts and soon bring to market a one-stop AI toolkit for audio and expand its product offering with additional à la carte models—including large language models (LLMs) and retrieval-augmented generation (RAG). With several design partners in the contact-center-as-a-service (CCaaS) segment, the company is currently piloting an agent-assist solution powered by Gladia’s real-time AI engine. Additionally, Gladia will continue to expand its talent base as it prepares for international expansion. “We are multilingual, and we have something that is called ‘code switching,’ which makes it unique,” Quéguiner said. “You can start with the language and switch to another.” He went on to show me that he could start a call in English and initiate the transcription. Then he spoke French words, and the model correctly translated it in French. “Keep in mind that [others] are not real time right now, and this one is real time,” he said. “Usually, real time is a little bit less accurate. You can also have your own custom vocabulary in real time, which is pretty unusual, with us. We have the capability to extract some real-time insights.” The service has an AI summarizer, and it will have new optional features in the coming months. Quéguiner said that his service can also get acronyms right and detect the switch to another language. “The model we use is very similar to LLMs (large language models). It has no code decoder architecture, which

Gladia raises $16M for AI transcription and analytics Read More »

PicsArt’s creative AI playbook: A vision for contextual intelligence, AI agents

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Whether you’re an Android or iOS person, most people have heard of PicsArt. The platform launched more than a decade ago and has become one of the go-to services for all things image and video editing, with more than 150 million monthly active users.  However, it hasn’t been an easy journey for the company. Despite being an early mover in the smartphone-based editing domain, the company has seen significant competition from players like Canva and Adobe who have been playing a cat-and-mouse game for quite some time—building their own similar products. When I spoke with Artavazd Mehrabyan, the CTO of the company, at the recent WCIT conference in Armenia, he was pretty vocal about the challenges, saying it is tough to be or at least stay different for long in this market. “A lot of things that PicsArt had before were copied into the competitors. PicsArt was the first all-in-one editing service on mobile. There was no other player before 2011. We started with this approach and it was copied, among many other things,” Mehrabyan said. He pointed out that the same is happening with AI, where competitors, including mainstream photo services, are offering very similar capabilities. For example, PicsArt offers object generation, allowing users to use advanced AI to create required photo elements. The same capability has also been incorporated into other products in the category, creating an overlap of sorts. PicsArt AI GIF generator However, instead of pushing to stand out by adding more tools to its existing batch of over two dozen AI capabilities, the company is looking to make a mark on users by improving the quality of what it is delivering. Specifically, Mehrabyan said, the focus is on how they are productizing and tailoring the features to help customers get to their goal – whether they want to remove a specific object from a vacation image or generate visually appealing advertisements, complete with images and copy. Training high-quality creative AI In the early stage, when AI was not a thing, Mehrabyan said most of PicsArt’s technology research and effort went towards making mobile-based editing seamless.  “It was very hard to get all these editing functionality working on the device offline. Then, the next challenge was to scale our ecosystem and infrastructure to support a surging user base. This took us to hybrid infrastructure. We started with multi-cloud and a data center, which, till now, continues to be the best solution as it’s more cost-efficient, highly performant and very flexible,” Mehrabyan explained. With this tech stack in place, the company launched its first AI feature in 2016, running a bunch of small models offline on user devices. This gradually transformed into a large-scale AI effort, with the company transforming into an AI-first organization and leveraging its infra and backend services to serve larger models and APIs for more enhanced capabilities like background removal/replacement. More recently, with the generative AI wave taking shape, PicsArt started training its own creative AI models from scratch. In the creative domain, it is very easy to lose a user. A small error here or there (leading to low-quality results) and there’s a good chance the person won’t come back again. To prevent this, PicsArt is extremely focused on the data side of things. It is selectively using data from its own network – marked by users as public and free to edit – for training the AI models. “We have a special ‘free to edit’ license. If you are posting publicly and tagging your image – from stock photo across any category to a sticker or background – as free to edit, it allows another user of the service to reuse or work on top of it. So, in essence, the user is contributing this image to the community and PicsArt itself,” Mehrabyan said. The license has been in place from the early days of the service and has given PicsArt a massive stock of user-generated content for training AI. However, as the CTO pointed out, not all of that is of high quality and ready to use right away. The data has to pass through multiple layers of cleansing and processing, from manual and AI-driven, to be transformed into a safe training-ready dataset. “At the end of this, we have quite a big dataset that is proprietary to PicsArt. We don’t need to have additional data,” he said. However, having a large volume of high-quality data in hand was just one part of the puzzle.  The real challenge for PicsArt, as Mehrabyan described, was to build the “data flywheel.” A self-reinforcing cycle covering not only data accessibility but also aspects like how to annotate data, how to use it and eventually how to leverage it as part of a continuous learning process to improve over time.  Establishing a feedback loop to achieve this was a long and complex process, he said. “We built our own annotation technology. We internally developed all related infrastructure and ecosystem technologies, including those for identifying and classifying images, tagging them and adding different types of labels to them,” Mehrabyan said. “Then, we created a team to help refine the pipeline and give feedback over time. It’s mostly been very automatic, AI-driven with human feedback in between so that we can have continuous improvement.” Feedback loop leads to contextual intelligence While the human-driven feedback loop has been a critical part in improving PicsArt’s products – enhancing the quality of the outputs they generate – it is also taking the company towards what Mehrabyan calls “contextual intelligence” or the ability of the platform to understand user needs and deliver exactly what they want.  This function is particularly important for the platform’s growing base of business-focused users who are looking to get work done right on their smartphones. Whether that’s generating graphics or a full-fledged ad for a social media campaign. The platform is still mostly used by

PicsArt’s creative AI playbook: A vision for contextual intelligence, AI agents Read More »

LLMs can’t outperform a technique from the 70s, but they’re still worth using

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More This year, our team at MIT Data to AI lab decided to try using large language models (LLMs) to perform a task usually left to very different machine learning tools — detecting anomalies in time series data. This has been a common machine learning (ML) task for decades, used frequently in industry to anticipate and find problems with heavy machinery. We developed a framework for using LLMs in this context, then compared their performance to 10 other methods, from state-of-the-art deep learning tools to a simple method from the 1970s called autoregressive integrated moving average (ARIMA). In the end, the LLMs lost to the other models in most cases — even the old-school ARIMA, which outperformed it on seven datasets out of a total of 11. For those who dream of LLMs as a totally universal problem-solving technology, this may sound like a defeat. And for many in the AI community — who are discovering the current limits of these tools — it is likely unsurprising. But there were two elements of our findings that really surprised us. First, LLMs’ ability to outperform some models, including some transformer-based deep learning methods, caught us off guard. The second and perhaps even more important surprise was that unlike the other models, the LLMs did all of this with no fine-tuning. We used GPT-3.5 and Mistral LLMs out of the box, and didn’t tune them at all. LLMs broke multiple foundational barriers For the non-LLM approaches, we would train a deep learning model, or the aforementioned 1970’s model, using the signal for which we want to detect anomalies. Essentially, we would use the historical data for the signal to train the model so it understands what “normal” looks like. Then we would deploy the model, allowing it to process new values for the signal in real time, detect any deviations from normal and flag them as anomalies. LLMs did not need any previous examples But, when we used LLMs, we did not do this two-step process — the LLMs were not given the opportunity to learn “normal” from the signals before they had to detect anomalies in real time. We call this zero shot learning. Viewed through this lens, it’s an incredible accomplishment. The fact that LLMs can perform zero-shot learning — jumping into this problem without any previous examples or fine-tuning — means we now have a way to detect anomalies without training specific models from scratch for every single signal or a specific condition. This is a huge efficiency gain, because certain types of heavy machinery, like satellites, may have thousands of signals, while others may require training for specific conditions. With LLMs, these time-intensive steps can be skipped completely.  LLMs can be directly integrated in deployment A second, perhaps more challenging part of current anomaly detection methods is the two-step process employed for training and deploying a ML model. While deployment sounds straightforward enough, in practice it is very challenging. Deploying a trained model requires that we translate all the code so that it can run in the production environment. More importantly, we must convince the end user, in this case the operator, to allow us to deploy the model. Operators themselves don’t always have experience with machine learning, so they often consider this to be an additional, confusing item added to their already overloaded workflow. They may ask questions, such as “how frequently will you be retraining,” “how do we feed the data into the model,” “how do we use it for various signals and turn it off for others that are not our focus right now,” and so on.  This handoff usually causes friction, and ultimately results in not being able to deploy a trained model. With LLMs, because no training or updates are required, the operators are in control. They can query with APIs, add signals that they want to detect anomalies for, remove ones for which they don’t need anomaly detection and turn the service on or off without having to depend on another team. This ability for operators to directly control anomaly detection will change difficult dynamics around deployment and may help to make these tools much more pervasive. While improving LLM performance, we must not take away their foundational advantages Although they are spurring us to fundamentally rethink anomaly detection, LLM-based techniques have yet to perform as well as the state-of-the-art deep learning models, or (for 7 datasets) the ARIMA model from the 1970s. This might be because my team at MIT did not fine-tune or modify the LLM in any way, or create a foundational LLM specifically meant to be used with time series.  While all those actions may push the needle forward, we need to be careful about how this fine-tuning happens so as to not compromise the two major benefits LLMs can afford in this space. (After all, although the problems above are real, they are solvable.) This in mind, though, here is what we cannot do to improve the anomaly detection accuracy of LLMs: Fine-tune the existing LLMs for specific signals, as this will defeat their “zero shot” nature. Build a foundational LLM to work with time series and add a fine-tuning layer for every new type of machinery.  These two steps would defeat the purpose of using LLMs and would take us right back to where we started: Having to train a model for every signal and facing difficulties in deployment.  For LLMs to compete with existing approaches — anomaly detection or other ML tasks —  they must either enable a new way of performing a task or open up an entirely new set of possibilities. To prove that LLMs with any added layers will still constitute an improvement, the AI community has to develop methods, procedures and practices to make sure that improvements in some areas don’t eliminate LLMs’ other advantages.   For classical ML, it took almost 2

LLMs can’t outperform a technique from the 70s, but they’re still worth using Read More »

Lidwave raises $10M to improve machine vision with on-chip 4D LiDAR

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Lidwave has raised $10 million to make machine vision better when it comes to spotting pedestrians in a busy landscape or a robot in a factory being able to see better. The technology is dubbed 4D-LiDAR, and Lidwave is working taking complex LiDAR sensors and putting them on a chip, said Yehuda Vidal, Lidwave’s CEO, in an interview with GamesBeat. Jumpspeed Ventures and Next Gear Ventures led the round, with strategic investment from a leading Swedish truck manufacturer. The investment emphasizes the significance of Lidwave’s technology and approach in advancing the future of machine vision. Lidwave will use the new funding to further develop its optical chip, launch the industry’s first software-definable 4D LiDAR sensor, and expand its market presence. “This investment marks a significant milestone for Lidwave, propelling us closer to our goal of revolutionizing machine vision,” said Vidal. “Our 4D LiDAR chip not only sets a new standard for sensor performance but also makes advanced perception technology accessible to the mass market. We are thrilled to have the support of visionary investors who share our mission to enhance safety and productivity across various industries.” The challenge Lidwave is putting 4D LiDAR components on a single chip. Sensors with machine vision are critical across many industries. And there is a consensus that LiDAR sensors (Light Detection and Ranging) are essential for autonomous machines across various fields. LiDAR is a remote sensing technology that uses a laser to measure distances and create 3D models of the space near the sensor. A LiDAR system emits a laser pulse, which reflects off objects and is detected by a receiver. The time it takes for the light to return is used to calculate the distance to the object. And so it can be used to map the space in front of a LiDAR-equipped car. However, its full potential remains untapped due to high costs, complexity, and reliability issues. Legacy LiDAR systems are complicated, comprising dozens of elements including arrays of lasers, detectors, and optical components, assembled through a complex and costly process. This results in high-end LiDAR units costing thousands (sometimes tens of thousands) of dollars, limiting widespread adoption across industries ranging from automotive, transportation, traffic management, industrial automation, ports to railways. Lidwave’s answer Lidwave is trying to take LiDAR to the mass market with small chips. Lidwave addresses these challenges with its novel technology, marking a new era: LiDAR 2.0, an affordable system-on-chip LiDAR designed for the mass market. Lidwave’s proprietary Finite Coherent Ranging (FCR) technology integrates all critical components onto a single chip, simplifying production and drastically reducing costs. FCR allows Lidwave to integrate key components onto a single chip by treating light as a wave, rather than using traditional photon counting. This approach allows for precise measurement of both range and velocity while offering high-resolution data that helps systems understand their surroundings with greater clarity and provide immunity to external interference. By combining lasers, amplifiers, receivers, and optical routing onto one chip, Lidwave not only reduces production costs but also makes this powerful technology more accessible and reliable for a wide range of industries. Moreover, unlike conventional LiDARs, Lidwave’s coherent sensing method provides Doppler (velocity) data at the pixel level alongside depth information, enabling machines to perceive and understand their surroundings with unmatched clarity, leading to better-informed decisions. Origins Lidwave’s founders (left to right): Yossi Kabessa, Uri Weiss and Yehuda Vidal. Vidal cofounded Lidwave in 2021 with Yossi Kabessa (CTO) and Uri Weiss (chief scientist) in Jerusalem. The company has less than 20 people. “Our core knowledge is in coherent optics. It’s a regime of optics that utilizes quantum phenomena to use with light for imaging purposes. We saw that LiDAR is a very complex machine that costs tens of thousands of dollars for a high-end system,” Vidal said. The variety of LiDAR sensors is wide, from small ones in smartphones for face recognition to long-range models that can detect more than 100 meters for cars. Since it’s based on a laser, it has optical components that are not so easily converted to silicon chips. Lidwave is a fabless chip company, meaning it designs chips and has them fabricated by contract chip manufacturers. Sensors for cars and robots need to see better. “We have more than 10 years expertise in the specific domain of coherent optics, which allows us to do this on a chip,” Vidal said. The 4D refers to time, or the fourth dimension, which means capturing spatial data over time for something like a moving car. The sensor can thus use Doppler tech to capture information like velocity. With this additional data, the sensor can clean up an image. It is in higher resolution, and you can figure out with blue data if an object is coming toward you. If it’s red, it is moving away from you, based on a demo Vidal showed me. Lidwave’s own name means that it can focus on coherent light and measure the wave of light, as opposed to particles. That helps extract velocity and depth. “This is the fourth dimension that we provide,” he said. “We still use the light, but we use it differently.” The applications range from self-driving cars to industrial automation or smart cities, as it’s very useful to figure out the status of a moving object in many different scenarios. Investor interest Lidwave is designing LiDAR for a single chip. “We recognized the potential of LiDAR technology many years ago, but only now, with Lidwave, there is a clear pathway to scalability and wide adoption,” said Ben Wiener, founding partner at Jumpspeed Ventures, in a statement. “Lidwave’s revolutionary 4D chip overcomes the barriers of legacy LiDARs, reducing the complexities and costs associated with their deployment. We pride ourselves on investing in cutting-edge technologies that are positioned to fundamentally transform industries, and with this in mind, we look forward to the impact Lidwave will make.” Lidwave’s seed

Lidwave raises $10M to improve machine vision with on-chip 4D LiDAR Read More »

The one question you need to ask ChatGPT right now

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Do you use ChatGPT regularly? Do you have the “memory” feature turned on — which allows the chatbot from OpenAI to recall important information about you and your preferences? If so, navigate over to it when you have a free moment and enter in the following question: “From all of our interactions what is one thing that you can tell me about myself that I may not know about myself?” The question was proposed and suggested first on the social network X by Tom Morgan, founder of The Leading Edge newsletter and former director of client communications and marketing at Sapient Capital wealth management. The answers ChatGPT provides in response to this prompt may surprise and even move you in their insight into your character and work style. Presumably, it could work with other AI chatbots and assistants with persistent memory, such as Anthropic’s Claude Sonnet 3.5. For example, here’s what it responded when I asked it this very question (using the GPT-4o model, the default paid one). Other users have reported similarly, moving, insightful responses. Even OpenAI co-founder and CEO Sam Altman remarked on the trend on his account on X, stating “love this” and quote posting Morgan’s original post. Yet others, such as AI researcher and expert Simon Willison, disagree that the trend reveals anything particularly insightful about the user. Posting on X as well, Willison likened the responses to a “horoscope generator.” However, I disagree with this take as at least in my case — and presumably for all those who have ChaGPT’s memory feature enabled (read how to turn it on here) — the chatbot is taking into account whatever is stored in its memory to answer you, and even if it does not derive insights from every single interaction you have with it, it clearly knows information about you that it can use to attempt some sort of value-judgement and introspective answer (as evidenced by the fact that my response noted I was a journalist). Still others have posted variations on the original question proposed by Morgan, noting the curious user would do well to check out the variation in responses by switching the underlying model powering ChatGPT from the default GPT-4o or 4o mini to OpenAI’s new o1 preview reasoning model. Others have altered the prompt to receive brutal criticism and honesty. And still others have completely other ideas for questions you could ask the chatbot, such as requesting it “roast” you in the style of a Comedy Central special. Regardless of which style question you decide to ask ChatGPT, or any AI chatbot for that matter, regular users might find it interesting, amusing, and potentially revealing to learn what the chatbot says it knows about you — and more importantly, it may inspire you to think differently about yourself today. Altogether, the interest in using AI models to find out more about ourselves and our own habits reveals how much potential they have, far beyond simply assisting with work or school assignments. Indeed, as the generative AI era approaches its 2nd year anniversary (since the November 2022 launch of ChatGPT), this question and the others like it show just how much AI has become embedded into the fabric of our lives and society, and how the more we use it, the more interesting new uses for it people find. source

The one question you need to ask ChatGPT right now Read More »

Strella raises $4 million to automate market research with AI-powered customer interviews

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Strella, a startup using artificial intelligence to automate and accelerate customer research, announced today that it has raised $4 million in seed funding led by Decibel, with participation from Unusual Ventures. The company’s AI-powered platform aims to deliver human insights up to 10 times faster and at half the cost of traditional research methods. Founded by Lydia Hylton and Priya Krishnan, Strella is tackling a long-standing challenge in market research: the trade-off between speed and depth of customer insights. The company’s AI moderator can conduct interviews, analyze responses, and synthesize findings in real-time, dramatically condensing timelines for gathering qualitative feedback. Strella’s AI-powered interview platform prepares to connect with a participant for a study on online grocery shopping habits. The interface showcases the blend of technology and human interaction that defines the company’s approach to market research. (Credit: Strella) AI interviews: The future of scalable qualitative research “Traditionally, if you wanted any scale in a customer research project, you had to run surveys. It’s way too painful to do human-led interviews if you want to have 30, 40, 50 interviews on a topic,” said Lydia Hylton, Co-Founder and CEO of Strella, in an interview with VentureBeat. “We’re now able to get the richness of qualitative feedback that you get from a conversation, but at the scale of a survey and at the speed of a survey.” The platform is designed to work alongside human researchers, allowing companies to blend AI-moderated and human-led interviews within the same system. This flexibility addresses concerns about losing the human touch in customer interactions. “We’ve designed our platform to be conducive to human-centered research as well,” explained Priya Krishnan, Co-Founder of Strella. “Let’s say you want to run a research project and you want to interview 10 of your customers, we give you the flexibility to choose to use the AI moderator as much or as little as you want.” Strella’s AI-powered interview platform showing a customizable questionnaire for online grocery shopping habits. The interface allows researchers to easily add questions, tasks, and media elements to gather comprehensive customer insights. (Credit: Strella) Enhancing customer feedback: Strella’s approach Strella’s method could significantly alter how companies gather customer feedback and inform product decisions. By lowering the time and cost barriers to qualitative research, the platform may enable more frequent and comprehensive customer engagement across various industries. The company reports it has already signed on 15 customers, including notable names like Duolingo and Spanx. This early traction in both the tech and consumer goods sectors suggests broad applicability for Strella’s technology. Jessica Leao, partner at Decibel, highlighted the potential impact of Strella’s technology: “You get to transform this entire world of quantitative research into qualitative research, because you’re no longer blocked on time. You’re no longer blocked on scheduling.” However, Strella enters a competitive field. Established players like Qualtrics dominate in quantitative research, while numerous startups are leveraging AI for various aspects of market research. Strella’s differentiation lies in its end-to-end automation of the qualitative research process, from interview moderation to insight synthesis. The AI-driven future of market research: Opportunities and challenges The funding round comes at a time of growing interest in AI applications for business intelligence. As companies seek to become more data-driven and customer-centric, tools that can rapidly deliver actionable insights are increasingly valuable. Looking ahead, Strella aims to expand its reach across industries and company sizes. “We really want customer research to be accessible for teams of all sizes, across industries,” Krishnan said. “Up until now, research has really only been something that medium to larger companies have had the resources to do.” As Strella emerges from stealth mode with this funding announcement, it faces a twofold challenge: proving its AI can consistently deliver high-quality insights across diverse research scenarios, and convincing businesses to shift away from established research methodologies. The company’s success hinges not just on technological prowess, but on its ability to change deeply ingrained corporate habits around customer feedback. If Strella can overcome these hurdles, it may usher in a new era where AI-driven qualitative research becomes as commonplace as surveys are today. In a business world increasingly driven by data, Strella’s approach could be the difference between companies that truly understand their customers and those that are left guessing. source

Strella raises $4 million to automate market research with AI-powered customer interviews Read More »

SAP adds more open source LLM support, turns Joule into a collaborative agent

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More SAP announced today the expansion of its generative AI copilot Joule’s capabilities to support up to 80% of its customers’ most common business tasks. This will enable it to be a collaborative agent that can accomplish complex workflows. For customers to get the most value from Joule and future innovations, they must be on SAP cloud platforms and systems. SAP is accelerating product innovation on the cloud in the hopes of attracting more customers to its RISE with SAP initiative. Launched in Jan. 2021, RISE with SAP aims to guide customers’ transition from on-premises SAP ERP systems to the cloud, modernizing processes along the way. SAP reported that its cloud revenue increased by 25% in Q2 2024, with the Cloud ERP Suite growing by 33% as a result, demonstrating that RISE is effective. Joule’s infusion of new features signals how serious SAP is about moving its customers to the cloud. SAP says on-premises customers can still use Joule. However, they will need to use the SAP Integration Suite to connect their existing infrastructure to SAP’s cloud services, enabling Joule to access and process data while extending AI capabilities to their on-premise environments. Additional announcements at TechEd 2024 introduced additional open source large language model (LLM) support as part of the SAP Generative AI Hub, introduced the SAP Knowledge Graph, showcasing developer enhancements in SAP Build, highlighted specific use cases and reaffirmed the company’s commitment to upskill 2 million people by 2025. SAP placing a strategic bet on Joule’s new agentic AI strengths   Designed as a cloud-native AI assistant that is core to SAP’s Business Technology Platform (BTP), Joules’ ability to integrate and scale with all current and future apps, modules and platform environments to accelerate customers to the cloud further. SAP made a strategic bet with BTP, believing their customers would see the value of a unified cloud platform over the legacy on-premises ERP systems, which earned a reputation for being challenging to integrate real-time data and third-party applications with. SAP doubling down on Joule shows they’re working to reverse their proprietary ways of the past and go after a more open cloud-based architecture that can deliver the accuracy, speed and scale their customers need.  Embedded across SAP’s ecosystem, Joule can already understand business contexts, deliver data-driven insights and enable customers to get more work done using its advanced natural language processing and machine learning (ML) capabilities. With 80% of the most common business tasks now part of Joule, SAP is betting their latest gen AI copilot will be compelling enough for more customers to join RISE and move to the cloud. Source: Presentation at VB Transform 2024 by Yaad Oren. SAP highlighted two use cases at TechEd 2024 to demonstrate the power of these agents: Dispute Management: AI agents autonomously resolve disputes related to invoices, credits and payments. This will significantly reduce manual intervention. Financial Accounting: Specialized agents streamline billing, ledger updates and invoice processing. This will ensure accuracy and efficiency. “Collaborative AI agents from SAP represent a new era in enterprise productivity,” said Philipp Herzig, Chief Artificial Intelligence Officer at SAP. “Our ability to integrate multiple specialized AI agents into Joule allows businesses to automate intricate workflows and focus on tasks that truly require human ingenuity.” Three new open-source models added to the SAP gen AI Hub The open-source LLM announcements at TechEd 2024 show that SAP is continuing to develop its generative AI hub strategy. Open-source models now available on the SAP Generative AI Hub include Meta’s Llama 3.1 70B model, Mistral Large 2 (available by the end of 2024), and Mistral Codestral. SAP’s announcements this week at TechEd 2024 show that it is committed to keeping up with the quicker pace of innovation in open-source LLMs. The SAP Generative AI Hub, positioned as a central node within SAP’s Business Technology Platform (BTP), connects both proprietary and open-source models. Developer tools like the Extensibility Wizard and SAP Build enhancements streamline this integration, reflecting SAP’s push toward a developer-friendly environment. Source: SAP SAP continues to go on the offensive when it comes to providing more significant support for open-source LLMs. Their series of announcements this week at TechEd 2024 show they’re committed to keeping up with the quickening pace of innovation in open source LLMs in general and across all of open source strategically. One of the main goals of going on the offensive with open-source LLMs is to enable enterprise-level standards for customers to adopt while ensuring reliability, scalability, security and performance, all within the SAP AI Core. The following table summarizes the three open-source LLMs SAP announced support for during TechEd 2024:   Source: VentureBeat analysis SAP’s new AI era has arrived Long known for its dominance in the ERP market, SAP shows signs of successfully reinventing itself in a new AI era. Lessons learned on usability, the need for a more open, adaptive system architecture, and the need to provide customers with more flexibility in how they use data, including open-source LLMs, now dominate their product strategies. Their Business Technology Platform with an SAP AI Core reflects a more forward-thinking SAP that realizes their quickest path to value is recognizing customers need the freedom to go open source when they choose. source

SAP adds more open source LLM support, turns Joule into a collaborative agent Read More »

Microsoft just dropped Drasi, and it could change how we handle big data

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Microsoft has launched Drasi, a new open-source data processing system designed to simplify the detection and reaction to critical events in complex infrastructures. This release follows last year’s launch of Radius, an open application platform for the cloud, and further cements Microsoft’s commitment to open-source innovation in cloud computing. Mark Russinovich, CTO and Technical Fellow at Microsoft Azure described Drasi as “the birth of a new category of data processing system” in an interview with VentureBeat. He explained that Drasi emerged from recognizing the growing complexity in event-driven architectures, particularly in scenarios like IoT edge deployments and smart building management. From complexity to clarity “We saw massive simplification of the architecture, just incredible developer productivity,” Russinovich said, highlighting Drasi’s potential to reduce the complexity of reactive systems. Drasi works by continuously monitoring data sources, evaluating incoming changes through predefined queries and executing automated reactions when specific conditions are met. This approach eliminates the need for inefficient polling mechanisms or constant data source querying, which can lead to performance bottlenecks in large-scale systems. The system’s key innovation lies in its use of continuous database queries to monitor state changes. “What Drasi does is takes that and says, I just have a database query… and when an event comes in… Drasi knows, ‘Hey, part of this query is satisfied,’” Russinovich explained. Open-source synergy Microsoft’s decision to release Drasi as an open-source project aligns with its broader strategy of contributing to the open-source community, particularly in cloud-native computing. This strategy is evident in the recent launch of Radius, which addresses challenges in deploying and managing cloud-native applications across multiple environments. “We believe in contributing to the open-source community because… many enterprises are making strategies that are, especially around Cloud Native Computing, centered on open-source software and open governance,” Russinovich said. The Azure Incubations team, responsible for both Drasi and Radius, has a track record of launching successful open-source projects including Dapr, KEDA and Copacetic. These projects are all available through the Cloud Native Computing Foundation (CNCF). While Radius focuses on application deployment and management, Drasi tackles the complexities of event-driven architectures. Together, these tools represent Microsoft’s holistic approach to addressing the challenges faced by developers and operations teams in modern cloud environments. Drasi’s continuous queries usher in a new era of reactive systems Looking ahead, Russinovich hinted at the possible integration of Drasi into Microsoft’s data services. “It looks like it’ll probably slot into our data services, where you have Drasi integrated into Postgres database or Cosmos DB, or as a standalone service that integrates across these,” he said. The introduction of Drasi could have significant implications for businesses grappling with the complexities of cloud-native development and event-driven architectures. By simplifying these processes, Microsoft aims to enable organizations to build more responsive and efficient applications, potentially leading to improved operational efficiency and faster time-to-market for new features. As with Radius, Microsoft is actively seeking feedback from partners and early adopters to refine Drasi and address any scaling, performance, or security concerns that may arise in production environments. The true test for both tools will be their adoption and performance in real-world scenarios across various cloud providers and on-premises environments. As businesses increasingly rely on cloud-native applications and real-time data processing, tools like Drasi and Radius could play a crucial role in managing the growing complexity of modern software systems. Whether Drasi will indeed establish itself as a new category of data processing system, as Russinovich suggests, remains to be seen, but its introduction marks another significant step in Microsoft’s ongoing efforts to shape the future of cloud computing through open-source innovation. source

Microsoft just dropped Drasi, and it could change how we handle big data Read More »