The ‘strawberrry’ problem: How to overcome AI’s limitations

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More By now, large language models (LLMs) like ChatGPT and Claude have become an everyday word across the globe. Many people have started worrying that AI is coming for their jobs, so it is ironic to see almost all LLM-based systems flounder at a straightforward task: Counting the number of “r”s in the word “strawberry.” They are not exclusively failing at the alphabet “r”; other examples include counting “m”s in “mammal”, and “p”s in “hippopotamus.” In this article, I will break down the reason for these failures and provide a simple workaround. LLMs are powerful AI systems trained on vast amounts of text to understand and generate human-like language. They excel at tasks like answering questions, translating languages, summarizing content and even generating creative writing by predicting and constructing coherent responses based on the input they receive. LLMs are designed to recognize patterns in text, which allows them to handle a wide range of language-related tasks with impressive accuracy. Despite their prowess, failing at counting the number of “r”s in the word “strawberry” is a reminder that LLMs are not capable of “thinking” like humans. They do not process the information we feed them like a human would. Conversation with ChatGPT and Claude about the number of “r”s in strawberry. Almost all the current high performance LLMs are built on transformers. This deep learning architecture doesn’t directly ingest text as their input. They use a process called tokenization, which transforms the text into numerical representations, or tokens. Some tokens might be full words (like “monkey”), while others could be parts of a word (like “mon” and “key”). Each token is like a code that the model understands. By breaking everything down into tokens, the model can better predict the next token in a sentence.  LLMs don’t memorize words; they try to understand how these tokens fit together in different ways, making them good at guessing what comes next. In the case of the word “hippopotamus,” the model might see the tokens of letters “hip,” “pop,” “o” and “tamus”, and not know that the word “hippopotamus” is made of the letters — “h”, “i”, “p”, “p”, “o”, “p”, “o”, “t”, “a”, “m”, “u”, “s”. A model architecture that can directly look at individual letters without tokenizing them may potentially not have this problem, but for today’s transformer architectures, it is not computationally feasible. Further, looking at how LLMs generate output text: They predict what the next word will be based on the previous input and output tokens. While this works for generating contextually aware human-like text, it is not suitable for simple tasks like counting letters. When asked to answer the number of “r”s in the word “strawberry”, LLMs are purely predicting the answer based on the structure of the input sentence. Here’s a workaround While LLMs might not be able to “think” or logically reason, they are adept at understanding structured text. A splendid example of structured text is computer code, of many many programming languages. If we ask ChatGPT to use Python to count the number of “r”s in “strawberry”, it will most likely get the correct answer. When there is a need for LLMs to do counting or any other task that may require logical reasoning or arithmetic computation, the broader software can be designed such that the prompts include asking the LLM to use a programming language to process the input query. Conclusion A simple letter counting experiment exposes a fundamental limitation of LLMs like ChatGPT and Claude. Despite their impressive capabilities in generating human-like text, writing code and answering any question thrown at them, these AI models cannot yet “think” like a human. The experiment shows the models for what they are, pattern matching predictive algorithms, and not “intelligence” capable of understanding or reasoning. However, having a prior knowledge of what type of prompts work well can alleviate the problem to some extent. As the integration of AI in our lives increases, recognizing its limitations is crucial for responsible usage and realistic expectations of these models.  Chinmay Jog is a senior machine learning engineer at Pangiam. DataDecisionMakers Welcome to the VentureBeat community! DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation. If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers. You might even consider contributing an article of your own! Read More From DataDecisionMakers source

The ‘strawberrry’ problem: How to overcome AI’s limitations Read More »

Navigating AI and APIs in Telecommunications: Critical Questions and Insights

The telecommunications sector stands at the forefront of technological innovation, especially in the realms of Artificial Intelligence (AI) and Application Programming Interfaces (APIs). The World Economic Forum highlights the critical role of telecommunications in managing AI risks and ensuring the security of vital infrastructure, provided data protection and ethical considerations are prioritized. At a time in which AI and network APIs are poised to reshape the telecommunications landscape, numerous pivotal questions emerge. Watch on-demand webinar: Revenue Enablers for the Future Telco: APIs, AI, and Emerging Tech. To address this curiosity and foster a deeper understanding, we’ve compiled a list of the most pressing questions currently dominating the industry discourse. Let’s embark to uncover the answers that will shape the future of telecommunications. Q: What is Generative AI (GenAI) and how can it be used in telecommunications? GenAI involves algorithms that enable computers to create new content from existing data. It’s being evaluated for use in enhancing customer engagement, network optimization, and creating new services. Q: What are the investment trends in GenAI among telco companies? More than a third of the companies surveyed are doing initial testing of GenAI models and focused proofs of concept, and almost a fifth are investing significantly in GenAI.  There’s a notable interest in developing GenAI use cases, especially in network optimization and customer service enhancements. Q: How are telcos leveraging network APIs for monetization and service improvement? Telcos are using network APIs to deliver new applications, improve customer experiences, and partner with developers for B2B2X applications. The adoption of 5G and network APIs is seen as a significant opportunity for monetization. Q: What are the challenges and strategies for telco companies in adopting AI and APIs? Challenges include skills gaps, data privacy concerns, and ensuring API interoperability. Strategies involve partnering with hyperscalers, focusing on customer-centric values like transparency and empowerment, and investing in cloud-native technologies and advanced orchestration solutions. AI requires data as an input and so breaking down data silos to create readily available single source of truth becomes critical to any AI strategy. Q: What is the future outlook for AI and API adoption in the telco sector? AI and API adoption is expected to drive network and operational efficiency, enhance customer experiences, and open new revenue streams through innovative services and partnerships. Collaboration with technology partners and a focus on ecosystem-driven approaches are key to leveraging these technologies effectively. Empowering Your Strategy with IDC Tools Planning: Understand the Total Addressable Market (TAM) and Serviceable Available Market (SAM) for informed business decisions with IDC’s custom data and market models. Marketing: Develop a comprehensive messaging strategy that aligns with your campaigns and sales activities. IDC’s marketing messaging workshops can guide your thematic planning. Sales Enablement: Equip your sales team with the skills to showcase GenAI features effectively and address objections through IDC’s sales mastery classes and GenAI sales playbook. For a deeper dive into how AI and APIs are revolutionizing the telecommunications industry, we invite you to watch our on-demand webinar. Click here to watch our on-demand webinar “Revenue Enablers for the Future of Telco: APIs, AI and Emerging Tech” to unlock valuable insights and answers to your most pressing questions. Learn More About IDC’s: AI’s Research and Content Marketing Support Content Marketing Services Buyer Behavior Practice Sales Enablement Practice IDC’s AI Use Case Discovery Tool source

Navigating AI and APIs in Telecommunications: Critical Questions and Insights Read More »

Get Ready For GenAI Chatbots: The State Of Conversational AI

Talk about change! As we approach the two-year anniversary of the announcement of OpenAI’s launch of GPT-3.5, conversational AI has been reinvented to incorporate generative AI (genAI) to take advantage of the many ways that this technology can make self-service applications smarter. Previously, the conversational AI tools used to create chatbots and intelligent virtual agents (IVAs) required specific training for every interaction, including identifying the many ways someone might ask any question that the system was set up to handle. Conversations had to flow in a very specific order, as the systems were very limited in their ability to switch between topics without a great deal of very specific training and guidelines. It took a lot of work to build applications that were often disappointing to users. GenAI significantly shortens the development time for applications while creating much better user experiences, replacing stilted and awkward conversations with comfortable, almost human interactions. This has a revolutionary impact on the chatbots and IVAs built with conversational AI systems that utilize generative AI. New chatbots can provide much more information to customers and deliver it in a comfortable, conversational manner. These are early days for genAI-driven conversational AI solutions, but early results are impressive and the potential is off the charts. My latest report, The State Of Conversational AI, looks at where conversational AI is today in this crazy, fast-moving market moment. While the report looks at conversational AI across several areas, most readers of my blogs are focused on customer service, so I’ll spin this blog in that direction. Here are some of the key findings in the report that customer service leaders should pay attention to as they consider adding conversational AI to their contact center. Prioritize Customer Experience Over Cost Savings While the cost benefits of automation are undeniable, focusing solely on cost reduction can undermine customer loyalty. The report emphasizes the importance of balancing efficiency with customer satisfaction. When customers can use self-service to get quick answers to simple questions and agents are available to help tackle the hard stuff, everyone wins. Implement Robust Guardrails For Safe AI Interactions Safety and reliability are paramount when deploying conversational AI. The report highlights the need for guardrails such as retrieval-augmented generation and finely tuned large language models to ensure that AI interactions are secure and trustworthy. This enables applications such as “infrequently asked questions,” where knowledge bases, or even a set of PDFs, can provide answers to many customer questions without needing to predefine them. This creates solutions that are fast to build, useful for customers, and reasonably safe from hallucinations, since all answers must come from a specific data source. Drive Positive Customer Experiences With Transaction Workflows Self-service applications that answer customer questions are helpful, but without the ability to connect to back-end systems, a chatbot or IVA is of limited value. If you can’t check on the status of an order, schedule an appointment, or make a purchase, automation will fall short in customers’ eyes. Effective management of transaction workflows is essential to deliver positive customer experiences. The state of conversational AI is at a pivotal juncture, offering unprecedented opportunities for customer service and customer experience leaders. By embracing generative AI, prioritizing customer experience, implementing robust safety measures, future-proofing self-service offerings, and managing transaction workflows effectively, organizations can unlock the full potential of conversational AI. source

Get Ready For GenAI Chatbots: The State Of Conversational AI Read More »

Trump and Harris Supporters Differ on Mass Deportations but Favor Border Security, High-Skilled Immigration

Majority of Trump backers say more immigrants would make life worse for people like them; most Harris backers say life wouldn’t change Pew Research Center conducted this study to understand Americans’ views on immigration and immigration policy prior to the 2024 presidential election.For this analysis, we surveyed 9,201 adults – including 7,569 registered voters – from Aug. 5 to 11, 2024. Everyone who took part in this survey is a member of the Center’s American Trends Panel (ATP), a group of people recruited through national, random sampling of residential addresses who have agreed to take surveys regularly. This kind of recruitment gives nearly all U.S. adults a chance of selection. Surveys were conducted either online or by telephone with a live interviewer. The survey is weighted to be representative of the U.S. adult population by gender, race, ethnicity, partisan affiliation, education and other factors. Read more about the ATP’s methodology. Here are the questions used for the report, the topline and its methodology. Harris supporters are respondents who said they would vote for Kamala Harris, the Democrat, if the 2024 presidential election were held today, or those who said they would not vote for any of the candidates but lean toward Harris. Trump supporters are respondents who said they would vote for Donald Trump, the Republican, if the 2024 presidential election were held today, or those who said they would not vote for any of the candidates but lean toward Trump. In a presidential race where immigration has become a key and contentious issue, a Pew Research Center survey shows wide differences and common ground on immigration policy among registered voters who support Donald Trump and Kamala Harris. The candidates have taken sometimes sharply different positions on immigration issues that divide their supporters: Nearly nine-in-ten Trump supporters (88%) favor mass deportations of immigrants living in the country illegally. In contrast, only 27% of Harris supporters favor mass deportations while 72% oppose. More than a third of Trump supporters (37%) favor allowing undocumented immigrants to live and work in the U.S. if they are married to an American citizen, compared with 80% of Harris supporters who say the same. About half of Trump supporters (49%) support admitting more civilian refugees who are escaping war or violence, but a majority of Harris supporters (85%) say the same. On other immigration issues, Trump and Harris supporters share more common ground: Improving border security is supported by large majorities of both Trump supporters (96%) and Harris supporters (80%). Admitting more high-skilled immigrants is favored by 71% of Trump supporters and 87% of Harris supporters. These findings come from a bilingual, nationally representative survey of 9,201 adults – including 7,569 registered voters – conducted Aug. 5-11, 2024, ahead of the Democratic National Convention and about a month before the Sept. 10 presidential debate. The U.S. immigrant population has grown sharply over the decades, from 9.6 million in 1970 to 31.1 million in 2000 and almost 48 million in 2023. These totals account for immigrants in the country both legally and illegally. Immigrants make up about 14.3% of the nation’s population, a near-record high. They are dispersed across all states and metro areas. And nearly three-quarters of registered voters say they know someone who was born outside of the United States, according to the survey. Trump supporters have a more negative view than Harris supporters on the impact of immigrants on their lives. A majority of Trump supporters (59%) say that the increasing number of immigrants will make things worse for people like them. But a majority of Harris supporters (65%) say that the increasing number of immigrants will make no difference in their lives, with only 11% saying it will make life worse for people like them. Trump supporters have a mixed view about the impact of legal immigration on the country. A majority say immigrants living in the country legally either make the economy better (31%) or don’t have much of an effect on it (38%), while 29% say these immigrants make the economy worse. By contrast, a clear majority of Harris supporters (62%) say immigrants living in the U.S. legally make the economy better. Trump supporters have a much more negative view of the impact of illegal immigration. An overwhelming majority of Trump supporters (92%) say immigrants living in the country illegally make crime worse, compared with 37% of Harris supporters. Notably, American voters have become less likely than in recent years to say undocumented immigrants should be allowed to stay in the country. About six-in-ten U.S. registered voters (59%) in the new survey say undocumented immigrants should be allowed to stay in the country legally if certain requirements are met, down from 77% who said the same in 2017. Most Harris supporters (87%) say that there should be a way for undocumented immigrants to stay in the country legally, compared with only a third of Trump supporters (33%). In the 2024 presidential race, Trump supporters place far more importance on immigration than Harris supporters. For Trump supporters, 82% say immigration is very important to their vote in the 2024 presidential election, trailing only the economy in importance, according to a Center survey conducted in late August to early September. By contrast, just 39% of Harris supporters say the issue of immigration is very important to their presidential vote this year, behind all other issues asked about in the survey, including health care, Supreme Court appointments, the economy, abortion, gun policy and climate change. Immigration to U.S. has rebounded from pandemic-era lows Legal immigration to the U.S. has started to increase after a steep decline during the COVID-19 outbreak, according to Center analysis of data from the Department of Homeland Security. In 2023, about 1.1 million immigrants became lawful permanent residents of the U.S., a return to pre-pandemic levels. In addition, a record number of immigrants crossed the U.S.-Mexico border without authorization at the end of 2023, though these flows have since dropped sharply. Both Republicans and Democrats have been critical of the

Trump and Harris Supporters Differ on Mass Deportations but Favor Border Security, High-Skilled Immigration Read More »

OpenAI unveils experimental ‘Swarm’ framework, igniting debate on AI-driven automation

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI has unveiled “Swarm,” an experimental framework designed to orchestrate networks of AI agents. This unexpected release has ignited intense discussions among industry leaders and AI ethicists about the future of enterprise automation, despite the company’s emphasis that Swarm is not an official product. Swarm provides developers with a blueprint for creating interconnected AI networks capable of communicating, collaborating, and tackling complex tasks autonomously. While the concept of multi-agent systems isn’t new, Swarm represents a significant step in making these systems more accessible to a broader range of developers. (credit: x.com/shyamalanadkat) The next frontier in enterprise AI: Multi-agent systems and their potential impact The framework’s potential business applications are extensive. A company using Swarm-inspired technology could theoretically create a network of specialized AI agents for different departments. These agents might work together to analyze market trends, adjust marketing strategies, identify sales leads, and provide customer support—all with minimal human intervention. This level of automation could fundamentally alter business operations. AI agents might handle tasks currently requiring human oversight, potentially boosting efficiency and freeing employees to focus on strategic initiatives. However, this shift prompts important questions about the evolving nature of work and the role of human decision-making in increasingly automated environments. Navigating the ethical minefield: Security, bias, and job displacement in AI networks Swarm’s release has also rekindled debates about the ethical implications of advanced AI systems. Security experts stress the need for robust safeguards to prevent misuse or malfunction in networks of autonomous agents. Concerns about bias and fairness also loom large, as decisions made by these AI networks could significantly impact individuals and society. The specter of job displacement adds another layer of complexity. The potential of technologies like Swarm to create new job categories contrasts with fears that it may accelerate white-collar automation at an unprecedented pace. This tension highlights the need for businesses and policymakers to consider the broader societal impacts of AI adoption. Some developers have already begun exploring Swarm’s potential. An open-source project called “OpenAI Agent Swarm Project: Hierarchical Autonomous Agent Swarms (HOS)” demonstrates a possible implementation, including a hierarchy of AI agents with distinct roles and responsibilities. While intriguing, this early experiment also underscores the challenges in creating effective governance structures for AI systems. From experiment to enterprise: The future of AI collaboration and decision-making OpenAI has been clear about Swarm’s limitations. Shyamal Anadkat, a researcher at the company, stated on Twitter: “Swarm is not an official OpenAI product. Think of it more like a cookbook. It’s experimental code for building simple agents. It’s not meant for production and won’t be maintained by us.” ‼️ since this started trending unexpectedly: swarm is not an official openai product. think of it more like a cookbook. it’s experimental code for building simple agents. it's not meant for production and won’t be maintained by us — shyamal (@shyamalanadkat) October 12, 2024 This caveat tempers expectations and serves as a reminder that multi-agent AI development remains in its early stages. However, it doesn’t diminish Swarm’s significance as a conceptual framework. By providing a tangible example of how multi-agent systems might be structured, OpenAI has given developers and businesses a clearer vision of potential future AI ecosystems. For enterprise decision-makers, Swarm serves as a catalyst for forward-thinking. While not ready for immediate implementation, it signals the direction of AI technology’s evolution. Companies that begin exploring these concepts now—considering both their potential benefits and challenges—will likely be better positioned to adapt as the technology matures. Swarm’s release also emphasizes the need for interdisciplinary collaboration in navigating the complex landscape of advanced AI. Technologists, ethicists, policymakers, and business leaders must work together to ensure that the development of multi-agent AI systems aligns with societal values and needs. The conversation around AI will increasingly focus on these interconnected systems. Swarm offers a valuable preview of the questions and challenges that businesses and society will face in the coming years. The tech world now closely watches to see how developers will build upon the ideas presented in Swarm, and how OpenAI and other leading AI companies will continue to shape the trajectory of this transformative technology. source

OpenAI unveils experimental ‘Swarm’ framework, igniting debate on AI-driven automation Read More »

Nailing Jello to the Wall: Can You Really Measure and Manage Enterprise Tech Debt?

Are legacy systems holding your company back? Do you have manual processes in place to fill the gaps from old technology that hasn’t been updated or maintained? Is your IT budget constrained by maintenance costs for old servers and operating systems within your ever-growing network? These are the various ways that technical debt is hampering innovation and progress within organizations. Tech debt is the hidden, back-office monster under the bed that everyone talks about but no one really knows how to attack it.  What Is Tech Debt?  At the height of the 2022 holiday travel season, Southwest Airlines experienced a massive outage of its scheduling system that affected 2 million customers and resulted in the cancellation of 16,900 flights. Southwest experienced an immediate 16% drop in its stock price and logged a loss of more than $800 million that fiscal year due to this outage.    Significant winter storms that year had disrupted air travel across the United States and forced most airlines to cancel flights and scramble to rebook their customers. While other airlines recovered in a matter of days, Southwest Airlines took weeks to return to normal activity. Its recovery from this event was significantly hampered because of its legacy IT systems. Its IT team and leadership had known for quite some time that their systems needed upgrades and critical maintenance, but this work was never given priority or funding. Southwest was assessed another $140 million fine a year after this event for its failure associated with the outage. The outage was pure tech debt, but the problem and the potential risk were never sufficiently addressed within the organization.  Why would executives accept such high risk related to their legacy systems? One probable explanation is that while they recognized that tech debt was an issue, no one was able to sufficiently measure it, and the systemic risk was not sufficiently quantified. This is a problem for most organizations today; they know that they have challenges from their tech debt, but they don’t understand the depth of those challenges, nor do they recognize the extent to which they are accepting systemic risk related to that tech debt.  The consequences of tech debt permeate the enterprise. They encompass hidden IT costs, increased operational risks, compromised security, hindered innovation, and challenges in adapting to change. Tech debt has evolved into a multifaceted challenge that demands the attention of leaders, from the CIOs responsible for technology strategy to the CEOs focused on the organization’s bottom line. Time-to-market decisions can cause tech debt to accumulate across the entire infrastructure in the same way that it does for custom code. Enterprise tech debt is not always the result of poor decisions; it can easily accumulate within an enterprise as the result of a rapidly changing technology stack and extremely interconnected business-critical systems. The more tightly coupled enterprise systems are, the more prone they become to enterprise tech debt and the more challenging they become to update due to the interconnected interfaces, sharing of data, and intertwined data pathways. Thus, the maintenance, support, and improvement of these tightly coupled, critical business systems become much more challenging and expensive, and they often get deprioritized in favor of more revenue-generating activities. This lack of maintenance is one of the factors of accumulating tech debt and increasing the tech debt leverage (a measure of tech debt as a percentage of the entire enterprise tech stack) within an organization.  How Do You Measure and Manage Tech Debt?  As management expert Peter Drucker famously said, “If you can’t measure it, you can’t manage it.” Unless you track and measure your tech debt, how will you manage it? There are a number of steps to take to measure and manage tech debt. They are:   Establish a clear definition for what your organization considers tech debt. Without a clear definition of tech debt, the term becomes a meaningless bucket into which everything that isn’t the new shiny technology can be dumped.  Put in place mechanisms for measuring that debt across your enterprise. This involves:   Evaluating how much time and effort is needed to maintain old systems  Measuring the security vulnerabilities in those old systems  Assessing the duplicative costs of similar but disparate technologies within your tech stack that exist in silos throughout the organization   Bring all these measurements together into a single view so that you can present your tech debt leverage to the executive team and to the board and use this to clarify the systemic risk within the organization that you are accepting by not addressing the tech debt.   The corollary of Drucker’s famous quote still holds true and applies directly to enterprise tech debt: Once you measure it, you can manage it.  source

Nailing Jello to the Wall: Can You Really Measure and Manage Enterprise Tech Debt? Read More »

How 116 Retailers Competed With Amazon’s October 2024 Prime Event

Amazon has wrapped up its October 2024 Prime Big Deal Days event — a shopping bonanza that effectively kicked off the end-of-year shopping season. Much like in 2023, we reviewed 116 brands and retailers to see what they did — and didn’t — offer during the same time (note: we update the mix of brands from year to year). Some of our key findings include: Sixty-four of the brands we reviewed participated in the fall event. The majority ran some form of discount or promotion, and 20 of those held a sitewide sale. Twelve brands offered free shipping, four of which offered the perk for a limited time only. Some brands that did not participate offered their shoppers alternatives — for example, Dyson offered an in-store hardware trade-in event. Members-only promotions are evident. Brands are rewarding their members and loyalty account holders — nine participating retailers offered a members-only perk during this event. Perks included offers such as free standard shipping without a minimum purchase amount, additional loyalty points, sweepstakes, and extra discounts. Department store retailer Macy’s reserved no-minimum-spend free shipping only for its credit card holders. Marketing messaging focused on “prime” and “fall.” Brands used a broad range of sale themes on their websites this week. Sixteen of the 116 retailers evaluated stuck to a traditional “Prime” theme with “48-hours only” or “two days only” messaging on their home pages. Other retailers leaned into the season, with six brands using specific “fall season” sale messaging. With end-of-year holiday shopping officially underway, stay tuned for more holiday coverage and insights as part of Forrester’s 2024 Holiday Prep Series. Forrester clients who would like to discuss the 2024 holiday season and your business — please schedule an inquiry or guidance session with us!   (coauthored with Delilah Gonzalez, senior research associate) source

How 116 Retailers Competed With Amazon’s October 2024 Prime Event Read More »

Inflection AI helps address RLHF uniformity issues with unique models for enterprise, agentic AI

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More A recent exchange on X (formerly Twitter) between Wharton professor Ethan Mollick and Andrej Karpathy, the former Director of AI at Tesla and co-founder of OpenAI, touches on something both fascinating and foundational: many of today’s top generative AI models — including those from OpenAI, Anthropic, and Google— exhibit a striking similarity in tone, prompting the question: why are large language models (LLMs) converging not just in technical proficiency but also in personality? The follow-up commentary pointed out a common feature that could be driving the trend of output convergence: Reinforcement Learning with Human Feedback (RLHF), a technique in which AI models are fine-tuned based on evaluations provided by human trainers.  Building on this discussion of RLHF’s role in output similarity, Inflection AI’s recent announcements of Inflection 3.0 and a commercial API may provide a promising direction to address these challenges. It has introduced a novel approach to RLHF, aimed at making generative models not only consistent but also distinctively empathetic.  With an entry into the enterprise space, the creators of the Pi collection of models leverage RLHF in a more nuanced way, from deliberate efforts to improve the fine-tuning models to a proprietary platform that incorporates employee feedback to tailor gen AI outputs to organizational culture. The strategy aims to make Inflection AI’s models true cultural allies rather than just generic chatbots, providing enterprises with a more human and aligned AI system that stands out from the crowd. Inflection AI wants your work chatbots to care Against this backdrop of convergence, Inflection AI, the creators of the Pi model, are carving out a different path. With the recent launch of Inflection for Enterprise, Inflection AI aims to make emotional intelligence — dubbed  “EQ” — a core feature for its enterprise customers.  The company says its unique approach to RLHF sets it apart. Instead of relying on anonymous data-labeling, the company sought feedback from 26,000 school teachers and university professors to aid in the fine-tuning process through a proprietary feedback platform. Furthermore, the platform enables enterprise customers to run reinforcement learning with employee feedback. This enables subsequent tuning of the model to the unique voice and style of the customer’s company. Inflection AI’s approach promises that companies will “own” their intelligence, meaning an on-premise model fine-tuned with proprietary data that is securely managed on their own systems. This is a notable move away from the cloud-centric AI models many enterprises are familiar with — a setup Inflection believes will enhance security and foster greater alignment between AI outputs and the ways people use it at work. What RLHF is and isn’t RLHF has become the centerpiece of gen AI development, largely because it allows companies to shape responses to be more helpful, coherent, and less prone to dangerous errors. OpenAI’s use of RLHF was foundational to making tools like ChatGPT engaging and generally trustworthy for users. RLHF helps align model behavior with human expectations, making it more engaging and reducing undesirable outputs. However, RLHF is not without its drawbacks. RLHF was quickly offered as a contributing reason to a convergence of model outputs, potentially leading to a loss of unique characteristics and making models increasingly similar. Seemingly, alignment offers consistency, but it also creates a challenge for differentiation. Previously, Karpathy himself pointed out some of the limitations inherent in RLHF. He likened it to a game of vibe checks, and stressed that it does not provide an “actual reward” akin to competitive games like AlphaGo. Instead, RLHF optimizes for an emotional resonance that’s ultimately subjective and may miss the mark for practical or complex tasks.  From EQ to AQ To mitigate some of these RLHF limitations, Inflection AI has embarked on a more nuanced training strategy. Not only implementing improved RLHF, but it has also taken steps towards agentic AI capabilities, which it has abbreviated as AQ (Action Quotient). As White described in a recent interview, Inflection AI’s enterprise aims involve enabling models to not only understand and empathize but also to take meaningful actions on behalf of users — ranging from sending follow-up emails to assisting in real-time problem-solving. While Inflection AI’s approach is certainly innovative, there are potential short falls to consider. Its 8K token context window used for inference is smaller than what many high-end models employ, and the performance of their newest models has not been benchmarked. Despite ambitious plans, Inflection AI’s models may not achieve the desired level of performance in real-world applications.  Nonetheless, the shift from EQ to AQ could mark a critical evolution in gen AI development, especially for enterprise clients looking to leverage automation for both cognitive and operational tasks. It’s not just about talking empathetically with customers or employees; Inflection AI hopes that Inflection 3.0 will also execute tasks that translate empathy into action. Inflection’s partnership with automation platforms like UiPath to provide this “agentic AI” further bolsters their strategy to stand out in an increasingly crowded market. Navigating a post-Suleyman world Inflection AI has undergone significant internal changes over the past year. The departure of CEO Mustafa Suleyman in Microsoft’s “acqui-hire,” along with a sizable portion of the team, cast doubt on the company’s trajectory. However, the appointment of White as CEO and a refreshed management team has set a new course for the organization. After an initial licensing agreement with the Redmond tech giant, Inflection AI’s model development was forked by the two companies. Microsoft continues to build on a version of the model focused on integration with its existing ecosystem. Meanwhile, Inflection AI continued to independently evolve Inflection 2.5 into today’s 3.0 version, distinct from Microsoft’s. Pi’s… actually pretty popular Inflection AI’s unique approach with Pi is gaining traction beyond the enterprise space, particularly among users on platforms like Reddit. The Pi community has been vocal about their experiences, sharing positive anecdotes and discussions regarding Pi’s thoughtful and empathetic responses.  This grassroots popularity demonstrates that Inflection AI might

Inflection AI helps address RLHF uniformity issues with unique models for enterprise, agentic AI Read More »

Drowning in Data for Want of Information: Is Data Minimization Really Possible?

In the five-year span between 2016 and 2021, the average amount of data that organizations managed grew by 10 times, from 1.45PB in 2016 to 14.6PB in 2021.  We are extremely adept at generating data, not so much at extracting value from those data, and very challenged to destroy any data at all. Data hoarding, data sprawl, and data decay are all significant problems for contemporary companies, and these issues can create legal liability risk and operational inefficiencies. Yet data minimization efforts tend to be difficult from the start, mainly due to the fear of deleting something that may be of value at some point in the future.  The Data Problem  An academic doctoral study released in 2020 included several statistics highlighting the fact that data generation has increased exponentially in recent years, with no signs of this trend stopping. Among those metrics was the estimate that globally we are generating around 2.5 exabytes of new data per day, as well as the prediction by the U.S. Government Accountability Office that by 2025 there will be between 25 and 50 billion devices connected to the internet and actively generating data. That same study reported that organizations effectively use less than 5% of their available data. There are three potential problems that could cause this situation: Companies don’t know how to analyze the data they have, they don’t know what insights they could gain by analyzing the data, or they simply don’t know that they have the data in the first place.  One study found that nearly 85% of Fortune 500 organizations are unable to use their data effectively. Yet companies continue to store data in the hope that one day they might be able to analyze it appropriately and somehow extract insights from the “gold mine” of hoarded data they have accumulated. In thinking this way, they disregard the fact that most data have a shelf life that will reduce its viability before the company is able to extract valuable information from it.  When data is not properly curated or updated, it becomes outdated, inconsistent, and potentially unreliable. Data hoarding exacerbates each of these issues — resulting in lower data quality and accuracy — as it becomes exponentially more difficult to maintain, clean, and manage data as it constantly grows within the enterprise over time. Given the recent regulatory focus on data privacy, much of this data are not only pure tech debt but also increase liability for the company.  Analyzing rogue (incomplete, inaccurate, irrelevant, corrupt, incorrectly formatted, or duplicative) data is so problematic that the data science community says that a “rule of 10” applies to it. The rule states that it will cost 10 times more for a data scientist to complete a unit of work when the data is unclean compared with when the data is perfect.   Why Do We Need to Be Concerned About Data Minimization?  An IBM survey found that poor data quality costs the U.S. economy approximately $3.1 trillion annually and that companies are losing up to 12% of their potential revenue due to rogue data within their business processes.  Storing unnecessary data can expose an organization to security and compliance risks and lead to compliance violations, especially under regulations like GDPR or CCPA. This is why these privacy laws have requirements for data minimization. According to the Colorado Privacy Act, the processing of personal data “shall be solely to the extent that the processing is necessary, reasonable, and proportionate to the specific purpose or purposes.” Similar verbiage exists in all other privacy laws. Regardless of where you do business, it is highly likely that at least one, if not many, of these laws apply to your business.  What Is Data Minimization and How Is It Accomplished?  At its foundation, data minimization means adherence to two basic principles: Only collect the data that you actually need to provide your services, and don’t keep data any longer than you need.  To minimize your data, you should do the following:   Evaluate your data storage processes and align data retention policies and practices with the principles outlined in the various privacy laws that your company is subject to.  Implement data destruction policies and follow them.  Use data classification and data discovery tools to scour your current data sets.   Remove data that hasn’t been accessed for years and that you are storing for no valid business reason.  Adhering to data minimization principles will not only help you remain in compliance with privacy laws, but it will also reduce your attack surface; improve your operational ability to analyze the information you do have; and improve your ability to make data-driven decisions based on current, clean, minimal data.  source

Drowning in Data for Want of Information: Is Data Minimization Really Possible? Read More »

What’s New In Indian Mobile Banking In 2024?

In an era when people have started doing everything on their phone, is your bank’s mobile app evolving at the pace of customer expectations? Our latest Digital Experience Review™ of Indian mobile banking apps reveals a landscape ripe with innovation yet marked by notable gaps in customer experience. In 2024, Forrester reviewed and evaluated six key players to identify best practices and their usability, effectiveness, and customer experience. The Forrester Digital Experience Review™: Indian Mobile Banking Apps, Q3 2024, highlights the leaders, uncovers the current state of mobile banking experiences, and looks at notable trends that are emerging in the market. Some Indian Mobile Banking Apps Show Rays Of Innovation Our assessment of mobile banking apps unveils IDFC First Bank as the leader with a feature-rich platform. It not only caters to traditional banking needs but goes a step further by enhancing financial wellness and literacy. ICICI Bank follows, delivering an intuitive user experience and robust security measures. IDFC First Bank’s approach to offering personalized financial insights is a prime example of how banks can deliver excellent personalization. ICICI Bank’s engaging Discover section is another similar example. Most other banks, however, have yet to offer truly differentiated mobile experiences that are inclusive, engaging, and assuring. Now Is The Time To Double Down On The Effort The era of “one size fits all” is over. Customers now expect services tailored to their unique needs and financial aspirations. To meet these expectations, banks must go beyond traditional transactional services. Also, they must implement robust security measures in response to escalating digital fraud and improve their conversational banking offerings, such as chatbots. Current chatbots only troubleshoot, direct customers toward a service, or answer questions. Banks must evolve these bots to handle complex conversations and tasks themselves. This will offer customers a more personalized experience. Additionally, embracing financial inclusion as a key differentiator that enables banks to reach a broader, more diverse customer base. While offering multilingual support and improving accessibility are positive steps, banks must continue to explore innovative ways to make their services more inclusive. Moreover, continuous innovation and actively incorporating customer feedback are crucial for refining these services. Thus, banks that prioritize these elements will excel in delivering exceptional customer experiences, setting a new standard in the financial industry. If you’re a Forrester client, you can explore these findings in detail by downloading the report, The Forrester Digital Experience Review™: Indian Mobile Banking Apps, Q3 2024. And if you’d like to discuss this topic further or understand how your mobile app measures up, please reach out through an inquiry or guidance session. source

What’s New In Indian Mobile Banking In 2024? Read More »