CFA Institute

ChatGPT and Large Language Models: Six Evolutionary Steps

The evolution of language models is nothing less than a super-charged industrial revolution. Google lit the spark in 2017 with the development of transformer models, which enable language models to focus on, or attend to, key elements in a passage of text. The next breakthrough — language model pre-training, or self-supervised learning — came in 2020 after which LLMs could be significantly scaled up to drive Generative Pretrained Transformer 3 (GPT-3). While large language models (LLMs) like ChatGPT are far from perfect, their development will only accelerate in the months and years ahead. The rapid expansion of the ChatGPT plugin store hints at the rate of acceleration. To anticipate how they will shape the investment industry, we need to understand their origins and their path thus far. So what were the six critical stages of LLMs’ early evolution? The Business of GPT-4: How We Got Here ChatGPT and GPT-4 are just two of the many LLMs that OpenAI, Google, Meta, and other organizations have developed. They are neither the largest nor the best. For instance, we prefer LaMDA for LLM dialogue, Google’s Pathways Language Model 2 (PaLM 2) for reasoning, and Bloom as an open-source, multilingual LLM. (The LLM leaderboard is fluid, but this site on GitHub maintains a helpful overview of model, papers, and rankings.) So, why has ChatGPT become the face of LLMs? In part, because it launched with greater fanfare first. Google and Meta each hesitated to launch their LLMs, concerned about potential reputational damage if they produced offensive or dangerous content. Google also feared its LLM might cannibalize its search business. But once ChatGPT launched, Google’s CEO Sundar Pichai, reportedly declared a “code red,” and Google soon unveiled its own LLM. GPT: The Big Guy or the Smart Guy? The ChatGPT and ChatGPT Plus chatbots sit on top of GPT-3 and GPT-4 neural networks, respectively. In terms of model size, Google’s PaLM 2, NVIDIA’s Megatron-Turing Natural Language Generation (MT-NLG), and now GPT-4 have eclipsed GPT-3 and its variant GPT-3.5, which is the basis of ChatGPT. Compared to its predecessors, GPT-4 produces smoother text of better linguistic quality, interprets more accurately, and, in a subtle but significant advance over GPT-3.5, can handle much larger input prompts. These improvements are the result of training and optimization advances — additional “smarts” — and probably the pure brute force of more parameters, but OpenAI does not share technical details about GPT-4. ChatGPT Training: Half Machine, Half Human ChatGPT is an LLM that is fine-tuned through reinforcement learning, specifically reinforcement learning from human feedback (RLHF). The process is simple in principle: First humans refine the LLM on which the chatbot is based by categorizing, on a massive scale, the accuracy of the text the LLM produces. These human ratings then train a reward model that automatically ranks answer quality. As the chatbot is fed the same questions, the reward model scores the chatbot’s answers. These scores go back into fine-tuning the chatbot to produce better and better answers through the Proximal Policy Optimization (PPO) algorithm. ChatGPT Training Process Source: Rothko Investment Strategies The Machine Learning behind ChatGPT and LLMs LLMs are the latest innovation in natural language processing (NLP). A core concept of NLP are language models that assign probabilities to sequences of words or text — S = (w1,w2, … ,wm) — in the same way that our mobile phones “guess” our next word when we are typing text messages based on the model’s highest probability. Steps in LLM Evolution The six evolutionary steps in LLM development, visualized in the chart below, demonstrate how LLMs fit into NLP research. The LLM Tech (R)Evolution 1. Unigram Models The unigram assigns each word in the given text a probability. To identify news articles that describe fraud in relation to a company of interest, we might search for “fraud,” “scam,” “fake,” and “deception.” If these words appear in an article more than in regular language, the article is likely discussing fraud. More specifically, we can assign a probability that a piece of text is about. More specifically, we can assign a probability that a piece of text is about fraud by multiplying the probabilities of individual words: In this equation, P(S) denotes the probability of a sentence S, P(wi) reflects the probability of a word wi appearing in a text about fraud, and the product taken over all m words in the sequence, determines the probability that these sentences are associated with fraud. These word probabilities are based on the relative frequency at which the words occur in our corpus of fraud-related documents, denoted as D, in the text under examination. We express this as P(w) = count(w) / count(D), where count(w) is the frequency that word w appears in D and count(D) is D’s total word count. A text with more frequent words is more probable, or more typical. While this may work well in a search for phrases like “identify theft,” it would not be as effective for “theft identify” despite both having the same probability. The unigram model thus has a key limitation: It disregards word order. 2. N-Gram Models “You shall know a word by the company it keeps!” — John Rupert Firth The n-gram model goes further than the unigram by examining subsequences of several words. So, to identify articles relevant to fraud, we would deploy such bigrams as “financial fraud,” “money laundering,” and “illegal transaction.” For trigrams, we might include “fraudulent investment scheme” and “insurance claim fraud.” Our fourgram might read “allegations of financial misconduct.” This way we condition the probability of a word on its preceding context, which the n-gram estimates by counting the word sequences in the corpus on which the model was trained. The formula for this would be: This model is more realistic, giving a higher probability to “identify theft” rather than “theft identify,” for example. However, the counting method has some pitfalls. If a word sequence does not occur in the corpus, its probability will be zero, rendering the entire product as zero. As the value of the

ChatGPT and Large Language Models: Six Evolutionary Steps Read More »

Book Review: The Enduring Value of Roger Murray

The Enduring Value of Roger Murray. 2022. Paul Johnson and Paul D. Sonkin. Columbia Business School Publishing. Who among us does not know the contributions of Benjamin Graham and David Dodd to security analysis, as well as their disciplined approach to long-term investing? After reading and thoroughly enjoying The Enduring Value of Roger Murray, I now understand that Roger Murray represents a dynamic successor to them in practicing fundamental analysis with an emphasis on uncovering the intrinsic value of the stock at hand. This fresh look at a great investment personality of the last millennium restores one’s confidence in fundamental analysis — and especially value investing. It also underscores how sticking with one’s convictions in all aspects of work and life can leave a lasting impact beyond one’s own lifetime. The authors present this book in a congenial way, acquainting the reader with Roger Murray’s professional and personal life and then introducing him in his own voice in four lectures, as well as a 1996 interview by Peter Tanous that took place the year before he died at the age of 86. Both authors have backgrounds that enhance their work on this investment master. Paul Johnson has taught security analysis and value investing at Columbia Business School for more than 30 years, and Paul D. Sonkin was a portfolio manager at Mario Gabelli’s GAMCO Investors, Inc., also serving as an adjunct professor at Columbia Business School. Murray strikes me as a rational thinker who could approach any problem with an open mind. The biographical section of the book highlights this strength during his undergraduate years at Yale, where he achieved awards at a young age in literary research and analysis. What drew Murray to business and economics during the Great Depression was marriage. Although he wanted to become a teacher, he realized he would need a better income to support his family. Early in his career at Bankers Trust, Murray discovered his passion for investing and his call to work in investment management. By the age of 39, he was named head of the economic and business research department and simultaneously given the responsibility to manage institutional portfolios. Murray’s primary concern in investing post–World War II was that the returns for fixed income would fall behind the returns he anticipated for equities. At this time, fixed income provided the main source of investment return for both individual and institutional investors. Murray took a retirement of sorts when he left Bankers Trust for Columbia Business School in 1954. His dream of becoming a teacher was about to come true, even though his work at Columbia was initially administrative. As an adjunct professor, he was able to teach only a single class — which happened to be Advanced Security Analysis, initially taught by Ben Graham who planned to retire in 1956. With Murray’s extensive experience in investment management, he brought a sense of excitement and purpose to all the classes he taught over two decades at Columbia. After his departure, the school’s excellent program in value investing was not actively nurtured until it was recultivated in the 1990s with the founding of The Heilbrunn Center for Graham & Dodd Investing. After 10 years at Columbia Business School, Murray took a sabbatical and began working at TIAA (later along with CREF) as vice president and economist, leading its investment operation. At that time, he noted that the returns from college endowments lagged the growth rate in operating budgets. As a remedy, Murray invested conservatively in equities with a multi-decade timeframe based on his bullish outlook for the US economy. Overarching his 30 years in investing and teaching, Murray stimulated widespread interest in investing for retirement, not only in pension plans but also in Keogh and IRA plans. He assisted US Representative Eugene Keogh in his efforts to pass a retirement plan for self-employed workers, and he worked to get the IRA into the 1974 ERISA. His 1968 comprehensive study of the effects of pension plans on savings and investment for the National Bureau of Economic Research (NBER) was a major part of the IRA effort. Murray’s thoughts are summarized impeccably in the four lectures he presented at the Museum of Television and Radio in New York City in early 1993, sponsored by Gabelli Asset Management Company. In these lectures, the reader “hears” his voice, understands his reasoning, and gets a few hearty laughs. Murray addresses numerous topics that emphasize critical issues investors face, including earning power and its sources, intrinsic value, cash flow versus reported earnings, and inflation in valuation. Readers will also enjoy the Authors’ Notes throughout the lectures; their analysis makes the lectures seem as if they were given recently, not 30 years ago. In addition to keen insights on fundamental investing, readers receive a special treat as the book begins when they are introduced to Murray’s family. The Murrays were a hard-working, close-knit family that valued education and strong commitment to productive work. The big surprise to me was learning about his older sister, Grace Hopper, fondly known as Grandma COBOL. She wrote the industry’s first software compiler in 1952. My only critique of this excellent book is that it lacks an index. I was put on the spot when a colleague asked me a specific question about Bruce Greenwald. I also unsuccessfully sought a quick look-up on Murray’s quote: “I’ve got a deal you can’t refuse!” This excellent tribute to Roger Murray and his enduring value will delight seasoned investment professionals and those who are just beginning their careers in investment research and management. For the more mature practitioner, it highlights the importance of thoughtfully considering the pricing versus the intrinsic value of securities in managing assets. For the student or younger practitioner, it extols the joy and satisfaction of loving one’s work and profession over a long and rich career. For all, it sheds great light on the investment management industry’s evolution over the past 90 years — and how one luminous individual contributed so much to

Book Review: The Enduring Value of Roger Murray Read More »

AI Can Pass the CFA® Exam, But It Cannot Replace Analysts

Recent headlines have highlighted how large language models (LLMs) perform well and quickly on the CFA® exam. These attention-grabbing headlines should not be viewed as a “death sentence” for a certification renowned for its rigorous curriculum and challenging pass rates. Rather, they serve as another illustration of artificial intelligence’s (AI’s) expanding capabilities and offer an opportunity to reflect on competency standards within the financial industry. When AI Passes the CFA Exam First, AI proponents should breathe a sigh of relief. This scenario is precisely where AI is expected to excel: a well-defined body of knowledge, abundant homogeneous training data, and a test format standardized across participants globally and through time. This outcome should not be surprising given how LLMs have demonstrated impressive capabilities in other standardized examinations beyond finance. These tests are designed to assess baseline competencies, and AI’s success in these areas underscores its ability to process and synthesize vast amounts of information efficiently, especially where passing thresholds do not demand perfect accuracy. If AI didn’t perform well in this scenario, it would certainly contribute to the ongoing debate about the outsized investments in its advancement. Technology Has Always Raised the Bar Second, as Mark Twain reportedly said, “History doesn’t repeat itself, but it often rhymes.” The progress of AI echoes broader trends in the financial industry and underscores that this progress isn’t necessarily linear, but can occur in leaps and bounds. The financial sector has embraced many technological advancements, moving from pen and paper to calculators, then to computers, Excel spreadsheets, Python programming, and more. None of these transitions turned out to be an existential threat to the profession; rather, they enhanced efficiency and analytical capabilities, freeing up professionals from routine tasks and allowing them to focus on higher-value activities. This historical perspective is exemplified by Benjamin Graham, father of value investing and driving force behind the CFA designation. Graham wrote optimistically about “The Future of Financial Analysis” in the Financial Analysts Journal in 1963, when the computer made its entry in the investing world. Competence Keeps Evolving Third, AI serves as a reminder that the bar for what constitutes basic competency is a continuously evolving standard, and that success in this industry, as in many others, requires an ongoing commitment to upskilling. CFA Institute has long promoted this approach, adapting its curriculum to integrate topics such as AI and big data. The breed of financial analyst still exclusively using pen and paper, not having basic computing skills, being apprehensive of Excel spreadsheets, or having no appreciation for the potential of programming has largely become obsolete. Not using AI is no longer an option and leveraging it where it’s value-adding, and with the appropriate guardrails, can become a significant advantage. The time saved through AI-driven analysis can be redirected toward more strategic thinking, complex problem-solving, and client engagement. Why Human Judgment Still Matters Finally, AI will not be a replacement for distinguishing yourself as an investment professional anytime soon. Success in the field demands more than rehashing common and easily accessible knowledge. Landing that first job requires more than tapping into a broad corpus of knowledge; it demands demonstrating the ability to apply knowledge in ever-changing market circumstances, critically analyze information, and innovate — a challenge that goes well beyond merely passing Levels I, II, and III. In that vein, hiring managers will more likely ask, “What aspects of the CFA curriculum will you leverage to assess how uncertainty around tariffs may impact the supply chain in your industry?” They will less likely ask, “Do these investments look suitable given this hypothetical client’s investment profile?” Similarly, investment performance is driven by finding outliers and identifying information that the market may be missing. This requires not only a deep understanding of foundational knowledge, but also the ability to contextualize it and express nuanced judgment grounded in subject matter expertise. While AI tools can serve as powerful assistants in this endeavor, the ability to uncover differentiated insights in a timely manner necessitates skills that extend far beyond surfacing consensus views that pass an exam threshold. As CFA Institute has been emphasizing for years, the future belongs to those who master the AI + HI (human intelligence) model, where investment professionals achieve superior outcomes through the synergy of machines and humans. The parting words of Graham’s 1963 FAJ article still ring true: “Be all as it may, of one thing I am certain. Financial analysis in the future, as in the past, offers numerous different roads to success.” I acknowledge the contributions of LLMs in reviewing and refining my outline and draft. source

AI Can Pass the CFA® Exam, But It Cannot Replace Analysts Read More »

ChatGPT and Large Language Models: Syntax and Semantics

For more on artificial intelligence (AI) in investment management, check out The Handbook of Artificial Intelligence and Big Data Applications in Investments, by Larry Cao, CFA, from the CFA Institute Research Foundation. A New Frontier for Finance? The banking and finance sectors have been among the early adopters of artificial intelligence (AI) and machine learning (ML) technology. These innovations have given us the ability to develop alternative, challenger models and improve existing models and analytics quickly and efficiently across a diverse range of functional areas, from credit and market risk management, know your customer (KYC), anti-money laundering (AML), and fraud detection to portfolio management, portfolio construction, and beyond. ML has automated much of the model-development process while compressing and streamlining the model development cycle. Moreover, ML-driven models have performed as well as, if not better than, their traditional counterparts. Today, ChatGPT and large language models (LLMs) more generally represent the next evolution in AI/ML technology. And that comes with a number of implications. The finance sector’s interest in LLMs is no surprise given their vast power and broad applicability. ChatGPT can seemingly “comprehend” human language and provide coherent responses to queries on just about any topic.  Its use cases are practically limitless. A risk analyst or bank loan officer can have it assess a borrower’s risk score and make a recommendation on a loan application. A senior risk manager or executive can use it to summarize a bank’s current capital and liquidity positions to address investor or regulatory concerns. A research and quant developer can direct it to develop a Python code that estimates the parameters of a model using a certain optimization function. A compliance or legal officer may have it review a law, regulation, or contract to determine whether it is applicable.  But there are real limitations and hazards associated with LLMs. Early enthusiasm and rapid adoption notwithstanding, experts have sounded various alarms. Apple, Amazon, Accenture, JPMorgan Chase, and Deutsche Bank, among other companies, have banned ChatGPT in the workplace, and some local school districts have forbidden its use in the classroom, citing the attendant risks and potential for abuse. But before we can figure out how to address such concerns, we first need to understand how these technologies work in the first place. ChatGPT and LLMs: How Do They Work? To be sure, the precise technical details of the ChatGPT neural network and training thereof are beyond the scope of this article and, indeed, my own comprehension. Nevertheless, certain things are clear: LLMs do not understand words or sentences in the way that we humans do. For us humans, words fit together in two distinct ways. Syntax On one level, we examine a series of words for its syntax, attempting to understand it based on the rules of construction applicable to a particular language. After all, language is more than jumbles of words. There are definite, unambiguous grammatical rules about how words fit together to convey their meaning. LLMs can guess the syntactic structure of a language by the regularities and patterns they recognize from all the text in their training data. It is akin to a native English speaker who may never have studied formal English in school but who knows what kinds of words are likely to follow in a series given the context and their own past experiences, even if their grasp of grammar may be far from perfect. LLMs are similar. Since they lack an algorithmic understanding of the syntactic rules, they may miss some formally correct grammatical cases, but they will have no problems communicating. Semantics “An evil fish orbits electronic games joyfully.” Syntax provides one layer of constraint on language, but semantics provides an even more complex, deeper constraint. Not only do words have to fit together according to the rules of syntax, but they also have to make sense. And to make sense, they must communicate meaning. The sentence above is grammatically and syntactically sound, but if we process the words as they are defined, it is gibberish. Semantics assumes a model of the world where logic, natural laws, and human perceptions and empirical observations play a significant role. Humans have an almost innate knowledge of this model — so innate that we just call it “common sense” — and apply it unconsciously in our everyday speech. Could ChatGPT-3, with its 175 billion parameters and 60 billion to 80 billion neurons, as compared with the human brain’s roughly 100 billion neurons and 100 trillion synaptic connections, have implicitly discovered the “Model of Language” or somehow deciphered the law of semantics by which humans create meaningful sentences? Not quite. ChatGPT is a giant statistical engine trained on human text. There is no formal generalized semantic logic or computational framework driving it. Therefore, ChatGPT cannot always make sense. It is simply producing what “sounds right” based on what it “sounds like” according to its training data. It is pulling out coherent threads of texts from the statistical conventional wisdom accumulated in its neural net. Key to ChatGPT: Embedding and Attention ChatGPT is a neural network; it processes numbers not words. It transforms words or fragments of words, about 50,000 in total, into numerical values called “tokens” and embeds them into their meaning space, essentially clusters of words, to show relationships among the words. What follows is a simple visualization of embedding in three dimensions. Three-Dimensional ChatGPT Meaning Space Of course, words have many different contextual meanings and associations. In ChatGPT-3, what we see in the three dimensions above is a vector in the 12,228 dimensions required to capture all the complex nuances of words and their relationships with one another. Besides the embedded vectors, the attention heads are also critical features in ChatGPT. If the embedding vector gives meaning to the word, the attention heads allow ChatGPT to string together words and continue the text in a reasonable way. The attention heads each examine the blocks of sequences of embedded vectors written so far. For each block of the embedded vectors, it reweighs or “transforms” them into a new vector that is then passed through the

ChatGPT and Large Language Models: Syntax and Semantics Read More »

How Tariffs Could Accelerate America’s AI Revolution: Implications for Investors

The 2024 US presidential election has ushered in major policy shifts, with sweeping tariffs and new trade strategies signaling the end of decades of open-market globalization. While these changes introduce short-term uncertainty for businesses and investors, they may also set the stage for a strategic overhaul: accelerated investment in US manufacturing and a surge in AI-driven productivity. If managed well, this shift could spark a new era of American economic growth. Understanding how tariffs could reshape investment trends and accelerate AI adoption is critical for anticipating the next phase of US economic growth. History shows that major disruptions, when paired with transformative technologies, often precede new periods of economic growth. Policy Shifts and Economic Risks: Tariffs Reshape the Landscape The federal government is expected to undergo major organizational reforms to improve its finances. The current economic disruption from tariffs could yield considerable long-term gains by downsizing departments and reducing headcount. This initiative may result in reductions in federal employment and the implementation of expanded tariffs, introducing risks of a mild recession. A reduction of federal employment could dampen household incomes and consumer spending, with potential knock-on effects for regional economies[i]. This downturn could impact commercial spaces, local bonds, and regional banks. Plans also call for replacing portions of federal tax revenue with tariffs, the assumption being that with these measures will decrease the federal deficit and help balance the budget. Under the best-case scenario, these tariffs could raise the average import duty to approximately 22%, thereby increasing prices by a few percentage points and slowing 2025 economic growth[ii].  Easing the Labor Transition: Reskilling and Reinvestment Opportunities The key question is how the economy will adapt to the influx of former federal employees seeking private and state sector jobs that match their qualifications. The US economy could mitigate the impact of losing 15% of federal jobs by allocating about 10% of tariff revenues into a “Re-Employ America” fund. This fund could provide reskilling vouchers, wage subsidies for new hires, and temporary unemployment benefits to rapidly integrate displaced workers into private or state sectors[iii]. Simultaneously, expanding CHIPS-style manufacturing grants, expediting infrastructure projects approved under the IIJA Infrastructure Investment and Jobs Act, and advancing defense procurement spending could create hundreds of thousands of new jobs[iv]. Nevertheless, even with superb execution, tangible outcomes would take years to materialize as a compensatory offset. A Fragile Recovery: Rising Defaults and “Stagflation Lite” Weakened consumer sentiment poses significant hurdles for companies. They are contending with dwindling sales and facing the task of refinancing about $1.8 trillion in corporate debt[v] and $1.98 trillion in commercial real estate this year and next[vi] at higher interest rates. This scenario risks increasing defaults and widening credit spreads. Already, we are witnessing a rise in subprime auto and credit card delinquencies, with small business loans next to the list[vii]. This picture of slowing growth, combined with inflation and stricter credit conditions, sometimes dubbed “stagflation lite,” represents a moderate downturn paired with stubborn inflationary pressures.  AI: A Beacon of Hope on the Horizon Amidst all this domestic and global economic ambiguity, there is a beacon of hope on the horizon. A more robust economy might just be in the cards over the coming years, stronger than what we have seen since the post-COVID period. What fuels this hope? The burgeoning wave of artificial intelligence (AI) is unfolding across numerous commercial applications. Investment cash is ready, and the demand is set to soar. The existing level of investment in this strategic area is quite impressive. Leading tech firms have committed more than $1 trillion to develop GPU production facilities, secure energy for extensive data centers, and propel innovative model research in 2026[viii]. Federal initiatives like the CHIPS and Science Act and a 25% investment tax credit are expected to maintain construction momentum, even if companies hold off on their IT spending for a bit[ix]. We are likely to see an influx of new computing power. Just as the PC market saw a revival following the disinflation of 1982, and cloud services boomed after the 2009 economic recovery, we may see a similar revitalization of capital expenditure initiatives by chief financial officers. Investor Sentiment: AI’s Growing Role in Earnings and Equity Markets Tariffs could reduce GDP by around 1%, which is already reflected in many cyclical stocks. Investors now demand a compelling growth narrative to reignite interest in equities. AI is emerging as a strong contender, particularly if tariff pressures prompt the Federal Reserve to ease monetary policy. Embracing Next Gen AI for more consumer-centric commerce could trigger a nationwide productivity surge that compensates for tariff-driven margin contractions. Investors are optimistic, as demonstrated by the staggering $57 billion poured into AI data centers and model training throughout late 2024. That investment fostered a robust network of equipment suppliers, electrical contractors, and software integrators[x]. A notable increase in AI mentions during earnings calls from sectors like finance, media, and manufacturing has prompted analysts to suggest we could see widespread margin enhancements. Nvidia’s 60% revenue forecast underscores the unceasing silicon demand[xi].  The Intersection of Protectionism and AI At the intersection of protectionism and AI lies a pivotal challenge: the erosion of white-collar career paths due to decades of offshoring. While outsourcing to cheaper regions reduced costs, it also slashed skilled jobs and pressured local wages. Gen AI might redefine this landscape. Today, AI chatbots manage about 60% of customer queries, and developer “copilots” empower a single US programmer to compete with multiple overseas counterparts[xii]. When you factor in stricter visa regulations and domestic sourcing policies, the drive to export routine tasks lessens. Although global expertise will be tapped for specific projects, AI-enhanced domestic teams are likely to revive key support roles.  Instead of cutting jobs, advanced AI amplifies American potential, freeing up workers for high-level tasks that require human ingenuity. Generative models efficiently draft code, reconcile accounts, or summarize legal texts, allowing auditors, engineers, and paralegals to focus on strategy, creativity, and complex analyses — tasks that rely on human insight. With the United States at

How Tariffs Could Accelerate America’s AI Revolution: Implications for Investors Read More »

The Little-Known Credit Holding Up the Clean Fuel Market

For investors watching the energy transition unfold, the surge in prices of compliance credits known as D3 renewable identification numbers (RINs) tells an important story. Refiners and importers of gasoline or diesel are obligated to purchase these biofuel compliance credits. D3 RINs have quietly become a barometer for the challenges facing renewable fuel policy — where government mandates, limited supply, and lagging innovation collide. Understanding the dynamics of this green currency can help investors spot both bottlenecks and breakthroughs in the low-carbon economy. Source: EPA and Author Analysis What’s Driving the Spike in D3 RIN Prices These compliance credits are the “currency” of the US Renewable Fuel Standard (RFS) Program. D3 RINs are linked to cellulosic biofuels,  which come from non-food plant material. Three forces are contributing to the rising prices of D3 RINs: Supply Constraints: Cellulosic biofuel production is challenging and costly and continues to lag far behind mandated levels. The limited number of D3 RINs has made compliance more difficult, forcing obligated refiners and importers to compete for a small pool of credits. Regulatory Pressure: Government policies have increased the required volumes of advanced biofuels, including cellulosic fuels, even as production struggles to keep pace. The growth rate of D3 RIN target volumes averaged 8.4% between 2021 and 2022. The projected growth rate from 2023 to 2025 is expected to average just over 30%. At the same time, regulators have removed key flexibilities. The Set Rule for 2023, 2024, and 2025 eliminated Cellulosic Waiver Credits as a compliance option, which effectively removed the price ceiling for D3 RINs. And since 2018, no exemptions have been granted for renewable volume obligations, resulting in increased demand for RINs. Trend Analysis: D3 RIN Volume Targets (billion RINs) Source: EPA Innovation and Investment: Ongoing investment and technological advancements in cellulosic biofuel production can also impact prices. If considerable progress is made, it may initially drive up prices as demand for new, more efficient technologies grows. Price Relief Is Possible—but Structural Constraints Make It Unlikely Strong demand, tight regulation, and limited supply have been keeping D3 RIN prices high. Several developments could ease pressure on D3 RIN prices, but so far, few show signs of materializing. Here’s what might push prices lower: Regulatory Relief: If the government reduces renewable fuel volume targets or allows RINs to carry over from previous years, demand could ease. Waivers and Exemptions: Small refinery exemptions (SREs) could reduce the number of obligated parties required to purchase RINs. More waivers could lower demand, but none have been granted since 2018. Summary of Small Refinery Exemption Decisions Each Compliance Year Source: EPA and Author Analysis Improved Market Liquidity: More active trading in the RIN market could increase efficiency and lead to more competitive pricing. Technological Breakthroughs: Advances that make cellulosic biofuel production cheaper or more scalable would help increase supply. Lower Compliance Costs: If obligated parties find cheaper ways to meet their RFS obligations, demand for RINs may decrease. Economic Factors: Broader economic conditions, such as falling crude oil prices can influence the competitiveness of renewable fuels. Currently, there are no clear indications that D3 RIN prices will decrease. Market factors, such as increasing demand for renewable fuels, regulatory requirements, and the limited supply of qualifying biofuels, are keeping prices elevated. Additionally, ongoing policy support and production constraints contribute to sustained price pressure. As a result, it is unlikely that we will see a significant drop in D3 RIN prices soon. Impact For Investors Over the past decade, D3 RIN credits have proven to be among the most significant factors affecting the financial viability of biogas projects across the United States. While project costs and operational complexities vary by region, infrastructure, and feedstock, the economics of most projects are fundamentally tied to D3 RIN prices remaining above a critical level. Since 2015, the price of D3 RIN credits has fluctuated within a broad range, reflecting changes in market dynamics and regulatory factors. Based on historical data, D3 RIN prices have varied from a low of $0.46 to a high of $3.50 per credit. Although prices are currently elevated, the economics of these projects remain sensitive to downward price movements. On average, trends observed across diverse projects nationwide indicate that if D3 RIN credits ever fall below $1.15, many ventures become financially unfeasible. This price threshold serves as a rough break-even point for many developers and is a key metric for assessing project risk. This underscores the broader investment implications tied to regulatory risk, energy transition volatility, and market inefficiencies. The elimination of price ceilings and waivers has intensified market dynamics, further amplifying demand. For investors, this creates both risk and opportunity — emphasizing the need for active monitoring and strategic positioning. Projects that incorporate risk mitigation tools, such as long-term credit hedging or structured offtake agreements, are better equipped to navigate volatility and deliver resilient returns in the maturing low-carbon fuel sector. source

The Little-Known Credit Holding Up the Clean Fuel Market Read More »

Hiding in Plain Sight: Accounting for Capex

In both public and private markets, investors often rely on EBITDA and cash flow metrics to assess profitability and value companies. Yet these measures can mask a wide gap between accounting earnings and free cash flow. That gap typically stems from two sources: shifts in working capital and investment cash flows, with CAPEX often the largest driver in capital-intensive industries. Poorly performing projects may even make profits look stronger while cash is being drained. This blog highlights why ex-post monitoring of capital allocation matters and how investors can detect whether CAPEX is creating or destroying value across different industries. It is important to note that CAPEX needs vary significantly by sector. Capital-intensive industries such as telecommunications and energy require large recurring investments. Others like software or education are far less dependent on fixed-asset spending. While working capital management is typically monitored closely, far less attention is given to the cash flow conversion of growth CAPEX. This oversight has become especially relevant in recent years as higher interest rates increase the cost of financing large investment programs. Why CAPEX Monitoring Matters Growth CAPEX is a long-term capital allocation decision. The challenge for investors is that, once approved and executed, companies rarely disclose whether projects actually deliver the promised returns. The risk is clear: reported earnings may not fully reflect the cash flow implications of expansion programs. Underperforming investments can make profitability look stronger than it is, while simultaneously reducing the cash available for dividends, buybacks, or debt service. The earnings–cash flow gap is especially pronounced in capital-intensive sectors like telecom and energy, where large recurring investments are the norm. With higher interest rates raising financing costs, careful monitoring of CAPEX cash conversion has become even more critical. Disclosure Approaches Here are a couple of examples of companies that break out CAPEX from total earnings: Telecommunications: Spanish telecom giant Telefónica reports earnings before interest, taxes, depreciation, amortization, and special losses (EBITDAaL). This metric incorporates accrued capital expenditures. Management noted in Q2 2025 results, “It is important to consider capital expenditures excluding spectrum acquisitions with EBITDAaL, in order to have a more complete measure of the performance of our telecommunication businesses.” Because Telefónica integrates all CAPEX into this key performance indicator (KPI), even by geography, management and investors can more easily identify when rollouts fail to generate expected cash flows. Industrial manufacturing: French transport system manufacturer Alstom disclosed an adjusted net profit to free cash flow conversion ratio but did not report return on capital employed (ROCE) or return on capital invested (ROCI) in its March 2025 annual report. On the other hand, it does track working capital needs on a project-by-project basis, indicating that management monitors cash flow implications at the operating level even if broader capital return metrics are absent. These examples show how disclosure practices differ across industries, and why investors must adapt their approach depending on the sector and reporting culture. Investor Red Flags Investors rarely see management’s internal capital budgeting models, but public disclosures often contain signals worth monitoring: Rising leverage at higher cost of capital, particularly when companies rely on private debt funds with variable rates. Declining profitability of comparable operations. For example, lower EBITDA per store, business unit, or product category after the ramp-up period may suggest new investments are diluting overall profitability. CAPEX growth without sustained improvement in return on invested capital (ROIC). These signals should always be assessed in conjunction with the Management Discussion & Analysis (MD&A) to separate structural problems from temporary pressures. What Good Disclosure Looks Like Strong disclosure practices help investors evaluate capital allocation discipline. Examples include: Reporting ROIC or EBITDA checkpoints after the ramp-up period, distinguishing between comparable units and those tied to new CAPEX. Providing segment-level CAPEX disclosure linked directly to cash flow outcomes. Communicating payback periods for strategic projects. Demonstrating improved profitability in the business units where CAPEX has been deployed, ideally with a breakdown of fixed assets by new versus comparable operations. Conclusion Shareholder value is not created by the volume of capital deployed, but by a company’s ability to transform those investments into sustainable cash flows. This principle applies across industries, whether in telecom, energy, industrials, or asset-light sectors where CAPEX plays a smaller but still strategic role. For investors, the key is to look beyond earnings and monitor whether CAPEX is being translated into real cash generation. Undisciplined CAPEX inflates balance sheets, but disciplined growth builds resilience and long-term economic return. If you liked this post, don’t forget to subscribe to the Enterprising Investor. All posts are the opinion of the author. As such, they should not be construed as investment advice, nor do the opinions expressed necessarily reflect the views of CFA Institute or the author’s employer. Image credit: ©Getty Images / Ascent / PKS Media Inc. Professional Learning for CFA Institute Members CFA Institute members are empowered to self-determine and self-report professional learning (PL) credits earned, including content on Enterprising Investor. Members can record credits easily using their online PL tracker. source

Hiding in Plain Sight: Accounting for Capex Read More »

Monetary Policy and Financial Conditions: Meaningful Relationship?

After nearly two years of high interest rates, investors are anticipating rate cuts in the coming months. The transition from highly expansionary to highly contractionary monetary policy in recent years, coupled with current expectations for another policy shift, make it an ideal time to assess the relationship between financial conditions and monetary policy. This analysis does exactly that. We examine the US Federal Reserve’s response to changing financial conditions, as well as the subsequent impact of these actions on financial conditions. Our findings illustrate that financial conditions are a relevant indicator for investors to monitor. Investors will benefit from a deeper understanding of how the dynamics between financial conditions and monetary policy evolve as policy shifts occur. Understanding this relationship will help investors prepare for policy shifts both now and in the future.This analysis focuses on the Fed’s recent rounds of quantitative easing (QE) and quantitative tightening (QT). We examined weekly data for the Federal Reserve Bank of Chicago’s National Financial Conditions Index (NFCI) from 31 January 2014 through 31 January 2024. The NFCI measures the state of financial conditions, consisting of 105 indicators of risk, credit, and leverage. We also obtained weekly data for the risk, credit, and leverage subindexes from the NFCI over the same period. Similarly, we gathered weekly data on the Fed’s balance sheet from 31 January 2014 through 31 January 2024. Fed assets have grown tremendously over the period, nearly doubling to $7.6 trillion as of 31 January 2024 from $4.1 trillion as of 31 January 2014. Most of this growth occurred in the first half of 2020, however, due to the Fed’s QE. The left-hand panel of Exhibit 1 visualizes the trends in the NFCI index, as well as in the risk, credit, and leverage subindexes, over the period. The right-hand panel of Exhibit 1 shows the trends in the NFCI index along with the increase in Fed assets over the period. Notably, financial conditions have generally been looser than their historical average as indicated by negative NCFI values over the period, except for March and April 2020. Exhibit 1 Sources: Federal Reserve Economic Data (FRED), Federal Reserve Bank of Chicago Lead/Lag Analysis for the QE Sample For this analysis, we examine the lead/lag relationship between the Fed’s balance sheet and the NFCI, following the lead/lag analysis conducted by Putnins (2022) between the Fed’s balance sheet and stock market returns. We first conduct this analysis over a period of QE, and later repeat the same analysis over a period of QT. On 15 March 2020, the Fed announced its plans to implement a round of QE in response to the onset of the coronavirus pandemic. This large-scale purchasing of assets continued until the beginning of May 2022, when the Fed announced that it would begin a round of QT. Thus, for the QE sample, the period begins on 11 March 2020 (the Wednesday prior to the QE announcement, since NFCI data is available on Wednesday each week) and ends on 27 April 2022, just prior to the Fed’s QT announcement in early May. We begin by calculating the weekly log change in Fed’s assets. And then we examine the relationship between the weekly log change in Fed assets in week n and the weekly value of the NFCI in week n + k, where n represents the point in time with no leads/lags and k represents the amount of the lead/lag in weeks, ranging from a lag of -10 weeks to a lead of +10 weeks. In other words, week n does not refer to a particular week, but rather, refers to the “base week,” or the point in time for any given week with no leads/lags (k = 0). Negative values for k (i.e., past values of the NFCI) capture how the Fed responded to either improving or deteriorating past financial conditions, while positive values for k (i.e., future values of the NFCI) capture how the Fed’s actions subsequently affected financial conditions. We analyze the relationship between the weekly log change in Fed assets and the weekly value of the NFCI by running a time-series regression of NFCIn+k on ∆FedAssetsn for each lead/lag value of k. Put differently, we keep the time-series of the weekly log change in Fed assets fixed at week n (the “base week”) and shift the time series of the NFCI back k=-1,-2,…,-10 weeks and forward k=1,2,…,10 weeks relative to week n. The model is given by the following regression equation: NFCIn+k= β0+β1 ∆FedAssetsn+εn+k Similarly, we run time-series regressions of Subindexn+k on ∆FedAssetsn for the risk, credit, and leverage subindexes for each lead/lag value of k, as shown by the following regression equation: Subindexn+k= β0+β1 ∆FedAssetsn+εn+k Exhibit 2 shows the t-statistics from the regressions of NFCIn+k on ∆FedAssetsn in the top left panel for each lead/lag value of k. The t-statistics from the regressions of Subindexn+k on ∆FedAssetsn for the risk, credit, and leverage subindexes are displayed in the top right, bottom left, and bottom right panels, respectively, for each lead/lag value of k. Shaded columns indicate statistically significant t-statistics, with grey columns representing significance at the 5% level and black columns representing significance at the 1% level. Exhibit 2 Source: CFA Institute Calculations Based on these results, the relationship between the weekly log change in Fed assets and the weekly value of the NFCI is significant from k=-5 through k=8, as indicated by the significant t-statistics in the top left panel of Exhibit 2. The positive and significant t-statistics prior to k=0 suggest that the Fed expanded its balance sheet through implementing a round of QE in response to an increase in the NFCI up to five weeks prior. This result is intuitive given that increasing values for the NFCI indicate tightening financial conditions, which in turn prompts the Fed to implement accommodative monetary policy (in this case, through QE) to stimulate the economy. Subsequently, the NFCI remained positive for an additional eight weeks following the Fed’s QE announcement, shown by the positive and significant t-statistics following

Monetary Policy and Financial Conditions: Meaningful Relationship? Read More »

The Factor Mirage: How Quant Models Go Wrong

Factor investing promised to bring scientific precision to markets by explaining why some stocks outperform. Yet after years of underwhelming results, researchers are finding that the problem may not be the data at all; it’s the way models are built. A new study suggests that many factor models mistake correlation for causation, creating a “factor mirage.” Factor investing was born from an elegant idea: that markets reward exposure to certain undiversifiable risks — value, momentum, quality, size — that explain why some assets outperform others. Trillions of dollars have since been allocated to products built on this premise. The data tell a sobering story. The Bloomberg–Goldman Sachs US Equity Multi-Factor Index, which tracks the long–short performance of classic style premia, has delivered a Sharpe ratio of just 0.17 since 2007 (t-stat=0.69, p-value=0.25), statistically indistinguishable from zero before costs. In plain terms: factor investing has not delivered value for investors. For fund managers who built products around these models, that shortfall translates into years of underperformance and lost confidence. Why the Backtests Mislead The conventional explanation blames backtest overfitting or “p-hacking” — researchers mining noise until it looks like alpha. That explanation is correct but incomplete. Recent research from ADIA Lab published by CFA Institute Research Foundation identifies a deeper flaw: systematic misspecification. Most factor models are developed following an econometric canon — linear regressions, significance tests, two-pass estimators — that conflates association with causation. Econometric textbooks teach students that regressions should include any variable associated with returns, regardless of the role that the variable plays in the causal mechanism. This is a methodological error. Including a collider (a variable influenced by both the factor and returns) and / or excluding a confounder (a variable that influences both the factor and returns) biases the coefficients’ estimates. This bias can flip the sign of a factor’s coefficient. Investors then buy securities they should have sold, and vice versa. Even if all risk premia are stable and correctly estimated, a misspecified model can produce systematic losses. The Factor Mirage The “factor zoo” is a well-known phenomenon: hundreds of published anomalies that fail out-of-sample. ADIA Lab researchers point to a subtler and more dangerous problem: the “factor mirage.” It arises not from data-mining but from models that are misspecified, despite having been developed following the econometric canon taught in textbooks. Models with colliders are particularly concerning, because they exhibit higher R² and often also lower p-values than correctly specified ones. The econometric canon favors such misspecified models, mistaking better fit for correctness. In a factor model with a collider, the value of the return is set before the value of the collider. As a result, the stronger association derived from the collider cannot be monetized. The profits promised by those academic papers are a mirage. In practice, that methodological mistake has billion-dollar consequences. For example, consider two researchers estimating a quality factor. One of the researchers controls for profitability, leverage, and size; the other adds return on equity, a variable influenced by both profitability (the factor) and stock performance (the outcome). By including a collider, the second researcher creates a spurious link: high quality now correlates with high past returns. In a backtest, the second model appears to be superior. In live trading, the tables are turned, the backtest is a statistical illusion that quietly drains capital. For individual managers, these errors may quietly erode returns; for markets as a whole, they distort capital allocation and create inefficiencies at a global scale. When Misspecification Becomes a Systemic Risk Model misspecification has multiple consequences. Capital misallocation: Trillions of dollars are steered by models that confuse association with causation, a statistical mistake with enormous financial consequences. Hidden correlation: Portfolios built on similar misspecified factors share exposures, increasing systemic fragility. Erosion of trust: Every backtest that fails in live trading undermines investor confidence in quantitative methods as a whole. ADIA Lab’s recent work goes further: it shows that no portfolio can be efficient without causal factor models. If the underlying factors are misspecified, even perfect estimates of means and covariances will yield suboptimal portfolios. That means investing is not merely a prediction problem, and adding complexity doesn’t make the model better. What Can Investors Do Differently? Factor investing’s predicament will not be resolved with more data or more complex methods. What is most needed is causal reasoning. Causal inference offers practical steps every allocator can apply now: Demand causal justification. Before accepting a model, ask: Have the authors declared the causal mechanism? Does the causal graph align with our understanding of the world? Is the causal graph consistent with empirical evidence? Are the chosen controls sufficient to eliminate confounder bias? Identify confounders and avoid colliders. Confounders should be controlled for; colliders should not. Without a causal graph, researchers cannot tell the difference. Causal discovery tools can help narrow the set of causal graphs consistent with the data. Explanatory power is misleading. A model that explains less variance but aligns with plausible causal structure is more reliable than one with a dazzling R². In practice, stronger association does not mean greater profitability. Test for causal stability. A causal factor should remain meaningful across regimes. If a “premium” changes sign after each crisis, the likely culprit is misspecification, not a shifting compensation for risk. From Association to Understanding Finance is not alone in this transition. Medicine moved from correlation to causation decades ago, transforming guesswork into evidence-based treatment. Epidemiology, policy analysis, and machine learning have all embraced causal reasoning. Now it is finance’s turn. The goal is not scientific purity; it is practical reliability. A causal model identifies the true sources of risk and return, allowing investors to allocate capital efficiently and explain performance credibly. The Path Forward For investors, this shift is more than academic. It’s about building strategies that hold up in the real world — models that explain why they work, not just that they work. In an era of data abundance, understanding cause and effect may be the only real edge left. Factor investing

The Factor Mirage: How Quant Models Go Wrong Read More »

Book Review: Irrational Together

Irrational Together: The Social Forces That Invisibly Shape Our Economic Behavior. 2025. Adam S. Hayes. The University of Chicago Press, Ltd., London Investment professionals who keep abreast of economic research know that the behavioral school has exposed flaws in conventional theory based on homo economicus, a hypothetical being capable of perfectly rational decision-making. A familiar illustration of the gap between that depiction and reality is the substantially higher percentage of employees who participate in 401(k) plans when given the choice to opt out rather than opt in; simply framing the decision differently produces a different outcome. Adam S. Hayes’s Irrational Together makes the case that the behavioral critique does not go far enough. Rather, it remains focused on the cognitive psychology of the individual, overlooking socially driven deviations from traditionally defined rational economic choices. Hayes, a professor of sociology at the University of Lucerne with previous experience as an equity derivatives sales trader and licensed financial advisor, describes numerous ways in which social and cultural norms cause people to diverge from straightforwardly obtaining the maximum personal benefit for the least possible expenditure. He presents survey findings involving decisions such as whether to save money by downsizing from a house that includes a spare bedroom used by one’s mother-in-law on occasional weekend visits. Respondents’ answers varied according to what they were told about how harmonious the relationship is between the homeowner and the mother-in-law. When asked the basis for their answers, however, the overwhelming majority cited only financial considerations. Lest investment professionals imagine they are immune from having their financial decisions skewed by social factors, Hayes cites a study involving in-group bias that found that ostensibly self-interested venture capitalists prefer to fund startups of teams with professional backgrounds and education similar to their own. This is just one of many striking research findings highlighted in Irrational Together, including: Notwithstanding the attention heaped on the behaviorists’ nudging techniques, a meta-analysis covering more than 200 published studies found that the nudging backfired in some instances, leaving an overall effect of zero. Field studies produced evidence that the widely reported gender-based disparity in risk tolerance is not entirely biologically determined but also reflects differences in socialization of males and females. Research over the past two decades has found that the left-brained/ right-brained dichotomy enshrined in pop psychology has no scientific basis. An analysis of the self-managed portfolios of 70,000 investors documented a seven-percentage-point-per-annum average underperformance of the S&P 500 Index. The research Hayes draws upon includes much of his own meticulous work. For instance, in his examination of the robo-advisor phenomenon, he pored over regulatory filings, interviewed providers, and opened accounts with several firms, posing alternatively as a thirty-five-year-old and a fifty-year-old. Attesting to the fact that there are no perfect books, Hayes attributes to baseball immortal Yogi Berra the adage, “It’s tough to make predictions, especially about the future.” The indispensable Quote Investigator reports on the contrary, “[C]urrent evidence indicates that this comical proverb was first expressed in Danish, and the author remains unknown.” Nevertheless, Irrational Together enriches our understanding of the collective impact of economic decisions. An intriguing section near the end ponders the paradoxical undermining of rational outcomes that could result from increasingly widespread application of modern portfolio theory via robo-advisors. Reading this book will provide investment professionals who deal with private clients valuable tips to help them avoid damage to their performance, not only through decisions that are irrational because of innate programming of the human brain, but also through those that arise from social conventions, culture, religion, and ideology. If you liked this post, don’t forget to subscribe to the Enterprising Investor. All posts are the opinion of the author. As such, they should not be construed as investment advice, nor do the opinions expressed necessarily reflect the views of CFA Institute or the author’s employer. Image credit: ©Getty Images / Ascent / PKS Media Inc. Professional Learning for CFA Institute Members CFA Institute members are empowered to self-determine and self-report professional learning (PL) credits earned, including content on Enterprising Investor. Members can record credits easily using their online PL tracker. source

Book Review: Irrational Together Read More »