Book Review: Financial Data Science

The book covers the key concepts required for a data science course focused on finance, organized into 14 chapters that center on theory and techniques, with clear exercises on applications and use cases. Each chapter presents a unique data science topic, with discussions and interpretations not typically covered in empirical finance texts. Key data science topics include:

• principal component analysis,

• cluster analysis

• advances beyond linear regression

• linear and nonlinear classifiers and kernel methods

• deep learning with neural networks

• advances in portfolio optimization beyond the mean/variance model

• financial networks, and

• text analytics, which includes large language models (LLM) and natural

language processing (NLP).

All these topics provide essential tools for any quantitative analyst who wants to stay current without having to work from source material in academic research. Unified notation that is consistent across techniques is of critical value, making learning easier. The authors do an effective job of presenting what could be considered classical techniques as foundations and then show variations that can serve as application enhancements. For regression and optimization, alternative methods are shown to potentially improve solutions and yield better outcomes.

The book enables the reader to navigate through the theory behind data science. It does an effective job of unifying the mathematical concepts behind data science techniques and providing insight into how an analyst can apply them to real-world problems. This is not a “cookbook,” but instead emphasizes the math behind its key data science topics, followed by application exercises. The reader who works through the book’s focused exercises and applications should be able to understand how these theories can be used to solve real problems. Nevertheless, from a practitioner’s viewpoint, a greater focus on the applied side of data science beyond the exercises would have been beneficial. This means providing explanations for why a new tool will yield improved predictions and tapping into the authors’ collective wisdom by offering insights into when and how these techniques will be helpful, and when simple methods will suffice.

Techniques are tools, and the book does a good job of explaining the different tools. It needs, however, to present more clearly when and why analysts should use specific tools and how to interpret model output. The rationale for using a particular technique and the skill in applying it come from experiential knowledge that is hard to gain from any course or textbook, yet imparting the process engineering for analyzing data is the critical piece that will elevate this book above others for the CFA Institute readership.

Too often, beginning quant analysts who have learned new techniques apply them to every problem they encounter without considering which tool is best for a given problem. For example, the explosion of machine learning techniques is changing how quantitative analysis is conducted in finance. Yet there is the nagging question of which complex methods are best suited to a given problem rather than a more straightforward technique. This thinking goes beyond torturing data until it talks and just reporting better prediction metrics.

While including new techniques is an advantage for this book, the emphasis on key topics detracts from its value. From a user’s perspective, data science should be a way of thinking about how to process information, not just a set of techniques. The primary challenge is establishing a framework for conducting data analysis. The process of doing data science is what makes it distinct from merely applying techniques from a toolbox to data problems. How should an analyst systematically look at data, regardless of the problem? A step-like process of analyzing data should always be at the forefront and is timeless.

There is also limited discussion of time series, which is foundational to financial analysis, as well as cross-sectional analysis. If you are a quant analyst, both are critical to the successful application of data science to complex prediction problems. Similarly, the authors do not address key issues such as p-hacking and model overfitting. With cheap computing, data science requires thoughtful approaches to find better predictions efficiently, without overfitting or hunting for fitted results.

Despite some drawbacks, Financial Data Science provides readers with an understanding of and exposure to many new techniques with potential financial value. Financial Data Science is at the current forefront of quantitative knowledge transfer. Some of these techniques will become part of the standard toolbox while others may become less valuable over time. Analysts and managers will both need to understand the good data science tools and those that may be fads; otherwise, they will be at a competitive disadvantage to other firms. Reading the book requires hard work with both a scratch pad for the math and some programming skill, but the payoff is high for anyone who wants to stay current.

source

Leave a Comment

Your email address will not be published. Required fields are marked *