Rethinking Research: Private GPTs for Investment Analysis
In an era where data privacy and efficiency are paramount, investment analysts and institutional researchers may increasingly be asking: Can we harness the power of generative AI without compromising sensitive data? The answer is a resounding yes. This post describes a customizable, open-source framework that analysts can adapt for secure, local deployment. It showcases a hands-on implementation of a privately hosted large language model (LLM) application, customized to assist with reviewing and querying investment research documents. The result is a secure, cost-effective AI research assistant, one that can parse thousands of pages in seconds and never sends your data to the cloud or the internet. I use AI to augment the process of investment analysis through partial automation, also discussed in an Enterprising Investor post on using AI to augment investment analysis. This chatbot-style tool allows analysts to query complex research materials in plain language without ever exposing sensitive data to the cloud. The Case for “Private GPT” For professionals working in buy-side investment research — whether in equities, fixed income, or multi-asset strategies — the use of ChatGPT and similar tools raises a major concern: confidentiality. Uploading research reports, investment memos, or draft offering documents to a cloud-based AI tool is usually not an option. That’s where “Private GPT” comes in: a framework built entirely on open-source components, running locally on your own machine. There’s no reliance on application programming interface (API) keys, no need for an internet connection, and no risk of data leakage. This toolkit leverages: Python scripts for ingestion and embedding of text documents Ollama, an open-source platform for hosting local LLMs on the computer Streamlit for building a user-friendly interface Mistral, DeepSeek, and other open-source models for answering questions in natural language The underlying Python code for this example is publicly housed in the Github repository here. Additional guidance on step-by-step implementation of the technical aspects in this project is provided in this supporting document. Querying Research Like a Chatbot Without the Cloud The first step in this implementation is launching a Python-based virtual environment on a personal computer. This helps to maintain a unique version of packages and utilities that feed into this application alone. As a result, settings and configuration of packages used in Python for other applications and programs remain undisturbed. Once installed, a script reads and embeds investment documents using an embedding model. These embeddings allow LLMs to understand the document’s content at a granular level, aiming to capture semantic meaning. Because the model is hosted via Ollama on a local machine, the documents remain secure and do not leave the analyst’s computer. This is particularly important when dealing with proprietary research, non-public financials like in private equity transactions or internal investment notes. A Practical Demonstration: Analyzing Investment Documents The prototype focuses on digesting long-form investment documents such as earnings call transcripts, analyst reports, and offering statements. Once the TXT document is loaded into the designated folder of the personal computer, the model processes it and becomes ready to interact. This implementation supports a wide variety of document types ranging from Microsoft Word (.docx), website pages (.html) to PowerPoint presentations (.pptx). The analyst can begin querying the document through the chosen model in a simple chatbot-style interface rendered in a local web browser. Using a web browser-based interface powered by Streamlit, the analyst can begin querying the document through the chosen model. Even though this launches a web-browser, the application does not interact with the internet. The browser-based rendering is used in this example to demonstrate a convenient user interface. This could be modified to a command-line interface or other downstream manifestations. For example, after ingesting an earnings call transcript of AAPL, one may simply ask: “What does Tim Cook do at AAPL?” Within seconds, the LLM parses the content from the transcript and returns: “…Timothy Donald Cook is the Chief Executive Officer (CEO) of Apple Inc…” This result is cross-verified within the tool, which also shows exactly which pages the information was pulled from. Using a mouse click, the user can expand the “Source” items listed below each response in the browser-based interface. Different sources feeding into that answer are rank-ordered based on relevance/importance. The program can be modified to list a different number of source references. This feature enhances transparency and trust in the model’s outputs. Model Switching and Configuration for Enhanced Performance One standout feature is the ability to switch between different LLMs with a single click. The demonstration exhibits the capability to cycle among open-source LLMs like Mistral, Mixtral, Llama, and DeepSeek. This shows that different models can be plugged into the same architecture to compare performance or improve results. Ollama is an open-source software package that can be installed locally and facilitates this flexibility. As more open-source models become available (or existing ones get updated), Ollama enables downloading/updating them accordingly. This flexibility is crucial. It allows analysts to test which models best suit the nuances of a particular task at hand, i.e., legal language, financial disclosures, or research summaries, all without needing access to paid APIs or enterprise-wide licenses. There are other dimensions of the model that can be modified to target better performance for a given task/purpose. These configurations are typically controlled by a standalone file, typically named as “config.py,” as in this project. For example, the similarity threshold among chunks of text in a document may be modulated to identify very close matches by using high value (say, greater than 0.9). This helps to reduce noise but may miss semantically related results if the threshold is too tight for a chosen context. Likewise, the minimum chunk length can be used to identify and weed out very short chunks of text that are unhelpful or misleading. Important considerations also arise from the choices of the size of chunk and overlap among chunks of text. Together, these determine how the document is split into pieces for analysis. Larger chunk sizes allow for more context per answer, but may also dilute the focus of the topic
Rethinking Research: Private GPTs for Investment Analysis Read More »











