By Justin Carbonneau (@jjcarbonneau) —
Artificial intelligence and techniques such as machine learning have become more prevalent in investing over the past decade. The launch of ChatGPT late last year, and my subsequent interaction with the technology, got me thinking about the technology’s possible impact for investors.
We’ve had multiple guests on our podcast, Excess Returns, that are using A.I. techniques in their investing process. I wanted to get their perspectives on ChatGPT and its implications for investing and beyond.
I asked these technology / investing pros …
- Kevin Zatloukal, former Google programmer and computer scientist. (@kczat)
- John Alberg & Lakshay Chauhan of Euclidean Technologies, a firm using machine learning in its value investing process. (@JohnAlberg)
- Kai Wu, founder of Sparkline Capital, an investment and ETF issuer that uses machine learning and computing to uncover alpha in large, unstructured data sets. (@ckaiwu)
The following four questions …
- How do you think investors may benefit from ChatGPT and the technology behind?
- What are the risks or downsides of a technology like ChatGPT?
- What is the most amazing/interesting thing you’ve seen ChatGPT produce so far?
- Any other thoughts you might have relating to investing, ChatGPT or anything else.
What Is ChatGPT
ChatGPT (https://chat.openai.com) is a natural language processing tool developed by OpenAI, a private venture-backed company. When a user provides ChatGPT with a prompt (or question), it gives them back a response in a way that a human would, presenting information back in a conversational and structured form. Not only can it find an answer to almost any question, but it can write music, correct and produce computer code, write a resume and cover letter, provide advice on relationships, create legal contracts and even write a letter to your young child explaining that Santa isn’t real and why we make stories out of love. It’s no wonder that Microsoft has a strategic investment in the firm behind ChatGPT and is planning to roll out the technology across its product suite!
 How do you think investors may benefit from ChatGPT and the technology behind it?
Kevin: The most immediate applications, I think, of the technology will be to make reading and writing easier. ChatGPT is already implicitly summarizing what it has read on the web, but other research projects have done this more directly. I’d expect Microsoft and/or Google will give us good summarization features soon. Somewhat relatedly, I’d bet that we can get something to tell us what text looks different in one document vs other similar ones, which you could use to highlight interesting changes in a new 10-K, for example. On the writing side, I’d assume Microsoft and/or Google will give us features to turn a bulleted list of points into an easy-to-read letter or essay format.
Those features make more sense in Word or Chrome, I think, than in a chat-bot like ChatGPT. In my view, the best application of chat-bots is to help with brainstorming or organizing your thoughts. Programmers have long noted that explaining your problem to a rubber duck (“rubber ducking”) often helps you figure out the problem, and I imagine that ChatGPT is even more useful there as it can provide semi-intelligent feedback. I’ve seen others note that ChatGPT can act usefully as a therapist, which is a task along similar lines.
John & Lakshay: There are two main applications of this technology where we see big potential for investors.
First, an interface like ChatGPT makes information far more accessible. A good way to think about ChatGPT-like technology is as a calculator, but for reading and writing. It enables investors to be far more efficient with their time. They can focus less on gathering information, and more on analyzing it. These systems will act as assistants for the user. For example, you can imagine a ChatGPT-like system answering questions like: “What were the year-over-year same store sales for Walmart in every year, for the last ten years?” or “Write me a detailed report of the current financial conditions on the commercial real estate market in downtown Austin?”
The second aspect relates to the investing strategies themselves. We have seen limited uses of qualitative information being used quantitatively in investing strategies. That is, most large implementation quant strategies still mostly use highly structured fundamental or market data that are found in company financial statements or market data feeds. Sentiment analysis has been tried using unstructured textual data but there isn’t strong evidence that it has helped much. But now, since this technology can understand language, it is possible to extract much more nuanced information from the text in SEC filings, transcripts, and news stories. For example, it might be possible to identify value traps or potential bankruptcies by analyzing what management has been communicating to investors on quarterly earnings calls or in analyst presentations.
Kai: At Sparkline Capital, we have been using LLMs in production for a few years. We have found it quite useful for the processing of unstructured text data found in documents, such as patents, 10-Ks, earnings calls, or job postings. Our objective is to extract insights from these documents to create structured data “factors” for use in our quantitative models (e.g., brand, innovation, and human capital scores). We have found that fine-tuning the base model on our industry-specific corpuses (i.e., transfer learning) particularly helpful (link to one of Kai’s detailed posts on deep learning & investing: https://www.sparklinecapital.com/post/deep-learning-in-investing).
 What are the risks or downsides of a technology like ChatGPT?
Kevin: As everyone has noted, ChatGPT’s answers are often wrong. That’s a problem because the humans that read them are credulous. A lot of videos I’ve seen implicitly hide this problem because they have someone who knows the answer asking the question and then verifying that it is correct, but normally when we ask questions, we don’t know the answer, and it’s not easy in that case to know if it’s correct or not. (You can actually ask ChatGPT to give you references, but it often makes those up too. If I have to do a bunch of Googling myself to verify the answer, then why bother to ask?)
ChatGPT seems to have a lot of interesting use cases for learning, but I’m not sure how we’re going to get around the problem that it will, off and on, tell you things that are false and you have know way to know which are which.
John & Lakshay: One of the biggest risks with ChatGPT and large language models is that there is no grounding in facts. The information that these models have acquired are stored within the models indirectly (as opposed to having identifiable sources). But since these models are probabilistic, they could write factually incorrect text that would be hard to verify unless the user already knows the answer. There are many great demonstrations of highly convincing but completely made up (incorrect) answers by ChatGPT. There have been some advancements to mitigate this risk, but nothing concrete so far.
Kai Wu: I would still consider it a “beta” product as the results are often unreliable and its knowledge is a few years stale
 What is the most amazing/interesting thing you’ve seen ChatGPT produce so far?
Kevin: I’m spending my time trying to think about why ChatGPT should be able to do the things that it does, which means I usually end up not being amazed by what it can do. For example, it does not surprise me that it can give you a Python program that prints out 1 to 10 or even the first 10 primes because ChatGPT was trained on web posts and there are probably hundreds if not thousands of examples of each of those. Likewise, there are many examples of code that sets up a basic website. ChatGPT could simply have memorized those and be printing them back out to me. (It’s clearly not quite that simple because I can tell ChatGPT what I want the site to be called and it will put the right name in there, but the rest could be copy-and-paste.)
The interesting cases, to me, are when we ask it to create something substantially different from what exists on the web, but I haven’t had a lot of success getting it to do anything like that myself. I’ve only made a few attempts, but the results have been disappointing.
John & Lakshay: There are several interesting tasks that these models can perform. Writing and debugging code is a big one. This saves developers significant time. This also opens up the possibility for non-developers to be able to do programming tasks via natural language.
Another one is being able to write articles, blogs, summaries, etc. on a wide range of topics.
One area where this has a huge potential is education. ChatGPT-like tools are really good at following instructions and answering questions. A tutor-like interface could potentially be big for the education sector.
Kai: The amazing thing about these so-called “Large Language Models” (LLMs) is their versatility. Loosely speaking, they encode a general understanding of natural language through a pre-training process on billions of text documents (e.g., websites, books, Wikipedia). From here, they can be adapted for an extremely wide range of use cases. We’ve most recently seen hype built around “Generative AI” applications such as ChatGPT, but they can also be used for tasks such as classification, extraction, translation, clustering and other use cases.
 Parting thoughts on ChatGPT?
Kevin: The most interesting question I have about ChatGPT is what it’s going to be able to do in the future when it has more parameters and is trained on more data. There has been plenty of research already trying to look at how the abilities of similar models improve as each of those increase. So far, the trajectory suggests that this sort of model will never be good at math, for example. However, the OpenAI folks like to point out that new abilities sometimes emerge at larger scales that were not present in smaller models. They cite various types of “few shot learning” as examples: ChatGPT can generalize from a few examples in ways that smaller models could not at all. So, some at OpenAI think that if we make the model bigger and train it on more data it will eventually figure out arithmetic, even though the trajectory doesn’t suggest this so far. The only way to find out is to try it. I’m looking forward to seeing what happens.
John & Lakshay: Euclidean is very excited about the prospects of this type of technology (generally referred to as Large Language Models (LLMs)) for investing. This is in part because an enormous amount of information about companies and markets are stored and communicated in an unstructured way through language rather than stored numerically in databases. Human analysts, economists, and investors find this information very useful in their decision-making process so we should expect it to be useful for computers/models as well. However, it is currently a blind spot for computers/models — an enormous untapped reservoir that could be made available for automated investment decisions through the use of LLMs.
Kai: I’ve had a front row seat on the progress of NLP and AI the past few years, I would expect innovation to continue its exponential path and am excited to see what the future has in store!
Thanks to Kevin, John, Lakshay and Kai for taking the time to contribute! It will be interesting to watch the developments of ChatGPT and other AI systems as the technology improves and advances over time.