Trial and Error

One thing I’ve learnt about AI is that it can be very helpful, yet it sometimes falls short of the task at hand. This became clear during a recent assignment focused on developing a value-at-risk model. This article covers both the success and failure I experienced with AI.

Here’s what worked.

A colleague introduced me to Otter.ai, a service that provides meeting transcripts (and more). Although other providers like Zoom offer similar services, and you can transcribe audio files to use with tools like GPT-4, this was my first time using Otter. It was straightforward. We invited the AI to the meeting as a participant, and by the end, it had generated a transcript and a set of notes. But it didn’t stop there. I could use either a single transcript or multiple ones from client meetings to generate various documents, whether a list of action points, memos in a specific style, an explanatory document, or just questions from the meeting. Otter’s capabilities seemed limited only by my imagination and needs.

For treasury teams, this kind of AI application offers immediate, practical benefits. By automating the process, you can concentrate on what’s said rather than being busy taking notes. It means nothing is missed, and Otter’s ability to generate documents directly from transcripts can help save a significant amount of time, particularly if you find writing is a slow process.

Now, what failed.

The assignment required building a prototype to demonstrate some of the main features a working risk model would have. In treasury and risk management, building Excel spreadsheets like this is routine, and many banks heavily rely on Excel. This led me to wonder if AI tools like GPT-4 or Claude could create a working value-at-risk model in Excel.

I tried this a few months ago with little success. Would it work now? I gave the AI some data and parameters. Then I instructed it to ask any clarifying questions one at a time before proceeding—a method I find useful for prompting.

The AI asked sensible questions, and I thought this time it would work. However, the results were disappointing. Claude struggled to understand the data in the spreadsheet, while GPT-4 managed better. With some prompting, GPT-4 eventually calculated daily price changes and daily volatility. Despite this progress, I still couldn’t generate a working model.

Frustratingly, AI is excellent at discussing concepts and can provide a detailed analysis of different risk approaches, but when it came to writing formulas in Excel and manipulating data to create a functional spreadsheet, it just didn’t happen.

For us in treasury and risk, this highlights a critical point: while AI can already do many tasks, particularly in analysis and decision support, it may not (at the moment) do everything you expect.

However, this experience also underscores the importance of staying informed about AI’s changing abilities. If you haven’t tried it yet, you should consider initiating small pilot projects to test AI in specific areas, such as data analysis or simple modelling tasks. By doing so, you can assess the practical benefits without disrupting key processes.

Despite this setback, I’ve learnt a few valuable lessons.

Experimentation is key when working with AI. You need to explore different models—like GPT-4 and Claude—or applications such as Otter to discover what AI can do. Being open to using both models and applications expands your understanding of AI’s potential.

Over the past 18 months, I’ve developed an understanding of what AI is likely to accomplish and where it may fall short. I think it could do a lot more and that the limitation is me rather than it.

Using multiple AI models can be beneficial. In this case, GPT-4 outperformed Claude in generating a working model, although this is not always the case. Sometimes Claude’s larger context window and its separate dedicated window (artifacts) make it an improvement over GPT-4. While I primarily use GPT-4, I turn to Claude when GPT-4 gets stuck.

My introduction to Otter came through a colleague who had been using it for some time. I quickly realised that, despite my experience with other AI models, using Otter was not much different. They all operate through a prompt window, and some offer multi-modal functionality. Understanding this fundamental operation is useful when you’re working with new applications.

Introducing AI to your team shouldn’t just involve giving everyone access. Instead, you should encourage experimentation, promote AI literacy, introduce pilot projects, collect experiences and knowledge, circulate that information within the firm, and quantify the benefits.

There should be a coordinated effort, perhaps led by a group or committee, to manage this process. Learning and sharing are crucial mindsets. For example, knowing how Otter works—thanks to my colleague—has saved me a considerable amount of time and significantly improved my efficiency. It’s about collective solutions and keeping track of how people use AI. The benefits extend beyond time-saving. When I was introduced to Otter, it wasn’t just a time-saver for working on documents; it also allowed me to concentrate on the conversation without the distraction of taking notes, which I found particularly helpful. I also appreciated knowing that the conversation was recorded and could be reviewed later.

While AI won’t do everything, it is improving. In some cases, its limitations are due to our lack of imagination or our inability to prompt it effectively. I’m sure others have successfully used AI to build spreadsheets—it just didn’t work out for me. However, AI can now handle spreadsheets in a more functional way than it could a few months ago and can execute basic commands using the data contained within them.

Where this technology will be in three to five years is an intriguing question, particularly for those of us working in treasury. It seems inevitable that we’ll be able to instruct AI to build fully functional risk management models that provide both quantitative information and qualitative insights.

That’s a topic I intend to explore in a later blog. For now, many of our roles, especially those in knowledge-based sectors, will undergo significant changes. The future is undoubtedly exciting, but for many, the speed of these changes will also be unsettling.

If you liked this sign-up for regular posts

Trial and Error

Recent Posts

Comments