#59: AI Passed the CFA
Model progress and context engineering
Well, my ego took a major hit last week. AI passed Level III of the CFA exam, which doesn’t feel great. I put an enormous amount of time and effort into studying, and now a computer can make it happen in a matter of minutes.
To make matters worse (for my ego), it’s not just one frontier model that passed. Models from OpenAI, Gemini, and Claude passed a series of Level III mock exams at a success rate consistently above the Minimum Passing Score threshold.
As a CFA charterholder, it’s a bummer to see a computer model complete a process regarded by many as brutally difficult. But at the same time, I’m excited about what this means for how AI will solve problems going forward.
A bit of background on the CFA exam process. The CFA program consists of three comprehensive finance exams conducted in-person. Levels I and II are multiple-choice, while Level III is a mix of multiple-choice and essay-style prompts. The CFA designation is considered the top credential in the finance industry.
AI had made quick work of Levels I and II, as these multiple-choice exams are deterministic in nature, since the answer can only be one of x-many choices. On the other hand, AI has struggled to crack Level III given the ambiguity around critical thinking in the essay section. Level III tests the bigger picture for finance professionals, where you are asked to tie various concepts together in a written response. Probabilistic questions have stumped AI to date because they’re broader and allow for multiple versions of a correct answer.
How did the AI models pass Level III? Researchers changed the models’ default line of reasoning to a concept called chain-of-thought prompting. This version of prompting requires the model to state its logic, summarize its findings, and then choose an answer. Essentially, researchers asked AI to “show its work”. By combining structured reasoning with the domain knowledge from each respective model, AI rose to the occasion and passed the essay response portion.
I find the simplicity of this result fascinating. Researchers asked the model to show its work, and we got a better answer.
Yet, I can’t help but bring up Daniel Kahneman’s System 2 thinking, a topic we discussed in AI + Thinking, Fast and Slow. Quick, intuitive thinking comes more naturally to AI models. When you ask a specific question of AI, you get a specific answer in return. While on the other hand, System 2 governs the slow, deliberate decision-making for ambiguous problems. When I wrote the article mentioned earlier in this paragraph , AI was not in a place to be able to comfortably handle System 2 thinking.
AI isn’t in a place where it can seamlessly replicate System 2 thinking, as a specific type of prompting is needed for AI to deliberate an answer. But this means that we are getting closer to imitating human thought processes via technology. A pass of CFA Level III is a promising step toward simulating System 2-like reasoning.
The sooner AI can achieve System 2 thinking at scale, the sooner we’ll see much more commercial success in fields that require subjective judgement like sales or marketing. As we bring structure to the ambiguity that is probabilistic problem solving, we’ll be able to take on the near infinite number of outcomes when it comes to trying to close a sale, as we are usually dealing with a human on the other end. By communicating with your customer or prospect in a chain of thought that lines up with how they prefer to receive information, I’ll bet that win rates increase.
Here’s an example: I recently took a sales demo of AI voice software that looks to replicate a brand’s sales rep. Although I was impressed with the sheer ability to sound like a human in the way of tone inflection, natural pauses, and accent, the software fell short. Particularly, in its ability to reason and pitch persuasively. It was better at reactive sales instead of proactive sales.
What am I tactically taking away from the chain-of-thought breakthrough? I’m investing more in my prompt writing. In fact, I’ve heard the term “context engineering” lately and believe it’s better representative of how we need to communicate with AI. For repetitive processes, I’m first writing out how I want the model to think through a problem. Decision-making frameworks, checkpoints of when to question reasoning, and examples of what a correct answer may look like, as well as signals of a wrong answer.
Chain-of-thought prompting represents a big step in artificial intelligence as it simulates how a human thinks. When asked an open-ended question, we usually don’t spit out the first thing that comes to mind (zero-shot prompting). We instead think about a few possible solutions, reason with ourselves on how each path will play out and then choose a solution.
If you want to be able to rely more on AI, you must put in the effort to teach it how to process information. Otherwise, your prompt is a glorified Google search inquiry. And sure, that will work if your question is simple but may fall apart if you are providing AI with a multi-step task or a deeper analysis.
Side note: Because AI passed Level III, it’s fair to question whether undertaking the massive commitment of the CFA exam is worth the trouble. Especially since it’s frequently quoted that 1,000 hours of studying are needed to pass all three examinations (1,000 hours cumulatively, not for each exam).
But what those 1,000 hours do signal is grit. From early mornings to late nights and weekends, the CFA exam process requires long-term discipline.
I still recommend taking the CFA exam if you are committed to a career in finance as the process teaches you how to tie the little details to the bigger picture and vice versa. And the credential will still hold weight with people in the industry who are familiar with the commitment that the exam requires.
Still, grit can be developed in many other ways. If we assume that AI will continue to progress in its ability to tackle complex, ambiguous problems (likely case), learning specific material for the sake of signaling may not be your best return on investment. Define what grit looks like for you and then go out and achieve it.

