Every Allocator Should Ask These Questions Before Hiring an AI Manager

Illustration by II

Asset owners don’t need to be machine learning experts to tell which managers are the real deal.

By Angelo Calvello November 8, 2022

The use of artificial intelligence in asset management is rapidly increasing — or at least that’s what asset managers want you to believe.

I’ve evaluated scores of managers claiming to use AI. Although some are genuine in their adoption, many are guilty of what I call AI-washing — professing to use AI when in fact they are merely employing traditional quantitative techniques, such as simple linear regressions, that technically qualify as “machine learning.”

These dubious claims largely target asset owners who are “eager” to invest in AI-driven funds, according to a recent CFA Institute Investor Trust Study. The survey found that 84 percent of institutional investors want to invest in funds that use artificial intelligence and 78 percent “believe that the use of AI in investment decision making will lead to better investor outcomes.”

As a community service, I’ve developed a due diligence checklist to help allocators assess managers’ AI credentials.

But AI is not a monolith, so let’s begin by defining the key terms. According to Andrew Moore, former dean of computer science at Carnegie Mellon University, “Artificial intelligence is the science and engineering of making computers behave in ways that, until recently, we thought required human intelligence.”

Machine learning is a type of AI in which algorithms developed by humans learn to perform a specific task (e.g., make predictions) without being explicitly programmed or hard-coded to perform that task. These algorithms improve with experience — examples include Decision Tree, Random Forest, Multilayer Perceptron, and Support Vector Machine. Let’s call this category “classical machine learning.”
Deep learning, a subset of machine learning, uses a layered structure of algorithms called an artificial neural network. Though humans design its architecture and select the network inputs and the desired output, the network — with the proper training — learns how to map the inputs to the intended outputs and make intelligent decisions on its own. Because of their structure and capacity, DL models can approximate much more complex functions than classical ML algorithms and can recognize nonlinear patterns in data that are too complex for humans to identify.
Reinforcement learning, another type of machine learning, is based on algorithms that learn through trial and error which actions to take to achieve a particular goal (e.g., determining the optimal allocation of capital across a portfolio of stocks).
Some managers are also using another type of AI called natural language processing, in which machines make sense of human language and perform tasks like translation, key word extraction, and topic classification. In many cases, some type of ML is used to automate these tasks.

It is important to note that most managers that use AI are employing some type of classical ML to augment an existing human intelligence–based investment process. However, few managers seem to be using DL and fewer still RL, despite the proven ability of these algorithms to solve complex commercial problems.

Infrastructure Debt Positioned Well for the Year Ahead

Sponsored by IFM Investors June 16, 2025

Also, although managers might use AI to improve operational efficiencies (e.g., automating regulatory, middle- or back-office, or customer service processes), I am focusing only on how managers use AI to achieve an investment edge. Traditional operational due diligence could be used to assess these noninvestment applications.

With this background, here is my admittedly nonexhaustive checklist that allocators could use to evaluate the purpose, integrity, and efficacy of AI as used by asset managers.

It might seem a bit pedantic, but start by asking a manager to clearly state what investment problem it is trying to solve, and why. The answer will provide the necessary context for the ensuing AI examination. What process did the manager go through to articulate this problem? What individuals and groups in the organization were involved in this process, and why? Has the process been documented? Finally, what is the manager’s solution to this problem, and why is it the optimal solution? The goal is to get clear articulation of and justification for the resulting investment process.
How did the manager decide to include AI as part of the investment process? Did it engage in a formal research process to make this decision? Who in the organization made the decision to explore and adopt AI? The investment team? The business group? Is the current use of AI supported by leadership? Does this adoption represent a broader decision to expand the use of AI in the future? The responses will help you gauge a manager’s commitment and approach to AI.
Next, ask why the manager made the decision to use AI. Cassie Kozyrkov, chief decision scientist at Google, offers a heuristic for measuring the response: “If you can do it without AI, so much the better. ML/AI is for those situations where the other approaches don’t get you the performance [you] need.” Have the manager explain how AI provides a better solution than traditional human techniques.
Now narrow the scope of inquiry: What particular task will the AI perform? Do not settle for generalities like “identifying market inefficiencies.” The manager should provide the specific use case and explain AI’s exact role in the investment process. The goal is to understand how the accomplishment of this task helps solve the investment problem.
What type of AI is the manager using? Again, do not be satisfied with generalities like “machine learning” or “Nearest Neighbor.” Push for specifics, as there is nothing proprietary in this answer. Then have the manager justify the choice: Why is this type of AI the optimal choice to perform the task? The AI must fit the use case. Ask for examples of how this type of AI has been used successfully in other, similar noninvestment use cases.
For many firms, adopting AI requires a new set of skills. Did the manager buy or build this competency? If the firm bought it, who is the vendor, what was the vetting process, and how are continued use and support ensured? Importantly, does the manager have exclusive use of the purchased technology? If the manager built it, who did the design and development? Did the organization hire new talent? How did it attract and retain the necessary data science talent? Ask to see the CVs of the data science team members; do they have experience using this type of AI in other commercial use cases? What is the relationship of this talent to the investment team and the rest of the organization? Recognize that creating an AI model, especially a DL or RL model, is a mix of art and science. More generally, be sure to ask about the firm’s AI budget by segment (e.g., human resources, computational power, data) to ascertain if it is sufficient to support the continued use and future development of the technology.
Building, testing, training, and commercially deploying AI require a robust research and development infrastructure. Request an explanation of the firm’s technology stack (both hardware and software). Is it cloud-based? What computer languages does it use? Does this infrastructure present any computational constraints?
As Alyssa Simpson Rochwerger and Wilson Pang correctly note in their book, Real World AI, “When creating AI in the real world, the data used to train the model is far more important than the model itself.” It is critical to ask managers about their choice of inputs. Managers might not provide a list of individual inputs, but they should present the kind and type of data as well as the data’s size, frequency, and quality. They should also describe the data curation process — why is this choice of inputs particularly well suited to the use case? Also, to determine if they have access to a steady (and redundant) stream of reliable data for as long as the model is used, ask them to identify their data providers. As Marco Iansiti and Karim Lakhani advise in Competing in the Age of AI, it is critical that their data pipelines be systematic, sustainable, and scalable.
Because data seldom comes in a readily usable format, it is critical to understand how the data is collected, prepared, and cleaned; transformed into a usable format; and stored. Question managers about data governance: Are there any security or privacy issues related to the data they are using? If so, how are these managed?
Once you know the type and purpose of the AI and the choice of inputs, it is time to ask the manager to explain how the data is turned into actionable insights — specifically, how the model is trained to predict future outcomes. A generally agreed upon training process that is used across verticals for ML and DL models includes three phases: training, validation, and testing. Ask for details of the manager’s process, including schemata. Be sure to have the manager put into plain language how that process avoids overfitting and look-ahead biases, but also request data science and investment performance metrics used to measure the model’s performance.
After developing and testing any AI, a manager must decide if the model is ready to go into production. What is the minimum performance it is willing to accept? Have the manager demonstrate that the model’s output exceeds what human intelligence could achieve. Also, ask who made the decision about whether the model was production-ready and if that decision was documented.
Managers have likely run previous iterations of AI experiments that failed to provide acceptable results in either the training, validation, or testing phase. Ask managers to explain how they responded to these failures. Did they review the models, data, and results to determine how problems might be remedied? What would cause them to scrap an experiment? What did they learn from these failures? The key is that they follow a rigorous process to ensure the integrity of new experiments (e.g., they do not cherry-pick features of a model or redo the testing phase with the same data) and that they document this iterative process in version-control software.
If a model meets the required standards, how is it integrated into the human-designed investment process? (In the case of some applications, the model itself could be the investment process.) This integration is a nontrivial task requiring specific engineering skills. Who is responsible for this “plumbing”? Is there a quality-check transition from testing to production? It is entirely appropriate to request a flow chart of a model’s signal generation process and to ask how the signal is integrated into the investment process. And then there are the practical matters: Does the manager have a means of ensuring the validity of input data? How often does it run the model? Does it run multiple models in parallel to ensure signal accuracy? How is the signal expressed? Do not hesitate to ask to see a signal being generated, and the resulting signal. Finally, are the models, data, and signals archived?
A manager’s commitment to the model and its output is of critical concern. A manager uses AI because it uncovers actionable insights undetectable to humans and these insights are additive to the investment process (or are the investment process itself). Therefore, knowing whether humans can override the model output is critical. If such discretion is permitted, who makes this decision and under what conditions? How often has this happened? Ask for specific examples of when the use of discretion worked and when it did not.
The use of discretion also raises the question of risk management and whether it’s built into the AI functionality. In the case of some machine learning models, this is expected, but less complex ML models frequently will use an exogenous risk management function, often requiring human discretion. What is this function, and does it add value?
The world changes, and so do data patterns. Therefore, it is imperative to monitor the model for performance decay. What process is used to detect and measure such decay? To limit model decay, it is best practice to regularly retrain the model on new data; what is the retraining process? More fundamentally, how would the manager know if the model stopped working? What metrics does the firm use? Has this ever happened? If so, what course of action did it take?
We’ve reached the point where it is time to ask the manager to demonstrate its AI’s contribution to the investment process. This demonstration must be based on empirical evidence. Do not accept a simple verbal explanation because, as research published in The Lancet shows, post hoc verbal explanations tend to misrepresent the relationships between inputs and outputs. Moreover, “the tendency is for humans to assume the AI is looking at whatever feature they, as human clinicians, would have found most important.” This is especially true with an DL or RL model, where the output cannot be explained because of the model’s very structure. In a future column, I’ll address this issue of explainability more thoroughly. But in summary, when it comes to AI in the real world, the focus should be on the rigor and thoroughness of validation procedures, not a human narrative.

AI is not magic. Investment managers cannot simply hire data scientists and machine learning engineers, buy data, and hope that magic happens. VentureBeat estimates that only about 13 percent of all applied AI is actually put into production.

Designing, developing, and deploying models are experimental by nature, and many projects fail because of the complexity of the task itself. But they also fail for more mundane reasons, including unclear business use cases; a lack of understanding and commitment by leadership; organizational silos; a dearth of AI talent; lack of access to sufficient, relevant, and reliable data; deficient infrastructure; and the cost of implementation.

Despite these obstacles, some managers are using AI to create better investment outcomes for clients. The challenge for allocators is creating a methodology to separate the wheat from the chaff.

Angelo Calvello, Ph.D., is co-founder of Rosetta Analytics, an investment manager that uses deep reinforcement learning to build and manage investment strategies for institutional investors.

Marco Iansiti

Angelo Calvello

Andrew Moore

Karim Lakhani

Alyssa Simpson Rochwerger

55+ Years of Analysis & Insight

Custom Research | Comprehensive Data | Transforming Intelligence

600 Asset Managers | 90+ Annual Events

Infrastructure Debt Positioned Well for the Year Ahead

Related Articles

Contact Info

Corporate

Digital

Events

Market Intelligence

Subscription

Register

Newsletters

55+ Years of Analysis & Insight

Custom Research | Comprehensive Data | Transforming Intelligence

600 Asset Managers | 90+ Annual Events

Infrastructure Debt Positioned Well for the Year Ahead

Related Articles

Investment Management in a Box

SEC's AI-Driven Market Risk Worries: Justified Caution or Misplaced Concern?

Gold and the U.S. Dollar: An Evolving Relationship?

Contact Info

Corporate

Digital

Events

Market Intelligence

Subscription

Register

Newsletters