A 2021 documentary from HBO Max and CNN films paints a dark picture about personality testing. But it’s not an accurate picture. The filmmakers covered the sensational side of personality tests. But they didn’t take a research-based approach. Nor did they cover how to use personality tests in the workplace responsibly.
The documentary does not provide a coherent argument nor conclusion about personality. Rather, the core point is about fairness. The main point is that personality testing is wholly unfair and is the cause of systemic bias in employment practices.
We agree with the response to the documentary issued by the Society for Industrial and Organizational Psychology (SIOP). (SIOP is the professional home for organizational psychologists who are trained to practice in this arena, and, in full disclosure, we are members.)
To be fair, the documentary is just one of many examples of how personality testing is often inaccurately portrayed. Perhaps there wouldn’t be as much attention if they weren’t so popular with employers. This popularity points to the value they can provide despite the challenges that get noticed by the press. So how should employers get the value these tools provide without creating undo concerns?
We believe it’s important for HR professionals to understand how to use personality tests in the workplace. In fact, personality testing, when used appropriately, has been shown to improve the diversity of our hiring decisions and correlate with job, task, and training performance. As such, companies often use these tests in programs focused on employee growth and development.
Getting It Right
The film does get some things right. There are plenty of examples of bad HR practices that have unintended consequences. Every day we see bright shiny new tools that claim to predict which applicants will do best on the job. Some might use a personality assessment, others use online games, or even scrape social-media profiles for clues about future performance. Only some of these have been designed and tested to ensure they are effective and free from the sort of bias portrayed in the documentary.
That’s why it’s so important to learn how to use personality tests in the workplace correctly. This is particularly important if you are using it for leadership roles, where personality plays a major role.
In the eyes of the government agencies that audit these processes, even placing someone in an accelerated development pool is an employment decision. Employers need to have a good understanding of the requirements for the tools they use to assess their people.
For that reason, we will share a few rules for getting personality testing right in the workplace. We created these rules based on empirical research, decades of application, and the lessons of continuous scrutiny from companies devoted to fairness and validity. They will help you make the right decisions about people when it comes to hiring, promoting, and growing their careers.
We focus on personality here, but many of the same rules apply to the measurement of other characteristics as well.
MBTI and Other Personality Tests
Before we dive into the rules, we wanted to quickly discuss types of personality tests. For many people, the popular Myers Briggs Type Indicator® (MBTI) is the first personality inventory that comes to mind.
One of the big mistakes is assuming that all other personality tests are similar to the MBTI. They are not.
There are thousands of personality tests on the market, each with different purposes, measuremnt factors, validity, and more.
The MBTI was not designed to be used as a pre-employment personality test or to make hiring decisions, and the company says so clearly. But other workplace personality tests are specifically designed and tested for this use. These tests are built on well-supported research that is proven to relate to specific aspects of job performance.
One of the most common personality theories behind these tests is the “Big 5” factor model of personality. The five factors (Neuroticism, Extraversion, Openness to Experience, Agreeableness, and Conscientiousness) are used to organize many more specific personality traits called facets.
For example, Conscientiousness contains facets like dutifulness, order, self-discipline, and achievement striving. These facets have a stronger research base than the MBTI factors.
Different personality tests are similarly research-driven. Some have six factors (e.g., the “HEXACO” model), some have seven (the model that underlies the Hogan Personality Inventories), and so on.
No one model is right and the others wrong. They are simply different ways to slice the same pie. Rather, what’s important is that they have been tested and supported by research. They have shown a history of successful measurement and have withstood multiple checks for fairness.
These rules reflect research findings regarding the use of personality tests in the workplace, but also our years of experience implementing them for positive impact. As you will see, our bias is toward the inclusion of multiple measures in combination to get a view of the whole person. We also place stronger weight on observed behavior, using personality as a way to understand the nuance and rationale behind workplace performance.
Use Proven Predictors That Are Measured Effectively and Are Empirically Researched to Ensure Fairness
It’s easy to find professional standards and legal guidelines to guide the appropriate use of assessment information. For reference, see examples such as SIOP’s Principles for the Validation and Use of Personnel Selection Procedures, and legal guidelines (e.g., the Uniform Guidelines on Employee Selection Procedures). That said, these are technical documents and require some expertise to interpret, so we will break down some of the key aspects.
According to these standards, a good personality tool should be both reliable and valid. Reliable means that if you used it many times on the same person, you would expect similar results. Validity means that it actually measures the targeted personality traits.
The provider of any personality tool should be able to address these questions and provide related research documentation for their tool. Even if the tool has demonstrated reliability and validity, the question of fairness across demographic groups is also critical. Ask the provider how their tool meets these standards and what evidence of fairness have they collected.
Furthermore, the providers should be able to describe the theory behind the variables they are assessing, and there should be a base of scientific research that shows the theory or model has been tested.
Demonstrate a Relationship Between Test Scores and Important Aspects of the Job
One of the most important criteria for any assessment is to show that it relates to something important about the job. Does the tool help to predict how the person will perform? How long they will stay on the job? How well they will get along with their teammates?
At a minimum, ask the provider for evidence that the prediction was used for similar jobs before you use the tool. Ideally, the provider will conduct research to show the relationship between their assessment and important features of the jobs in your company. Usually, you’ll receive these results as a correlation coefficient or a content map.
The test provider should deliver the results of the validation study to you in a format that allows you to prove to regulators that the tool works in your organization.
Use Tools That Are Designed and Researched for Use in the Workplace
This one might seem obvious, but it’s not. In recent years, there has been an explosion of technology-based personality tests. But not all of them are designed to be relevant for use in the workplace.
This point is made in the documentary as they describe how the MBTI was developed to be used as a tool for self-insight. Yet some employers insist on using the tool for picking between job candidates, despite the publisher’s protests.
It’s not that the tool provides bad information. Rather, it was never designed and validated for this purpose.
The risks go up even higher if a tool was first designed to detect health and medical conditions but is being twisted for applications in the workplace. This practice can lead to illegal discrimination and has been banned in the United States under the Americans with Disabilities Act.
The EEOC has released extensive guidance on this topic. Reputable workplace assessment firms have long ago stripped out any questions that reflect mental disabilities.
Watch Out for Unintended Consequences
Even the best assessments can produce unintended consequences. For example, a test for cognitive ability or an aptitude test can reveal differences between demographic groups that don’t have the same access to educational opportunities. Likewise, personality tests might show differences between people raised in different cultures.
These differences are real and are a common side effect of accurate measurement. But they can cause major issues if the assessment isn’t fair to people with different backgrounds who could have succeeded on the job.
Well-designed assessment processes should account for these differences by including many variables that give you a sense for the whole person, not just a few personality traits that are over-emphasized. Good consultants will work with you to monitor your assessments to ensure fairness. In addition, they can provide advice on how to best weight each variable so they fit your purpose. That way, you won’t end up with a test that’s unfair to your employees and job prospects.
Another risk is to model your test after a small, similar group of people. A lot of companies do this because they’re trying to replicate the profile of a few top performers in important jobs. But this process is badly flawed.
Having the same personality profile in an organization can reinforce “group think.” When everyone approaches a problem from a similar point of view, they tend to reject new ways of thinking. Not only do these groups stifle innovation, but they often overlook critical risks, which can produce dire results.
Understand That Many Alternatives Are Less Effective Than Good Assessment
In our experience, well-intended HR and talent professionals overlook this point more than any other. When considering any tool or process that provides information about people (even interviewing techniques apply here), you must consider their quality. But you also have to compare them to what will be used as an alternative.
The reality is that we have to make decisions about people. And we need to base those decisions on some rationale.
If you don’t use a process that produces good information, what will you use? A hiring manager will find a way to fill the gap. They will make up their own interview questions. They will apply their own bias (whether they recognize it or not). They will be influenced by factors that are not related to the job. And they will be ecstatic when they find people who are similar to themselves.
Worse yet, they will be convinced they are great at making these decisions. Years of well-accepted science in this area has demonstrated that unstructured interviews are statistically terrible for identifying people who will succeed on the job.
Well-designed assessment and interview tools may not be flawless, but they are highly useful. If you don’t use these things for fear of the challenges, what will you use instead? The answer is almost always the same: “We trust our managers to make the right decisions about their people.” The science is not on your side if this is the path you take. Too many organizations fool themselves with this perspective.
Know What the Assessment Results Mean
Assessment is not magic. Psychologists design good assessments with an underlying logic that they have to be able to explain to others.
This is important because effective users of assessment data should have a basic understanding of what the scores really mean. More importantly, they need to understand what they don’t mean, so they don’t make judgments that stretch the value of the assessment results too far.
When it comes to personality tests, it is particularly important to keep the test result in perspective. In our practice, we commonly use personality tests to understand why participants might behave the way they do.
For example, an extrovert might do very well on assessments that require lots of partnership and outreach behaviors because these behaviors are consistent with their natural tendencies. Meanwhile, an introverted leader might take a different approach to partnerships that is more comfortable for their style but can still succeed in the role.
This information is useful when coaching and developing a leader to take on bigger challenges. The coach can hypothesize areas that might be challenging for the leader to take on based on their profile. They can also provide recommendations consistent with their strengths. The coach has a perspective on the assessment information and can keep it in context. In this case, the personality test is used like a flashlight, shedding light where there was darkness before.
The risk goes up when naive users use scores for personality profiling. They often think the leader’s personality type is like a box in which the leader is trapped, preventing free choice. As a result, they assume the leader can’t succeed because of the personality trait.
The user has the responsibility to develop a basic understanding of the limits of the assessment data they are using.
Recognize There Are Multiple Ways for People to Succeed
Any assessment is just one perspective of an individual. But people are complicated. Even if they face a limitation from one perspective, they might surprise you and succeed from a combination of other traits and skills.
For this reason, it’s important to use a range of metrics when making important decisions about people. In the HBO documentary, they overlooked this concept, portraying only situations in which personality tests were being given as the sole pre-employment test used to screen out job seekers.
Assessment professionals routinely advise against relying too much on a single assessment, especially personality tests. There are so many ways people can succeed.
In the research on assessment, this idea shows up as low correlations between personality and job performance. For example, research on the Big Five personality model shows that Extraversion is important to measure for sales jobs. It relates with job performance in sales jobs with an average correlation of .18.
That means that around three percent of the variability in sales performance can be accounted for by knowing if someone is extraverted. That’s a useful fact to know if you are hiring salespeople, but certainly should not be the only reason for hiring a particular person. However, if two job candidates are equally strong in other ways, it might be perfectly fine to lean toward the extraverted job applicant.
Add another complexity: an individual’s personality traits can vary over time regarding whether they help or hurt, especially for leaders. Consider the personality characteristic of Ambition, which is desirable for upcoming leaders to keep them motivated for the challenges of the role. But when leaders with high ambition reach executive positions, they may express the same trait by assertively suppressing different viewpoints to preserve their dominance.
The same trait that helped early in a leadership career becomes a limitation in later roles. Sophisticated assessment users understand that there is no such thing as “the successful personality profile.” Rather, each profile can be strong or weak depending on which characteristics are expressed across different job circumstances.
Past Behavior Is the Best Predictor of Future Performance
You’ve probably heard this aphorism before. The idea is that people are consistent enough that their past behavior will likely be typical of their actions in the future. Therefore, the best way to predict future behavior is to examine current or past behavior.
This pattern is the foundation for many excellent assessment tools. For example, job sampling is an assessment technique that puts a person on the job for a short period of time. This approach is a very strong predictor of success. However, there are a number of challenges to this approach, including the practical implications, risk, and expense of putting people on jobs that they aren’t ready for.
Assessment centers address this concern by simulating key parts of the job in a way that can be standardized and repeated across candidates. Research has shown that these tools are excellent predictors of performance across a wide range of jobs.
By combining these assessments with other tools such as personality tests, you can create a very well-rounded view of the person. You can see both performance and insight into why they behaved that way. That’s why so many companies use this combination for very high-risk roles such as CEOs, pilots, and astronauts.
Behavioral interviewing is another way to use past behavior as a predictor of future success. This structured interview approach asks for specific examples of when a job candidate has performed a competency that is critical for the new job. Behavioral interviewing is also highly scalable, and is applicable to nearly any job.
Beware of Decision Rules That Can’t Be Explained
Remember that assessment is not magic. These tools need to be based on scientific research. Therefore, assessment creators must be able to explain the logic and rationale for why any tool works.
To be clear, that doesn’t mean the scoring key for a test is available, even if you ask for it. Often the exact rules for generating a score based on individual results are proprietary to the companies that invested in making them.
In fact, professional standards dictate that the information be kept secure. The reason is that, just like a high-school algebra test, if the answers are known in advance, the test no longer measures what it is designed to. Rather, it only measures whether the person had access to the key in advance.
This rule isn’t about knowing the scoring key. It’s about whether the designer of the test even knows how it works. Allow us to explain.
Advancements in artificial intelligence (AI) make it possible to predict future actions based on big datasets.
For example, a company could collect data from your social media posts, reactions to online quizzes, even the time it takes to surf a web page and which images you hover over. AI could then use the data to make predictions about you.
While there are a range of concerns about this type of data collection, one of the biggest issues is using it to inform workplace decisions. This approach steps outside of the realm of established psychological research. There isn’t a well-founded theory guiding the choice of variables. Rather, there’s only an algorithm that has been tuned to maximize a prediction with whatever information it can find.
Do Curly Fries Make You Smarter?
One fun example of how this approach can go wrong relates to curly fries and IQ. Researchers from the University of Cambridge created an algorithm based on the likes of 60,000 Facebook users. Among other things, it found that Facebook users who like curly fries have higher IQs than those who don’t.
To fuel this prediction, the algorithm measured all the current “likes” provided in Facebook. As the study gained wide press coverage, more people started to “like” curtly fries. Maybe they did it to show they were smart, or simply because they are tasty.
But the new pattern of likes can change the prediction value of liking curly fries in the algorithm, leading the variable to no longer predict higher IQ. This true story shows some of the challenges of using AI in high-stakes employment decisions—significantly.
Why did the algorithm find the pattern? Likely not even the researchers can explain it. Perhaps it was a bias in the first sample. For example, if the curly-fry page started based on a stand opened in Harvard Square and frequented by students. As more stands opened in other areas, the association with IQ disappears.
In truth, we will never know. And that’s the problem with not knowing the basis for why one variable predicts another. Unless there is theory and research to guide a prediction, there is a chance it is a temporary fluke in the data that won’t stand the test of time.
Contrast the curly-fry example with a test of sales potential that includes an Extraversion personality scale as part of its design. Extraversion can be reliably measured and has a long research history. Studies have found extraversion correlates with sales performance. Furthermore, it’s possible to show that the relationship is valid as one of several parts of a test for hiring salespeople. The designers of the test don’t give you the answer key, but they can certainly explain why it works. Without the distraction of curly fries.
In this algorithm-driven world, it might serve us well to heed some advice from Harry Potter: “Never trust anything that can think for itself if you can't see where it keeps its brain.” The algorithmic equivalent might be “don’t trust decision rules that can’t be explained or just don’t make sense.”
Using personality tests in the workplace is simply not a “good or bad” proposition.
It’s really about learning how to use personality tests in the workplace correctly.
The bleak portrayal of personality measurement in the documentary was not an accurate reflection of the principled application of assessment in the workplace. But it does highlight the fact that personality tests can have a dark side when used incorrectly.
We offer our rules to help employers leverage the right tools to make informed, fair, and data-driven decisions about their people.
Learn more about leadership assessment in DDI's Ultimate Guide to Leadership Assessment.
Doug Reynolds, Ph.D., is DDI’s Executive Vice President. His department includes teams of psychologists and engineers that develop software systems for assessment and learning products for use in large organizations. Doug has published and presented frequently on topics related to the intersection of I-O psychology and technology. He co-edited Next Generation Technology-Enhanced Assessment and the Handbook of Workplace Assessment, and coauthored Online Recruiting and Selection. He also served as SIOP president in 2012-2013.
Georgi Yankov, Ph.D., is a research scientist at DDI’s world headquarters in Pittsburgh. He has extensive expertise in psychometrics, test development, and individual differences. At DDI, he designs and validates assessments for leadership selection and development. Georgi holds a M.Sc. in Industrial-Organizational Psychology from Baruch College (City University of New York) and a Ph.D. in Industrial-Organizational Psychology from Bowling Green State University.
Topics covered in this blog