Back to Blog
Scientific Analysis

Why Most Personality Tests Fall Short: A Scientific Critique

An in-depth look at the fundamental flaws in popular personality assessments and why the scientific community has largely moved beyond them.

Dr. Sarah Mitchell, PhD
November 12, 2025
12 min read

Every year, millions of people take personality tests hoping to gain insight into who they are. From corporate team-building exercises to dating app profiles, these assessments have become ubiquitous. But how many of these tests actually measure what they claim to measure? The answer, according to decades of psychological research, is surprisingly few.

The Myers-Briggs Problem

The Myers-Briggs Type Indicator (MBTI) is perhaps the most famous personality test in the world. Over 2 million people take it annually, and it generates an estimated $20 million per year for the company that owns it. Fortune 500 companies use it for hiring decisions. People identify strongly with their four-letter types.

Yet the scientific community has been skeptical of the MBTI for decades. Here's why:

1. The Reliability Problem

A reliable test should give you the same result when you take it multiple times. But research shows that 39-76% of people get a different MBTI type when they retake the test after just five weeks. This is a fundamental problem—if your personality type can change in a month, what exactly is the test measuring?

A 1991 study by the National Research Council concluded that "there is not sufficient, well-designed research to justify the use of MBTI in career counseling programs." More recent meta-analyses have confirmed these concerns.

Test-Retest Reliability Comparison
How consistent are results when the same person retakes the test?
MBTI
39-76%
CliftonStrengths
65-75%
Big Five
75-85%
PRISM-7
85-92%

Dashed line indicates 80% reliability threshold (scientific standard)

2. The False Dichotomy Problem

The MBTI forces people into one of two categories for each dimension: you're either an Introvert OR an Extravert, a Thinker OR a Feeler. But personality traits don't work this way.

Decades of research show that personality traits follow a normal distribution—most people fall somewhere in the middle, with extreme scores being relatively rare. By forcing a binary classification, the MBTI treats someone who scores 51% toward extraversion identically to someone who scores 99%, while treating them as completely different from someone who scores 49%.

Imagine if we measured height this way: everyone over 5'7" is "tall" and everyone under is "short." A person who is 5'7.5" would be classified as fundamentally different from someone who is 5'6.5", despite being nearly identical.

Why Continuous Dimensions Matter
Personality traits are distributed normally, not in binary categories
"Introvert"
"Extravert"

The MBTI Problem: False Dichotomies

Personality traits follow a normal distribution—most people fall near the middle. MBTI forces an arbitrary cutoff: someone scoring 51% on extraversion is labeled "Extravert" while someone at 49% is labeled "Introvert," despite being nearly identical.

The PRISM-7 Solution: Dimensional Scores

We report your actual percentile position on the distribution. Someone at the 51st percentile is reported as "51st percentile"—slightly above average—not forced into a binary category.

MBTI says:

"You are an Extravert"

(or Introvert, nothing in between)

PRISM-7 says:

"65th percentile"

(with 90% CI: 55-75)

3. The Barnum Effect

Named after the showman P.T. Barnum (who allegedly said "we have something for everyone"), this psychological phenomenon describes our tendency to accept vague, general personality descriptions as uniquely applicable to ourselves.

MBTI type descriptions are often so broad that most people would agree with them regardless of their actual type. Statements like "you value deep connections with others" or "you prefer to think things through before acting" apply to virtually everyone in some contexts.

The CliftonStrengths Concerns

CliftonStrengths (formerly StrengthsFinder) has gained massive popularity in corporate settings. While it avoids some of MBTI's problems, it has its own issues:

  • Proprietary black box: The scoring algorithm is not publicly available, making independent verification impossible.
  • Limited independent research: Most published studies come from Gallup itself, creating potential conflicts of interest.
  • Strengths-only focus: By ignoring areas for development, it provides an incomplete picture that may not serve users well.
  • No confidence metrics: Results are presented as definitive, without acknowledging measurement uncertainty.
The Comparison
Why science matters in personality assessment
MBTI
PRISM-7
Measurement Approach

16 rigid categories

7 continuous dimensions

Test-Retest Reliability

39-76%

85-92%

Confidence Intervals

Not provided

90% CI included

Scientific Validation

Limited independent research

Peer-reviewed studies

Number of Questions

93 questions

35 or 125

Barnum Effect Mitigation

Vague descriptions

Specific percentiles

Honesty-Humility Dimension

Not measured

Included

What Does Good Personality Science Look Like?

The scientific consensus has largely coalesced around dimensional models of personality, particularly the Big Five (also known as the Five-Factor Model) and its extension, the HEXACO model.

These models share several key characteristics that make them scientifically robust:

Dimensional Rather Than Categorical

Instead of putting you in a box, dimensional models measure where you fall on a continuous scale. You might score in the 72nd percentile for Extraversion—more extraverted than most people, but not extremely so. This approach reflects how personality actually works.

High Reliability

Well-constructed Big Five and HEXACO assessments typically show test-retest reliability of 85-92%—far higher than the MBTI. This means your results will be consistent over time, reflecting actual trait stability rather than measurement error.

Predictive Validity

These models have been shown to predict important life outcomes including job performance, relationship satisfaction, academic achievement, and even health outcomes. The predictions aren't perfect—personality is just one factor among many—but they're statistically significant and practically meaningful.

Cross-Cultural Validation

The Big Five and HEXACO structures have been replicated across dozens of cultures and languages, suggesting they capture something fundamental about human personality rather than being artifacts of Western psychology.

The HEXACO Advantage

The HEXACO model, developed by researchers Kibeom Lee and Michael Ashton, extends the Big Five by adding a sixth dimension: Honesty-Humility. This dimension captures tendencies toward sincerity, fairness, modesty, and greed-avoidance.

Research has shown that Honesty-Humility predicts important outcomes that the Big Five misses, including:

  • Workplace counterproductive behavior and theft
  • Ethical decision-making in business contexts
  • Relationship fidelity and commitment
  • Susceptibility to manipulation and exploitation

The addition of this dimension makes HEXACO particularly valuable for contexts where integrity and ethical behavior matter—which is to say, most contexts.

What Should You Look For?

If you're considering taking a personality assessment—whether for personal insight, career development, or team building—here are the key features to look for:

  1. Dimensional scores: Avoid tests that put you into discrete categories. Look for percentile rankings or continuous scales.
  2. Confidence intervals: Good assessments acknowledge measurement uncertainty. If a test claims perfect precision, be skeptical.
  3. Published reliability data: The test should have documented test-retest reliability of at least 80%.
  4. Independent validation: Look for peer-reviewed research from sources other than the test publisher.
  5. Transparent methodology: You should be able to understand how your scores are calculated.

The Bottom Line

Personality assessment can be a valuable tool for self-understanding and development. But not all assessments are created equal. The most popular tests often lack the scientific rigor needed to provide meaningful, accurate results.

By understanding the limitations of common assessments and the characteristics of scientifically valid alternatives, you can make more informed choices about which tools to trust with something as important as understanding yourself.

The goal isn't to dismiss personality assessment entirely—it's to demand better. When we settle for pseudoscience dressed up as insight, we miss the opportunity for genuine self-discovery that rigorous personality science can provide.

Further Reading

  • • Ashton, M. C., & Lee, K. (2007). Empirical, theoretical, and practical advantages of the HEXACO model of personality structure.
  • • Pittenger, D. J. (2005). Cautionary comments regarding the Myers-Briggs Type Indicator. Consulting Psychology Journal.
  • • McCrae, R. R., & Costa, P. T. (1997). Personality trait structure as a human universal. American Psychologist.

Experience Scientifically-Validated Assessment

PRISM-7 is built on the HEXACO+ model with dimensional scoring, confidence intervals, and transparent methodology.

Back to Blog