Real World Appeal
Back to all articles
Looks improvementJune 20, 20269 min read

Is Umax accurate? Why the same photo gets a different score

Is Umax accurate? Why the same photo gets a different score: the algorithm tracks the photo, not your face, and the number isn't calibrated to real attraction.

You upload a selfie. Umax thinks for a few seconds, freezes around 91% the way it sometimes does, then hands you a number — say, a 67 out of 100, with a jawline sub-score and a "masculinity" sub-score underneath. You stare at it. Then you do the thing everyone does: you upload the exact same photo again, just to see.

It comes back a 71.

So which one is real?

That's the question that brings most people here, and it deserves a straight answer before any reframe. Short version: neither number is "real," and the gap between them is the most honest thing the app will ever show you. Let's walk through why.

Key numbers

  • Umax has reported 7M+ total downloads and roughly $500,000/month in subscription revenue (per founder Blake Anderson, in Fortune's reporting).
  • The subscription runs about $3.99/week, and the full score is gated behind a paywall after you've already uploaded and scanned.
  • Users repeatedly report the same selfie returning different numbers — one App Store review describes submitting the same picture three times and getting "a different number" almost every time.
  • A first-impression judgment in real life forms in about 100 milliseconds (Willis & Todorov, 2006) — faster than any app finishes loading.
  • A meta-analysis of 919 studies found human attractiveness ratings agree far more than the "beauty is subjective" cliché predicts (Langlois et al., 2000) — agreement Umax's number isn't measured against.

First, the direct answer: is Umax consistent?

No — and the inconsistency is well documented, not a fluke on your end.

The most common complaint about these apps, across App Store reviews and third-party roundups, is exactly the one you noticed. "Submitted the same picture 3 times, got a different number." Reviewers call results "completely inaccurate," report the scan "easy to bug out and crash," and note that the score they got didn't match the glow-up the TikTok ad promised. (We're quoting reviewers here, not asserting it ourselves — your own experience may differ, and a single screenshot of a low score doesn't prove much either way.)

But here's the trap worth naming early: people see "different number every time" and conclude the app just needs to be more consistent. As if a version that returned 67 every single time would finally be trustworthy.

It wouldn't. And understanding why is the whole point.

What the algorithm is actually measuring

Here's the mechanism, plainly.

Umax doesn't have a model of your face. It can't. It has a model that takes the pixels in one image and maps them to a number it learned to associate with "attractive-looking photos" in its training set. Those are not the same thing.

A face is a stable 3D object. A photo is a flat projection of that object under one specific set of conditions — and almost none of those conditions are your face:

  • Light. Window light versus overhead light versus phone flash redraws every shadow on your jaw, your under-eyes, your cheekbones. The "structure" the app scores is mostly shadow.
  • Angle. A camera held at chest height versus eye height versus slightly above changes your apparent jaw, your nose projection, your forehead-to-chin ratio. Same skull, different geometry on the sensor.
  • Crop and distance. Lens distortion at 30cm is brutal and at 80cm is gentle. The app reads the distortion as your face.
  • The model's own randomness. Re-run the same file and many of these systems still wobble, because there's sampling noise baked into how the model produces an output.

So when the same photo scores 67 then 71, the app isn't being indecisive about you. It's giving you two readings of an image — and on a fresh upload, even tiny re-compression or a re-roll of internal randomness moves the needle. Change the light or the angle and it swings far more. It tracks the photograph; you care about your face; the app quietly conflates them, and the wobble is the seam showing. (Caveat: not every score change is noise — a genuinely better-lit photo will read better. That's real information, just not about your bone structure.)

Consistency is not the same as accuracy

Now the part that actually matters.

Imagine Umax fixed the wobble tomorrow. Same photo, same number, every time, locked. Would the number be true?

No. Because consistency and validity are two different properties, and the app only ever gestures at the first one.

A bathroom scale that always reads 12 pounds heavy is perfectly consistent. It is also wrong, every single time, in the same direction. Consistency just means an instrument repeats itself. Validity means it measures the thing it claims to measure. A score can be rock-steady and completely disconnected from reality.

So here's the question nobody asks the app: what was that 67 ever calibrated against?

For the number to mean "67 out of 100 on attractiveness," someone would have had to take real faces, collect real human ratings of them, and tune the model so its output matched what people actually find attractive. There's no evidence these apps do that. The score is the model's internal opinion about an image, dressed up as a measurement. A consistent 67 wouldn't be more accurate. Just a more confident guess.

The inconsistency, ironically, is the more honest signal. It's the system accidentally admitting it doesn't know.

What attraction actually keys off — and why a single number can't hold it

Step back to how attraction works in the real world, because that's the thing you actually want to improve.

The judgment is real and it's fast. Willis & Todorov (2006) showed people form stable impressions of a face — trustworthiness, competence, attractiveness — in about 100 milliseconds, and longer looks mostly just increase confidence in that snap read. Ambady & Rosenthal (1992) found thin slices of behavior, a few silent seconds, predict outcomes startlingly well. First impressions are not noise.

But what's getting judged is not a geometry score. It's a gestalt — and a sizable chunk of it isn't even fixed facial structure:

  • Expression and eyes carry enormous weight. Todorov's work shows tiny shifts in expression move perceived trustworthiness and warmth dramatically — and warmth feeds directly into attraction.
  • The halo effect (Dion, Berscheid & Walster, 1972) means a face read as warm and open gets credited with competence and likability it never had to earn — and the reverse drags an "objectively symmetrical" but cold face down.
  • Context bleeds in. Dutton & Aron (1974) — the shaky-bridge study — showed arousal from the environment gets misattributed as attraction to the person standing there. None of that lives in your selfie.
  • Agreement is real, subjectivity is overstated. Langlois et al.'s 2000 meta-analysis of 919 studies found people agree on attractiveness far more than the "it's all subjective" line suggests — and that agreement is about whole faces in context, not isolated jaw angles. (Caveat: agreement is strong, not total — culture and individual taste still move the edges, which is exactly why one absolute number is the wrong unit.)

A single 0-100 score has to flatten all of that into one digit and throw away the part that's actually movable. That's not a precision problem you fix with a better algorithm. It's a category error.

How we approach it differently

We built Real World Appeal because the honest version of this is more useful than the magic-number version — and frankly less harmful. (Psychologists quoted by Fortune and Yahoo Finance have warned that looks-rating apps can feed body dysmorphia in younger users; a number with no context and a paywall behind it is a genuinely risky thing to hand a 15-year-old.)

So we don't do the magic number. Specifically:

  • No PSL-style absolute "out of 100." Perceived attraction isn't linear and it isn't a leaderboard — it's a set of thresholds, and past a band, more "geometry" buys you almost nothing. We dig into why ranking faces on one axis is the wrong model in PAS vs. objective beauty.
  • Feedback grounded in perception research, not bone-geometry mysticism. The report speaks the language of what women actually find attractive — expression, warmth, the first-impression window — the levers that actually move, not the ones you can't change.
  • It's free and there's no paywall after the upload. You see the read before deciding anything.
  • If Umax handed you a number that stung, read this first: the score that gutted you was a reading of one badly-lit photo by a system that gives the same photo a different verdict on the next try. That's covered in Umax score vs. real life and what a low Umax score actually means. It is not a verdict on you.

If you want something concrete to compare, like the facial-tilt geometry these apps lean on, the canthal tilt test runs entirely in your browser — your photo never leaves your device — and we're upfront about how little that one measurement predicts.

So — is Umax accurate?

It's consistent enough to feel scientific and inconsistent enough to expose itself, and it was never calibrated against the thing it claims to score. The same photo gets a different number because the model reads images, not faces, and there's no anchor tying any of those numbers to how real people respond to you in a real room.

Your face doesn't have a score. It has an effect on people — and that effect is faster, warmer, and far more changeable than a frozen decimal can hold.

Take the free test. No paywall after the upload, no leaderboard, no number pretending to be the truth — just a read on what's actually working and what's actually movable.


Studies referenced: Willis, J., & Todorov, A. (2006). First impressions: Making up your mind after a 100-ms exposure to a face. Psychological Science, 17(7), 592-598. Langlois, J. H., Kalakanis, L., Rubenstein, A. J., Larson, A., Hallam, M., & Smoot, M. (2000). Maxims or myths of beauty? A meta-analytic and theoretical review. Psychological Bulletin, 126(3), 390-423. Dion, K., Berscheid, E., & Walster, E. (1972). What is beautiful is good. Journal of Personality and Social Psychology, 24(3), 285-290. Ambady, N., & Rosenthal, R. (1992). Thin slices of expressive behavior as predictors of interpersonal consequences. Psychological Bulletin, 111(2), 256-274. Dutton, D. G., & Aron, A. P. (1974). Some evidence for heightened sexual attraction under conditions of high anxiety. Journal of Personality and Social Psychology, 30(4), 510-517. Umax figures (downloads, revenue, pricing) as reported by Fortune (July 2024) and Yahoo Finance. </content> </invoke>

Test your own first-impression score

1 minute, 3 photos + a short questionnaire. Concrete improvement levers ranked by how much they actually move the dial.

Start the test