“Size M” is not a measurement. It’s a label that one specific brand applied to one specific garment, based on whatever internal conventions they used when they designed their size chart. It does not mean the same thing across brands, product categories, countries, or even across a single brand’s catalog over time.
This is the size chart normalization problem, and it’s the technical challenge at the center of any multi-brand sizing application. Here’s how to think about it and build around it.
Why size labels aren’t interchangeable
The root cause is that clothing sizes were never standardized in any enforceable way. The US government attempted it in the 1940s and 1950s (USDA CS 215-58, then ASTM D5585), but the standards were voluntary and compliance was inconsistent. By the 1990s, “vanity sizing” — systematically labeling garments with smaller size numbers than their actual measurements to flatter buyers — had eroded whatever consistency existed.
The result in 2026:
- A “Medium” shirt from Brand A has a chest measurement of 96–101cm
- A “Medium” shirt from Brand B has a chest measurement of 100–106cm
- A “Medium” shirt from Brand C is cut in European sizing and corresponds to chest 98–102cm
- All three are correctly labeled “M” by their own internal logic
For any system recommending sizes across these brands, the only reliable source of truth is body dimensions in millimeters, not size labels.
The normalization architecture
A size chart normalization layer has three components:
-
A size chart registry — a database of brand-specific size chart data, keyed by brand + category + size label, with body dimension ranges for each label
-
A body measurement source — either user-provided measurements or predicted dimensions from a prediction API
-
A matching algorithm — given a user’s body measurements and a brand’s size chart, determine the best-fit label
from dataclasses import dataclass
@dataclass
class SizeEntry:
label: str
chest_min_mm: int | None
chest_max_mm: int | None
waist_min_mm: int | None
waist_max_mm: int | None
hip_min_mm: int | None
hip_max_mm: int | None
# Example: a minimal size chart for a brand's women's tops
BRAND_CHART: dict[str, list[SizeEntry]] = {
"brand_a_womens_top": [
SizeEntry("XS", 800, 855, 610, 660, None, None),
SizeEntry("S", 856, 900, 661, 710, None, None),
SizeEntry("M", 901, 960, 711, 760, None, None),
SizeEntry("L", 961, 1020, 761, 820, None, None),
SizeEntry("XL", 1021, 1090, 821, 890, None, None),
]
}
The matching algorithm
Naive matching — “find the size where the user’s measurement falls within the range” — fails in the common case where a user’s chest falls in size M but their waist falls in size L. Real bodies don’t conform to proportional grading.
A better approach is dimension-weighted scoring:
def score_size_entry(
entry: SizeEntry,
chest_mm: int | None,
waist_mm: int | None,
hip_mm: int | None,
weights: dict[str, float] | None = None
) -> float:
"""
Returns a fit score for a size entry given body dimensions.
Lower score = better fit. Returns infinity if hard-excluded.
"""
if weights is None:
weights = {"chest": 0.5, "waist": 0.35, "hip": 0.15}
total_score = 0.0
total_weight = 0.0
pairs = [
(chest_mm, entry.chest_min_mm, entry.chest_max_mm, weights["chest"]),
(waist_mm, entry.waist_min_mm, entry.waist_max_mm, weights["waist"]),
(hip_mm, entry.hip_min_mm, entry.hip_max_mm, weights["hip"]),
]
for value, low, high, weight in pairs:
if value is None or low is None or high is None:
continue
total_weight += weight
if low <= value <= high:
# Perfectly within range: score is 0
total_score += 0.0
elif value < low:
# Too small: penalize by how far below
total_score += weight * (low - value)
else:
# Too large: penalize by how far above
total_score += weight * (value - high)
if total_weight == 0:
return float("inf")
return total_score / total_weight
def recommend_size(
chart_key: str,
chest_mm: int | None = None,
waist_mm: int | None = None,
hip_mm: int | None = None,
) -> tuple[str, str]:
"""
Returns (recommended_size_label, confidence_level).
"""
entries = BRAND_CHART[chart_key]
scores = [(e, score_size_entry(e, chest_mm, waist_mm, hip_mm)) for e in entries]
scores.sort(key=lambda x: x[1])
best_entry, best_score = scores[0]
second_score = scores[1][1] if len(scores) > 1 else float("inf")
# Confidence: how much better is the best vs second-best
margin = second_score - best_score
if best_score == 0.0:
confidence = "HIGH"
elif margin > 20:
confidence = "MEDIUM"
else:
confidence = "LOW" # User is between sizes
return best_entry.label, confidence
Where body measurement APIs fit
The matching algorithm above requires body dimensions in millimeters. Getting those dimensions is the other half of the problem.
Users rarely know their chest circumference in millimeters. But they do know their height and weight. A prediction API fills the gap — taking height and weight as input and returning predicted chest, waist, and hip circumferences (among 130+ other dimensions) with associated confidence intervals.
import requests
def get_user_dimensions(
gender: str,
height_cm: float,
weight_kg: float,
region: str = "GLOBAL"
) -> dict:
response = requests.post(
"https://dimensionspot-bodysize-engine.p.rapidapi.com/v1/predict",
json={
"input_data": {
"input_unit_system": "metric",
"subject": {
"gender": gender,
"input_origin_region": region
},
"anchors": {
"body_height": int(height_cm * 10), # cm → mm
"body_mass": weight_kg
}
},
"output_settings": {
"calculation": {"target_region": region, "body_build_type": "CIVILIAN"},
"requested_dimensions": {
"specific_dimensions": [
"chest_circumference",
"waist_circumference_natural",
"hip_circumference"
]
},
"output_format": {"include_range_95": True, "confidence_score_threshold": 60}
}
},
headers={
"X-RapidAPI-Key": "YOUR_API_KEY",
"X-RapidAPI-Host": "dimensionspot-bodysize-engine.p.rapidapi.com"
}
)
return response.json()
def extract_dimensions(api_response: dict) -> dict[str, float | None]:
return {
dim_id: dim.get("value")
for dim_id, dim in api_response.get("body_dimensions", {}).items()
}
Handling the between-sizes case
The LOW confidence case — where a user falls between two sizes — is the most important edge case to handle explicitly in your UI.
Options:
- Size up by default — correct for most garments; better to return a garment that’s slightly loose than one that doesn’t close
- Size down by default — correct for compressive garments (athletic wear, shapewear)
- Show both sizes — let the user choose, with an explanation (“Your measurements are between M and L. If you prefer a more fitted look, choose M. If you prefer more room, choose L.”)
- Use the 95% prediction interval — if the user’s predicted chest is 950mm with a 95% range of 900–1000mm, and M covers 901–960mm while L covers 961–1020mm, the interval spans both sizes. Surface this as genuine uncertainty.
The third and fourth options produce the best user experience when implemented cleanly. Most sizing tools default to the first option silently — which is fine but misses an opportunity to build trust by being transparent about uncertainty.
Building the size chart registry
The practical bottleneck in a multi-brand sizing feature is collecting accurate size chart data. Options:
Manual entry: Scrape or manually input size charts from brand websites. This works for a fixed catalog but requires ongoing maintenance as brands update charts seasonally.
Brand integration: Some brands provide size data via API or structured data feeds (especially in fashion B2B ecosystems). This is the gold standard but requires brand partnerships.
User feedback loop: Collect fit feedback from users who purchased based on your recommendation, then use that feedback to adjust your chart data. This is the only way to correct charts that are inaccurate in practice.
Store size charts with version dates — brands change charts without announcement, and version history lets you debug why recommendations were wrong for historical orders.
The size label normalization problem doesn’t have a perfect solution because the underlying data — brand size charts — is inherently inconsistent. What you can do is build a system that converts the problem from “matching labels” to “matching dimensions,” which is mathematically tractable, and surface uncertainty honestly when a user falls between sizes.