Data Sources

Reference datasets used to train and validate DimensionsPot API — ANSUR II, NHANES, CDC Growth Charts, NASA-STD-3001, and the regional calibration methodology.

DimensionsPot is built on publicly archived, peer-reviewed anthropometric datasets. No proprietary data. No scraped measurements. Every training and validation dataset is citable.


Adult engine — ANSUR II (2012)

Primary training dataset.

The 2012 Anthropometric Survey of U.S. Army Personnel (ANSUR II) is the most comprehensive modern anthropometric survey of the adult human body available in the public domain. 6,068 US soldiers (4,082 male, 1,986 female) were measured across 93 body dimensions under standardised, controlled conditions at 108 military installations. Measurements were taken by certified anthropometric technicians using Martin & Saller landmarks, the same landmark protocol referenced by ISO 7250-1.

Gordon, C.C., et al. (2014). 2012 Anthropometric Survey of U.S. Army Personnel: Methods and Summary Statistics. Natick Technical Report TR-15/007. US Army Research Laboratory / Natick Soldier Research, Development & Engineering Center.

ANSUR II is publicly available via the OpenData portal of the US Army Research Laboratory.

Why ANSUR II: Scale, measurement precision, and dimensional completeness. No other publicly available dataset provides this combination of sample size, dimensional breadth, and measurement rigor for adult human body anthropometry.


Civilian validation — NHANES (2001–2018)

Independent civilian validation set.

The National Health and Nutrition Examination Survey (NHANES) is a continuous cross-sectional survey of the non-institutionalized US civilian population, conducted by the CDC/NCHS in mobile examination centers. The Body Measures (BMX) component covers height, weight, waist, hip, arm, and thigh circumferences across approximately 10,000 subjects per 2-year cycle, with stratified sampling to represent the full US BMI distribution.

NHANES is used as the out-of-distribution validation set: ANSUR II skews toward a lean, athletic population; NHANES covers the full civilian BMI range, including obese and morbidly obese subjects. Passing both validates generalization beyond the training distribution.

National Health and Nutrition Examination Survey. CDC National Center for Health Statistics. Public-use data files available at cdc.gov/nchs/nhanes.


Pediatric engine — CDC Growth Charts + WHO MGRS

Pediatric engine training parameters.

The LMS (Lambda-Mu-Sigma) parameters used by the pediatric engine are sourced directly from two published tables:

  • CDC 2000 Growth Charts — 218 age-sex-specific LMS parameter sets covering height, weight, head circumference, and BMI from birth to age 20, in monthly increments. The US clinical standard for pediatric anthropometric assessment.
  • WHO Multicentre Growth Reference Study (MGRS) 2006 — LMS parameters for 0–5 age range across 6 countries (Brazil, Ghana, India, Norway, Oman, United States), providing the international normative reference for infant and toddler growth.

Kuczmarski, R.J., et al. (2000). CDC Growth Charts: United States. Advance Data, No. 314. National Center for Health Statistics.

WHO Multicentre Growth Reference Study Group (2006). WHO Motor Development Study. Acta Paediatrica Supplement 450, 86–95.

The API stores 1,555 age-sex-specific LMS parameter points, interpolated at monthly intervals.


Biological limits — NASA-STD-3001

Hard output bounds reference.

Every prediction passes through a biological limits gate before being returned. Adult limits are sourced from NASA-STD-3001, Volume 1 (Space Human Factors Engineering), which defines P1–P99 ranges across combined ANSUR II, DLR, and international anthropometric data. This is the most rigorously maintained human factors engineering reference available.

Pediatric limits are derived from the same CDC 2000 LMS tables used for prediction, using inverse Box-Cox transform to generate age-sex-specific P1–P99 bounds.

Predictions outside these bounds are clamped to the boundary value and flagged with biological_limit_status: "OUT_OF_BOUNDS" in the response.


Regional calibration

Regional calibration parameters are derived from peer-reviewed anthropometric surveys and published academic literature on population-level body proportion variation. Studies covering European, East Asian, South Asian, Latin American, Middle Eastern, and Sub-Saharan African populations were used to derive population-mean offsets and proportion coefficients applied by the Universal Translator.

Regional coverage notes (from the API response meta_warnings):**

RegionCoverage status
GLOBALFull — both genders, ANSUR II baseline
EUROPEFull — both genders
ASIA_PACIFICFull — both genders
LATAMFull — both genders
INDIAFemale falls back to ASIA_PACIFIC
AFRICAMale-only validated; SD proxied from global with −10 confidence penalty
MIDDLE_EASTValidated for males 18–30; female uses global baseline

Regional limitations are always reflected in the API’s confidence scores — reduced coverage regions produce lower confidence scores and meta_warnings entries.


Training vs. serving

The API does not store, cache, or serve any content from its training datasets. Input parameters (height, weight, gender) are processed in-memory and discarded after the response. The model’s outputs are statistical estimates derived from trained coefficients — they are not retrieved records, look-ups, or reproductions of training data.

Height and weight are numeric facts, not biometric data under GDPR Recital 26.