Samantar
Methodology
This page explains the principles behind Samantar's education ROI model, the data sources it draws from, and the design decisions that shape it.
Core Approach
Samantar uses an incremental ROI framework rather than a sticker-price or simple break-even model. Instead of asking "how much does a degree cost?", the model asks "what is the financial difference between pursuing this degree and not pursuing it — year by year, over time?"
Both the degree path and the no-degree path are modeled in parallel across a multi-year horizon. The model accounts for income, education costs, and living expenses for each path independently, then compares the cumulative difference. The year at which the degree path surpasses the no-degree path in net financial terms is surfaced as a key signal — not the only one.
This design reflects Samantar's core belief: the right question is not "is college worth it?" but "is this degree, at this school, for this career, in this market, worth it for you?"
Data Sources
Samantar draws from multiple authoritative public datasets to build each projection. No single source is treated as complete — the model is designed to layer and cross-validate across them.
- U.S. Census Bureau (ACS): Provides the counterfactual earnings baseline — what someone in the same demographic profile typically earns without the selected degree.
- College Scorecard (U.S. Dept. of Education): School-level cost, completion, retention, and earnings outcomes for degree completers.
- Texas Occupational Employment and Wage Statistics (OEWS): Primary source for occupation-level wage data, with preference for Texas labor markets.
- Texas Workforce Commission: Statewide wage benchmarks used when occupation-level data requires supplementation.
- HUD and TWC living-cost data: Regional housing and living expense defaults.
- IPEDS: Institutional tuition data used as a supplementary cost reference.
Each data source is tagged in the output with its origin, vintage year, and geographic scope so users understand exactly what is driving each number.
How The Model Works
The model evaluates two parallel financial trajectories — a degree path and a no-degree path — and compares them year over year. Key design principles:
- Both paths carry living costs. Housing and living expenses are applied to both trajectories so the comparison focuses purely on what education adds, not on ordinary cost of living.
- Education cost is isolated. Tuition and required fees are accounted for separately from living expenses, using school-specific data where available.
- Debt is modeled from school-specific data and adjusted based on user inputs such as savings and in-school earnings.
- Income assumptions are conservative by default and can be refined by the user to reflect their specific situation.
- The model is multi-year. Rather than a single break-even year, users can see how the two paths diverge or converge across an extended time horizon.
Wage Data and Confidence Levels
Occupation wage data varies in availability and specificity by source, year, and geography. Samantar uses a proprietary multi-tier resolution process to find the most relevant and reliable wage figure for each occupation-geography combination.
Texas labor market data is prioritized throughout, reflecting the platform's current focus on Texas schools and career markets. When a highly specific match is not available, the model steps through progressively broader sources rather than returning an error or an unqualified national average.
Every wage figure shown to the user is labeled with its source, vintage, geographic scope, and a confidence indicator. When estimation is involved — for example, deriving an entry wage from a central wage — that step is disclosed in plain language in the method note. The goal is transparency: Samantar does not hide data gaps behind a single opaque number.
Education Cost Definition
`Education Cost` is the direct school cost used in the ROI model over the full schooling period.
It includes annual tuition and, when available, required books and supplies from College Scorecard. It falls back to tuition-only if direct-cost Scorecard fields are unavailable.
It does not include ordinary living expenses such as rent, groceries, insurance, or utilities, because those are modeled separately year by year for both the degree path and the no-degree path.
Why These Choices
- Texas-specific wage data is preferred because the calculator is modeling Texas schools and Texas labor markets.
- Prior-year Texas wages are preferred over broader geography because they preserve local labor-market structure even when a current row is missing.
- TWC statewide wages are used before low-confidence proxies because they are still Texas labor-market data, even though they may only publish a mean wage.
- Low-confidence proxies are kept as a last resort so users can still inspect pathways while clearly seeing that the number is directional rather than occupation-specific.
Latest Run
Notes
College Scorecard completion and retention metrics are displayed as quality signals for the selected school. ACS currently provides the counterfactual earnings baseline. The methodology can be expanded further with program-level Scorecard earnings and more geography-specific ACS baselines.