**Fast. Accurate. Easy to use. Stata is a complete, integrated software package that provides all your data science needs—data manipulation, visualization, statistics, and reproducible reporting.**

**Fast, accurate, and easy to use**

With both a point-and-click interface and a powerful, intuitive command syntax, Stata is fast, accurate, and easy to use. All analyses can be reproduced and documented for publication and review. Version control ensures statistical programs will continue to produce the same results no matter when you wrote them.

**Broad suite of statistical features**

Stata puts hundreds of statistical tools at your fingertips:

Basic statistics

summaries • cross-tabulations • correlations • z and t tests • equality-of-variance tests • tests of proportions • confidence intervals • factor variables • more

Linear models

regression • censored outcomes • endogenous regressors • bootstrap, jackknife, and robust and cluster–robust variance • instrumental variables • three-stage least squares • constraints • quantile regression • GLS • more

Panel/longitudinal data

random and fixed effects with robust standard errors • linear mixed models • random-effects probit • GEE • random- and fixed-effects Poisson • dynamic panel-data models • instrumental variables • panel unit-root tests • more

Multilevel mixed-effects models

continuous, binary, count, and survival outcomes • two-, three-, and higher-level models • generalized linear models • nonlinear models • random intercepts • random slopes • crossed random effects • BLUPs of effects and fitted values • hierarchical models • residual error structures • DDF adjustments • support for survey data • more

Binary, count, and limited outcomes

logistic, probit, tobit • Poisson and negative binomial • conditional, multinomial, nested, ordered, rank-ordered, and stereotype logistic • multinomial probit • zero-inflated and left-truncated count models • selection models • marginal effects • more

Extended regression models (ERMs)

combine endogenous covariates, sample selection, and nonrandom treatment in models for continuous, interval-censored, binary, and ordinal outcomes • more

Generalized linear models (GLMs)

ten link functions • user-defined links • seven distributions • ML and IRLS estimation • nine variance estimators • seven residuals • more

Finite mixture models (FMMs)

fmm: prefix for 17 estimators • mixtures of a single estimator • mixtures combining multiple estimators or distributions • continuous, binary, count, ordinal, categorical, censored, truncated, and survival outcomes • more

Spatial autoregressive models

spatial lags of dependent variable, independent variables, and autoregressive errors • fixed and random effects in panel data • endogenous covariates • analyze spillover effects • more

ANOVA/MANOVA

balanced and unbalanced designs • factorial, nested, and mixed designs • repeated measures • marginal means • contrasts • more

Exact statistics

exact logistic and Poisson regression • exact case–control statistics • binomial tests • Fisher’s exact test for r × c tables • more

Linearized DSGE models

specify models algebraically • solve models • estimate parameters • identification diagnostics • policy and transition matrices • IRFs • dynamic forecasts • more

Tests, predictions, and effects

Wald tests • LR tests • linear and nonlinear combinations • predictions and generalized predictions • marginal means • least-squares means • adjusted means • marginal and partial effects • forecast models • Hausman tests • more

Contrasts, pairwise comparisons, and margins

compare means, intercepts, or slopes • compare with reference category, adjacent category, grand mean, etc. • orthogonal polynomials • multiple-comparison adjustments • graph estimated means and contrasts • interaction plots • more

Simple maximum likelihood

specify likelihood using simple expressions • no programming required • survey data • standard, robust, bootstrap, and jackknife SEs • matrix estimators • more

Programmable maximum likelihood

user-specified functions • NR, DFP, BFGS, BHHH • OIM, OPG, robust, bootstrap, and jackknife SEs • Wald tests • survey data • numeric or analytic derivatives • more

Resampling and simulation methods

bootstrap • jackknife • Monte Carlo simulation • permutation tests • more

Time series

ARIMA • ARFIMA • ARCH/GARCH • VAR • VECM • multivariate GARCH • unobserved-components model • dynamic factors • state-space models • Markov-switching models • business calendars • tests for structural breaks • threshold regression • forecasts • impulse–response functions • unit-root tests • filters and smoothers • rolling and recursive estimation • more

Survival analysis

Kaplan–Meier and Nelson–Aalen estimators, • Cox regression (frailty) • parametric models (frailty, random effects) • competing risks • hazards • time-varying covariates • left-, right-, and interval-censoring • Weibull, exponential, and Gompertz models • more

Bayesian analysis

thousands of built-in models • univariate and multivariate models • linear and nonlinear models • multilevel models • continuous, binary, ordinal, and count outcomes • bayes: prefix for 45 estimation commands • continuous univariate, multivariate, and discrete priors • add your own models • convergence diagnostics • posterior summaries • hypothesis testing • model comparison • more

Power and sample size

power • sample size • effect size • minimum detectable effect • means • proportions • variances • correlations • ANOVA • regression • cluster randomized designs • case–control studies • cohort studies • contingency tables • survival analysis • balanced or unbalanced designs • results in tables or graphs • more

Treatment effects/Causal inference

inverse probability weight (IPW) • doubly robust methods • propensity-score matching • regression adjustment • covariate matching • multilevel treatments • endogenous treatments • average treatment effects (ATEs) • ATEs on the treated (ATETs) • potential-outcome means (POMs) • continuous, binary, count, fractional, and survival outcomes • more

SEM (structural equation modeling)

graphical path diagram builder • standardized and unstandardized estimates • modification indices • direct and indirect effects • continuous, binary, count, ordinal, and survival outcomes • multilevel models • random slopes and intercepts • factor scores, empirical Bayes, and other predictions • groups and tests of invariance • goodness of fit • handles MAR data by FIML • correlated data • survey data • more

Latent class analysis

binary, ordinal, continuous, count, categorical, fractional, and survival items • add covariates to model class membership • combine with SEM path models • expected class proportions • goodness of fit • predictions of class membership • more

Multiple imputation

nine univariate imputation methods • multivariate normal imputation • chained equations • explore pattern of missingness • manage imputed datasets • fit model and pool results • transform parameters • joint tests of parameter estimates • predictions • more

Survey methods

multistage designs • bootstrap, BRR, jackknife, linearized, and SDR variance estimation • poststratification • DEFF • predictive margins • means, proportions, ratios, totals • summary tables • almost all estimators supported • more

Cluster analysis

hierarchical clustering • kmeans and kmedian nonhierarchical clustering • dendrograms • stopping rules • user-extensible analyses • more

IRT (item response theory)

binary (1PL, 2PL, 3PL), ordinal, and categorical response models • item characteristic curves • test characteristic curves • item information functions • test information functions • differential item functioning (DIF) • more

Multivariate methods

factor analysis • principal components • discriminant analysis • rotation • multidimensional scaling • Procrustean analysis • correspondence analysis • biplots • dendrograms • user-extensible analyses • more

Data management

data transformations • match-merge • import/export data • ODBC • SQL • Unicode • by-group processing • append files • sort • row–column transposition • labeling • save results • more

Graphics

lines • bars • areas • ranges • contours • confidence intervals • interaction plots • survival plots • publication quality • customize anything • Graph Editor • more

Graphical user interface

menus and dialogs for all features • Data Editor • Variables Manager • Graph Editor • Project Manager • Do-file Editor • Clipboard Preview Tool • multiple preference sets • more

Basic statistics

summaries • cross-tabulations • correlations • z and t tests • equality-of-variance tests • tests of proportions • confidence intervals • factor variables • more

Nonparametric methods

nonparametric regression • Wilcoxon–Mann–Whitney, Wilcoxon signed ranks, and Kruskal–Wallis tests • Spearman and Kendall correlations • Kolmogorov–Smirnov tests • exact binomial CIs • survival data • ROC analysis • smoothing • bootstrapping • more

Epidemiology

standardization of rates • case–control • cohort • matched case–control • Mantel–Haenszel • pharmacokinetics • ROC analysis • ICD-10 • more

GMM and nonlinear regression

generalized method of moments (GMM) • nonlinear regression • more

Other statistical methods

kappa measure of interrater agreement • Cronbach's alpha • stepwise regression • tests of normality • more

Functions

statistical • random-number • mathematical • string • date and time • more

Internet capabilities

ability to install new commands • web updating • web file sharing • latest Stata news • more

Community-contributed commands

community-written commands for meta-analysis, data management, survival, econometrics, more

Programming features

adding new commands • command scripting • object-oriented programming • menu and dialog-box programming • dynamic documents • Markdown • Project Manager • plugins • more

Matrix programming—Mata

interactive sessions • large-scale development projects • optimization • matrix inversions • decompositions • eigenvalues and eigenvectors • LAPACK engine • real and complex numbers • string matrices • interface to Stata datasets and matrices • numerical derivatives • object-oriented programming • more

Documentation

31 manuals • 15,000+ pages • seamless navigation • thousands of worked examples • quick starts • methods and formulas • references • more

**Complete data-management facilities**

Stata's data-management features give you complete control of all types of data.

**Publication-quality graphics**

Stata makes it easy to generate publication-quality, distinctly styled graphs.

You can write scripts to produce hundreds or thousands of graphs in a reproducible manner and export them to EPS or TIF for publication, to PNG or SVG for the web, or to PDF for viewing. With the integrated Graph Editor you click to change anything about your graph or to add titles, notes, lines, arrows, and text.

**Extensible**

Stata is so programmable that developers and users add new features every day to respond to the growing demands of today's researchers.

With Stata's Internet capabilities, new features and official updates can be installed over the Internet with a single click.

**Matrix programming with Mata**

Mata is a full-blown programming language that compiles what you type into bytecode, optimizes it, and executes it fast.

Though you don't need to program to use Stata, it is comforting to know that a fast and complete matrix programming language is an integral part of Stata. Mata is both an interactive environment for manipulating matrices and a full development environment that can produce compiled and optimized code. It includes special features for processing panel data, performs operations on real or complex matrices, provides complete support for object-oriented programming, and is fully integrated with every aspect of Stata.

**Cross-platform compatible**

Stata will run on Windows, Mac, and Linux/Unix computers; however, our licenses are not platform specific.

That means if you have a Mac laptop and a Windows desktop, you don't need two separate licenses to run Stata. You can install your Stata license on any of the supported platforms. Stata datasets, programs, and other data can be shared across platforms without translation. You can also quickly and easily import datasets from other statistical packages, spreadsheets, and databases.

**Trusted**

**We don't just program statistical methods, we validate them.**

The results you see from a Stata estimator rest on comparisons with other estimators, Monte Carlo simulations of consistency and coverage, and extensive testing by our statisticians. Every Stata we ship has passed a certification suite that includes 3.2 million lines of testing code that produces 4.9 million lines of output. We certify every number and piece of text from those 4.9 million lines of output.