Treatment indicator:

Dataset information:

Handling missing values:

If units have missing values for variables in the propensity score model,

Propensity score model:

Variables in the dataset:

Preliminary syntax check:

Variable-name check:

Model-fitting check:

Estimated propensity score distributions


The plots on the next pages depend on the estimated propensity scores. If you want to view the plots without developing a propensity score model, just type a '1' (numeral one, no quotes) in the formula box above, and a model will be fit using just an intercept.

Variables to view and restrict:

View numeric variables as discrete if they have fewer than __ distinct values in the original dataset:

Note that this may take a few minutes for larger datasets.

Preferences for graphs:

Point/histogram opacity ('alpha')
Symbol size for scatterplots

Current sample size

Estimated propensity score distribution (brushable)

Legend for this plot applies to all plots on page.

(If making the plots was slow the first time, expect a delay after clicking either button.)


The thin black lines in the stripcharts indicate the mean; in the scatterplots, the thin black lines are loess curves.

After pruning, the pruning limits you specified for continuous variables will be moved inward to the nearest sample value.

The upper subplots for each covariate include all points in the (pruned) dataset, even if those points are missing from the subplots immediately below because the propensity score is missing. This can happen if some variables have missing values and only complete cases are used to estimate the propensity score.

Show the following weightings in the SMD plot:

Note that each one may take several minutes.

Note that for larger datasets, the plot may take a few minutes to refresh.


For information about how the absolute standardized mean differences shown in the plot above are calculated, see the documentation for the tableone package.

The dotted vertical line at 0.1 marks a degree of imbalance that many researchers consider to be unacceptable.

Visual Pruner currently displays in the SMD plot only those variables selected for viewing on the 'Prune' page. In general it is important to consider standardized mean differences for squared terms and interactions, as well as for missingness indicators. We hope to add automatic generation of these variables in the future, but in the meantime we recommend adding them to your dataset before importing so that you can select them for viewing.

The following R expression can be copied to select rows to KEEP:

Download inclusion criteria as .txt file

Current propensity score formula:

Download PS formula as .txt file


Visual Pruner uses rms::lrm() to fit the propensity score model, after first imputing missing values with Hmisc::impute() if imputation is selected on the Specify tab. Missingness indicator variables are then created using Hmisc::is.imputed(). See the R tab for more details.

Visual Pruner is a study-design tool for use with observational studies.

Instructions for running locally and additional information can be found at






Lauren R. Samuels and Robert A. Greevy, Jr.

We welcome bug reports, suggestions, and requests.


Visual Pruner is built using the R Shiny framework, with CSS from Bootswatch (slightly modified).
Many thanks to Meira Epplein, Qi Liu, Dale Plummer, Bryan Shepherd, and Matt Shotwell for their valuable suggestions.

You can ignore this tab if you are not interested in the R packages or source code used in making this app.

R session information: