These functions extract various elements from a workflow object. If they do not exist yet, an error is thrown.

  • extract_preprocessor() returns the formula, recipe, or variable expressions used for preprocessing.

  • extract_spec_parsnip() returns the parsnip model specification.

  • extract_fit_parsnip() returns the parsnip model fit object.

  • extract_fit_engine() returns the engine specific fit embedded within a parsnip model fit. For example, when using parsnip::linear_reg() with the "lm" engine, this returns the underlying lm object.

  • extract_mold() returns the preprocessed "mold" object returned from hardhat::mold(). It contains information about the preprocessing, including either the prepped recipe, the formula terms object, or variable selectors.

  • extract_recipe() returns the recipe. The estimated argument specifies whether the fitted or original recipe is returned.

# S3 method for workflow
extract_spec_parsnip(x, ...)

# S3 method for workflow
extract_recipe(x, ..., estimated = TRUE)

# S3 method for workflow
extract_fit_parsnip(x, ...)

# S3 method for workflow
extract_fit_engine(x, ...)

# S3 method for workflow
extract_mold(x, ...)

# S3 method for workflow
extract_preprocessor(x, ...)

Arguments

x

A workflow

...

Not currently used.

estimated

A logical for whether the original (unfit) recipe or the fitted recipe should be returned. This argument should be named.

Value

The extracted value from the object, x, as described in the description section.

Details

Extracting the underlying engine fit can be helpful for describing the model (via print(), summary(), plot(), etc.) or for variable importance/explainers.

However, users should not invoke the predict() method on an extracted model. There may be preprocessing operations that workflows has executed on the data prior to giving it to the model. Bypassing these can lead to errors or silently generating incorrect predictions.

Good:

workflow_fit %>% predict(new_data)

Bad:

workflow_fit %>% extract_fit_engine()  %>% predict(new_data)
# or
workflow_fit %>% extract_fit_parsnip() %>% predict(new_data)

Examples

library(parsnip) library(recipes) library(magrittr) model <- linear_reg() %>% set_engine("lm") recipe <- recipe(mpg ~ cyl + disp, mtcars) %>% step_log(disp) base_wf <- workflow() %>% add_model(model) recipe_wf <- add_recipe(base_wf, recipe) formula_wf <- add_formula(base_wf, mpg ~ cyl + log(disp)) variable_wf <- add_variables(base_wf, mpg, c(cyl, disp)) fit_recipe_wf <- fit(recipe_wf, mtcars) fit_formula_wf <- fit(formula_wf, mtcars) # The preprocessor is a recipe, formula, or a list holding the # tidyselect expressions identifying the outcomes/predictors extract_preprocessor(recipe_wf)
#> Recipe #> #> Inputs: #> #> role #variables #> outcome 1 #> predictor 2 #> #> Operations: #> #> Log transformation on disp
#> mpg ~ cyl + log(disp) #> <environment: 0x563f697d8bf8>
extract_preprocessor(variable_wf)
#> $outcomes #> <quosure> #> expr: ^mpg #> env: 0x563f697d8bf8 #> #> $predictors #> <quosure> #> expr: ^c(cyl, disp) #> env: 0x563f697d8bf8 #> #> attr(,"class") #> [1] "workflow_variables"
# The `spec` is the parsnip spec before it has been fit. # The `fit` is the fitted parsnip model. extract_spec_parsnip(fit_formula_wf)
#> Linear Regression Model Specification (regression) #> #> Computational engine: lm #>
extract_fit_parsnip(fit_formula_wf)
#> parsnip model object #> #> Fit time: 1ms #> #> Call: #> stats::lm(formula = ..y ~ ., data = data) #> #> Coefficients: #> (Intercept) cyl `log(disp)` #> 67.6674 -0.1755 -8.7971 #>
extract_fit_engine(fit_formula_wf)
#> #> Call: #> stats::lm(formula = ..y ~ ., data = data) #> #> Coefficients: #> (Intercept) cyl `log(disp)` #> 67.6674 -0.1755 -8.7971 #>
# The mold is returned from `hardhat::mold()`, and contains the # predictors, outcomes, and information about the preprocessing # for use on new data at `predict()` time. extract_mold(fit_recipe_wf)
#> $predictors #> # A tibble: 32 × 2 #> cyl disp #> <dbl> <dbl> #> 1 6 5.08 #> 2 6 5.08 #> 3 4 4.68 #> 4 6 5.55 #> 5 8 5.89 #> 6 6 5.42 #> 7 8 5.89 #> 8 4 4.99 #> 9 4 4.95 #> 10 6 5.12 #> # … with 22 more rows #> #> $outcomes #> # A tibble: 32 × 1 #> mpg #> <dbl> #> 1 21 #> 2 21 #> 3 22.8 #> 4 21.4 #> 5 18.7 #> 6 18.1 #> 7 14.3 #> 8 24.4 #> 9 22.8 #> 10 19.2 #> # … with 22 more rows #> #> $blueprint #> Recipe blueprint: #> #> # Predictors: 2 #> # Outcomes: 1 #> Intercept: FALSE #> Novel Levels: FALSE #> Composition: tibble #> #> $extras #> $extras$roles #> NULL #> #>
# A useful shortcut is to extract the fitted recipe from the workflow extract_recipe(fit_recipe_wf)
#> Recipe #> #> Inputs: #> #> role #variables #> outcome 1 #> predictor 2 #> #> Training data contained 32 data points and no missing data. #> #> Operations: #> #> Log transformation on disp [trained]
# That is identical to identical( extract_mold(fit_recipe_wf)$blueprint$recipe, extract_recipe(fit_recipe_wf) )
#> [1] TRUE