vivi.Rd
Creates a matrix displaying variable importance on the diagonal and variable interaction on the off-diagonal.
vivi( data, fit, response, gridSize = 50, importanceType = "agnostic", nmax = 500, reorder = TRUE, class = 1, predictFun = NULL, normalized = FALSE, numPerm = 4, showVimpError = FALSE )
data | Data frame used for fit. |
---|---|
fit | A supervised machine learning model, which understands condvis2::CVpredict |
response | The name of the response for the fit. |
gridSize | The size of the grid for evaluating the predictions. |
importanceType | Used to select the importance metric. By default, an agnostic importance measure is used. If an embedded metric is available, then setting this argument to the importance metric will use the selected importance values in the vivid-matrix. Please refer to the examples given for illustration. Alternatively, set to equal "agnostic" (the default) to override embedded importance measures and return agnostic importance values. |
nmax | Maximum number of data rows to consider. Default is 500. Use all rows if NULL. |
reorder | If TRUE (default) uses DendSer to reorder the matrix of interactions and variable importances. |
class | Category for classification, a factor level, or a number indicating which factor level. |
predictFun | Function of (fit, data) to extract numeric predictions from fit. Uses condvis2::CVpredict by default, which works for many fit classes. |
normalized | Should Friedman's H-statistic be normalized or not. Default is FALSE. |
numPerm | Number of permutations to perform for agnostic importance. Default is 4. |
showVimpError | Logical. If TRUE, and |
A matrix of interaction values, with importance on the diagonal.
If the argument importanceType = 'agnostic'
, then an agnostic permutation importance (1) is calculated.
Friedman's H statistic (2) is used for measuring the interactions. This measure is based on partial dependence curves
and relates the interaction strength of a pair of variables to the total effect strength of that variable pair.
1: Fisher A., Rudin C., Dominici F. (2018). All Models are Wrong but many are Useful: Variable Importance for Black-Box, Proprietary, or Misspecified Prediction Models, using Model Class Reliance. Arxiv.
2: Friedman, J. H. and Popescu, B. E. (2008). “Predictive learning via rule ensembles.” The Annals of Applied Statistics. JSTOR, 916–54.
aq <- na.omit(airquality) f <- lm(Ozone ~ ., data = aq) m <- vivi(fit = f, data = aq, response = "Ozone") # as expected all interactions are zero #> Agnostic variable importance method used. #> Calculating interactions... viviHeatmap(m) # Select importance metric library(randomForest) #> randomForest 4.7-1.1 #> Type rfNews() to see new features/changes/bug fixes. #> #> Attaching package: ‘randomForest’ #> The following object is masked from ‘package:ranger’: #> #> importance rf1 <- randomForest(Ozone~., data = aq, importance = TRUE) m2 <- vivi(fit = rf1, data = aq, response = 'Ozone', importanceType = '%IncMSE') # select %IncMSE as the importance measure #> %IncMSE importance selected. #> Calculating interactions... viviHeatmap(m2) # \donttest{ library(ranger) rf <- ranger(Species ~ ., data = iris, importance = "impurity", probability = TRUE) vivi(fit = rf, data = iris, response = "Species") # returns agnostic importance #> Agnostic variable importance method used. #> Calculating interactions... #> Petal.Width Petal.Length Sepal.Length Sepal.Width #> Petal.Width 0.3243165 7.7598903 7.18419678 5.70678283 #> Petal.Length 7.7598903 0.3098453 7.32373953 5.81933853 #> Sepal.Length 7.1841968 7.3237395 0.02232394 4.50969747 #> Sepal.Width 5.7067828 5.8193385 4.50969747 0.01147973 #> attr(,"class") #> [1] "vivid" "matrix" "array" vivi(fit = rf, data = iris, response = "Species", importanceType = "impurity") # returns selected 'impurity' importance. #> Embedded impurity variable importance method used. #> Calculating interactions... #> Petal.Width Petal.Length Sepal.Length Sepal.Width #> Petal.Width 43.152277 7.983278 7.365552 5.845194 #> Petal.Length 7.983278 42.290158 7.471649 6.013182 #> Sepal.Length 7.365552 7.471649 8.967825 4.598689 #> Sepal.Width 5.845194 6.013182 4.598689 1.181052 #> attr(,"class") #> [1] "vivid" "matrix" "array" # }