Compute the logarithmic loss of a classification model.
mn_log_loss(data, ...) # S3 method for data.frame mn_log_loss(data, truth, ..., na_rm = TRUE, sum = FALSE) mn_log_loss_vec(truth, estimate, na_rm = TRUE, sum = FALSE, ...)
data  A 

...  A set of unquoted column names or one or more

truth  The column identifier for the true class results
(that is a 
na_rm  A 
sum  A 
estimate  If 
A tibble
with columns .metric
, .estimator
,
and .estimate
and 1 row of values.
For grouped data frames, the number of rows returned will be the same as the number of groups.
For mn_log_loss_vec()
, a single numeric
value (or NA
).
Log loss is a measure of the performance of a classification model. A
perfect model has a log loss of 0
.
Compared with accuracy()
, log loss
takes into account the uncertainty in the prediction and gives a more
detailed view into the actual performance. For example, given two input
probabilities of .6
and .9
where both are classified as predicting
a positive value, say, "Yes"
, the accuracy metric would interpret them
as having the same value. If the true output is "Yes"
, log loss penalizes
.6
because it is "less sure" of it's result compared to the probability
of .9
.
Log loss has a known multiclass extension, and is simply the sum of the log loss values for each class prediction. Because of this, no averaging types are supported.
Other class probability metrics:
average_precision()
,
gain_capture()
,
pr_auc()
,
roc_auc()
,
roc_aunp()
,
roc_aunu()
#> # A tibble: 1 x 3 #> .metric .estimator .estimate #> <chr> <chr> <dbl> #> 1 mn_log_loss binary 0.328# Multiclass library(dplyr) data(hpc_cv) # You can use the col1:colN tidyselect syntax hpc_cv %>% filter(Resample == "Fold01") %>% mn_log_loss(obs, VF:L)#> # A tibble: 1 x 3 #> .metric .estimator .estimate #> <chr> <chr> <dbl> #> 1 mn_log_loss multiclass 0.734#> # A tibble: 10 x 4 #> Resample .metric .estimator .estimate #> <chr> <chr> <chr> <dbl> #> 1 Fold01 mn_log_loss multiclass 0.734 #> 2 Fold02 mn_log_loss multiclass 0.808 #> 3 Fold03 mn_log_loss multiclass 0.705 #> 4 Fold04 mn_log_loss multiclass 0.747 #> 5 Fold05 mn_log_loss multiclass 0.799 #> 6 Fold06 mn_log_loss multiclass 0.766 #> 7 Fold07 mn_log_loss multiclass 0.927 #> 8 Fold08 mn_log_loss multiclass 0.855 #> 9 Fold09 mn_log_loss multiclass 0.861 #> 10 Fold10 mn_log_loss multiclass 0.821# Vector version # Supply a matrix of class probabilities fold1 < hpc_cv %>% filter(Resample == "Fold01") mn_log_loss_vec( truth = fold1$obs, matrix( c(fold1$VF, fold1$F, fold1$M, fold1$L), ncol = 4 ) )#> [1] 0.7338423# Supply `...` with quasiquotation prob_cols < levels(two_class_example$truth) mn_log_loss(two_class_example, truth, Class1)#> # A tibble: 1 x 3 #> .metric .estimator .estimate #> <chr> <chr> <dbl> #> 1 mn_log_loss binary 0.328mn_log_loss(two_class_example, truth, !! prob_cols[1])#> # A tibble: 1 x 3 #> .metric .estimator .estimate #> <chr> <chr> <dbl> #> 1 mn_log_loss binary 0.328