Matthews correlation coefficient

mcc(data, ...)
# S3 method for data.frame
mcc(data, truth, estimate, na_rm = TRUE, ...)
mcc_vec(truth, estimate, na_rm = TRUE, ...)

## Arguments

data |
Either a `data.frame` containing the `truth` and `estimate`
columns, or a `table` /`matrix` where the true class results should be
in the columns of the table. |

... |
Not currently used. |

truth |
The column identifier for the true class results
(that is a `factor` ). This should be an unquoted column name although
this argument is passed by expression and supports
quasiquotation (you can unquote column
names). For `_vec()` functions, a `factor` vector. |

estimate |
The column identifier for the predicted class
results (that is also `factor` ). As with `truth` this can be
specified different ways but the primary method is to use an
unquoted variable name. For `_vec()` functions, a `factor` vector. |

na_rm |
A `logical` value indicating whether `NA`
values should be stripped before the computation proceeds. |

## Value

A `tibble`

with columns `.metric`

, `.estimator`

,
and `.estimate`

and 1 row of values.

For grouped data frames, the number of rows returned will be the same as
the number of groups.

For `mcc_vec()`

, a single `numeric`

value (or `NA`

).

## Relevant Level

There is no common convention on which factor level should
automatically be considered the "event" or "positive" result.
In `yardstick`

, the default is to use the *first* level. To
change this, a global option called `yardstick.event_first`

is
set to `TRUE`

when the package is loaded. This can be changed
to `FALSE`

if the *last* level of the factor is considered the
level of interest by running: `options(yardstick.event_first = FALSE)`

.
For multiclass extensions involving one-vs-all
comparisons (such as macro averaging), this option is ignored and
the "one" level is always the relevant result.

## Multiclass

`mcc()`

has a known multiclass generalization and that is computed
automatically if a factor with more than 2 levels is provided. Because
of this, no averaging methods are provided.

## References

Giuseppe, J. (2012). "A Comparison of MCC and CEN Error
Measures in Multi-Class Prediction". *PLOS ONE*. Vol 7, Iss 8, e41882.

## See also

Other class metrics:
`accuracy()`

,
`bal_accuracy()`

,
`detection_prevalence()`

,
`f_meas()`

,
`j_index()`

,
`kap()`

,
`npv()`

,
`ppv()`

,
`precision()`

,
`recall()`

,
`sens()`

,
`spec()`

## Examples

#> # A tibble: 1 x 3
#> .metric .estimator .estimate
#> <chr> <chr> <dbl>
#> 1 mcc binary 0.677

#> # A tibble: 10 x 4
#> Resample .metric .estimator .estimate
#> <chr> <chr> <chr> <dbl>
#> 1 Fold01 mcc multiclass 0.542
#> 2 Fold02 mcc multiclass 0.521
#> 3 Fold03 mcc multiclass 0.602
#> 4 Fold04 mcc multiclass 0.519
#> 5 Fold05 mcc multiclass 0.520
#> 6 Fold06 mcc multiclass 0.494
#> 7 Fold07 mcc multiclass 0.461
#> 8 Fold08 mcc multiclass 0.538
#> 9 Fold09 mcc multiclass 0.459
#> 10 Fold10 mcc multiclass 0.498