Mean signed deviation (also known as mean signed difference, or mean signed
error) computes the average differences between `truth`

and `estimate`

. A
related metric is the mean absolute error (`mae()`

).

msd(data, ...) # S3 method for data.frame msd(data, truth, estimate, na_rm = TRUE, ...) msd_vec(truth, estimate, na_rm = TRUE, ...)

data | A |
---|---|

... | Not currently used. |

truth | The column identifier for the true results
(that is |

estimate | The column identifier for the predicted
results (that is also |

na_rm | A |

A `tibble`

with columns `.metric`

, `.estimator`

,
and `.estimate`

and 1 row of values.

For grouped data frames, the number of rows returned will be the same as the number of groups.

For `msd_vec()`

, a single `numeric`

value (or `NA`

).

Mean signed deviation is rarely used, since positive and negative errors
cancel each other out. For example, `msd_vec(c(100, -100), c(0, 0))`

would
return a seemingly "perfect" value of `0`

, even though `estimate`

is wildly
different from `truth`

. `mae()`

attempts to remedy this by taking the
absolute value of the differences before computing the mean.

This metric is computed as `mean(truth - estimate)`

, following the convention
that an "error" is computed as `observed - predicted`

. If you expected this
metric to be computed as `mean(estimate - truth)`

, reverse the sign of the
result.

Other numeric metrics:
`ccc()`

,
`huber_loss_pseudo()`

,
`huber_loss()`

,
`iic()`

,
`mae()`

,
`mape()`

,
`mase()`

,
`mpe()`

,
`rmse()`

,
`rpd()`

,
`rpiq()`

,
`rsq_trad()`

,
`rsq()`

,
`smape()`

Other accuracy metrics:
`ccc()`

,
`huber_loss_pseudo()`

,
`huber_loss()`

,
`iic()`

,
`mae()`

,
`mape()`

,
`mase()`

,
`mpe()`

,
`rmse()`

,
`smape()`

Thomas Bierhance

# Supply truth and predictions as bare column names msd(solubility_test, solubility, prediction)#> # A tibble: 1 x 3 #> .metric .estimator .estimate #> <chr> <chr> <dbl> #> 1 msd standard -0.0143library(dplyr) set.seed(1234) size <- 100 times <- 10 # create 10 resamples solubility_resampled <- bind_rows( replicate( n = times, expr = sample_n(solubility_test, size, replace = TRUE), simplify = FALSE ), .id = "resample" ) # Compute the metric by group metric_results <- solubility_resampled %>% group_by(resample) %>% msd(solubility, prediction) metric_results#> # A tibble: 10 x 4 #> resample .metric .estimator .estimate #> <chr> <chr> <chr> <dbl> #> 1 1 msd standard -0.0119 #> 2 10 msd standard -0.0424 #> 3 2 msd standard 0.0111 #> 4 3 msd standard -0.0906 #> 5 4 msd standard -0.0859 #> 6 5 msd standard -0.0301 #> 7 6 msd standard -0.0132 #> 8 7 msd standard -0.00640 #> 9 8 msd standard -0.000697 #> 10 9 msd standard -0.0399#> # A tibble: 1 x 1 #> avg_estimate #> <dbl> #> 1 -0.0310