Mean signed deviation (also known as mean signed difference, or mean signed
error) computes the average differences between `truth`

and `estimate`

. A
related metric is the mean absolute error (`mae()`

).

## Usage

```
msd(data, ...)
# S3 method for data.frame
msd(data, truth, estimate, na_rm = TRUE, case_weights = NULL, ...)
msd_vec(truth, estimate, na_rm = TRUE, case_weights = NULL, ...)
```

## Arguments

- data
A

`data.frame`

containing the columns specified by the`truth`

and`estimate`

arguments.- ...
Not currently used.

- truth
The column identifier for the true results (that is

`numeric`

). This should be an unquoted column name although this argument is passed by expression and supports quasiquotation (you can unquote column names). For`_vec()`

functions, a`numeric`

vector.- estimate
The column identifier for the predicted results (that is also

`numeric`

). As with`truth`

this can be specified different ways but the primary method is to use an unquoted variable name. For`_vec()`

functions, a`numeric`

vector.- na_rm
A

`logical`

value indicating whether`NA`

values should be stripped before the computation proceeds.- case_weights
The optional column identifier for case weights. This should be an unquoted column name that evaluates to a numeric column in

`data`

. For`_vec()`

functions, a numeric vector,`hardhat::importance_weights()`

, or`hardhat::frequency_weights()`

.

## Value

A `tibble`

with columns `.metric`

, `.estimator`

,
and `.estimate`

and 1 row of values.

For grouped data frames, the number of rows returned will be the same as the number of groups.

For `msd_vec()`

, a single `numeric`

value (or `NA`

).

## Details

Mean signed deviation is rarely used, since positive and negative errors
cancel each other out. For example, `msd_vec(c(100, -100), c(0, 0))`

would
return a seemingly "perfect" value of `0`

, even though `estimate`

is wildly
different from `truth`

. `mae()`

attempts to remedy this by taking the
absolute value of the differences before computing the mean.

This metric is computed as `mean(truth - estimate)`

, following the convention
that an "error" is computed as `observed - predicted`

. If you expected this
metric to be computed as `mean(estimate - truth)`

, reverse the sign of the
result.

## See also

Other numeric metrics:
`ccc()`

,
`huber_loss_pseudo()`

,
`huber_loss()`

,
`iic()`

,
`mae()`

,
`mape()`

,
`mase()`

,
`mpe()`

,
`poisson_log_loss()`

,
`rmse()`

,
`rpd()`

,
`rpiq()`

,
`rsq_trad()`

,
`rsq()`

,
`smape()`

Other accuracy metrics:
`ccc()`

,
`huber_loss_pseudo()`

,
`huber_loss()`

,
`iic()`

,
`mae()`

,
`mape()`

,
`mase()`

,
`mpe()`

,
`poisson_log_loss()`

,
`rmse()`

,
`smape()`

## Examples

```
# Supply truth and predictions as bare column names
msd(solubility_test, solubility, prediction)
#> # A tibble: 1 × 3
#> .metric .estimator .estimate
#> <chr> <chr> <dbl>
#> 1 msd standard -0.0143
library(dplyr)
set.seed(1234)
size <- 100
times <- 10
# create 10 resamples
solubility_resampled <- bind_rows(
replicate(
n = times,
expr = sample_n(solubility_test, size, replace = TRUE),
simplify = FALSE
),
.id = "resample"
)
# Compute the metric by group
metric_results <- solubility_resampled %>%
group_by(resample) %>%
msd(solubility, prediction)
metric_results
#> # A tibble: 10 × 4
#> resample .metric .estimator .estimate
#> <chr> <chr> <chr> <dbl>
#> 1 1 msd standard -0.0119
#> 2 10 msd standard -0.0424
#> 3 2 msd standard 0.0111
#> 4 3 msd standard -0.0906
#> 5 4 msd standard -0.0859
#> 6 5 msd standard -0.0301
#> 7 6 msd standard -0.0132
#> 8 7 msd standard -0.00640
#> 9 8 msd standard -0.000697
#> 10 9 msd standard -0.0399
# Resampled mean estimate
metric_results %>%
summarise(avg_estimate = mean(.estimate))
#> # A tibble: 1 × 1
#> avg_estimate
#> <dbl>
#> 1 -0.0310
```