Calculate the coefficient of determination using the traditional definition
of R squared using sum of squares. For a measure of R squared that is
strictly between (0, 1), see rsq()
.
rsq_trad(data, ...) # S3 method for data.frame rsq_trad(data, truth, estimate, na_rm = TRUE, ...) rsq_trad_vec(truth, estimate, na_rm = TRUE, ...)
data | A |
---|---|
... | Not currently used. |
truth | The column identifier for the true results
(that is |
estimate | The column identifier for the predicted
results (that is also |
na_rm | A |
A tibble
with columns .metric
, .estimator
,
and .estimate
and 1 row of values.
For grouped data frames, the number of rows returned will be the same as the number of groups.
For rsq_trad_vec()
, a single numeric
value (or NA
).
The two estimates for the
coefficient of determination, rsq()
and rsq_trad()
, differ by
their formula. The former guarantees a value on (0, 1) while the
latter can generate inaccurate values when the model is
non-informative (see the examples). Both are measures of
consistency/correlation and not of accuracy.
Kvalseth. Cautionary note about \(R^2\). American Statistician (1985) vol. 39 (4) pp. 279-285.
Other numeric metrics:
ccc()
,
huber_loss_pseudo()
,
huber_loss()
,
iic()
,
mae()
,
mape()
,
mase()
,
mpe()
,
msd()
,
rmse()
,
rpd()
,
rpiq()
,
rsq()
,
smape()
Max Kuhn
# Supply truth and predictions as bare column names rsq_trad(solubility_test, solubility, prediction)#> # A tibble: 1 x 3 #> .metric .estimator .estimate #> <chr> <chr> <dbl> #> 1 rsq_trad standard 0.879library(dplyr) set.seed(1234) size <- 100 times <- 10 # create 10 resamples solubility_resampled <- bind_rows( replicate( n = times, expr = sample_n(solubility_test, size, replace = TRUE), simplify = FALSE ), .id = "resample" ) # Compute the metric by group metric_results <- solubility_resampled %>% group_by(resample) %>% rsq_trad(solubility, prediction) metric_results#> # A tibble: 10 x 4 #> resample .metric .estimator .estimate #> <chr> <chr> <chr> <dbl> #> 1 1 rsq_trad standard 0.870 #> 2 10 rsq_trad standard 0.878 #> 3 2 rsq_trad standard 0.891 #> 4 3 rsq_trad standard 0.913 #> 5 4 rsq_trad standard 0.889 #> 6 5 rsq_trad standard 0.857 #> 7 6 rsq_trad standard 0.872 #> 8 7 rsq_trad standard 0.852 #> 9 8 rsq_trad standard 0.915 #> 10 9 rsq_trad standard 0.883#> # A tibble: 1 x 1 #> avg_estimate #> <dbl> #> 1 0.882# With uninformitive data, the traditional version of R^2 can return # negative values. set.seed(2291) solubility_test$randomized <- sample(solubility_test$prediction) rsq(solubility_test, solubility, randomized)#> # A tibble: 1 x 3 #> .metric .estimator .estimate #> <chr> <chr> <dbl> #> 1 rsq standard 0.00199rsq_trad(solubility_test, solubility, randomized)#> # A tibble: 1 x 3 #> .metric .estimator .estimate #> <chr> <chr> <dbl> #> 1 rsq_trad standard -1.01