Title: | Bootstrap-Based Goodness-of-Fit Tests for Parametric Regression |
---|---|
Description: | Provides statistical methods to check if a parametric family of conditional density functions fits to some given dataset of covariates and response variables. Different test statistics can be used to determine the goodness-of-fit of the assumed model, see Andrews (1997) <doi:10.2307/2171880>, Bierens & Wang (2012) <doi:10.1017/S0266466611000168>, Dikta & Scheer (2021) <doi:10.1007/978-3-030-73480-0> and Kremling & Dikta (2024) <doi:10.48550/arXiv.2409.20262>. As proposed in these papers, the corresponding p-values are approximated using a parametric bootstrap method. |
Authors: | Gitte Kremling [aut, cre, cph]
|
Maintainer: | Gitte Kremling <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.0.0 |
Built: | 2025-03-04 04:33:34 UTC |
Source: | https://github.com/gkremling/gofreg |
This class inherits from TestStatistic and implements a function to calculate the test statistic (and x-y-values that can be used to plot the underlying process).
The process underlying the test statistic is defined by
gofreg::TestStatistic
-> CondKolmbXY
calc_stat()
Calculate the value of the test statistic for given data and a model to test for.
CondKolmbXY$calc_stat(data, model)
data
data.frame()
with columns x and y containing the data
model
ParamRegrModel to test for, already fitted to the data
The modified object (self
), allowing for method chaining.
clone()
The objects of this class are cloneable with this method.
CondKolmbXY$clone(deep = FALSE)
deep
Whether to make a deep clone.
# Create an example dataset n <- 100 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() y <- model$sample_yx(x, params=list(beta=c(2,3), sd=1)) data <- dplyr::tibble(x = x, y = y) # Fit the correct model model$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Print value of test statistic and plot corresponding process ts <- CondKolmbXY$new() ts$calc_stat(data, model) print(ts) plot(ts) # Fit a wrong model model2 <- NormalGLM$new(linkinv = function(u) {u+10}) model2$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Print value of test statistic and plot corresponding process ts2 <- CondKolmbXY$new() ts2$calc_stat(data, model2) print(ts2) plot(ts2)
# Create an example dataset n <- 100 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() y <- model$sample_yx(x, params=list(beta=c(2,3), sd=1)) data <- dplyr::tibble(x = x, y = y) # Fit the correct model model$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Print value of test statistic and plot corresponding process ts <- CondKolmbXY$new() ts$calc_stat(data, model) print(ts) plot(ts) # Fit a wrong model model2 <- NormalGLM$new(linkinv = function(u) {u+10}) model2$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Print value of test statistic and plot corresponding process ts2 <- CondKolmbXY$new() ts2$calc_stat(data, model2) print(ts2) plot(ts2)
This class inherits from TestStatistic and implements a function to calculate the test statistic (and x-y-values that can be used to plot the underlying process).
The process underlying the test statistic is given in Andrews (1997) doi:10.2307/2171880 and defined by
gofreg::TestStatistic
-> CondKolmXY
calc_stat()
Calculate the value of the test statistic for given data and a model to test for.
CondKolmXY$calc_stat(data, model)
data
data.frame()
with columns x and y containing the data
model
ParamRegrModel to test for, already fitted to the data
The modified object (self
), allowing for method chaining.
clone()
The objects of this class are cloneable with this method.
CondKolmXY$clone(deep = FALSE)
deep
Whether to make a deep clone.
# Create an example dataset n <- 100 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() y <- model$sample_yx(x, params=list(beta=c(2,3), sd=1)) data <- dplyr::tibble(x = x, y = y) # Fit the correct model model$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Print value of test statistic and plot corresponding process ts <- CondKolmXY$new() ts$calc_stat(data, model) print(ts) plot(ts) # Fit a wrong model model2 <- NormalGLM$new(linkinv = function(u) {u+10}) model2$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Print value of test statistic and plot corresponding process ts2 <- CondKolmXY$new() ts2$calc_stat(data, model2) print(ts2) plot(ts2)
# Create an example dataset n <- 100 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() y <- model$sample_yx(x, params=list(beta=c(2,3), sd=1)) data <- dplyr::tibble(x = x, y = y) # Fit the correct model model$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Print value of test statistic and plot corresponding process ts <- CondKolmXY$new() ts$calc_stat(data, model) print(ts) plot(ts) # Fit a wrong model model2 <- NormalGLM$new(linkinv = function(u) {u+10}) model2$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Print value of test statistic and plot corresponding process ts2 <- CondKolmXY$new() ts2$calc_stat(data, model2) print(ts2) plot(ts2)
This class inherits from TestStatistic and implements a function to calculate the test statistic (and x-y-values that can be used to plot the underlying process).
The process underlying the test statistic is given in Kremling & Dikta (2024) https://arxiv.org/abs/2409.20262 and defined by
gofreg::TestStatistic
-> CondKolmY
calc_stat()
Calculate the value of the test statistic for given data and a model to test for.
CondKolmY$calc_stat(data, model)
data
data.frame()
with columns x and y containing the data
model
ParamRegrModel to test for, already fitted to the data
The modified object (self
), allowing for method chaining.
clone()
The objects of this class are cloneable with this method.
CondKolmY$clone(deep = FALSE)
deep
Whether to make a deep clone.
# Create an example dataset n <- 100 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() y <- model$sample_yx(x, params=list(beta=c(2,3), sd=1)) data <- dplyr::tibble(x = x, y = y) # Fit the correct model model$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Print value of test statistic and plot corresponding process ts <- CondKolmY$new() ts$calc_stat(data, model) print(ts) plot(ts) # Fit a wrong model model2 <- NormalGLM$new(linkinv = function(u) {u+10}) model2$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Print value of test statistic and plot corresponding process ts2 <- CondKolmY$new() ts2$calc_stat(data, model2) print(ts2) plot(ts2)
# Create an example dataset n <- 100 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() y <- model$sample_yx(x, params=list(beta=c(2,3), sd=1)) data <- dplyr::tibble(x = x, y = y) # Fit the correct model model$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Print value of test statistic and plot corresponding process ts <- CondKolmY$new() ts$calc_stat(data, model) print(ts) plot(ts) # Fit a wrong model model2 <- NormalGLM$new(linkinv = function(u) {u+10}) model2$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Print value of test statistic and plot corresponding process ts2 <- CondKolmY$new() ts2$calc_stat(data, model2) print(ts2) plot(ts2)
This class inherits from TestStatistic and implements a function to calculate the test statistic (and x-y-values that can be used to plot the underlying process).
The process underlying the test statistic is defined by
gofreg::TestStatistic
-> CondKolmY_RCM
calc_stat()
Calculate the value of the test statistic for given data and a model to test for.
CondKolmY_RCM$calc_stat(data, model)
data
data.frame()
with columns x and y containing the data
model
ParamRegrModel to test for, already fitted to the data
The modified object (self
), allowing for method chaining.
clone()
The objects of this class are cloneable with this method.
CondKolmY_RCM$clone(deep = FALSE)
deep
Whether to make a deep clone.
# Create an example dataset n <- 100 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() y <- model$sample_yx(x, params=list(beta=c(2,3), sd=1)) c <- rnorm(n, mean(y)*1.2, sd(y)*0.5) data <- dplyr::tibble(x = x, z = pmin(y,c), delta = as.numeric(y <= c)) # Fit the correct model model$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE, loglik = loglik_xzd) # Print value of test statistic and plot corresponding process ts <- CondKolmY_RCM$new() ts$calc_stat(data, model) print(ts) plot(ts) # Fit a wrong model model2 <- NormalGLM$new(linkinv = function(u) {u+10}) model2$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE, loglik = loglik_xzd) # Print value of test statistic and plot corresponding process ts2 <- CondKolmY_RCM$new() ts2$calc_stat(data, model2) print(ts2) plot(ts2)
# Create an example dataset n <- 100 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() y <- model$sample_yx(x, params=list(beta=c(2,3), sd=1)) c <- rnorm(n, mean(y)*1.2, sd(y)*0.5) data <- dplyr::tibble(x = x, z = pmin(y,c), delta = as.numeric(y <= c)) # Fit the correct model model$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE, loglik = loglik_xzd) # Print value of test statistic and plot corresponding process ts <- CondKolmY_RCM$new() ts$calc_stat(data, model) print(ts) plot(ts) # Fit a wrong model model2 <- NormalGLM$new(linkinv = function(u) {u+10}) model2$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE, loglik = loglik_xzd) # Print value of test statistic and plot corresponding process ts2 <- CondKolmY_RCM$new() ts2$calc_stat(data, model2) print(ts2) plot(ts2)
This class represents a generalized linear model with exponential distribution. It inherits from GLM and implements its functions that, for example, evaluate the conditional density and distribution functions.
gofreg::ParamRegrModel
-> gofreg::GLM
-> ExpGLM
fit()
Calculates the maximum likelihood estimator for the model parameters based on given data.
ExpGLM$fit( data, params_init = private$params, loglik = loglik_xy, inplace = FALSE )
data
tibble containing the data to fit the model to
params_init
initial value of the model parameters to use for the optimization (defaults to the fitted parameter values)
loglik
function(data, model, params)
defaults to loglik_xy()
inplace
logical
; if TRUE
, default model parameters are set
accordingly and parameter estimator is not returned
MLE of the model parameters for the given data, same shape as
params_init
f_yx()
Evaluates the conditional density function.
ExpGLM$f_yx(t, x, params = private$params)
t
value(s) at which the conditional density shall be evaluated
x
matrix of covariates, each row representing one sample
params
model parameters to use (list()
with tag beta), defaults to
the fitted parameter values
value(s) of the conditional density function, same shape as t
F_yx()
Evaluates the conditional distribution function.
ExpGLM$F_yx(t, x, params = private$params)
t
value(s) at which the conditional distribution shall be evaluated
x
matrix of covariates, each row representing one sample
params
model parameters to use (list()
with tag beta), defaults to
the fitted parameter values
value(s) of the conditional distribution function, same shape as
t
F1_yx()
Evaluates the conditional quantile function.
ExpGLM$F1_yx(t, x, params = private$params)
t
value(s) at which the conditional quantile function shall be evaluated
x
matrix of covariates, each row representing one sample
params
model parameters to use (list()
with tag beta), defaults to
the fitted parameter values
value(s) of the conditional quantile function, same shape as
t
sample_yx()
Generates a new sample of response variables with the same conditional distribution.
ExpGLM$sample_yx(x, params = private$params)
x
matrix of covariates, each row representing one sample
params
model parameters to use (list()
with tag beta), defaults to
the fitted parameter values
vector of sampled response variables, same length as nrow(x)
clone()
The objects of this class are cloneable with this method.
ExpGLM$clone(deep = FALSE)
deep
Whether to make a deep clone.
# Use the built-in cars dataset x <- datasets::cars$speed y <- datasets::cars$dist data <- dplyr::tibble(x=x, y=y) # Create an instance of ExpGLM model <- ExpGLM$new() # Fit an Exponential GLM to the cars dataset model$fit(data, params_init = list(beta=3), inplace=TRUE) params_opt <- model$get_params() # Plot the resulting regression function plot(datasets::cars) abline(a = 0, b = params_opt$beta) # Generate a sample for y for given x following the same distribution x.new <- seq(min(x), max(x), by=2) y.smpl <- model$sample_yx(x.new) points(x.new, y.smpl, col="red") # Evaluate the conditional density, distribution, quantile and regression # function at given values model$f_yx(y.smpl, x.new) model$F_yx(y.smpl, x.new) model$F1_yx(y.smpl, x.new) y.pred <- model$mean_yx(x.new) points(x.new, y.pred, col="blue")
# Use the built-in cars dataset x <- datasets::cars$speed y <- datasets::cars$dist data <- dplyr::tibble(x=x, y=y) # Create an instance of ExpGLM model <- ExpGLM$new() # Fit an Exponential GLM to the cars dataset model$fit(data, params_init = list(beta=3), inplace=TRUE) params_opt <- model$get_params() # Plot the resulting regression function plot(datasets::cars) abline(a = 0, b = params_opt$beta) # Generate a sample for y for given x following the same distribution x.new <- seq(min(x), max(x), by=2) y.smpl <- model$sample_yx(x.new) points(x.new, y.smpl, col="red") # Evaluate the conditional density, distribution, quantile and regression # function at given values model$f_yx(y.smpl, x.new) model$F_yx(y.smpl, x.new) model$F1_yx(y.smpl, x.new) y.pred <- model$mean_yx(x.new) points(x.new, y.pred, col="blue")
This class represents a generalized linear model with Gamma distribution. It inherits from GLM and implements its functions that, for example, evaluate the conditional density and distribution functions.
gofreg::ParamRegrModel
-> gofreg::GLM
-> GammaGLM
fit()
Calculates the maximum likelihood estimator for the model parameters based on given data.
GammaGLM$fit( data, params_init = private$params, loglik = loglik_xy, inplace = FALSE )
data
tibble containing the data to fit the model to
params_init
initial value of the model parameters to use for the optimization (defaults to the fitted parameter values)
loglik
function(data, model, params)
defaults to loglik_xy()
inplace
logical
; if TRUE
, default model parameters are set
accordingly and parameter estimator is not returned
MLE of the model parameters for the given data, same shape as
params_init
f_yx()
Evaluates the conditional density function.
GammaGLM$f_yx(t, x, params = private$params)
t
value(s) at which the conditional density shall be evaluated
x
matrix of covariates, each row representing one sample
params
model parameters to use (list()
with tags beta and shape),
defaults to the fitted parameter values
value(s) of the conditional density function, same shape as t
F_yx()
Evaluates the conditional distribution function.
GammaGLM$F_yx(t, x, params = private$params)
t
value(s) at which the conditional distribution shall be evaluated
x
matrix of covariates, each row representing one sample
params
model parameters to use (list()
with tags beta and shape),
defaults to the fitted parameter values
value(s) of the conditional distribution function, same shape as
t
F1_yx()
Evaluates the conditional quantile function.
GammaGLM$F1_yx(t, x, params = private$params)
t
value(s) at which the conditional quantile function shall be evaluated
x
matrix of covariates, each row representing one sample
params
model parameters to use (list()
with tags beta and shape),
defaults to the fitted parameter values
value(s) of the conditional quantile function, same shape as
t
sample_yx()
Generates a new sample of response variables with the same conditional distribution.
GammaGLM$sample_yx(x, params = private$params)
x
matrix of covariates, each row representing one sample
params
model parameters to use (list()
with tags beta and shape),
defaults to the fitted parameter values
vector of sampled response variables, same length as nrow(x)
clone()
The objects of this class are cloneable with this method.
GammaGLM$clone(deep = FALSE)
deep
Whether to make a deep clone.
# Use the built-in cars dataset x <- datasets::cars$speed y <- datasets::cars$dist data <- dplyr::tibble(x=x, y=y) # Create an instance of GammaGLM model <- GammaGLM$new() # Fit an Gamma GLM to the cars dataset model$fit(data, params_init = list(beta=3, shape=1), inplace=TRUE) params_opt <- model$get_params() # Plot the resulting regression function plot(datasets::cars) abline(a = 0, b = params_opt$beta) # Generate a sample for y for given x following the same distribution x.new <- seq(min(x), max(x), by=2) y.smpl <- model$sample_yx(x.new) points(x.new, y.smpl, col="red") # Evaluate the conditional density, distribution, quantile and regression # function at given values model$f_yx(y.smpl, x.new) model$F_yx(y.smpl, x.new) model$F1_yx(y.smpl, x.new) y.pred <- model$mean_yx(x.new) points(x.new, y.pred, col="blue")
# Use the built-in cars dataset x <- datasets::cars$speed y <- datasets::cars$dist data <- dplyr::tibble(x=x, y=y) # Create an instance of GammaGLM model <- GammaGLM$new() # Fit an Gamma GLM to the cars dataset model$fit(data, params_init = list(beta=3, shape=1), inplace=TRUE) params_opt <- model$get_params() # Plot the resulting regression function plot(datasets::cars) abline(a = 0, b = params_opt$beta) # Generate a sample for y for given x following the same distribution x.new <- seq(min(x), max(x), by=2) y.smpl <- model$sample_yx(x.new) points(x.new, y.smpl, col="red") # Evaluate the conditional density, distribution, quantile and regression # function at given values model$f_yx(y.smpl, x.new) model$F_yx(y.smpl, x.new) model$F1_yx(y.smpl, x.new) y.pred <- model$mean_yx(x.new) points(x.new, y.pred, col="blue")
This class specializes ParamRegrModel. It is the abstract base class for parametric generalized linear model objects with specific distribution family such as NormalGLM and handles the (inverse) link function.
gofreg::ParamRegrModel
-> GLM
new()
Initialize an object of class GLM.
GLM$new(linkinv = identity, params = NA)
linkinv
inverse link function, defaults to identity function
params
model parameters to use as default (optional)
a new instance of the class
mean_yx()
Evaluates the regression function or in other terms the expected value of Y given X=x.
GLM$mean_yx(x, params = private$params)
x
vector of covariates
params
model parameters to use, defaults to the fitted parameter values
value of the regression function
clone()
The objects of this class are cloneable with this method.
GLM$clone(deep = FALSE)
deep
Whether to make a deep clone.
This constructor function can be used to create an instance of a
parametric GLM with specific distribution family, returning a new object of
NormalGLM, ExpGLM, WeibullGLM or GammaGLM, depending on the value of
distr
.
GLM.new(distr, linkinv = identity, params = NA)
GLM.new(distr, linkinv = identity, params = NA)
distr |
distribution family |
linkinv |
inverse link function, defaults to identity function |
params |
model parameters to use as default (optional) |
a new instance of a GLM-subclass
model <- GLM.new(distr = "normal") # see examples of GLM-subclasses (e.g. NormalGLM) for how to use such models
model <- GLM.new(distr = "normal") # see examples of GLM-subclasses (e.g. NormalGLM) for how to use such models
This class implements functions to calculate the test statistic for the original data as well as the statistics for bootstrap samples. It also offers the possibility to compute the corresponding bootstrap p-value.
new()
Initialize an instance of class GOFTest.
GOFTest$new( data, model_fitted, test_stat, nboot, resample = resample_param, loglik = loglik_xy )
data
data.frame()
containing the data
model_fitted
object of class ParamRegrModel with fitted parameters
test_stat
object of class TestStatistic
nboot
number of bootstrap iterations
resample
function(data, model)
used to resample data in
bootstrap iterations, defaults to resample_param()
loglik
function(data, model, params)
negative log-likelihood
function used to fit model to resampled data in bootstrap iterations,
defaults to loglik_xy()
a new instance of the class
get_stat_orig()
Calculates the test statistic for the original data and model.
GOFTest$get_stat_orig()
object of class TestStatistic
get_stats_boot()
Calculates the test statistics for the resampled data and corresponding models.
GOFTest$get_stats_boot()
vector of length nboot
containing objects of class
TestStatistic
get_pvalue()
Calculates the bootstrap p-value for the given model.
GOFTest$get_pvalue()
p-value for the null hypothesis that y
was generated according
to model
plot_procs()
Plots the processes underlying the bootstrap test statistics (gray) and the original test statistic (red)
GOFTest$plot_procs( title = sprintf("Test Statistic: %s, p-value: %s", class(private$test_stat)[1], self$get_pvalue()), subtitle = ggplot2::waiver(), color_boot = "gray40", color_orig = "red", x_lab = "plot.x", y_lab = "plot.y" )
title
text to be displayed as title of the plot; defaults to "Test statistic: xxx, p-value: xxx"
subtitle
text to be displayed as subtitle of the plot; default is no subtitle
color_boot
color used to plot bootstrap test statistics; default is "red"
color_orig
color used to plot original test statistic; default is "gray40"
x_lab
label to use for the x-axis; default is "plot.x"
y_lab
label to use for the y-axis; default is "plot.y"
The object (self
), allowing for method chaining.
clone()
The objects of this class are cloneable with this method.
GOFTest$clone(deep = FALSE)
deep
Whether to make a deep clone.
# Create an example dataset n <- 100 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() y <- model$sample_yx(x, params=list(beta=c(2,3), sd=1)) data <- dplyr::tibble(x = x, y = y) # Fit the correct model model$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Calculate the bootstrap p-value and plot the corresponding processes goftest <- GOFTest$new(data, model, test_stat = CondKolmY$new(), nboot = 10) goftest$get_pvalue() goftest$plot_procs() # Fit a wrong model model2 <- NormalGLM$new(linkinv = function(u) {u+10}) model2$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Calculate the bootstrap p-value and plot the corresponding processes goftest2 <- GOFTest$new(data, model2, test_stat = CondKolmY$new(), nboot = 10) goftest2$get_pvalue() goftest2$plot_procs()
# Create an example dataset n <- 100 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() y <- model$sample_yx(x, params=list(beta=c(2,3), sd=1)) data <- dplyr::tibble(x = x, y = y) # Fit the correct model model$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Calculate the bootstrap p-value and plot the corresponding processes goftest <- GOFTest$new(data, model, test_stat = CondKolmY$new(), nboot = 10) goftest$get_pvalue() goftest$plot_procs() # Fit a wrong model model2 <- NormalGLM$new(linkinv = function(u) {u+10}) model2$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Calculate the bootstrap p-value and plot the corresponding processes goftest2 <- GOFTest$new(data, model2, test_stat = CondKolmY$new(), nboot = 10) goftest2$get_pvalue() goftest2$plot_procs()
The log-likelihood function for a parametric regression model with data (x,y) is given by the sum of the logarithm of the conditional density of Y given X=x evaluated at y.
This function is one option that can be used to fit a ParamRegrModel. It
returns the negative log-likelihood value in order for optim()
to
maximize (instead of minimize).
loglik_xy(data, model, params)
loglik_xy(data, model, params)
data |
|
model |
ParamRegrModel to use for the likelihood function |
params |
vector with model parameters to compute likelihood function for |
Value of the negative log-likelihood function
# Create an example dataset n <- 100 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() params.true <- list(beta = c(2,3), sd = 1) y <- model$sample_yx(x, params = params.true) data <- dplyr::tibble(x = x, y = y) # Compute negative log likelihood for true parameters loglik_xy(data, model, params.true) # Compute negative log likelihood for wrong parameters (should be higher) loglik_xy(data, model, params = list(beta = c(1,2), sd = 0.5))
# Create an example dataset n <- 100 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() params.true <- list(beta = c(2,3), sd = 1) y <- model$sample_yx(x, params = params.true) data <- dplyr::tibble(x = x, y = y) # Compute negative log likelihood for true parameters loglik_xy(data, model, params.true) # Compute negative log likelihood for wrong parameters (should be higher) loglik_xy(data, model, params = list(beta = c(1,2), sd = 0.5))
The log-likelihood function for a parametric regression model under random censorship with data (x,z,delta) is given by the sum of the logarithm of the conditional density of Y given X=x evaluated at z if z was uncensored or the logarithm of the conditional survival of Y given X=x evaluated at z if z was censored.
This function is one option that can be used to fit a ParamRegrModel. It
returns the negative log-likelihood value in order for optim()
to
maximize (instead of minimize).
loglik_xzd(data, model, params)
loglik_xzd(data, model, params)
data |
|
model |
ParamRegrModel to use for the likelihood function |
params |
vector with model parameters to compute likelihood function for |
Value of the negative log-likelihood function
# Create an example dataset n <- 100 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() params.true <- list(beta = c(2,3), sd = 1) y <- model$sample_yx(x, params = params.true) c <- rnorm(n, mean(y) * 1.2, sd(y) * 0.5) data <- dplyr::tibble(x = x, z = pmin(y, c), delta = as.numeric(y <= c)) # Compute negative log likelihood for true parameters loglik_xzd(data, model, params.true) # Compute negative log likelihood for wrong parameters (should be higher) loglik_xzd(data, model, params = list(beta = c(1,2), sd = 0.5))
# Create an example dataset n <- 100 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() params.true <- list(beta = c(2,3), sd = 1) y <- model$sample_yx(x, params = params.true) c <- rnorm(n, mean(y) * 1.2, sd(y) * 0.5) data <- dplyr::tibble(x = x, z = pmin(y, c), delta = as.numeric(y <= c)) # Compute negative log likelihood for true parameters loglik_xzd(data, model, params.true) # Compute negative log likelihood for wrong parameters (should be higher) loglik_xzd(data, model, params = list(beta = c(1,2), sd = 0.5))
This class inherits from TestStatistic and implements a function to calculate the test statistic (and x-y-values that can be used to plot the underlying process).
The process underlying the test statistic is given in Dikta & Scheer (2021) doi:10.1007/978-3-030-73480-0 and defined by
gofreg::TestStatistic
-> MEP
calc_stat()
Calculate the value of the test statistic for given data and a model to test for.
MEP$calc_stat(data, model)
data
data.frame()
with columns x and y containing the data
model
ParamRegrModel to test for
The modified object (self
), allowing for method chaining.
clone()
The objects of this class are cloneable with this method.
MEP$clone(deep = FALSE)
deep
Whether to make a deep clone.
# Create an example dataset n <- 100 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() y <- model$sample_yx(x, params=list(beta=c(2,3), sd=1)) data <- dplyr::tibble(x = x, y = y) # Fit the correct model model$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Print value of test statistic and plot corresponding process ts <- MEP$new() ts$calc_stat(data, model) print(ts) plot(ts) # Fit a wrong model model2 <- NormalGLM$new(linkinv = function(u) {u+10}) model2$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Print value of test statistic and plot corresponding process ts2 <- MEP$new() ts2$calc_stat(data, model2) print(ts2) plot(ts2)
# Create an example dataset n <- 100 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() y <- model$sample_yx(x, params=list(beta=c(2,3), sd=1)) data <- dplyr::tibble(x = x, y = y) # Fit the correct model model$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Print value of test statistic and plot corresponding process ts <- MEP$new() ts$calc_stat(data, model) print(ts) plot(ts) # Fit a wrong model model2 <- NormalGLM$new(linkinv = function(u) {u+10}) model2$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Print value of test statistic and plot corresponding process ts2 <- MEP$new() ts2$calc_stat(data, model2) print(ts2) plot(ts2)
This class represents a generalized linear model with negative binomial distribution. It inherits from GLM and implements its functions that, for example, evaluate the conditional density and distribution functions.
gofreg::ParamRegrModel
-> gofreg::GLM
-> NegBinomGLM
fit()
Calculates the maximum likelihood estimator for the model parameters based on given data.
NegBinomGLM$fit( data, params_init = private$params, loglik = loglik_xy, inplace = FALSE )
data
tibble containing the data to fit the model to
params_init
initial value of the model parameters to use for the optimization (defaults to the fitted parameter values)
loglik
function(data, model, params)
defaults to loglik_xy()
inplace
logical
; if TRUE
, default model parameters are set
accordingly and parameter estimator is not returned
MLE of the model parameters for the given data, same shape as
params_init
f_yx()
Evaluates the conditional density function.
NegBinomGLM$f_yx(t, x, params = private$params)
t
value(s) at which the conditional density shall be evaluated
x
matrix of covariates, each row representing one sample
params
model parameters to use (list()
with tags beta and shape),
defaults to the fitted parameter values
value(s) of the conditional density function, same shape as t
F_yx()
Evaluates the conditional distribution function.
NegBinomGLM$F_yx(t, x, params = private$params)
t
value(s) at which the conditional distribution shall be evaluated
x
matrix of covariates, each row representing one sample
params
model parameters to use (list()
with tags beta and shape),
defaults to the fitted parameter values
value(s) of the conditional distribution function, same shape as
t
F1_yx()
Evaluates the conditional quantile function.
NegBinomGLM$F1_yx(t, x, params = private$params)
t
value(s) at which the conditional quantile function shall be evaluated
x
matrix of covariates, each row representing one sample
params
model parameters to use (list()
with tags beta and shape),
defaults to the fitted parameter values
value(s) of the conditional quantile function, same shape as
t
sample_yx()
Generates a new sample of response variables with the same conditional distribution.
NegBinomGLM$sample_yx(x, params = private$params)
x
matrix of covariates, each row representing one sample
params
model parameters to use (list()
with tags beta and shape),
defaults to the fitted parameter values
vector of sampled response variables, same length as nrow(x)
clone()
The objects of this class are cloneable with this method.
NegBinomGLM$clone(deep = FALSE)
deep
Whether to make a deep clone.
# Use the built-in cars dataset x <- datasets::cars$speed y <- datasets::cars$dist data <- dplyr::tibble(x=x, y=y) # Create an instance of a NegBinomGLM model <- NegBinomGLM$new() # Fit a Negative Binomial GLM to the cars dataset model$fit(data, params_init = list(beta=3, shape=2), inplace=TRUE) params_opt <- model$get_params() # Plot the resulting regression function plot(datasets::cars) abline(a = 0, b = params_opt$beta) # Generate a sample for y for given x following the same distribution x.new <- seq(min(x), max(x), by=2) y.smpl <- model$sample_yx(x.new) points(x.new, y.smpl, col="red") # Evaluate the conditional density, distribution, quantile and regression # function at given values model$f_yx(y.smpl, x.new) model$F_yx(y.smpl, x.new) model$F1_yx(y.smpl, x.new) y.pred <- model$mean_yx(x.new) points(x.new, y.pred, col="blue")
# Use the built-in cars dataset x <- datasets::cars$speed y <- datasets::cars$dist data <- dplyr::tibble(x=x, y=y) # Create an instance of a NegBinomGLM model <- NegBinomGLM$new() # Fit a Negative Binomial GLM to the cars dataset model$fit(data, params_init = list(beta=3, shape=2), inplace=TRUE) params_opt <- model$get_params() # Plot the resulting regression function plot(datasets::cars) abline(a = 0, b = params_opt$beta) # Generate a sample for y for given x following the same distribution x.new <- seq(min(x), max(x), by=2) y.smpl <- model$sample_yx(x.new) points(x.new, y.smpl, col="red") # Evaluate the conditional density, distribution, quantile and regression # function at given values model$f_yx(y.smpl, x.new) model$F_yx(y.smpl, x.new) model$F1_yx(y.smpl, x.new) y.pred <- model$mean_yx(x.new) points(x.new, y.pred, col="blue")
This class represents a generalized linear model with normal distribution. It inherits from GLM and implements its functions that, for example, evaluate the conditional density and distribution functions.
gofreg::ParamRegrModel
-> gofreg::GLM
-> NormalGLM
fit()
Calculates the maximum likelihood estimator for the model parameters based on given data.
NormalGLM$fit( data, params_init = private$params, loglik = loglik_xy, inplace = FALSE )
data
tibble containing the data to fit the model to
params_init
initial value of the model parameters to use for the optimization (defaults to the fitted parameter values)
loglik
function(data, model, params)
defaults to loglik_xy()
inplace
logical
; if TRUE
, default model parameters are set
accordingly and parameter estimator is not returned
MLE of the model parameters for the given data, same shape as
params_init
f_yx()
Evaluates the conditional density function.
NormalGLM$f_yx(t, x, params = private$params)
t
value(s) at which the conditional density shall be evaluated
x
matrix of covariates, each row representing one sample
params
model parameters to use (list()
with tags beta and sd),
defaults to the fitted parameter values
value(s) of the conditional density function, same shape as t
F_yx()
Evaluates the conditional distribution function.
NormalGLM$F_yx(t, x, params = private$params)
t
value(s) at which the conditional distribution shall be evaluated
x
matrix of covariates, each row representing one sample
params
model parameters to use (list()
with tags beta and sd),
defaults to the fitted parameter values
value(s) of the conditional distribution function, same shape as
t
F1_yx()
Evaluates the conditional quantile function.
NormalGLM$F1_yx(t, x, params = private$params)
t
value(s) at which the conditional quantile function shall be evaluated
x
matrix of covariates, each row representing one sample
params
model parameters to use (list()
with tags beta and sd),
defaults to the fitted parameter values
value(s) of the conditional quantile function, same shape as
t
sample_yx()
Generates a new sample of response variables with the same conditional distribution.
NormalGLM$sample_yx(x, params = private$params)
x
matrix of covariates, each row representing one sample
params
model parameters to use (list()
with tags beta and sd),
defaults to the fitted parameter values
vector of sampled response variables, same length as nrow(x)
clone()
The objects of this class are cloneable with this method.
NormalGLM$clone(deep = FALSE)
deep
Whether to make a deep clone.
# Use the built-in cars dataset x <- datasets::cars$speed y <- datasets::cars$dist data <- dplyr::tibble(x=x, y=y) # Create an instance of a NormalGLM model <- NormalGLM$new() # Fit a Normal GLM to the cars dataset model$fit(data, params_init = list(beta=3, sd=2), inplace=TRUE) params_opt <- model$get_params() # Plot the resulting regression function plot(datasets::cars) abline(a = 0, b = params_opt$beta) # Generate a sample for y for given x following the same distribution x.new <- seq(min(x), max(x), by=2) y.smpl <- model$sample_yx(x.new) points(x.new, y.smpl, col="red") # Evaluate the conditional density, distribution, quantile and regression # function at given values model$f_yx(y.smpl, x.new) model$F_yx(y.smpl, x.new) model$F1_yx(y.smpl, x.new) y.pred <- model$mean_yx(x.new) points(x.new, y.pred, col="blue")
# Use the built-in cars dataset x <- datasets::cars$speed y <- datasets::cars$dist data <- dplyr::tibble(x=x, y=y) # Create an instance of a NormalGLM model <- NormalGLM$new() # Fit a Normal GLM to the cars dataset model$fit(data, params_init = list(beta=3, sd=2), inplace=TRUE) params_opt <- model$get_params() # Plot the resulting regression function plot(datasets::cars) abline(a = 0, b = params_opt$beta) # Generate a sample for y for given x following the same distribution x.new <- seq(min(x), max(x), by=2) y.smpl <- model$sample_yx(x.new) points(x.new, y.smpl, col="red") # Evaluate the conditional density, distribution, quantile and regression # function at given values model$f_yx(y.smpl, x.new) model$F_yx(y.smpl, x.new) model$F1_yx(y.smpl, x.new) y.pred <- model$mean_yx(x.new) points(x.new, y.pred, col="blue")
This is the abstract base class for parametric regression model objects like NormalGLM.
Parametric regression models are built around the following key tasks:
A method fit()
to fit the model to given data, i.e. compute the MLE
for the model parameters
Methods f_yx()
, F_yx()
and mean_yx()
to evaluate the conditional
density, distribution and regression function
A method sample_yx()
to generate a random sample of response variables
following the model given a vector of covariates
set_params()
Set the value of the model parameters used as default for the class functions.
ParamRegrModel$set_params(params)
params
model parameters to use as default
The modified object (self
), allowing for method chaining.
get_params()
Returns the value of the model parameters used as default for the class functions.
ParamRegrModel$get_params()
model parameters used as default
fit()
Calculates the maximum likelihood estimator for the model parameters based on given data.
ParamRegrModel$fit(data, params_init = private$params, loglik = loglik_xy)
data
list containing the data to fit the model to
params_init
initial value of the model parameters to use for the optimization (defaults to the fitted parameter values)
loglik
function(data, model, params)
defaults to loglik_xy()
MLE of the model parameters for the given data, same shape as
params_init
f_yx()
Evaluates the conditional density function.
ParamRegrModel$f_yx(t, x, params = private$params)
t
value(s) at which the conditional density shall be evaluated
x
vector of covariates
params
model parameters to use, defaults to the fitted parameter values
value(s) of the conditional density function, same shape as t
F_yx()
Evaluates the conditional distribution function.
ParamRegrModel$F_yx(t, x, params = private$params)
t
value(s) at which the conditional distribution shall be evaluated
x
vector of covariates
params
model parameters to use, defaults to the fitted parameter values
value(s) of the conditional distribution function, same shape as
t
F1_yx()
Evaluates the conditional quantile function.
ParamRegrModel$F1_yx(t, x, params = private$params)
t
value(s) at which the conditional quantile function shall be evaluated
x
vector of covariates
params
model parameters to use, defaults to the fitted parameter values
value(s) of the conditional quantile function, same shape as
t
sample_yx()
Generates a new sample of response variables with the same conditional distribution.
ParamRegrModel$sample_yx(x, params = private$params)
x
vector of covariates
params
model parameters to use, defaults to the fitted parameter values
vector of sampled response variables, same length as x
mean_yx()
Evaluates the regression function or in other terms the expected value of Y given X=x.
ParamRegrModel$mean_yx(x, params = private$params)
x
vector of covariates
params
model parameters to use, defaults to the fitted parameter values
value of the regression function
clone()
The objects of this class are cloneable with this method.
ParamRegrModel$clone(deep = FALSE)
deep
Whether to make a deep clone.
Generate a new, resampled dataset of the same shape as data
following the given model. The covariates are kept the same and the
response variables are drawn according to model$sample_yx()
.
resample_param(data, model)
resample_param(data, model)
data |
|
model |
ParamRegrModel to use for the resampling |
data.frame()
with columns x and y containing the resampled data
# Create an example dataset n <- 10 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() params <- list(beta = c(2, 3), sd = 1) y <- model$sample_yx(x, params = params) data <- dplyr::tibble(x = x, y = y) # Fit the model to the data model$fit(data, params_init = params, inplace = TRUE) # Resample from the model given data resample_param(data, model)
# Create an example dataset n <- 10 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() params <- list(beta = c(2, 3), sd = 1) y <- model$sample_yx(x, params = params) data <- dplyr::tibble(x = x, y = y) # Fit the model to the data model$fit(data, params_init = params, inplace = TRUE) # Resample from the model given data resample_param(data, model)
Generate a new, resampled dataset of the same shape as data
following the given model. The covariates X are kept the same. Survival
times Y are drawn according to model$sample_yx()
and censoring times C
according to the KM estimator.
resample_param_cens(data, model)
resample_param_cens(data, model)
data |
|
model |
ParamRegrModel to use for the resampling |
data.frame()
with columns x, z and delta containing the resampled
data
# Create an example dataset n <- 10 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() params <- list(beta = c(2, 3), sd = 1) y <- model$sample_yx(x, params = params) c <- rnorm(n, mean(y) * 1.2, sd(y) * 0.5) z <- pmin(y, c) delta <- as.numeric(y <= c) data <- dplyr::tibble(x = x, z = z, delta = delta) # Fit the model to the data model$fit(data, params_init = params, inplace = TRUE, loglik = loglik_xzd) # Resample from the model given data resample_param_cens(data, model)
# Create an example dataset n <- 10 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() params <- list(beta = c(2, 3), sd = 1) y <- model$sample_yx(x, params = params) c <- rnorm(n, mean(y) * 1.2, sd(y) * 0.5) z <- pmin(y, c) delta <- as.numeric(y <= c) data <- dplyr::tibble(x = x, z = z, delta = delta) # Fit the model to the data model$fit(data, params_init = params, inplace = TRUE, loglik = loglik_xzd) # Resample from the model given data resample_param_cens(data, model)
Generate a new, resampled dataset of the same shape as data
following the given model. The covariates are resampled from data$x
and
the response variables are drawn according to model$sample_yx()
.
resample_param_rsmplx(data, model)
resample_param_rsmplx(data, model)
data |
|
model |
ParamRegrModel to use for the resampling |
data.frame()
with columns x and y containing the resampled data
# Create an example dataset n <- 10 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() params <- list(beta = c(2, 3), sd = 1) y <- model$sample_yx(x, params = params) data <- dplyr::tibble(x = x, y = y) # Fit the model to the data model$fit(data, params_init = params, inplace = TRUE) # Resample from the model given data resample_param(data, model)
# Create an example dataset n <- 10 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() params <- list(beta = c(2, 3), sd = 1) y <- model$sample_yx(x, params = params) data <- dplyr::tibble(x = x, y = y) # Fit the model to the data model$fit(data, params_init = params, inplace = TRUE) # Resample from the model given data resample_param(data, model)
This class inherits from TestStatistic and implements a function to calculate the test statistic (and x-y-values that can be used to plot the underlying process).
The process underlying the test statistic is given in Bierens & Wang (2012) doi:10.1017/S0266466611000168 and defined by
gofreg::TestStatistic
-> SICM
new()
Initialize an instance of class SICM.
SICM$new( c, transx = function(values) { tvals <- atan(scale(values)) tvals[, apply(values, 2, sd) == 0] <- 0 return(tvals) }, transy = function(values, data) { array(atan(scale(values, center = mean(data$y), scale = sd(data$y)))) } )
c
chosen value for integral boundaries (see Bierens & Wang (2012))
transx
function(values)
used to transform x-values to be
standardized and bounded; default is standardization by subtracting the
mean and dividing by the standard deviation and then applying arctan
transy
function(values, data)
used to transform y-values to be
standardized and bounded (same method is used for simulated y-values);
default is standardization by subtracting the mean and dividing by the
standard deviation and then applying arctan
a new instance of the class
calc_stat()
Calculate the value of the test statistic for given data and a model to test for.
SICM$calc_stat(data, model)
data
data.frame()
with columns x and y containing the data
model
ParamRegrModel to test for
The modified object (self
), allowing for method chaining.
clone()
The objects of this class are cloneable with this method.
SICM$clone(deep = FALSE)
deep
Whether to make a deep clone.
# Create an example dataset n <- 100 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() y <- model$sample_yx(x, params=list(beta=c(2,3), sd=1)) data <- dplyr::tibble(x = x, y = y) # Fit the correct model model$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Print value of test statistic and plot corresponding process ts <- SICM$new(c = 5) ts$calc_stat(data, model) print(ts) plot(ts) # Fit a wrong model model2 <- NormalGLM$new(linkinv = function(u) {u+10}) model2$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Print value of test statistic and plot corresponding process ts2 <- SICM$new(c = 5) ts2$calc_stat(data, model2) print(ts2) plot(ts2)
# Create an example dataset n <- 100 x <- cbind(runif(n), rbinom(n, 1, 0.5)) model <- NormalGLM$new() y <- model$sample_yx(x, params=list(beta=c(2,3), sd=1)) data <- dplyr::tibble(x = x, y = y) # Fit the correct model model$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Print value of test statistic and plot corresponding process ts <- SICM$new(c = 5) ts$calc_stat(data, model) print(ts) plot(ts) # Fit a wrong model model2 <- NormalGLM$new(linkinv = function(u) {u+10}) model2$fit(data, params_init=list(beta=c(1,1), sd=3), inplace = TRUE) # Print value of test statistic and plot corresponding process ts2 <- SICM$new(c = 5) ts2$calc_stat(data, model2) print(ts2) plot(ts2)
This is the abstract base class for test statistic objects like CondKolmY or MEP.
Test statistics are built around the key method
calc_stat()
which calculates the particular test statistic (and
x-y-values that can be used to plot the underlying process).
get_value()
Returns the value of the test statistic.
TestStatistic$get_value()
value of the test statistic
calc_stat()
Calculate the value of the test statistic for given data and a model to test for.
TestStatistic$calc_stat(data, model)
data
list()
containing the data
model
ParamRegrModel to test for
The modified object (self
), allowing for method chaining.
get_plot_xy()
Returns vectors of x and y that can be used to plot the process corresponding to the test statistic.
TestStatistic$get_plot_xy()
list with plot.x and plot.y being vectors of the same length
print()
Overrides the print-method for objects of type
TestStatistic
to only print its value.
TestStatistic$print()
The object (self
), allowing for method chaining.
geom_ts_proc()
Creates a line plot showing the underlying process of the test statistic.
TestStatistic$geom_ts_proc(...)
...
Other arguments passed on to ggplot2::geom_line()
. These are
often aesthetics, used to set an aesthetic to a fixed value, like
colour = "red"
or size = 3
.
A ggplot2 layer representing a line plot.
plot()
Creates a new ggplot showing the underlying process of the test statistic.
TestStatistic$plot(...)
...
Other arguments passed on to ggplot2::geom_line()
. These are
often aesthetics, used to set an aesthetic to a fixed value, like
colour = "red"
or size = 3
.
A ggplot2 object representing the complete plot, including a line geometry.
clone()
The objects of this class are cloneable with this method.
TestStatistic$clone(deep = FALSE)
deep
Whether to make a deep clone.
This class represents a generalized linear model with Weibull distribution. It inherits from GLM and implements its functions that, for example, evaluate the conditional density and distribution functions.
gofreg::ParamRegrModel
-> gofreg::GLM
-> WeibullGLM
fit()
Calculates the maximum likelihood estimator for the model parameters based on given data.
WeibullGLM$fit( data, params_init = private$params, loglik = loglik_xy, inplace = FALSE )
data
tibble containing the data to fit the model to
params_init
initial value of the model parameters to use for the optimization (defaults to the fitted parameter values)
loglik
function(data, model, params)
defaults to loglik_xy()
inplace
logical
; if TRUE
, default model parameters are set
accordingly and parameter estimator is not returned
MLE of the model parameters for the given data, same shape as
params_init
f_yx()
Evaluates the conditional density function.
WeibullGLM$f_yx(t, x, params = private$params)
t
value(s) at which the conditional density shall be evaluated
x
matrix of covariates, each row representing one sample
params
model parameters to use (list()
with tags beta and shape),
defaults to the fitted parameter values
value(s) of the conditional density function, same shape as t
F_yx()
Evaluates the conditional distribution function.
WeibullGLM$F_yx(t, x, params = private$params)
t
value(s) at which the conditional distribution shall be evaluated
x
matrix of covariates, each row representing one sample
params
model parameters to use (list()
with tags beta and shape),
defaults to the fitted parameter values
value(s) of the conditional distribution function, same shape as
t
F1_yx()
Evaluates the conditional quantile function.
WeibullGLM$F1_yx(t, x, params = private$params)
t
value(s) at which the conditional quantile function shall be evaluated
x
matrix of covariates, each row representing one sample
params
model parameters to use (list()
with tags beta and shape),
defaults to the fitted parameter values
value(s) of the conditional quantile function, same shape as
t
sample_yx()
Generates a new sample of response variables with the same conditional distribution.
WeibullGLM$sample_yx(x, params = private$params)
x
matrix of covariates, each row representing one sample
params
model parameters to use (list()
with tags beta and shape),
defaults to the fitted parameter values
vector of sampled response variables, same length as nrow(x)
clone()
The objects of this class are cloneable with this method.
WeibullGLM$clone(deep = FALSE)
deep
Whether to make a deep clone.
# Use the built-in cars dataset x <- datasets::cars$speed y <- datasets::cars$dist data <- dplyr::tibble(x=x, y=y) # Create an instance of WeibullGLM model <- WeibullGLM$new() # Fit an Weibull GLM to the cars dataset model$fit(data, params_init = list(beta=3, shape=1), inplace=TRUE) params_opt <- model$get_params() # Plot the resulting regression function plot(datasets::cars) abline(a = 0, b = params_opt$beta) # Generate a sample for y for given x following the same distribution x.new <- seq(min(x), max(x), by=2) y.smpl <- model$sample_yx(x.new) points(x.new, y.smpl, col="red") # Evaluate the conditional density, distribution, quantile and regression # function at given values model$f_yx(y.smpl, x.new) model$F_yx(y.smpl, x.new) model$F1_yx(y.smpl, x.new) y.pred <- model$mean_yx(x.new) points(x.new, y.pred, col="blue")
# Use the built-in cars dataset x <- datasets::cars$speed y <- datasets::cars$dist data <- dplyr::tibble(x=x, y=y) # Create an instance of WeibullGLM model <- WeibullGLM$new() # Fit an Weibull GLM to the cars dataset model$fit(data, params_init = list(beta=3, shape=1), inplace=TRUE) params_opt <- model$get_params() # Plot the resulting regression function plot(datasets::cars) abline(a = 0, b = params_opt$beta) # Generate a sample for y for given x following the same distribution x.new <- seq(min(x), max(x), by=2) y.smpl <- model$sample_yx(x.new) points(x.new, y.smpl, col="red") # Evaluate the conditional density, distribution, quantile and regression # function at given values model$f_yx(y.smpl, x.new) model$F_yx(y.smpl, x.new) model$F1_yx(y.smpl, x.new) y.pred <- model$mean_yx(x.new) points(x.new, y.pred, col="blue")