Title: | Observed Fisher Information Matrix for Finite Mixture Model |
---|---|
Description: | Developed for the following tasks. 1- simulating realizations from the canonical, restricted, and unrestricted finite mixture models. 2- Monte Carlo approximation for density function of the finite mixture models. 3- Monte Carlo approximation for the observed Fisher information matrix, asymptotic standard error, and the corresponding confidence intervals for parameters of the mixture models sing the method proposed by Basford et al. (1997) <https://espace.library.uq.edu.au/view/UQ:57525>. |
Authors: | Mahdi Teimouri [aut, cre, cph, ctb] |
Maintainer: | Mahdi Teimouri <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.2.3 |
Built: | 2024-10-25 04:01:06 UTC |
Source: | https://github.com/cran/mixbox |
The set of AIS data involves recorded body factors of 202 athletes including 100 women 102 men, see Cook (2009). Among factors, two variables body mass index (BMI) and body fat percentage (Bfat) are chosen for cluster analysis.
data(AIS)
data(AIS)
A text file with 3 columns.
R. D. Cook and S. Weisberg, (2009). An Introduction to Regression Graphics, John Wiley & Sons, New York.
data(AIS)
data(AIS)
The bankruptcy dataset involves ratio of the retained earnings (RE) to the total assets, and the ratio of earnings before interests and the taxes (EBIT) to the total assets of 66 American firms, see Altman (1969).
data(bankruptcy)
data(bankruptcy)
A text file with 3 columns.
E. I. Altman, 1969. Financial ratios, discriminant analysis and the prediction of corporate bankruptcy, The Journal of Finance, 23(4), 589-609.
data(bankruptcy)
data(bankruptcy)
The density function of a -component finite mixture model can be represented as
where with
. Herein,
accounts for the density function of random vector
within each component. In the restricted case,
admits the representation given by
where is location vector,
is skewness vector,
is a positive definite symmetric dispersion matrix for
. Further,
is a positive random variable with mixing density function
,
, and
. We note that
,
, and
are mutually independent. In the canonical or unrestricted case,
admits the representation as
where is the skewness matrix and random vector
follows a zero-mean normal random vector truncated to the positive hyperplane
whose independent marginals have variance unity. We note that in the unrestricted case
is a
diagonal matrix whereas in the canonical case, it is a
matrix and so, random vector
follows a zero-mean normal random vector truncated to the positive hyperplane
.
dmix(Y, G, weight, model = "restricted", mu, sigma, lambda, family = "constant", skewness = "FALSE", param = NULL, theta = NULL, tick = NULL, N = 3000, log = "FALSE")
dmix(Y, G, weight, model = "restricted", mu, sigma, lambda, family = "constant", skewness = "FALSE", param = NULL, theta = NULL, tick = NULL, N = 3000, log = "FALSE")
Y |
an |
G |
number of components. |
weight |
a vector of weight parameters (or mixing proportions). |
model |
it must be |
mu |
a list of location vectors of |
sigma |
a list of dispersion matrices of |
lambda |
a list of skewness vectors of |
family |
name of mixing distribution. By default |
skewness |
a logical statement. By default |
param |
name of the elements of |
theta |
a list of maximum likelihood estimator for |
tick |
a binary vector whose length depends on type of family. The elements of |
N |
an integer number for approximating the |
log |
if |
Monte Carlo approximated values of mixture model density function.
Mahdi Teimouri
Y <- c(1, 2) G <- 2 weight <- rep( 0.5, 2 ) mu1 <- rep( -5, 2 ) mu2 <- rep( 5, 2 ) sigma1 <- matrix( c( 0.4, -0.20, -0.20, 0.5 ), nrow = 2, ncol = 2 ) sigma2 <- matrix( c( 0.5, 0.20, 0.20, 0.4 ), nrow = 2, ncol = 2 ) lambda1 <- c( 5, -5 ) lambda2 <- c(-5, 5 ) mu <- list( mu1, mu2 ) sigma <- list( sigma1 , sigma2 ) lambda <- list( lambda1, lambda2) out <- dmix(Y, G, weight, model = "restricted", mu, sigma, lambda, family = "constant", skewness = "TRUE", param = NULL, theta = NULL, tick = NULL, N = 3000)
Y <- c(1, 2) G <- 2 weight <- rep( 0.5, 2 ) mu1 <- rep( -5, 2 ) mu2 <- rep( 5, 2 ) sigma1 <- matrix( c( 0.4, -0.20, -0.20, 0.5 ), nrow = 2, ncol = 2 ) sigma2 <- matrix( c( 0.5, 0.20, 0.20, 0.4 ), nrow = 2, ncol = 2 ) lambda1 <- c( 5, -5 ) lambda2 <- c(-5, 5 ) mu <- list( mu1, mu2 ) sigma <- list( sigma1 , sigma2 ) lambda <- list( lambda1, lambda2) out <- dmix(Y, G, weight, model = "restricted", mu, sigma, lambda, family = "constant", skewness = "TRUE", param = NULL, theta = NULL, tick = NULL, N = 3000)
This function computes the observed Fisher information matrix for a given restricted finite mixture model. For this, we use the method of Basford et al. (1997). The density function of each -component finite mixture model is given by
where with
. Herein,
accounts for the density function of random vector
within
-th component that admits the representation given by
where is location vector,
is skewness vector,
_g is a positive definite symmetric dispersion matrix for
. Further,
is a positive random variable with mixing density function
,
, and
. We note that
,
, and
are mutually independent. For approximating the observed Fisher information matrix of the finite mixture models, we use the method of
Basford et al. (1997). Based on this method, using observations
, an approximation of the expected information
is give by the observed information as
where
and
Herein
and
denote the maximum likelihood estimator of
and
, for
, respectively.
ofim1(Y, G, weight, mu, sigma, lambda, family = "constant", skewness = "FALSE", param = NULL, theta = NULL, tick = NULL, h = 0.001, N = 3000, level = 0.05, PDF = NULL )
ofim1(Y, G, weight, mu, sigma, lambda, family = "constant", skewness = "FALSE", param = NULL, theta = NULL, tick = NULL, h = 0.001, N = 3000, level = 0.05, PDF = NULL )
Y |
an |
G |
number of components. |
weight |
a vector of weight parameters (or mixing proportions). |
mu |
a list of location vectors of |
sigma |
a list of dispersion matrices of |
lambda |
a list of skewness vectors of |
family |
name of the mixing distribution. By default |
skewness |
logical statement. By default |
param |
name of the elements of |
theta |
a list of maximum likelihood estimator for |
tick |
a binary vector whose length depends on type of family. The elements of |
h |
a positive small value for computing numerical derivative of |
N |
an integer number for approximating the posterior expected values within the E-step of the EM algorithm through the Monte Carlo method. By default |
level |
significance level |
PDF |
mathematical expression for mixing density function |
A two-part list whose first part is the observed Fisher information matrix for finite mixture model.
Mahdi Teimouri
K. E. Basford, D. R. Greenway, G. J. McLachlan, and D. Peel, (1997). Standard errors of fitted means under normal mixture, Computational Statistics, 12, 1-17.
n <- 100 G <- 2 weight <- rep( 0.5, 2 ) mu1 <- rep(-5 , 2 ) mu2 <- rep( 5 , 2 ) sigma1 <- matrix( c(0.4, -0.20, -0.20, 0.5 ), nrow = 2, ncol = 2 ) sigma2 <- matrix( c(0.5, 0.20, 0.20, 0.4 ), nrow = 2, ncol = 2 ) lambda1 <- c( 5, -5 ) lambda2 <- c(-5, 5 ) mu <- list( mu1, mu2 ) lambda <- list( lambda1, lambda2 ) sigma <- list( sigma1 , sigma2 ) PDF <- quote( (b/2)^(a/2)*x^(-a/2 - 1)/gamma(a/2)*exp( -b/(x*2) ) ) param <- c( "a","b") theta1 <- c( 10, 12 ) theta2 <- c( 10, 20 ) theta <- list( theta1, theta2 ) tick <- c( 1, 1 ) Y <- rmix(n, G, weight, model = "restricted", mu, sigma, lambda, family = "igamma", theta) out <- ofim1(Y[, 1:2], G, weight, mu, sigma, lambda, family = "igamma", skewness = "TRUE", param, theta, tick, h = 0.001, N = 3000, level = 0.05, PDF)
n <- 100 G <- 2 weight <- rep( 0.5, 2 ) mu1 <- rep(-5 , 2 ) mu2 <- rep( 5 , 2 ) sigma1 <- matrix( c(0.4, -0.20, -0.20, 0.5 ), nrow = 2, ncol = 2 ) sigma2 <- matrix( c(0.5, 0.20, 0.20, 0.4 ), nrow = 2, ncol = 2 ) lambda1 <- c( 5, -5 ) lambda2 <- c(-5, 5 ) mu <- list( mu1, mu2 ) lambda <- list( lambda1, lambda2 ) sigma <- list( sigma1 , sigma2 ) PDF <- quote( (b/2)^(a/2)*x^(-a/2 - 1)/gamma(a/2)*exp( -b/(x*2) ) ) param <- c( "a","b") theta1 <- c( 10, 12 ) theta2 <- c( 10, 20 ) theta <- list( theta1, theta2 ) tick <- c( 1, 1 ) Y <- rmix(n, G, weight, model = "restricted", mu, sigma, lambda, family = "igamma", theta) out <- ofim1(Y[, 1:2], G, weight, mu, sigma, lambda, family = "igamma", skewness = "TRUE", param, theta, tick, h = 0.001, N = 3000, level = 0.05, PDF)
This function computes the observed Fisher information matrix for a given unrestricted or canonical finite mixture model. For this, we use the method of Basford et al. (1997). The density function of each -component finite mixture model is given by
where with
. Herein,
accounts for the density function of random vector
within
-th component that admits the representation given by
where is location vector,
is skewness vector,
is a positive definite symmetric dispersion matrix for
. Further,
is a positive random variable with mixing density function
,
, and
. We note that
,
, and
are mutually independent. For approximating the observed Fisher information matrix of the finite mixture models, we use the method of
Basford et al. (1997). Based on this method, using observations
, an approximation of the expected information
is give by the observed information as
where
and
Herein
and
denote the maximum likelihood estimator of
and
, for
, respectively.
ofim2(Y, G, weight, model, mu, sigma, lambda, family = "constant", skewness = "FALSE", param = NULL, theta = NULL, tick = NULL, h = 0.001, N = 3000, level = 0.05, PDF = NULL )
ofim2(Y, G, weight, model, mu, sigma, lambda, family = "constant", skewness = "FALSE", param = NULL, theta = NULL, tick = NULL, h = 0.001, N = 3000, level = 0.05, PDF = NULL )
Y |
an |
G |
number of components. |
weight |
a vector of weight parameters (or mixing proportions). |
model |
It must be |
mu |
a list of location vectors of |
sigma |
a list of dispersion matrices of |
lambda |
a list of skewness vectors of |
family |
name of the mixing distribution. By default |
skewness |
logical statement. By default |
param |
name of the elements of |
theta |
a list of maximum likelihood estimator for |
tick |
a binary vector whose length depends on type of family. The elements of |
h |
a positive small value for computing numerical derivative of |
N |
an integer number for approximating the posterior expected values within the E-step of the EM algorithm through the Monte Carlo method. By default |
level |
significance level |
PDF |
mathematical expression for mixing density function |
A two-part list whose first part is the observed Fisher information matrix for finite mixture model.
Mahdi Teimouri
K. E. Basford, D. R. Greenway, G. J. McLachlan, and D. Peel, (1997). Standard errors of fitted means under normal mixture, Computational Statistics, 12, 1-17.
n <- 100 G <- 2 weight <- rep( 0.5, 2 ) mu1 <- rep(-5 , 2 ) mu2 <- rep( 5 , 2 ) sigma1 <- matrix( c(0.4, -0.20, -0.20, 0.5 ), nrow = 2, ncol = 2 ) sigma2 <- matrix( c(0.5, 0.20, 0.20, 0.4 ), nrow = 2, ncol = 2 ) lambda1 <- diag( c( 5, -5 ) ) lambda2 <- diag( c(-5, 5 ) ) mu <- list( mu1, mu2 ) lambda <- list( lambda1, lambda2 ) sigma <- list( sigma1 , sigma2 ) PDF <- quote( (b/2)^(a/2)*x^(-a/2 - 1)/gamma(a/2)*exp( -b/(x*2) ) ) param <- c( "a","b") theta1 <- c( 10, 12 ) theta2 <- c( 10, 20 ) theta <- list( theta1, theta2 ) tick <- c( 1, 1 ) Y <- rmix(n, G, weight, model = "unrestricted", mu, sigma, lambda, family = "igamma", theta) out <- ofim2(Y[, 1:2], G, weight, model = "unrestricted", mu, sigma, lambda, family = "igamma", skewness = "TRUE", param, theta, tick, h = 0.001, N = 3000, level = 0.05, PDF)
n <- 100 G <- 2 weight <- rep( 0.5, 2 ) mu1 <- rep(-5 , 2 ) mu2 <- rep( 5 , 2 ) sigma1 <- matrix( c(0.4, -0.20, -0.20, 0.5 ), nrow = 2, ncol = 2 ) sigma2 <- matrix( c(0.5, 0.20, 0.20, 0.4 ), nrow = 2, ncol = 2 ) lambda1 <- diag( c( 5, -5 ) ) lambda2 <- diag( c(-5, 5 ) ) mu <- list( mu1, mu2 ) lambda <- list( lambda1, lambda2 ) sigma <- list( sigma1 , sigma2 ) PDF <- quote( (b/2)^(a/2)*x^(-a/2 - 1)/gamma(a/2)*exp( -b/(x*2) ) ) param <- c( "a","b") theta1 <- c( 10, 12 ) theta2 <- c( 10, 20 ) theta <- list( theta1, theta2 ) tick <- c( 1, 1 ) Y <- rmix(n, G, weight, model = "unrestricted", mu, sigma, lambda, family = "igamma", theta) out <- ofim2(Y[, 1:2], G, weight, model = "unrestricted", mu, sigma, lambda, family = "igamma", skewness = "TRUE", param, theta, tick, h = 0.001, N = 3000, level = 0.05, PDF)
The density function of a restricted -component finite mixture model can be represented as
where positive constants are called weight (or mixing proportions) parameters with this properties that
and
with
. Herein,
accounts for the density function of random vector
within
-th component that admits the representation given by
where is location vector,
is skewness vector, and
is a positive definite symmetric dispersion matrix for
. Further,
is a positive random variable with mixing density function
,
, and
. We note that
,
, and
are mutually independent.
rmix(n, G, weight, model = "restricted", mu, sigma, lambda, family = "constant", theta = NULL)
rmix(n, G, weight, model = "restricted", mu, sigma, lambda, family = "constant", theta = NULL)
n |
number of realizations. |
G |
number of components. |
weight |
a vector of weight parameters (or mixing proportions). |
model |
It must be |
mu |
a list of location vectors of |
sigma |
a list of dispersion matrices of |
lambda |
a list of skewness vectors of |
family |
name of mixing distribution. By default |
theta |
a list of maximum likelihood estimator(s) for |
a matrix with rows and
columns. The first
columns constitute
realizations from random vector
and the last column is the label of realization
( for
) indicating the component that
is coming from.
Mahdi Teimouri
weight <- rep( 0.5, 2 ) mu1 <- rep(-5 , 2 ) mu2 <- rep( 5 , 2 ) sigma1 <- matrix( c( 0.4, -0.20, -0.20, 0.4 ), nrow = 2, ncol = 2 ) sigma2 <- matrix( c( 0.4, 0.10, 0.10, 0.4 ), nrow = 2, ncol = 2 ) lambda1 <- matrix( c( -4, -2, 2, 5 ), nrow = 2, ncol = 2 ) lambda2 <- matrix( c( 4, 2, -2, -5 ), nrow = 2, ncol = 2 ) theta1 <- c( 10, 10 ) theta2 <- c( 20, 20 ) mu <- list( mu1, mu2 ) sigma <- list( sigma1 , sigma2 ) lambda <- list( lambda1, lambda2) theta <- list( theta1 , theta2 ) Y <- rmix( n = 100, G = 2, weight, model = "canonical", mu, sigma, lambda, family = "igamma", theta )
weight <- rep( 0.5, 2 ) mu1 <- rep(-5 , 2 ) mu2 <- rep( 5 , 2 ) sigma1 <- matrix( c( 0.4, -0.20, -0.20, 0.4 ), nrow = 2, ncol = 2 ) sigma2 <- matrix( c( 0.4, 0.10, 0.10, 0.4 ), nrow = 2, ncol = 2 ) lambda1 <- matrix( c( -4, -2, 2, 5 ), nrow = 2, ncol = 2 ) lambda2 <- matrix( c( 4, 2, -2, -5 ), nrow = 2, ncol = 2 ) theta1 <- c( 10, 10 ) theta2 <- c( 20, 20 ) mu <- list( mu1, mu2 ) sigma <- list( sigma1 , sigma2 ) lambda <- list( lambda1, lambda2) theta <- list( theta1 , theta2 ) Y <- rmix( n = 100, G = 2, weight, model = "canonical", mu, sigma, lambda, family = "igamma", theta )
The density function of each finite mixture model can be represented as
where positive constants are called weight (or mixing proportions) parameters with this properties that
and
with
. Herein,
accounts for the density function of random vector
within
-th component that admits the representation given by
where is location vector,
is skewness vector,
is a positive definite symmetric dispersion matrix for
. Further,
is a positive random variable with mixing density function
,
, and
. We note that
,
, and
are mutually independent. For approximating the asymptotic standard error for parameters of the finite mixture model based on observed Fisher information matrix, we use the method of
Basford et al. (1997). In fact, the covariance matrix of maximum likelihood (ML) estimator
, can be approximated by the inverse of the observed information matrix as
where
and
. Herein
and
, for
, denote the ML estimator of
and
, respectively.
sefm(Y, G, weight, model = "restricted", mu, sigma, lambda, family = "constant", skewness = "FALSE", param = NULL, theta = NULL, tick = NULL, h = 0.001, N = 3000, level = 0.05, PDF = NULL)
sefm(Y, G, weight, model = "restricted", mu, sigma, lambda, family = "constant", skewness = "FALSE", param = NULL, theta = NULL, tick = NULL, h = 0.001, N = 3000, level = 0.05, PDF = NULL)
Y |
an |
G |
number of components. |
weight |
a vector of weight parameters (or mixing proportions). |
model |
it must be |
mu |
a list of location vectors of |
sigma |
a list of dispersion matrices of |
lambda |
a list of skewness vectors of |
family |
name of mixing distribution. By default |
skewness |
a logical statement. By default |
param |
name of the elements of |
PDF |
mathematical expression for mixing density function |
theta |
a list of maximum likelihood estimator for |
tick |
a binary vector whose length depends on type of family. The elements of |
h |
a positive small value for computing numerical derivative of |
N |
an integer number for approximating the posterior expected values within the E-step of the EM algorithm through the Monte Carlo method. By default |
level |
significance level |
Mathematical expressions for density function of mixing distributions , are "bs" (for Birnbaum-Saunders), "burriii" (for Burr type iii), "chisq" (for chi-square), "exp" (for exponential), "f" (for Fisher), "gamma" (for gamma), "gig" (for generalized inverse-Gaussian), "igamma" (for inverse-gamma), "igaussian" (for inverse-Gaussian), "lindley" (for Lindley), "loglog" (for log-logistic), "lognorm" (for log-normal), "lomax" (for Lomax), "pstable" (for positive
-stable), "ptstable" (for polynomially tilted
-stable), "rayleigh" (for Rayleigh), and "weibull" (for Weibull). We note that the density functions of "pstable" and "ptstable" families have no closed form and so are not represented here. The pertinent and given by the following, respectively.
where . Herein
and
are the first and second parameters of this family, respectively.
where and
. Herein
and
are the first and second parameters of this family, respectively.
where and
. Herein
is the degrees of freedom parameter of this family.
where and
where
is the rate parameter of this family.
where and
denotes the ordinary beta function. Herein
where
and
are the first and second degrees of freedom parameters of this family, respectively.
where and
. Herein
and
are the shape and rate parameters of this family, respectively.
where denotes the modified Bessel function of the third kind with order
index
and
. Herein
,
, and
are the
first, second, and third parameters of this family, respectively.
where and
. Herein
and
are the shape and scale parameters of this family, respectively.
where and
. Herein
and
are the first (mean) and second (shape) parameters of this family, respectively.
where and
where
is the only parameter of this family.
where and
. Herein
and
are the shape and scale (median) parameters of this family, respectively.
where and
. Herein
and
are the first and second parameters of this family, respectively.
where and
. Herein
and
are the shape and rate parameters of this family, respectively.
where and
. Herein
is the scale parameter of this family.
where and
. Herein
and
are the shape and scale parameters of this family, respectively.
In what follows, we give four examples. In the first, second, and third examples, we consider three mixture models including: two-component normal, two-component restricted skew , and two-component restricted skew sub-Gaussian
-stable (SSG) mixture models are fitted to
iris
, AIS
, and bankruptcy
data, respectively. In order to approximate the asymptotic standard error of the model parameters, the ML estimators for parameters of skew and SSG mixture models have been computed through the
R
packages EMMIXcskew
(developed by Lee and McLachlan (2018) for skew ) and
mixSSG
(developed by Teimouri (2022) for skew sub-Gaussian -stable). To avoid running package
mixSSG
, we use the ML estimators correspond to bankruptcy
data provided by Teimouri (2022). The package mixSSG
is available at https://CRAN.R-project.org/package=mixSSG. In the fourth example, we apply a three-component generalized hyperbolic mixture model to Wheat
data. The ML estimators of this mixture model have been obtained using the R
package MixGHD
available at https://cran.r-project.org/package=MixGHD. Finally, we note that if parameter h
is very small (less than 0.001, say), then the approximated observed Fisher information matrix may not be invertible.
A list consists of the maximum likelihood estimator, approximated asymptotic standard error, upper, and lower
bounds of asymptotic confidence interval for parameters of the finite mixture model.
Mahdi Teimouri
K. E. Basford, D. R. Greenway, G. J. McLachlan, and D. Peel, (1997). Standard errors of fitted means under normal mixture, Computational Statistics, 12, 1-17.
S. X. Lee and G. J. McLachlan, (2018). EMMIXcskew: An R package for the fitting of a mixture of canonical fundamental skew t-distributions, Journal of Statistical Software, 83(3), 1-32, doi:10.18637/jss.v083.i03.
M. Teimouri, (2022). Finite mixture of skewed sub-Gaussian stable distributions, https://arxiv.org/abs/2205.14067.
C. Tortora, R. P. Browne, A. ElSherbiny, B. C. Franczak, and P. D. McNicholas, (2021). Model-based clustering, classification, and discriminant analysis using the generalized hyperbolic distribution: MixGHD R package. Journal of Statistical Software, 98(3), 1-24, doi:10.18637/jss.v098.i03.
# Example 1: Approximating the asymptotic standard error and 95 percent confidence interval # for the parameters of fitted three-component normal mixture model to iris data. Y <- as.matrix( iris[, 1:4] ) colnames(Y) <- NULL rownames(Y) <- NULL G <- 3 weight <- c( 0.334, 0.300, 0.366 ) mu1 <- c( 5.0060, 3.428, 1.462, 0.246 ) mu2 <- c( 5.9150, 2.777, 4.204, 1.298 ) mu3 <- c( 6.5468, 2.949, 5.482, 1.985 ) sigma1 <- matrix( c( 0.133, 0.109, 0.019, 0.011, 0.109, 0.154, 0.012, 0.010, 0.019, 0.012, 0.028, 0.005, 0.011, 0.010, 0.005, 0.010 ), nrow = 4 , ncol = 4) sigma2 <- matrix( c( 0.225, 0.076, 0.146, 0.043, 0.076, 0.080, 0.073, 0.034, 0.146, 0.073, 0.166, 0.049, 0.043, 0.034, 0.049, 0.033 ), nrow = 4 , ncol = 4) sigma3 <- matrix( c( 0.429, 0.107, 0.334, 0.065, 0.107, 0.115, 0.089, 0.061, 0.334, 0.089, 0.364, 0.087, 0.065, 0.061, 0.087, 0.086 ), nrow = 4 , ncol = 4) mu <- list( mu1, mu2, mu3 ) sigma <- list( sigma1, sigma2, sigma3 ) sigma <- list( sigma1, sigma2, sigma3 ) lambda <- list( rep(0, 4), rep(0, 4), rep(0, 4) ) out1 <- sefm( Y, G, weight, model = "restricted", mu, sigma, lambda, family = "constant", skewness = "FALSE") # Example 2: Approximating the asymptotic standard error and 95 percent confidence interval # for the parameters of fitted two-component restricted skew t mixture model to # AIS data. data( AIS ) Y <- as.matrix( AIS[, 2:3] ) G <- 2 weight <- c( 0.5075, 0.4925 ) mu1 <- c( 19.9827, 17.8882 ) mu2 <- c( 21.7268, 5.7518 ) sigma1 <- matrix( c(3.4915, 8.3941, 8.3941, 28.8113 ), nrow = 2, ncol = 2 ) sigma2 <- matrix( c(2.2979, 0.0622, 0.0622, 0.0120 ), nrow = 2, ncol = 2 ) lambda1 <- ( c( 2.5186, -0.2898 ) ) lambda2 <- ( c( 2.1681, 3.5518 ) ) theta1 <- c( 68.3088 ) theta2 <- c( 3.8159 ) mu <- list( mu1, mu2 ) sigma <- list( sigma1, sigma2 ) lambda <- list( lambda1, lambda2 ) theta <- list( theta1, theta2 ) param <- c( "nu" ) PDF <- quote( (nu/2)^(nu/2)*w^(-nu/2 - 1)/gamma(nu/2)*exp( -nu/(w*2) ) ) tick <- c( 1, 1 ) out2 <- sefm( Y, G, weight, model = "restricted", mu, sigma, lambda, family = "igamma", skewness = "TRUE", param, theta, tick, h = 0.001, N = 3000, level = 0.05, PDF ) # Example 3: Approximating the asymptotic standard error and 95 percent confidence interval # for the parameters of fitted two-component restricted skew sub-Gaussian # alpha-stable mixture model to bankruptcy data. data( bankruptcy ) Y <- as.matrix( bankruptcy[, 2:3] ); colnames(Y) <- NULL; rownames(Y) <- NULL G <- 2 weight <- c( 0.553, 0.447 ) mu1 <- c( -3.649, -0.085 ) mu2 <- c( 40.635, 19.042 ) sigma1 <- matrix( c(1427.071, -155.356, -155.356, 180.991 ), nrow = 2, ncol = 2 ) sigma2 <- matrix( c( 213.938, 9.256, 9.256, 74.639 ), nrow = 2, ncol = 2 ) lambda1 <- c( -41.437, -21.750 ) lambda2 <- c( -3.666, -1.964 ) theta1 <- c( 1.506 ) theta2 <- c( 1.879 ) mu <- list( mu1, mu2 ) sigma <- list( sigma1, sigma2 ) lambda <- list( lambda1, lambda2 ) theta <- list( theta1, theta2 ) param <- c( "alpha" ) tick <- c( 1 ) out3 <- sefm( Y, G, weight, model = "restricted", mu, sigma, lambda, family = "pstable", skewness = "TRUE", param, theta, tick, h = 0.01, N = 3000, level = 0.05 ) # Example 4: Approximating the asymptotic standard error and 95 percent confidence interval # for the parameters of fitted two-component restricted generalized inverse-Gaussian # mixture model to AIS data. data( wheat ) Y <- as.matrix( wheat[, 1:7] ); colnames(Y) <- NULL; rownames(Y) <- NULL G <- 3 weight <- c( 0.325, 0.341, 0.334 ) mu1 <- c( 18.8329, 16.2235, 0.9001, 6.0826, 3.8170, 1.6604, 6.0260 ) mu2 <- c( 11.5607, 13.1160, 0.8446, 5.1873, 2.7685, 4.9884, 5.2203 ) mu3 <- c( 13.8071, 14.0720, 0.8782, 5.5016, 3.1513, 0.6575, 4.9111 ) lambda1 <- diag( c( 0.1308, 0.2566,-0.0243, 0.2625,-0.1259, 3.3111, 0.1057) ) lambda2 <- diag( c( 0.7745, 0.3084, 0.0142, 0.0774, 0.1989,-1.0591,-0.2792) ) lambda3 <- diag( c( 2.0956, 0.9718, 0.0042, 0.2137, 0.2957, 3.9484, 0.6209) ) theta1 <- c( -3.3387, 4.2822 ) theta2 <- c( -3.6299, 4.5249 ) theta3 <- c( -3.9131, 5.8562 ) sigma1 <- matrix( c( 1.2936219, 0.5841467,-0.0027135, 0.2395983, 0.1271193, 0.2263583, 0.2105204, 0.5841467, 0.2952009,-0.0045937, 0.1345133, 0.0392849, 0.0486487, 0.1222547, -0.0027135,-0.0045937, 0.0003672,-0.0033093, 0.0016788, 0.0056345,-0.0033742, 0.2395983, 0.1345133,-0.0033093, 0.0781141, 0.0069283,-0.0500718, 0.0747912, 0.1271193, 0.0392849, 0.0016788, 0.0069283, 0.0266365, 0.0955757, 0.0002497, 0.2263583, 0.0486487, 0.0056345,-0.0500718, 0.0955757, 1.9202036,-0.0455763, 0.2105204, 0.1222547,-0.0033742, 0.0747912, 0.0002497,-0.0455763, 0.0893237 ), nrow = 7, ncol = 7 ) sigma2 <- matrix( c( 0.9969975, 0.4403820, 0.0144607, 0.1139573, 0.1639597,-0.2216050, 0.0499885, 0.4403820, 0.2360065, 0.0010769, 0.0817149, 0.0525057,-0.0320012, 0.0606147, 0.0144607, 0.0010769, 0.0008914,-0.0023864, 0.0049263,-0.0122188,-0.0042375, 0.1139573, 0.0817149,-0.0023864, 0.0416206, 0.0030268, 0.0490919, 0.0407972, 0.1639597, 0.0525057, 0.0049263, 0.0030268, 0.0379771,-0.0384626,-0.0095661, -0.2216050,-0.0320012,-0.0122188, 0.0490919,-0.0384626, 4.0868766, 0.1459766, 0.0499885, 0.0606147,-0.0042375, 0.0407972,-0.0095661, 0.1459766, 0.0661900 ), nrow = 7, ncol = 7 ) sigma3 <- matrix( c( 1.1245716, 0.5527725,-0.0005064, 0.2083688, 0.1190222,-0.4491047, 0.2494994, 0.5527725, 0.3001219,-0.0036794, 0.1295874, 0.0419470,-0.1926131, 0.1586538, -0.0005064,-0.0036794, 0.0004159,-0.0034247, 0.0019652,-0.0026687,-0.0044963, 0.2083688, 0.1295874,-0.0034247, 0.0715283, 0.0055925,-0.0238820, 0.0867129, 0.1190222, 0.0419470, 0.0019652, 0.0055925, 0.0243991,-0.0715797, 0.0026836, -0.4491047,-0.1926131,-0.0026687,-0.0238820,-0.0715797, 1.5501246,-0.0048728, 0.2494994, 0.1586538,-0.0044963, 0.0867129, 0.0026836,-0.0048728, 0.1509183 ), nrow = 7, ncol = 7 ) mu <- list( mu1, mu2, mu3 ) sigma <- list( sigma1 , sigma2, sigma3 ) lambda <- list( lambda1, lambda2, lambda3 ) theta <- list( theta1 , theta2, theta3 ) tick <- c( 1, 1, 0 ) param <- c( "a", "b" ) PDF <- quote( 1/( 2*besselK( b, a ) )*w^(a - 1)*exp( -b/2*(1/w + w) ) ) out4 <- sefm( Y, G, weight, model = "unrestricted", mu, sigma, lambda, family = "gigaussian", skewness = "TRUE", param, theta, tick, h = 0.001, N = 3000, level = 0.05, PDF )
# Example 1: Approximating the asymptotic standard error and 95 percent confidence interval # for the parameters of fitted three-component normal mixture model to iris data. Y <- as.matrix( iris[, 1:4] ) colnames(Y) <- NULL rownames(Y) <- NULL G <- 3 weight <- c( 0.334, 0.300, 0.366 ) mu1 <- c( 5.0060, 3.428, 1.462, 0.246 ) mu2 <- c( 5.9150, 2.777, 4.204, 1.298 ) mu3 <- c( 6.5468, 2.949, 5.482, 1.985 ) sigma1 <- matrix( c( 0.133, 0.109, 0.019, 0.011, 0.109, 0.154, 0.012, 0.010, 0.019, 0.012, 0.028, 0.005, 0.011, 0.010, 0.005, 0.010 ), nrow = 4 , ncol = 4) sigma2 <- matrix( c( 0.225, 0.076, 0.146, 0.043, 0.076, 0.080, 0.073, 0.034, 0.146, 0.073, 0.166, 0.049, 0.043, 0.034, 0.049, 0.033 ), nrow = 4 , ncol = 4) sigma3 <- matrix( c( 0.429, 0.107, 0.334, 0.065, 0.107, 0.115, 0.089, 0.061, 0.334, 0.089, 0.364, 0.087, 0.065, 0.061, 0.087, 0.086 ), nrow = 4 , ncol = 4) mu <- list( mu1, mu2, mu3 ) sigma <- list( sigma1, sigma2, sigma3 ) sigma <- list( sigma1, sigma2, sigma3 ) lambda <- list( rep(0, 4), rep(0, 4), rep(0, 4) ) out1 <- sefm( Y, G, weight, model = "restricted", mu, sigma, lambda, family = "constant", skewness = "FALSE") # Example 2: Approximating the asymptotic standard error and 95 percent confidence interval # for the parameters of fitted two-component restricted skew t mixture model to # AIS data. data( AIS ) Y <- as.matrix( AIS[, 2:3] ) G <- 2 weight <- c( 0.5075, 0.4925 ) mu1 <- c( 19.9827, 17.8882 ) mu2 <- c( 21.7268, 5.7518 ) sigma1 <- matrix( c(3.4915, 8.3941, 8.3941, 28.8113 ), nrow = 2, ncol = 2 ) sigma2 <- matrix( c(2.2979, 0.0622, 0.0622, 0.0120 ), nrow = 2, ncol = 2 ) lambda1 <- ( c( 2.5186, -0.2898 ) ) lambda2 <- ( c( 2.1681, 3.5518 ) ) theta1 <- c( 68.3088 ) theta2 <- c( 3.8159 ) mu <- list( mu1, mu2 ) sigma <- list( sigma1, sigma2 ) lambda <- list( lambda1, lambda2 ) theta <- list( theta1, theta2 ) param <- c( "nu" ) PDF <- quote( (nu/2)^(nu/2)*w^(-nu/2 - 1)/gamma(nu/2)*exp( -nu/(w*2) ) ) tick <- c( 1, 1 ) out2 <- sefm( Y, G, weight, model = "restricted", mu, sigma, lambda, family = "igamma", skewness = "TRUE", param, theta, tick, h = 0.001, N = 3000, level = 0.05, PDF ) # Example 3: Approximating the asymptotic standard error and 95 percent confidence interval # for the parameters of fitted two-component restricted skew sub-Gaussian # alpha-stable mixture model to bankruptcy data. data( bankruptcy ) Y <- as.matrix( bankruptcy[, 2:3] ); colnames(Y) <- NULL; rownames(Y) <- NULL G <- 2 weight <- c( 0.553, 0.447 ) mu1 <- c( -3.649, -0.085 ) mu2 <- c( 40.635, 19.042 ) sigma1 <- matrix( c(1427.071, -155.356, -155.356, 180.991 ), nrow = 2, ncol = 2 ) sigma2 <- matrix( c( 213.938, 9.256, 9.256, 74.639 ), nrow = 2, ncol = 2 ) lambda1 <- c( -41.437, -21.750 ) lambda2 <- c( -3.666, -1.964 ) theta1 <- c( 1.506 ) theta2 <- c( 1.879 ) mu <- list( mu1, mu2 ) sigma <- list( sigma1, sigma2 ) lambda <- list( lambda1, lambda2 ) theta <- list( theta1, theta2 ) param <- c( "alpha" ) tick <- c( 1 ) out3 <- sefm( Y, G, weight, model = "restricted", mu, sigma, lambda, family = "pstable", skewness = "TRUE", param, theta, tick, h = 0.01, N = 3000, level = 0.05 ) # Example 4: Approximating the asymptotic standard error and 95 percent confidence interval # for the parameters of fitted two-component restricted generalized inverse-Gaussian # mixture model to AIS data. data( wheat ) Y <- as.matrix( wheat[, 1:7] ); colnames(Y) <- NULL; rownames(Y) <- NULL G <- 3 weight <- c( 0.325, 0.341, 0.334 ) mu1 <- c( 18.8329, 16.2235, 0.9001, 6.0826, 3.8170, 1.6604, 6.0260 ) mu2 <- c( 11.5607, 13.1160, 0.8446, 5.1873, 2.7685, 4.9884, 5.2203 ) mu3 <- c( 13.8071, 14.0720, 0.8782, 5.5016, 3.1513, 0.6575, 4.9111 ) lambda1 <- diag( c( 0.1308, 0.2566,-0.0243, 0.2625,-0.1259, 3.3111, 0.1057) ) lambda2 <- diag( c( 0.7745, 0.3084, 0.0142, 0.0774, 0.1989,-1.0591,-0.2792) ) lambda3 <- diag( c( 2.0956, 0.9718, 0.0042, 0.2137, 0.2957, 3.9484, 0.6209) ) theta1 <- c( -3.3387, 4.2822 ) theta2 <- c( -3.6299, 4.5249 ) theta3 <- c( -3.9131, 5.8562 ) sigma1 <- matrix( c( 1.2936219, 0.5841467,-0.0027135, 0.2395983, 0.1271193, 0.2263583, 0.2105204, 0.5841467, 0.2952009,-0.0045937, 0.1345133, 0.0392849, 0.0486487, 0.1222547, -0.0027135,-0.0045937, 0.0003672,-0.0033093, 0.0016788, 0.0056345,-0.0033742, 0.2395983, 0.1345133,-0.0033093, 0.0781141, 0.0069283,-0.0500718, 0.0747912, 0.1271193, 0.0392849, 0.0016788, 0.0069283, 0.0266365, 0.0955757, 0.0002497, 0.2263583, 0.0486487, 0.0056345,-0.0500718, 0.0955757, 1.9202036,-0.0455763, 0.2105204, 0.1222547,-0.0033742, 0.0747912, 0.0002497,-0.0455763, 0.0893237 ), nrow = 7, ncol = 7 ) sigma2 <- matrix( c( 0.9969975, 0.4403820, 0.0144607, 0.1139573, 0.1639597,-0.2216050, 0.0499885, 0.4403820, 0.2360065, 0.0010769, 0.0817149, 0.0525057,-0.0320012, 0.0606147, 0.0144607, 0.0010769, 0.0008914,-0.0023864, 0.0049263,-0.0122188,-0.0042375, 0.1139573, 0.0817149,-0.0023864, 0.0416206, 0.0030268, 0.0490919, 0.0407972, 0.1639597, 0.0525057, 0.0049263, 0.0030268, 0.0379771,-0.0384626,-0.0095661, -0.2216050,-0.0320012,-0.0122188, 0.0490919,-0.0384626, 4.0868766, 0.1459766, 0.0499885, 0.0606147,-0.0042375, 0.0407972,-0.0095661, 0.1459766, 0.0661900 ), nrow = 7, ncol = 7 ) sigma3 <- matrix( c( 1.1245716, 0.5527725,-0.0005064, 0.2083688, 0.1190222,-0.4491047, 0.2494994, 0.5527725, 0.3001219,-0.0036794, 0.1295874, 0.0419470,-0.1926131, 0.1586538, -0.0005064,-0.0036794, 0.0004159,-0.0034247, 0.0019652,-0.0026687,-0.0044963, 0.2083688, 0.1295874,-0.0034247, 0.0715283, 0.0055925,-0.0238820, 0.0867129, 0.1190222, 0.0419470, 0.0019652, 0.0055925, 0.0243991,-0.0715797, 0.0026836, -0.4491047,-0.1926131,-0.0026687,-0.0238820,-0.0715797, 1.5501246,-0.0048728, 0.2494994, 0.1586538,-0.0044963, 0.0867129, 0.0026836,-0.0048728, 0.1509183 ), nrow = 7, ncol = 7 ) mu <- list( mu1, mu2, mu3 ) sigma <- list( sigma1 , sigma2, sigma3 ) lambda <- list( lambda1, lambda2, lambda3 ) theta <- list( theta1 , theta2, theta3 ) tick <- c( 1, 1, 0 ) param <- c( "a", "b" ) PDF <- quote( 1/( 2*besselK( b, a ) )*w^(a - 1)*exp( -b/2*(1/w + w) ) ) out4 <- sefm( Y, G, weight, model = "unrestricted", mu, sigma, lambda, family = "gigaussian", skewness = "TRUE", param, theta, tick, h = 0.001, N = 3000, level = 0.05, PDF )
These data are about 210 wheat grains belonging to three different varieties (including: Kama, Rosa, and Canadian) on which 7 quantitative variables related to these kernel structures detected by using a soft X-ray visualization technique have been measured. These variables are: area
, perimeter
, compactness
, length of kernel
, width of kernel
, asymmetry coefficient
, length of kernel groove
, and class label variable variety
.
data(wheat)
data(wheat)
A text file with 8 columns.
P. Giordani, M. B. Ferraro and F. Martella, (2020). An Introduction to Clustering with R, Springer, Singapore.
data(wheat)
data(wheat)