Package 'randomMachines' reference manual

Title:	An Ensemble Modeling using Random Machines
Description:	A novel ensemble method employing Support Vector Machines (SVMs) as base learners. This powerful ensemble model is designed for both classification (Ara A., et. al, 2021) <doi:10.6339/21-JDS1014>, and regression (Ara A., et. al, 2021) <doi:10.1016/j.eswa.2022.117107> problems, offering versatility and robust performance across different datasets and compared with other consolidated methods as Random Forests (Maia M, et. al, 2021) <doi:10.6339/21-JDS1025>.
Authors:	Mateus Maia [aut, cre] , Anderson Ara [cte] , Gabriel Ribeiro [cte]
Maintainer:	Mateus Maia <[email protected]>
License:	MIT + file LICENSE
Version:	0.1.0
Built:	2025-03-15 05:27:13 UTC
Source:	https://github.com/mateusmaiads/randommachines

Bolsa Família Dataset

Description

The 'bolsafam' dataset contains information about the utilization rate of the Bolsa Família program in Brazilian municipalities. The utilization rate $y_{i}$ is defined as the number of people benefiting from the assistance divided by the total population of the city.

Usage

  data(bolsafam)
data(bolsafam)

Format

A data frame with 5564 rows and 11 columns.

Details

This dataset includes the following columns:

y: Rate of use of the social assistance program by municipality.
COD_UF: Code to identify the Brazilian state to which the city belongs.
T_DENS: Percentage of the population living in households with a density greater than 2 people per bedroom.
TRABSC: Percentage of employed persons aged 18 or over who are employed without a formal contract.
PPOB: Proportion of people vulnerable to poverty.
T_NESTUDA_NTRAB_MMEIO: Percentage of people aged 15 to 24 who do not study or work and are vulnerable to poverty.
T_FUND15A17: Percentage of the population aged 15 to 17 with complete primary education.
RAZDEP: Dependency ratio.
T_ATRASO_0_BASICO: Percentage of the population aged 6 to 17 years attending basic education that does not have an age-grade delay.
T_AGUA: Percentage of the population living in households with running water.
REGIAO: Aggregation of states according to the regions defined by IBGE.

Source

The 'bolsafam' dataset is sourced from the Brazilian organizational site called Transparency Portal.

References

Mateus Maia & Anderson Ara (2023). rmachines: Random Machines: a package for a support vector ensemble based on random kernel space. R package version 0.1.0.

Examples

    data(bolsafam)
    head(bolsafam)
data(bolsafam)
    head(bolsafam)

Brier Score function

Description

Calculate the Brier Score for a set of predicted probabilities and observed outcomes. The Brier Score is a measure of the accuracy of probabilistic predictions. It is commonly used in the evaluation of predictive models.

Usage

brier_score(prob, observed, levels)
brier_score(prob, observed, levels)

Arguments

`prob`	predicted probabilities
`observed`	$y$ observed values (it assumed that the positive class is coded is equal to one and the negative 0)
`levels`	A string vector with the original levels from the target variable

Value

Returns the Brier Score, a numeric value indicating the accuracy of the predictions.

Ionosphere Dataset

Description

The 'ionosphere' dataset contains radar data for the classification of radar returns as either 'good' or 'bad'.

Usage

  data(ionosphere)
data(ionosphere)

Format

A data frame with 351 rows and 35 columns.

Details

This dataset includes the following columns:

X1-X34: Features extracted from radar signals.
y: Class label indicating whether the radar return is 'g' (good) or 'b' (bad).

Source

The 'ionosphere' dataset is sourced from the UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/ionosphere

Examples

    data(ionosphere)
    head(ionosphere)
data(ionosphere)
    head(ionosphere)

Prediction function for the rm_class_model

Description

This function predicts the outcome for a RM object model using new data

Usage

## S4 method for signature 'rm_class'
predict(object,newdata)
## S4 method for signature 'rm_class'
predict(object,newdata)

Arguments

`object`	A fitted RM model object of class `rm_class`.
`newdata`	A data frame or matrix containing the new data to be predicted.

Value

A vector of predicted outcomes: probabilities in case of 'prob_model = TRUE' and classes in case of 'prob_model = FALSE'.

Examples

# Generating a sample for the simulation
library(randomMachines)
sim_data <- sim_class(n = 75)
sim_new <- sim_class(n = 25)
rm_mod <- randomMachines(y~., train = sim_data)
y_hat <- predict(rm_mod, newdata = sim_new)
# Generating a sample for the simulation
library(randomMachines)
sim_data <- sim_class(n = 75)
sim_new <- sim_class(n = 25)
rm_mod <- randomMachines(y~., train = sim_data)
y_hat <- predict(rm_mod, newdata = sim_new)

Prediction function for the rm_reg_model

Description

This function predicts the outcome for a RM object model using new data for continuous $y$

Usage

## S4 method for signature 'rm_reg'
predict(object,newdata)
## S4 method for signature 'rm_reg'
predict(object,newdata)

Arguments

`object`	A fitted RM model object of class `rm_reg`.
`newdata`	A data frame or matrix containing the new data to be predicted.

Value

Predicted values newdata object from the Random Machines model.

Examples

# Generating a sample for the simulation
library(randomMachines)
sim_data <- sim_reg1(n = 75)
sim_new <- sim_reg1(n = 25)
rm_mod_reg <- randomMachines(y~., train = sim_data)
y_hat <- predict(rm_mod_reg, newdata = sim_new)
# Generating a sample for the simulation
library(randomMachines)
sim_data <- sim_reg1(n = 75)
sim_new <- sim_reg1(n = 25)
rm_mod_reg <- randomMachines(y~., train = sim_data)
y_hat <- predict(rm_mod_reg, newdata = sim_new)

Random Machines

Description

Random Machines is an ensemble model which uses the combination of different kernel functions to improve the diversity in the bagging approach, improving the predictions in general. Random Machines was developed for classification and regression problems by bagging multiple kernel functions in support vector models.

Random Machines uses SVMs (Cortes and Vapnik, 1995) as base learners in the bagging procedure with a random sample of kernel functions to build them.

Let a training sample given by $(\boldsymbol{x_{i}},y_i)$ with $i=1,\dots, n$ observations, where $\boldsymbol{x_{i}}$ is the vector of independent variables and $y_{i}$ the dependent one. The kernel bagging method initializes by training of the $r$ single learner, where $r=1,\dots,R$ and $R$ is the total number of different kernel functions that could be used in support vector models. In this implementation the default value is $R=4$ (gaussian, polynomial, laplacian and linear). See more details below.

Each single learner is internally validated and the weights $\lambda_{r}$ are calculated proportionally to the strength from the single predictive performance.

Afterwards, $B$ bootstrap samples are sampled from the training set. A support vector machine model $g_{b}$ is trained for each bootstrap sample, $b=i,\dots,B$ and the kernel function that will be used for $g_{b}$ will be determined by a random choice with probability $\lambda_{r}$ . The final weight $w_b$ in the bagging procedure is calculated by out-of-bag samples.

The final model $G(\boldsymbol{x}_i)$ for a new $\boldsymbol{x}_i$ is given by,

The weights $\lambda_{r}$ and $w_b$ are different calculated for each task (classification, probabilistic classification and regression). See more details in the references.

For a binary classification problem $\mathbin{{ G(\boldsymbol{x_{i}})= \text{sgn} \left( \sum_{b=1}^{B}w_{b}g_{b}(\boldsymbol{x_{i}})\right)}}$ , where $g_b$ are single binary classification outputs;
For a probabilistic binary classification problem $\mathbin{{ G(\boldsymbol{x_{i}})= \sum_{b=1}^{B}w_{b}g_{b}(\boldsymbol{x_{i}})}}$ , where $g_b$ are single probabilistic classification outputs;
For a regression problem $G(\boldsymbol{x_{i}})= \sum_{b=1}^{B}w_{b}g_{b}(\boldsymbol{x_{i}})$ , , where $g_b$ are single regression outputs.

Usage

randomMachines(
     formula,
     train,validation,
     B = 25, cost = 1,
     automatic_tuning = FALSE,
     gamma_rbf = 1,
     gamma_lap = 1,
     degree = 2,
     poly_scale = 1,
     offset = 0,
     gamma_cau = 1,
     d_t = 2,
     kernels = c("rbfdot", "polydot", "laplacedot", "vanilladot"),
     prob_model = TRUE,
     loss_function = RMSE,
     epsilon = 0.1,
     beta = 2
)
randomMachines(
     formula,
     train,validation,
     B = 25, cost = 1,
     automatic_tuning = FALSE,
     gamma_rbf = 1,
     gamma_lap = 1,
     degree = 2,
     poly_scale = 1,
     offset = 0,
     gamma_cau = 1,
     d_t = 2,
     kernels = c("rbfdot", "polydot", "laplacedot", "vanilladot"),
     prob_model = TRUE,
     loss_function = RMSE,
     epsilon = 0.1,
     beta = 2
)

Arguments

`formula`	an object of class `formula`: it should contain a symbolic description of the model to be fitted, indicating the dependent variable and all predictors that should be included.
`train`	the training data $\left\{\left( \mathbf{x}_{i},y_{i} \right)\right\}_{i=1}^{n}$ used to train the model.
`validation`	the validation data $\left\{\left( \mathbf{x}_{i},y_{i}\right) \right\}_{i=1}^{V}$ used to calculate probabilities $\lambda_{r}$ . If `validation = NULL`,the validation set is going be selected as 0.25 partition from the training data, and the remaining partition is selected as the new training sample.
`B`	number of bootstrap samples. The default value is `B=25`.
`cost`	the $C$ -constant term of the regularization on soft margins at support vector models. The default value is `cost=1`.
`automatic_tuning`	boolean to define if the kernel hyperparameters will be selected using the `sigest` from the `ksvm` function. The default value is `FALSE`.
`gamma_rbf`	the hyperparameter $\gamma_{g}$ used in the RBF kernel. The default value is `gamma_rbf=1`.
`gamma_lap`	the hyperparameter $\gamma_{l}$ used in the Laplacian kernel. The default value is `gamma_lap=1`.
`degree`	the degree used in the Polynomial kernel. The default value is `degree=2`.
`poly_scale`	the scale parameter from the Polynomial kernel. The default value is `poly_scale=1`.
`offset`	the offset parameter from the Polynomial kernel. The default value is `offset=0`.
`gamma_cau`	the hyperparameter $\gamma_{c}$ used in the Cauchy kernel. The default value is `gamma_cau=1`.
`d_t`	the $d_{t}$ -norm from the t-Student kernel. The default value is `d_t=2`.
`kernels`	a vector with the name of kernel functions that will be used in the Random Machines model. The default include the kernel functions: `c("rbfdot", "polydot", "laplacedot", "vanilladot").` The other kernel functions as `"cauchydot"` and `"tdot"` are exclusive to the binary classification setting.
`prob_model`	a boolean to define if the algorithm will be using a probabilistic approach to the define the predictions (default = `TRUE`).
`loss_function`	Define which loss function is going to be used in the regression approach. The default is the `RMSE` function but others can be added.
`epsilon`	The epsilon in the loss function used from the SVR implementation. The default value is `epsilon=0.1`.
`beta`	The correlation parameter $\beta$ which calibrates the penalisation of each kernel performance in regression tasks. The default value is `beta=2`.

Details

The Random Machines is an ensemble method which combines the bagging procedure proposed by Breiman (1996), using Support Vector Machine models as base learners jointly with a random selection of kernel functions that add diversity to the ensemble without harming its predictive performance. The kernel functions $k(x,y)$ are described by the functions below,

Linear Kernel: $k(x,y) = (x\cdot y)$
Polynomial Kernel: $k(x,y) = \left(scale( x\cdot y) + offset\right)^{degree}$
Gaussian Kernel: $k(x,y) = e^{-\gamma_{g}||x-y||^2}$
Laplacian Kernel: $k(x,y) = e^{-\gamma_{\ell}||x-y||}$
Cauchy Kernel: $k(x,y) = \frac{1}{1 + \frac{||x-y||^{2}}{\gamma_{c}}}$
Student's t Kernel: $k(x,y) = \frac{1}{1 + ||x-y||^{d_{t}}}$

Value

randomMachines() returns an object of class "rm_class" for classification tasks or "rm_reg" for if the target variable is a continuous numerical response. See predict.rm_class or predict.rm_reg for more details of how to obtain predictions from each model respectively.

Author(s)

Mateus Maia: [email protected], Gabriel Felipe Ribeiro: [email protected], Anderson Ara: [email protected]

References

Ara, Anderson, et al. "Regression random machines: An ensemble support vector regression model with free kernel choice." Expert Systems with Applications 202 (2022): 117107.

Ara, Anderson, et al. "Random machines: A bagged-weighted support vector model with free kernel choice." Journal of Data Science 19.3 (2021): 409-428.

Breiman, L. (1996). Bagging predictors. Machine learning, 24, 123-140.

Cortes, C., and Vapnik, V. (1995). Support-vector networks. Machine learning, 20, 273-297.

Maia, Mateus, Arthur R. Azevedo, and Anderson Ara. "Predictive comparison between random machines and random forests." Journal of Data Science 19.4 (2021): 593-614.

Examples

library(randomMachines)

# Simulation from a binary output context
sim_data <- sim_class(n = 75)

## Setting the training and validation set
sim_new <- sim_class(n = 75)

# Modelling Random Machines (probabilistic output)
rm_mod_prob <- randomMachines(y~., train = sim_data)

## Modelling Random Machines (binary class output)
rm_mod_label <- randomMachines(y~., train = sim_data,prob_model = FALSE)

## Predicting for new data
y_hat <- predict(rm_mod_label,sim_new)
library(randomMachines)

# Simulation from a binary output context
sim_data <- sim_class(n = 75)

## Setting the training and validation set
sim_new <- sim_class(n = 75)

# Modelling Random Machines (probabilistic output)
rm_mod_prob <- randomMachines(y~., train = sim_data)

## Modelling Random Machines (binary class output)
rm_mod_label <- randomMachines(y~., train = sim_data,prob_model = FALSE)

## Predicting for new data
y_hat <- predict(rm_mod_label,sim_new)

S4 class for RM classification

Description

S4 class for RM classification

Details

For more details see Ara, Anderson, et al. "Random machines: A bagged-weighted support vector model with free kernel choice." Journal of Data Science 19.3 (2021): 409-428.

Slots

train: a data.frame corresponding to the training data used into the model
class_name: a string with target variable used in the model
kernel_weight: a numeric vector corresponding to the weights for each bootstrap model contribution
lambda_values: a named list with value of the vector of $\boldsymbol{\lambda}$ sampling probabilities associated with each each kernel function
model_params: a list with all used model specifications
bootstrap_models: a list with all ksvm objects for each bootstrap sample
bootstrap_samples: a list with all bootstrap samples used to train each base model of the ensemble
prob: a boolean indicating if a probabilitistic approch was used in the classification Random Machines

S4 class for RM regression

Description

S4 class for RM regression

Details

For more details see Ara, Anderson, et al. "Regression random machines: An ensemble support vector regression model with free kernel choice." Expert Systems with Applications 202 (2022): 117107.

Slots

y_train_hat: a numeric corresponding to the predictions $\hat{y}_{i}$ for the training set
lambda_values: a named list with value of the vector of $\boldsymbol{\lambda}$ sampling probabilities associated with each each kernel function
model_params: a list with all used model specifications
bootstrap_models: a list with all ksvm objects for each bootstrap sample
bootstrap_samples: a list with all bootstrap samples used to train each base model of the ensemble
kernel_weight_norm: a numeric vector corresponding to the normalised weights for each bootstrap model contribution

Root Mean Squared Error (RMSE) Function

Description

Computes the Root Mean Squared Error (RMSE), a widely used metric for evaluating the accuracy of predictions in regression tasks. The formula is given by

$RMSE = \sqrt{\frac{1}{n}\sum_{i=1}^{n}\left(y_{i}-\hat{y}_{i}\right)^{2}}$

Usage

RMSE(predicted, observed)
RMSE(predicted, observed)

Arguments

`predicted`	A vector of predicted values $\hat{\mathbf{y}}$ .
`observed`	A vector of observed values $\mathbf{y}$ .

Value

a the Root Mean Squared error calculated by the formula in the description.

Generate a binary classification data set from normal distribution

Description

Simulation used as example of a classification task based on a separation of two normal multivariate distributions with different vector of means and differerent covariate matrices. For the label $A$ the $\mathbf{X}_{A}$ are sampled from a normal distribution ${MVN}\left(\mu_{A}\mathbf{1}_{p},\sigma_{A}^{2}\mathbf{I}_{p}\right)$ while for label $B$ the samples $\mathbf{X}_{B}$ are from a normal distribution ${MVN} \left(\mu_{B}\mathbf{1}_{p},\sigma_{B}^{2}\mathbf{I}_{p}\right)$ . For more details see Ara et. al (2021), and Breiman L (1998).

Usage

sim_class(
  n,
  p = 2,
  ratio = 0.5,
  mu_a = 0,
  sigma_a = 1,
  mu_b = 1,
  sigma_b = 1
)
sim_class(
  n,
  p = 2,
  ratio = 0.5,
  mu_a = 0,
  sigma_a = 1,
  mu_b = 1,
  sigma_b = 1
)

Arguments

`n`	Sample size
`p`	Number of predictors
`ratio`	Ratio between class A and class B
`mu_a`	Mean of $X_{1}$ .
`sigma_a`	Standard deviation of $X_{1}$ .
`mu_b`	Mean of $X_{2}$
`sigma_b`	Standard devation of $X_{2}$

Value

A simulated data.frame with two predictors for a binary classification problem

Author(s)

Mateus Maia: [email protected], Anderson Ara: [email protected]

References

Ara, Anderson, et al. "Random machines: A bagged-weighted support vector model with free kernel choice." Journal of Data Science 19.3 (2021): 409-428.

Breiman, L. (1998). Arcing classifier (with discussion and a rejoinder by the author). The annals of statistics, 26(3), 801-849.

Examples

library(randomMachines)
sim_data <- sim_class(n = 100)
library(randomMachines)
sim_data <- sim_class(n = 100)

Simulation for a regression toy examples from Random Machines Regression 1

Description

Simulation toy example initially found in Scornet (2016), and used and escribed by Ara et. al (2022). Inputs are 2 independent variables uniformly distributed on the interval $[-1,1]$ . Outputs are generated following the equation

$Y={X^{2}_{1}}+e^{{-{X^{2}_{2}}}} + \varepsilon, \varepsilon \sim \mathcal{N}(0,\sigma^{2})$

Usage

sim_reg1(n, sigma)
sim_reg1(n, sigma)

Arguments

`n`	Sample size
`sigma`	Standard deviation of residual noise

Value

A simulated data.frame with two predictors and the target variable.

Author(s)

Mateus Maia: [email protected], Anderson Ara: [email protected]

References

Ara, Anderson, et al. "Regression random machines: An ensemble support vector regression model with free kernel choice." Expert Systems with Applications 202 (2022): 117107.

Scornet, E. (2016). Random forests and kernel methods. IEEE Transactions on Information Theory, 62(3), 1485-1500.

Examples

library(randomMachines)
sim_data <- sim_reg1(n=100)
library(randomMachines)
sim_data <- sim_reg1(n=100)

Simulation for a regression toy examples from Random Machines Regression 2

Description

Simulation toy example initially found in Scornet (2016), and used and escribed by Ara et. al (2022). Inputs are 8 independent variables uniformly distributed on the interval $[-1,1]$ . Outputs are generated following the equation

$Y={X_{1}}{X_{2}}+{X^{2}_{3}}-{X_{4}}{X_{7}}+{X_{5}}{X_{8}}-{X^{2}_{6}}+ \varepsilon, \varepsilon \sim \mathcal{N}(0,\sigma^{2})$

Usage

sim_reg2(n, sigma)
sim_reg2(n, sigma)

Arguments

`n`	Sample size
`sigma`	Standard deviation of residual noise

Value

A simulated data.frame with two predictors and the target variable.

Author(s)

Mateus Maia: [email protected], Anderson Ara: [email protected]

References

Ara, Anderson, et al. "Regression random machines: An ensemble support vector regression model with free kernel choice." Expert Systems with Applications 202 (2022): 117107.

Scornet, E. (2016). Random forests and kernel methods. IEEE Transactions on Information Theory, 62(3), 1485-1500.

Examples

library(randomMachines)
sim_data <- sim_reg2(n=100)
library(randomMachines)
sim_data <- sim_reg2(n=100)

Simulation for a regression toy examples from Random Machines Regression 3

Description

Simulation toy example initially found in Scornet (2016), and used and escribed by Ara et. al (2022). Inputs are 4 independent variables uniformly distributed on the interval $[-1,1]$ . Outputs are generated following the equation

$Y= -\sin({X}_{1})+{X}^{2}_{2}+{X}_{3}-e^{{-X^{2}_{4}}} + \varepsilon, \varepsilon \sim \mathcal{N}(0,0.5)$

Usage

sim_reg3(n, sigma)
sim_reg3(n, sigma)

Arguments

`n`	Sample size
`sigma`	Standard deviation of residual noise

Value

A simulated data.frame with two predictors and the target variable.

Author(s)

Mateus Maia: [email protected], Anderson Ara: [email protected]

References

Ara, Anderson, et al. "Regression random machines: An ensemble support vector regression model with free kernel choice." Expert Systems with Applications 202 (2022): 117107.

Scornet, E. (2016). Random forests and kernel methods. IEEE Transactions on Information Theory, 62(3), 1485-1500.

Examples

library(randomMachines)
sim_data <- sim_reg3(n=100)
library(randomMachines)
sim_data <- sim_reg3(n=100)

Simulation for a regression toy examples from Random Machines Regression 3

Description

Simulation toy example initially found in Van der Laan, et.al (2016), and used and escribed by Ara et. al (2022). Inputs are 6 independent variables uniformly distributed on the interval $[-1,1]$ . Outputs are generated following the equation

$Y={X^{2}_{1}}+{X}^{2}_{2}{X_{3}}e^{-|{X_{4}}|}+{X_{6}}-{X_{5}}+ \varepsilon, \varepsilon \sim \mathcal{N}(0,\sigma^{2})$

Usage

sim_reg4(n, sigma)
sim_reg4(n, sigma)

Arguments

`n`	Sample size
`sigma`	Standard deviation of residual noise

Value

A simulated data.frame with two predictors and the target variable.

Author(s)

Mateus Maia: [email protected], Anderson Ara: [email protected]

References

Ara, Anderson, et al. "Regression random machines: An ensemble support vector regression model with free kernel choice." Expert Systems with Applications 202 (2022): 117107.

Van der Laan, M. J., Polley, E. C., & Hubbard, A. E. (2007). Super learner. Statistical applications in genetics and molecular biology, 6(1).

Examples

library(randomMachines)
sim_data <- sim_reg4(n=100)
library(randomMachines)
sim_data <- sim_reg4(n=100)

Simulation for a regression toy examples from Random Machines Regression 3

Description

Simulation toy example initially found in Van der Laan, et.al (2016), and used and escribed by Ara et. al (2022). Inputs are 6 independent variables sampled from $N(0,1)$ . Outputs are generated following the equation

$Y=X_{1}+0.707 X^{2}_{2} + 2\mathcal{1}_{(X_{3}>0)}+0.873 \log (X_{1})|X_{3}|+0.894 X_{2} X_{4}+2\mathcal{1}_{(X_{5}>0)}+0.464e^{X_{6}}+ \varepsilon, \varepsilon \sim \mathcal{N}(0,\sigma^{2})$

Usage

sim_reg5(n, sigma)
sim_reg5(n, sigma)

Arguments

`n`	Sample size
`sigma`	Standard deviation of residual noise

Value

A simulated data.frame with two predictors and the target variable.

Author(s)

Mateus Maia: [email protected], Anderson Ara: [email protected]

References

Ara, Anderson, et al. "Regression random machines: An ensemble support vector regression model with free kernel choice." Expert Systems with Applications 202 (2022): 117107.

Roy, M. H., & Larocque, D. (2012). Robustness of random forests for regression. Journal of Nonparametric Statistics, 24(4), 993-1006.

Examples

library(randomMachines)
sim_data <- sim_reg5(n=100)
library(randomMachines)
sim_data <- sim_reg5(n=100)

Wholesale Dataset

Description

The 'whosale' dataset contains information about wholesale customers' annual spending on various product categories.

Usage

  data(whosale)
data(whosale)

Format

A data frame with 440 rows and 8 columns.

Details

This dataset includes the following columns:

y: Type of customer, either 'Horeca' (Hotel/Restaurant/Cafe), coded as 1 or 'Retail' coded as 2.
Region: Geographic region of the customer, either 'Lisbon', 'Oporto', or 'Other'. Coded as {1,2,3}, respectively.
Fresh: Annual spending (in monetary units) on fresh products.
Milk: Annual spending on milk products.
Grocery: Annual spending on grocery products.
Frozen: Annual spending on frozen products.
Detergents Paper: Annual spending on detergents and paper products.
Delicassen: Annual spending on delicatessen products.

Source

The 'whosale' dataset is sourced from the UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/wholesale+customers

Examples

    data(whosale)
    head(whosale)
data(whosale)
    head(whosale)

Package 'randomMachines'

Help Index

Bolsa Família Dataset

Description

Usage

Format

Details

Source

References

Examples

Brier Score function

Description

Usage

Arguments

Value

Ionosphere Dataset

Description

Usage

Format

Details

Source

Examples

Prediction function for the rm_class_model

Description

Usage

Arguments

Value

Examples

Prediction function for the rm_reg_model

Description

Usage

Arguments

Value

Examples

Random Machines

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

S4 class for RM classification

Description

Details

Slots

S4 class for RM regression

Description

Details

Slots

Root Mean Squared Error (RMSE) Function

Description

Usage

Arguments

Value

Generate a binary classification data set from normal distribution

Description

Usage

Arguments

Value

Author(s)

References

Examples

Simulation for a regression toy examples from Random Machines Regression 1

Description

Usage

Arguments

Value

Author(s)

References

Examples

Simulation for a regression toy examples from Random Machines Regression 2

Description

Usage

Arguments

Value

Author(s)

References

Examples