Title: | Functions for Statistics Classes at Carleton College |
---|---|
Description: | Includes commands for bootstrapping and permutation tests, a command for created grouped bar plots, and a demo of the quantile-normal plot for data drawn from different distributions. |
Authors: | Laura Chihara [aut], Adam Loy [aut, cre] |
Maintainer: | Adam Loy <[email protected]> |
License: | GPL-2 |
Version: | 2.2 |
Built: | 2025-01-23 04:55:40 UTC |
Source: | https://github.com/aloy/carletonstats |
ANOVA F test when given summarized data (sample sizes, means and standard deviations).
anovaSummarized(N, mn, stdev)
anovaSummarized(N, mn, stdev)
N |
a vector with the sample sizes |
mn |
a vector of means, one for each group in the sample |
stdev |
a vector of standard deviations, one for each group in the sample |
Perform an ANOVA F test when presented with summarized data: sample sizes, sample means and sample standard devations.
Returns invisibly a list
Treatment SS |
The treatment sum of squares (also called the "between sum of squares"). |
Residual SS |
Residual sum of squares (also called the "within sum of squares"). |
Degrees of Freedom |
a vector with the numerator and denominator degrees of freedom. |
Treatment Mean
Square |
Treatment SS/numerator DF |
Residual Mean Square |
Residual SS/denominator DF |
Residual Standard Error |
Square root of Residual Mean Square |
F |
the F statistic |
P-value |
p-value |
...
Laura Chihara
#use the data set chickwts from base R head(chickwts) N <- table(chickwts$feed) stdev <- tapply(chickwts$weight, chickwts$feed, sd) mn <- tapply(chickwts$weight, chickwts$feed, mean) anovaSummarized(N, mn, stdev)
#use the data set chickwts from base R head(chickwts) N <- table(chickwts$feed) stdev <- tapply(chickwts$weight, chickwts$feed, sd) mn <- tapply(chickwts$weight, chickwts$feed, mean) anovaSummarized(N, mn, stdev)
Bootstrap a single variable or a grouped variable
boot(x, ...) ## Default S3 method: boot( x, group = NULL, statistic = mean, conf.level = 0.95, B = 10000, plot.hist = TRUE, plot.qq = FALSE, x.name = deparse(substitute(x)), xlab = NULL, ylab = NULL, title = NULL, seed = NULL, ... ) ## S3 method for class 'formula' boot(formula, data, subset, ...)
boot(x, ...) ## Default S3 method: boot( x, group = NULL, statistic = mean, conf.level = 0.95, B = 10000, plot.hist = TRUE, plot.qq = FALSE, x.name = deparse(substitute(x)), xlab = NULL, ylab = NULL, title = NULL, seed = NULL, ... ) ## S3 method for class 'formula' boot(formula, data, subset, ...)
x |
a numeric vector |
... |
further arguments to be passed to or from methods. |
group |
an optional grouping variable (vector), usually a factor variable. If it is a binary numeric variable, it will be coerced to a factor. |
statistic |
function that computes the statistic of interest. Default is the
|
conf.level |
confidence level for the bootstrap percentile interval. Default is 95%. |
B |
number of times to resample (positive integer greater than 2). |
plot.hist |
logical value. If |
plot.qq |
Logical value. If |
x.name |
Label for variable name |
xlab |
an optional character string for the x-axis label |
ylab |
an optional character string for the y-axis label |
title |
an optional character string giving the plot title |
seed |
optional argument to |
formula |
a formula |
data |
a data frame that contains the variables given in the formula. |
subset |
an optional expression indicating what observations to use. |
Perform a bootstrap of a statistic applied to a single variable, or to the
difference of the statistic computed on two samples (using the grouping
variable). If x
is a binary vector of 0's and 1's and the function is
the mean, then the statistic of interest is the proportion.
Observations with missing values are removed.
A vector with the resampled statistics is returned invisibly.
boot(default)
: Bootstrap a single variable or a grouped variable
boot(formula)
: Bootstrap a single variable or a grouped variable
Laura Chihara
Tim Hesterberg's website https://www.timhesterberg.net/bootstrap-and-resampling
#ToothGrowth data (supplied by R) #bootstrap mean of a single numeric variable boot(ToothGrowth$len) #bootstrap difference in mean of tooth length for two groups. boot(ToothGrowth$len, ToothGrowth$supp, B = 1000) #same as above using formula syntax boot(len ~ supp, data = ToothGrowth, B = 1000)
#ToothGrowth data (supplied by R) #bootstrap mean of a single numeric variable boot(ToothGrowth$len) #bootstrap difference in mean of tooth length for two groups. boot(ToothGrowth$len, ToothGrowth$supp, B = 1000) #same as above using formula syntax boot(len ~ supp, data = ToothGrowth, B = 1000)
Bootstrap the correlation of two numeric variables.
bootCor(x, ...) ## Default S3 method: bootCor( x, y, conf.level = 0.95, B = 10000, plot.hist = TRUE, xlab = NULL, ylab = NULL, title = NULL, plot.qq = FALSE, x.name = deparse(substitute(x)), y.name = deparse(substitute(y)), seed = NULL, ... ) ## S3 method for class 'formula' bootCor(formula, data, subset, ...)
bootCor(x, ...) ## Default S3 method: bootCor( x, y, conf.level = 0.95, B = 10000, plot.hist = TRUE, xlab = NULL, ylab = NULL, title = NULL, plot.qq = FALSE, x.name = deparse(substitute(x)), y.name = deparse(substitute(y)), seed = NULL, ... ) ## S3 method for class 'formula' bootCor(formula, data, subset, ...)
x |
a numeric vector. |
... |
further arguments to be passed to or from methods. |
y |
a numeric vector. |
conf.level |
confidence level for the bootstrap ercentile interval. |
B |
number of times to resample (positive integer greater than 2). |
plot.hist |
a logical value. If |
xlab |
an optional character string for the x-axis label |
ylab |
an optional character string for the y-axis label |
title |
an optional character string giving the plot title |
plot.qq |
a logical value. If |
x.name |
Label for variable x |
y.name |
Label for variable y |
seed |
optional argument to |
formula |
a formula of the form lhs ~ rhs where lhs is a numeric variable giving the data values and rhs a factor with two levels giving the corresponding groups. |
data |
an optional data frame containing the variables in the formula formula. By default the variables are taken from environment(formula). |
subset |
an optional vector specifying a subset of observations to be used. |
Bootstrap the correlation of two numeric variables. The bootstrap mean and standard error are printed as well as a bootstrap percentile confidence interval.
Observations with missing values are removed.
The command returns the correlations of the resampled observations.
bootCor(default)
: Bootstrap the correlation of two numeric variables.
bootCor(formula)
: Bootstrap the correlation of two numeric variables.
Laura Chihara
Tim Hesterberg's website https://www.timhesterberg.net/bootstrap-and-resampling
plot(states03$ColGrad, states03$InfMortality) bootCor(InfMortality ~ ColGrad, data = states03, B = 1000) bootCor(states03$ColGrad, states03$InfMortality, B = 1000)
plot(states03$ColGrad, states03$InfMortality) bootCor(InfMortality ~ ColGrad, data = states03, B = 1000) bootCor(states03$ColGrad, states03$InfMortality, B = 1000)
Perform a bootstrap of two paired variables.
bootPaired(x, ...) ## Default S3 method: bootPaired( x, y, conf.level = 0.95, B = 10000, plot.hist = TRUE, xlab = NULL, ylab = NULL, title = NULL, plot.qq = FALSE, x.name = deparse(substitute(x)), y.name = deparse(substitute(y)), seed = NULL, ... ) ## S3 method for class 'formula' bootPaired(formula, data, subset, ...)
bootPaired(x, ...) ## Default S3 method: bootPaired( x, y, conf.level = 0.95, B = 10000, plot.hist = TRUE, xlab = NULL, ylab = NULL, title = NULL, plot.qq = FALSE, x.name = deparse(substitute(x)), y.name = deparse(substitute(y)), seed = NULL, ... ) ## S3 method for class 'formula' bootPaired(formula, data, subset, ...)
x |
a numeric vector. |
... |
further arguments to be passed to or from methods. |
y |
a numeric vector. |
conf.level |
confidence level for the bootstrap percentile interval. |
B |
number of resamples (positive integer greater than 2). |
plot.hist |
logical. If |
xlab |
an optional character string for the x-axis label |
ylab |
an optional character string for the y-axis label |
title |
an optional character string giving the plot title |
plot.qq |
logical. If |
x.name |
Label for variable x |
y.name |
Label for variable y |
seed |
optional argument to |
formula |
a formula |
data |
a data frame that contains the variables given in the formula. |
subset |
an optional expression indicating what observations to use. |
The command will compute the difference of x
and y
and
bootstrap the difference. The mean and standard error of the bootstrap
distribution will be printed as well as a bootstrap percentile interval.
Observations with missing values are removed.
The command returns a vector with the replicates of the statistic being bootstrapped.
bootPaired(default)
: Perform a bootstrap of two paired variables.
bootPaired(formula)
: Perform a bootstrap of two paired variables.
Laura Chihara
Tim Hesterberg's website https://www.timhesterberg.net/bootstrap-and-resampling
#Bootstrap the mean difference of fat content in vanilla and chocolate ice #cream. Data are paired becaues ice cream from the same manufacturer will #have similar content. Icecream bootPaired(ChocFat ~ VanillaFat, data = Icecream) bootPaired(Icecream$VanillaFat, Icecream$ChocFat)
#Bootstrap the mean difference of fat content in vanilla and chocolate ice #cream. Data are paired becaues ice cream from the same manufacturer will #have similar content. Icecream bootPaired(ChocFat ~ VanillaFat, data = Icecream) bootPaired(Icecream$VanillaFat, Icecream$ChocFat)
Bootstrap theslope of a simple linear regression line. The bootstrap mean and standard error are printed as well as a bootstrap percentile confidence interval.
bootSlope(x, ...) ## Default S3 method: bootSlope( x, y, conf.level = 0.95, B = 10000, plot.hist = TRUE, xlab = NULL, ylab = NULL, title = NULL, plot.qq = FALSE, x.name = deparse(substitute(x)), y.name = deparse(substitute(y)), seed = NULL, ... ) ## S3 method for class 'formula' bootSlope(formula, data, subset, ...)
bootSlope(x, ...) ## Default S3 method: bootSlope( x, y, conf.level = 0.95, B = 10000, plot.hist = TRUE, xlab = NULL, ylab = NULL, title = NULL, plot.qq = FALSE, x.name = deparse(substitute(x)), y.name = deparse(substitute(y)), seed = NULL, ... ) ## S3 method for class 'formula' bootSlope(formula, data, subset, ...)
x |
a numeric vector. |
... |
further arguments to be passed to or from methods. |
y |
a numeric vector. |
conf.level |
confidence level for the bootstrap percentile interval. |
B |
number of times to resample (positive integer greater than 2). |
plot.hist |
a logical value. If |
xlab |
an optional character string for the x-axis label |
ylab |
an optional character string for the y-axis label |
title |
an optional character string giving the plot title |
plot.qq |
a logical value. If |
x.name |
Label for variable x |
y.name |
Label for variable y |
seed |
optional argument to |
formula |
a formula of the form lhs ~ rhs where lhs is a numeric variable giving the data values and rhs a factor with two levels giving the corresponding groups. |
data |
an optional data frame containing the variables in the formula formula. By default the variables are taken from environment(formula). |
subset |
an optional vector specifying a subset of observations to be used. |
Observations with missing values are removed.
The command returns the slopes of the resampled observations.
bootSlope(default)
: Bootstrap the slope of a simple linear regression line
bootSlope(formula)
: Bootstrap the slope of a simple linear regression line
Adam Loy, Laura Chihara
Tim Hesterberg's website https://www.timhesterberg.net/bootstrap-and-resampling
plot(states03$ColGrad, states03$InfMortality) bootSlope(InfMortality ~ ColGrad, data = states03, B = 1000) bootSlope(states03$ColGrad, states03$InfMortality, B = 1000)
plot(states03$ColGrad, states03$InfMortality) bootSlope(InfMortality ~ ColGrad, data = states03, B = 1000) bootSlope(states03$ColGrad, states03$InfMortality, B = 1000)
carlboot
objectCalculate percentile confidence intervals for a carlboot
object.
## S3 method for class 'carlboot' confint(object, parm = NULL, level = 0.95, ...)
## S3 method for class 'carlboot' confint(object, parm = NULL, level = 0.95, ...)
object |
The carlboot object to print. |
parm |
not used in CarletonStats, just for generic consistency |
level |
the confidence level |
... |
not used |
Draw many random samples and compute confidence interval. How many intervals capture the true mean?
confIntDemo(distr = "normal", size = 20, conf.level = 0.95)
confIntDemo(distr = "normal", size = 20, conf.level = 0.95)
distr |
distribution of the population to be sampled. Options include
|
size |
sample size |
conf.level |
confidence level. |
This simulation will draw 100 random samples from a given population distribution and compute the correpsonding confidence intervals. The 100 intervals will be drawn with an indication of the ones that missed the true mean. A histogram of the population will also be created.
The command invisibly returns the fraction of intervals that capture the true mean.
Laura Chihara
confIntDemo() confIntDemo(distr = "exponential", size = 40)
confIntDemo() confIntDemo(distr = "exponential", size = 40)
For a given r
, create a scatterplot of two variables with that
correlation.
corDemo(r = 0)
corDemo(r = 0)
r |
a number between -1 and 1. Enter any number r, |
Demonstrate the concept of correlation by inputting a number between -1 and
1 and seeing a scatter plot of two variables with that correlation. Once you
invoke this command, you can continue to enter values for r. Type any number
) to exit.
Laura Chihara
## Not run: corDemo() ## End(Not run)
## Not run: corDemo() ## End(Not run)
Create a bar chart of a single categorical variable or a grouped bar chart of two categorical variables.
groupedBar(resp, ...) ## Default S3 method: groupedBar( resp, condvar = NULL, percent = TRUE, print = TRUE, cond.name = deparse(substitute(condvar)), resp.name = deparse(substitute(resp)), ... ) ## S3 method for class 'formula' groupedBar(formula, data = parent.frame(), subset, ...)
groupedBar(resp, ...) ## Default S3 method: groupedBar( resp, condvar = NULL, percent = TRUE, print = TRUE, cond.name = deparse(substitute(condvar)), resp.name = deparse(substitute(resp)), ... ) ## S3 method for class 'formula' groupedBar(formula, data = parent.frame(), subset, ...)
resp |
a factor variable. If |
... |
further arguments to be passed to or from methods. |
condvar |
a factor variable to condition on. If |
percent |
a logical value. Should the y-axis give percent or counts? |
print |
a logical value. If |
cond.name |
Label for variable |
resp.name |
Label for variable |
formula |
a formula of the form |
data |
a data frame that contains the variables in the formula. |
subset |
an optional vector specifying a subset of observations to be used. |
For a single factor variable, a bar plot. If two factor variables are given,
then a bar plot of x
conditioned by condvar
. This command
uses R's table
command so missing values are automatically removed.
Returns invisibly a table of the variable(s).
groupedBar(default)
: Grouped bar chart
groupedBar(formula)
: Grouped bar chart
Laura Chihara
groupedBar(states03$Region) ## Not run: groupedBar(states03$DeathPenalty, states03$Region, legend.loc = "topleft") #Using a formula syntax: groupedBar(~Region, data = states03) groupedBar(DeathPenalty ~ Region, data = states03, legend.loc = "topleft") ## End(Not run)
groupedBar(states03$Region) ## Not run: groupedBar(states03$DeathPenalty, states03$Region, legend.loc = "topleft") #Using a formula syntax: groupedBar(~Region, data = states03) groupedBar(DeathPenalty ~ Region, data = states03, legend.loc = "topleft") ## End(Not run)
Nutritional information on vanilla and chocolate ice cream from a sample of companies.
A data frame with 39 observations on the following 7 variables.
Brand name
Calories per serving in vanilla
Fat per serving (g) in vanilla
Sugar per serving (g) in vanilla
Calories per serving in chocolate
Fat per serving (g) in chocolate
Sugar per serving (g) in chocolate
Data collected by Carleton student Ann Butkowski (2008).
head(Icecream) t.test(Icecream$VanillaCalories, Icecream$ChocCalories, paired = TRUE)
head(Icecream) t.test(Icecream$VanillaCalories, Icecream$ChocCalories, paired = TRUE)
Milkshakes (chocolate) Nutrional information on chocolate milkshakes from a sample of restaurants.
A data frame with 29 observations on the following 11 variables.
Names of restaurants
Type of restaurant, Dine In
Fast Food
Calories per serving
Fat per serving (g)
Sodium per serving (mg)
Carbohydrates per serving (g)
Size of milkshake (ounces)
Calories per ounce
Fat per ounce
Carbohydrates per ounce
Data collected by Carleton students Yoni Blumberg (2013) and Lindsay Guthrie (2013).
In data frames with factor variables, convert any observation with "" into <NA>.
missingLevel(data)
missingLevel(data)
data |
a data frame with factor variables. |
In a factor variable with the level """"
, this command will convert
this to an <NA>
.
Returns the same data frame with """"
replaced by <NA>
in factor variables.
When importing data from comma separated files (for example), missing values in a categorical variable are often denoted by """. We often do not want to treat this as a level of a factor variable in R.
Laura Chihara
Permutation test to test a hypothesis involving two samples.
permTest(x, ...) ## Default S3 method: permTest( x, group, statistic = mean, B = 9999, alternative = "two.sided", plot.hist = TRUE, plot.qq = FALSE, xlab = NULL, ylab = NULL, title = NULL, seed = NULL, ... ) ## S3 method for class 'formula' permTest(formula, data = parent.frame(), subset, ...)
permTest(x, ...) ## Default S3 method: permTest( x, group, statistic = mean, B = 9999, alternative = "two.sided", plot.hist = TRUE, plot.qq = FALSE, xlab = NULL, ylab = NULL, title = NULL, seed = NULL, ... ) ## S3 method for class 'formula' permTest(formula, data = parent.frame(), subset, ...)
x |
a numeric vector. If the function is the mean ( |
... |
further arguments to be passed to or from methods. |
group |
a factor variable with two levels. If |
statistic |
the statistic of interest. |
B |
the number of resamples (positive integer greater than 2). |
alternative |
the alternative hypothesis. Options are
|
plot.hist |
a logical value. If |
plot.qq |
a logical value. If |
xlab |
an optional character string for the x-axis label |
ylab |
an optional character string for the y-axis label |
title |
an optional character string giving the plot title |
seed |
optional argument to |
formula |
a formula of the form |
data |
a data frame with the variables in the formula. |
subset |
an optional expression specifying which observations to keep. |
Permutation test to see if a population parameter is the same for two
populations. For instance, test where
denotes the population mean. The values of the numeric
variable are randomly assigned to the two groups and the difference of the
statistic for each group is calculated. The command will print the mean and
standard error of the distribution of the test statistic as well as a
P-value.
Observations with missing values are removed.
Returns invisibly a vector of the replicates of the test statistic.
permTest(default)
: Permutation test
permTest(formula)
: Permutation test
Laura Chihara
Tim Hesteberg's website: https://www.timhesterberg.net/bootstrap-and-resampling
permTest(states03$ViolentCrime, states03$DeathPenalty) #using formula syntax permTest(ViolentCrime ~ DeathPenalty, data = states03, alt = "less")
permTest(states03$ViolentCrime, states03$DeathPenalty) #using formula syntax permTest(ViolentCrime ~ DeathPenalty, data = states03, alt = "less")
Permutation test to see if the population mean is the same for two or more
populations. For instance, test where
denotes the population mean. The values of the numeric
variable are randomly assigned to the groups and the ANOVA F statistic is
calculated. The command will print the mean and
standard error of the distribution of the test statistic as well as a
P-value.
permTestAnova(x, ...) ## Default S3 method: permTestAnova( x, group, B = 9999, plot.hist = TRUE, plot.qq = FALSE, xlab = NULL, ylab = NULL, title = NULL, seed = NULL, ... ) ## S3 method for class 'formula' permTestAnova(formula, data = parent.frame(), subset, ...)
permTestAnova(x, ...) ## Default S3 method: permTestAnova( x, group, B = 9999, plot.hist = TRUE, plot.qq = FALSE, xlab = NULL, ylab = NULL, title = NULL, seed = NULL, ... ) ## S3 method for class 'formula' permTestAnova(formula, data = parent.frame(), subset, ...)
x |
a numeric vector. |
... |
further arguments to be passed to or from methods. |
group |
a factor variable with two or more levels. If |
B |
the number of resamples (positive integer greater than 2). |
plot.hist |
a logical value. If |
plot.qq |
a logical value. If |
xlab |
an optional character string for the x-axis label |
ylab |
an optional character string for the y-axis label |
title |
an optional character string giving the plot title |
seed |
optional argument to |
formula |
a formula of the form |
data |
a data frame with the variables in the formula. |
subset |
an optional expression specifying which observations to keep. |
Observations with missing values are removed.
Returns invisibly a vector of the replicates of the test statistic.
permTestAnova(default)
: Permutation test for ANOVA F-test
permTestAnova(formula)
: Permutation test for ANOVA F-test
Adam Loy, Laura Chihara
Tim Hesteberg's website: https://www.timhesterberg.net/bootstrap-and-resampling
permTestAnova(states03$ViolentCrime, states03$Region, B = 499) #using formula syntax ## Not run: permTestAnova(ViolentCrime ~ Region, data = states03, B = 9999) ## End(Not run)
permTestAnova(states03$ViolentCrime, states03$Region, B = 499) #using formula syntax ## Not run: permTestAnova(ViolentCrime ~ Region, data = states03, B = 9999) ## End(Not run)
Hypothesis test for a correlation of two variables. The null hypothesis is that the population correlation is 0.
permTestCor(x, ...) ## Default S3 method: permTestCor( x, y, B = 999, alternative = "two.sided", plot.hist = TRUE, plot.qq = FALSE, x.name = deparse(substitute(x)), y.name = deparse(substitute(y)), xlab = NULL, ylab = NULL, title = NULL, seed = NULL, ... ) ## S3 method for class 'formula' permTestCor(formula, data, subset, ...)
permTestCor(x, ...) ## Default S3 method: permTestCor( x, y, B = 999, alternative = "two.sided", plot.hist = TRUE, plot.qq = FALSE, x.name = deparse(substitute(x)), y.name = deparse(substitute(y)), xlab = NULL, ylab = NULL, title = NULL, seed = NULL, ... ) ## S3 method for class 'formula' permTestCor(formula, data, subset, ...)
x |
a numeric vector. |
... |
further arguments to be passed to or from methods. |
y |
a numeric vector. |
B |
the number of resamples to draw (positive integer greater than 2). |
alternative |
alternative hypothesis. Options are |
plot.hist |
a logical value. If |
plot.qq |
a logical value. If |
x.name |
Label for variable x |
y.name |
Label for variable y |
xlab |
an optional character string for the x-axis label |
ylab |
an optional character string for the y-axis label |
title |
an optional character string giving the plot title |
seed |
optional argument to |
formula |
a formula |
data |
a data frame that contains the variables given in the formula. |
subset |
an optional expression indicating what observations to use. |
Perform a permutation test to test , where
is the population correlation. The rows of the second
variable are permuted and the correlation is re-computed.
The mean and standard error of the permutation distribution is printed as well as a P-value.
Observations with missing values are removed.
Returns invisibly a vector of the correlations obtained by the randomization.
permTestCor(default)
: Permutation test for the correlation of two variables.
permTestCor(formula)
: Permutation test for the correlation of two variables.
Laura Chihara
Tim Hesterberg's website: https://www.timhesterberg.net/bootstrap-and-resampling
plot(states03$HSGrad, states03$TeenBirths) cor(states03$HSGrad, states03$TeenBirths) permTestCor(states03$HSGrad, states03$TeenBirths) permTestCor(TeenBirths ~ HSGrad, data = states03)
plot(states03$HSGrad, states03$TeenBirths) cor(states03$HSGrad, states03$TeenBirths) permTestCor(states03$HSGrad, states03$TeenBirths) permTestCor(TeenBirths ~ HSGrad, data = states03)
Permutation test for paired data.
permTestPaired(x, ...) ## Default S3 method: permTestPaired( x, y, B = 9999, alternative = "two.sided", plot.hist = TRUE, plot.qq = FALSE, x.name = deparse(substitute(x)), y.name = deparse(substitute(y)), xlab = NULL, ylab = NULL, title = NULL, seed = NULL, ... ) ## S3 method for class 'formula' permTestPaired(formula, data, subset, ...)
permTestPaired(x, ...) ## Default S3 method: permTestPaired( x, y, B = 9999, alternative = "two.sided", plot.hist = TRUE, plot.qq = FALSE, x.name = deparse(substitute(x)), y.name = deparse(substitute(y)), xlab = NULL, ylab = NULL, title = NULL, seed = NULL, ... ) ## S3 method for class 'formula' permTestPaired(formula, data, subset, ...)
x |
a numeric vector. |
... |
further arguments to be passed to or from methods. |
y |
a numeric vector. |
B |
the number of resamples. |
alternative |
the alternative hypothesis. Options are
|
plot.hist |
a logical value. If |
plot.qq |
a logical value. If |
x.name |
Label for x variable |
y.name |
Label for y variable |
xlab |
an optional character string for the x-axis label |
ylab |
an optional character string for the y-axis label |
title |
an optional character string giving the plot title |
seed |
optional argument to |
formula |
a formula of the form |
data |
an optional data frame containing the variables in the formula. By default the variables are taken from environment(formula). |
subset |
an optional vector specifying a subset of observations to be used. |
For two paired numeric variables with n rows, randomly select k of the n
rows (k also is randm) and switch the entries
and then compute the mean of the difference of the two variables (
y-x
).
Observations with missing values are removed.
Returns invisibly a vector of the replicates of the test statistic (ex. mean of the difference of the resampled variables).
permTestPaired(default)
: Permutation test for paired data.
permTestPaired(formula)
: Permutation test for paired data.
Laura Chihara
Tim Hesterberg's website: https://www.timhesterberg.net/bootstrap-and-resampling
#Does chocolate ice cream have more calories than vanilla ice cream, on average? #H0: mean number of calories is the same #HA: mean number of calories is greater in chocolate ice cream permTestPaired(Icecream$VanillaCalories, Icecream$ChocCalories, alternative = "less") permTestPaired(ChocCalories ~ VanillaCalories, data = Icecream, alternative = "greater")
#Does chocolate ice cream have more calories than vanilla ice cream, on average? #H0: mean number of calories is the same #HA: mean number of calories is greater in chocolate ice cream permTestPaired(Icecream$VanillaCalories, Icecream$ChocCalories, alternative = "less") permTestPaired(ChocCalories ~ VanillaCalories, data = Icecream, alternative = "greater")
Hypothesis test for a slope of a simple linear regression model. The null hypothesis is that the population slope is 0.
permTestSlope(x, ...) ## Default S3 method: permTestSlope( x, y, B = 999, alternative = "two.sided", plot.hist = TRUE, plot.qq = FALSE, x.name = deparse(substitute(x)), y.name = deparse(substitute(y)), xlab = NULL, ylab = NULL, title = NULL, seed = NULL, ... ) ## S3 method for class 'formula' permTestSlope(formula, data, subset, ...)
permTestSlope(x, ...) ## Default S3 method: permTestSlope( x, y, B = 999, alternative = "two.sided", plot.hist = TRUE, plot.qq = FALSE, x.name = deparse(substitute(x)), y.name = deparse(substitute(y)), xlab = NULL, ylab = NULL, title = NULL, seed = NULL, ... ) ## S3 method for class 'formula' permTestSlope(formula, data, subset, ...)
x |
a numeric vector. |
... |
further arguments to be passed to or from methods. |
y |
a numeric vector. |
B |
the number of resamples to draw (positive integer greater than 2). |
alternative |
alternative hypothesis. Options are |
plot.hist |
a logical value. If |
plot.qq |
a logical value. If |
x.name |
Label for variable x |
y.name |
Label for variable y |
xlab |
an optional character string for the x-axis label |
ylab |
an optional character string for the y-axis label |
title |
an optional character string giving the plot title |
seed |
optional argument to |
formula |
a formula |
data |
a data frame that contains the variables given in the formula. |
subset |
an optional expression indicating what observations to use. |
Perform a permutation test to test , where
is the population slope. The rows of the second
variable are permuted and the slope is re-computed.
The mean and standard error of the permutation distribution is printed as well as a P-value.
Observations with missing values are removed.
Returns invisibly a vector of the slopes obtained by the randomization.
permTestSlope(default)
: Permutation test for the slope
permTestSlope(formula)
: Permutation test for the slope
Adam Loy, Laura Chihara
Tim Hesterberg's website: https://www.timhesterberg.net/bootstrap-and-resampling
plot(states03$HSGrad, states03$TeenBirths) lm(HSGrad ~ TeenBirths, data = states03) permTestSlope(states03$HSGrad, states03$TeenBirths) permTestSlope(TeenBirths ~ HSGrad, data = states03)
plot(states03$HSGrad, states03$TeenBirths) lm(HSGrad ~ TeenBirths, data = states03) permTestSlope(states03$HSGrad, states03$TeenBirths) permTestSlope(TeenBirths ~ HSGrad, data = states03)
carlboot
objectPlot the bootstrap distribution returned as a carlboot
object.
## S3 method for class 'carlboot' plot(x, bins = 15, size = 5, xlab = NULL, ylab = NULL, title = NULL, ...) ## S3 method for class 'carlperm' plot(x, bins = 15, size = 5, xlab = NULL, ylab = NULL, title = NULL, ...)
## S3 method for class 'carlboot' plot(x, bins = 15, size = 5, xlab = NULL, ylab = NULL, title = NULL, ...) ## S3 method for class 'carlperm' plot(x, bins = 15, size = 5, xlab = NULL, ylab = NULL, title = NULL, ...)
x |
The carlboot object to print. |
bins |
number of bins in histogram. |
size |
size of points. |
xlab |
an optional character string for the x-axis label |
ylab |
an optional character string for the y-axis label |
title |
an optional character string giving the plot title |
... |
not used |
boot_dist <- boot(ToothGrowth$len, ToothGrowth$supp, B = 1000) plot(boot_dist) perm_dist <- permTest(states03$ViolentCrime, states03$DeathPenalty, B = 999) plot(perm_dist)
boot_dist <- boot(ToothGrowth$len, ToothGrowth$supp, B = 1000) plot(boot_dist) perm_dist <- permTest(states03$ViolentCrime, states03$DeathPenalty, B = 999) plot(perm_dist)
carlboot
objectPrint summary statistics and confidence intervals for an carlboot
object.
## S3 method for class 'carlboot' print(x, ...) ## S3 method for class 'carlperm' print(x, ...)
## S3 method for class 'carlboot' print(x, ...) ## S3 method for class 'carlperm' print(x, ...)
x |
The carlboot object to print. |
... |
not used |
Demonstrate the normal quantile-quantile plot for samples drawn from different populations.
qqPlotDemo( n = 25, distribution = "normal", mu = 0, sigma = 1, df = 10, lambda = 10, numdf = 10, dendf = 16, shape1 = 40, shape2 = 5 )
qqPlotDemo( n = 25, distribution = "normal", mu = 0, sigma = 1, df = 10, lambda = 10, numdf = 10, dendf = 16, shape1 = 40, shape2 = 5 )
n |
sample size |
distribution |
population distribution. Options are |
mu |
mean for the normal distribution. |
sigma |
(positive) standard deviation for the normal distribution. |
df |
(positive) degrees of freedom for the t-distribution. |
lambda |
positive rate for the exponential distribution. |
numdf |
(positive) numerator degrees of freedom for the chi-square distribution. |
dendf |
(positive) denominator degrees of freedom for the chi-square distribution. |
shape1 |
positive parameter for the beta distribution (shape1 = a). |
shape2 |
positive parameter for the beta distribution (shape2 = b). |
Draw a random sample from the chosen sample and display the normal qq-plot as well as the histogram of its distribution.
Returns invisibly the random sample.
Laura Chihara
qqPlotDemo(n = 30, distr = "exponential", lambda = 1/3)
qqPlotDemo(n = 30, distr = "exponential", lambda = 1/3)
Census data on the 50 states from 2003.
A data frame with 50 observations on the following 24 variables.
the 50 states
a
factor with levels Midwest
, Northeast
, South
,
West
Population in 1000
Number of births
Number of deaths
Percent of population 18 years of age or younger
Percent of population 65 years of age or older
Percent of population 25 years of age or older with a high school degree
Percent of population 25 years of age or older with a college degree
Average teachers salary in dollars
Infant mortality per 1000 live births
Live births per 1000 15-19 year old females
Violent crime per 100000 population
Property crime per 100000 population
State has death penalty?
Number of executions 1977-2003
Percent of populaton below the poverty level
Percent unemployed (of population 16 years or older)
Percent uninsured (3 year aveage)
Median household income in 1998 dollars
Average hourly earnings of production workers in manufacturing
Deaths by heart disease per 100000 population
Deaths by motor vehicle accidents per 100000 population
Home ownership rate
United States Census Bureau https://www.census.gov/
Stem and leaf plot. Will accept a factor variable as a second argument to create stem plots for each of the levels.
stemPlot(x, ...) ## Default S3 method: stemPlot(x, grpvar = NULL, varname = NULL, grpvarname = NULL, ...) ## S3 method for class 'formula' stemPlot(formula, data = parent.frame(), subset, ...)
stemPlot(x, ...) ## Default S3 method: stemPlot(x, grpvar = NULL, varname = NULL, grpvarname = NULL, ...) ## S3 method for class 'formula' stemPlot(formula, data = parent.frame(), subset, ...)
x |
a numeric variable. |
... |
further arguments to be passed to or from methods. |
grpvar |
a factor variable. A stem plot of |
varname |
name of the numeric variable. This is for printing the output only. Change if you want to print out a name different from the actual variable name. |
grpvarname |
name of the factor variable. This is for printing the output only. Change if you want to print out a name different from the actual variable name. |
formula |
a formula of the form |
data |
a data frame with the variables in the formula. |
subset |
an optional expression specifying which observations to keep. |
This command is just an enhanced version of R's stem
command. It
allows the user to create the stem plot for a numeric variable grouped by
the levels of a factor variable.
stemPlot(default)
: Stem and leaf plot
stemPlot(formula)
: Stem and leaf plot
Laura Chihara
stemPlot(states03$Births, states03$Region) stemPlot(Births ~ Region, data = states03)
stemPlot(states03$Births, states03$Region) stemPlot(Births ~ Region, data = states03)
carlboot
objectPrint summary statistics and confidence intervals, if desired, for an lmeresamp
object.
## S3 method for class 'carlboot' summary(object, ...) ## S3 method for class 'carlperm' summary(object, ...)
## S3 method for class 'carlboot' summary(object, ...) ## S3 method for class 'carlperm' summary(object, ...)
object |
The carlboot object to print. |
... |
not used |
boot_dist <- boot(ToothGrowth$len, ToothGrowth$supp, B = 1000) summary(boot_dist) perm_dist <- permTest(states03$ViolentCrime, states03$DeathPenalty, B = 999) summary(perm_dist)
boot_dist <- boot(ToothGrowth$len, ToothGrowth$supp, B = 1000) summary(boot_dist) perm_dist <- permTest(states03$ViolentCrime, states03$DeathPenalty, B = 999) summary(perm_dist)