An R Introduction to Statistics

Support Vector Machine with GPU, Part II

fractal-02h In our last tutorial on SVM training with GPU, we mentioned a necessary step to pre-scale the data with rpusvm-scale, and to reverse scaling the prediction outcome. This cumbersome procedure is now simplified with the latest RPUSVM.

Support Vector Machine with GPU

fractal-01h Most elementary statistical inference algorithms assume that the data can be modeled by linear parameters with a normally distributed error component. A new class of algorithms called support vector machine (SVM) remove such constraint.

Kendall Rank Coefficient

fractal-09h The correlation coefficient is a measurement of correlation between two random variables. While its computation is straightforward, it is not readily applicable to non-parametric statistics.

Hierarchical Cluster Analysis

fractal-10h With the distance matrix found in previous tutorial, we can use various techniques of cluster analysis for relationship discovery. For example, in the data set mtcars, we can run the distance matrix with hclust, and plot a dendrogram that displays a hierarchical relationship among the vehicles.

GPU Computing with R

fractal-10h Statistics is computationally intensive. Routine statistical tasks such as data extraction, graphical summary, and technical interpretation all require pervasive use of modern computing machinery. Obviously, these tasks can benefit greatly from a parallel computing environment where extensive calculations can be performed simultaneously.

Type II Error

fractal-12h In hypothesis testing, a type II error is due to a failure of rejecting an invalid null hypothesis. The probability of avoiding a type II error is called the power of the hypothesis test, and is denoted by the quantity 1 - β .

Multiple Linear Regression

fractal-04h A multiple linear regression (MLR) model that describes a dependent variable y by independent variables x1, x2, ..., xp (p > 1) is expressed by the equation as follows, where the numbers α and βk (k = 1, 2, ..., p) are the parameters, and ϵ is the error term.

y = α+ ∑  β x  + ϵ
        k  k k

Analysis of Variance

fractal-08h In an experiment study, various treatments are applied to test subjects and the response data is gathered for analysis. A critical tool for carrying out the analysis is the Analysis of Variance (ANOVA). It enables a researcher to differentiate treatment results based on easily computed statistical quantities from the treatment outcome.