An R Introduction to Statistics

GPU Computing with R

Tutorials on GPU computing with R

Bayesian Classification with Gaussian Process

  Despite prowess of the support vector machine, it is not specifically designed to extract features relevant to the prediction. For example, in network intrusion detection, we need to learn relevant network statistics for the network defense. In consumer credit rating, we would like to determine relevant financial records for the credit score. As for medical genetics research, we aim to identify relevant genes of the illness.

Significance Test for Kendall's Tau-b

  A variation of the standard definition of Kendall correlation coefficient is necessary in order to deal with data samples with tied ranks. It known as the Kendall’s tau-b coefficient and is more effective in determining whether two non-parametric data samples with ties are correlated.

Support Vector Machine with GPU, Part II

  In our last tutorial on SVM training with GPU, we mentioned a necessary step to pre-scale the data with rpusvm-scale, and to reverse scaling the prediction outcome. This cumbersome procedure is now simplified with the latest RPUSVM.

Hierarchical Cluster Analysis

  With the distance matrix found in previous tutorial, we can use various techniques of cluster analysis for relationship discovery. For example, in the data set mtcars, we can run the distance matrix with hclust, and plot a dendrogram that displays a hierarchical relationship among the vehicles.

Installing CUDA Toolkit 5.0 on Fedora 16 Linux

A discussion on how to install CUDA Toolkit on Fedora Linux.

Installing CUDA Toolkit 5.0 on Ubuntu 11.10 Linux

A discussion on how to install CUDA Toolkit on Ubuntu Linux.

Support Vector Machine with GPU

  Most elementary statistical inference algorithms assume that the data can be modeled by linear parameters with a normally distributed error component. A new class of algorithms called support vector machine (SVM) remove such constraint.

Kendall Rank Coefficient

  The correlation coefficient is a measurement of correlation between two random variables. While its computation is straightforward, it is not readily applicable to non-parametric statistics.

Installing GPU Packages

A tutorial on how to install rpud package and rpudplus add-on in R.

Distance Matrix by GPU

A comparison of computing the distance matrix in CPU with dist function in core R, and in GPU with rpuDist in rpud.