# Introduction

The objective of this analysis is to determine the relationship between maximal expiratory pressure, a measure of the strength of the abdominal muscles and other expiratory muscles, and several other variables related largely to body size and lung function in these patients. We will do so by applying Principle Components Analysis (PCA), Sparse PCA, and Sliced Inverse Regression to a dataset and compare the loadings. Then we we will walk-through fitting Principal Components Regression, Ordinary Least Squares based on SIR directions, Partial Least Squares, and Sparse Partial Least Squares (SPLS), and compare their (predictive) performance with the Best Subset Selection, the Lasso, and Ridge Regression.

# Data

The data set we will be using is called ‘cystfibr’ and is from a study of 25 patients aged 7-23 years with cystic fibrosis. It is a part of the ISwR library and the data frame contains the following columns:

age: age in years

sex: 0 male, 1 female

height: height (cm)

weight: weight (kg)

bmp: body mass (% of normal)

fev1: forced expiratory volume

rv: residual volume

frc: functional residual capacity

tlc: total lung capacity

pemax: maximum expiratory pressure

The cystfibr data frame has 25 rows and 10 columns. It contains lung function data for cystic fibrosis patients (7-23 years old).

# Exploratory Data Analysis

Let us import the data set from the ISwR library:

library(ISwR)

Let us take a look at our data set:

summary(cystfibr)
##       age             sex           height          weight
##  Min.   : 7.00   Min.   :0.00   Min.   :109.0   Min.   :12.9
##  1st Qu.:11.00   1st Qu.:0.00   1st Qu.:139.0   1st Qu.:25.1
##  Median :14.00   Median :0.00   Median :156.0   Median :37.2
##  Mean   :14.48   Mean   :0.44   Mean   :152.8   Mean   :38.4
##  3rd Qu.:17.00   3rd Qu.:1.00   3rd Qu.:174.0   3rd Qu.:51.1
##  Max.   :23.00   Max.   :1.00   Max.   :180.0   Max.   :73.8
##       bmp             fev1             rv             frc
##  Min.   :64.00   Min.   :18.00   Min.   :158.0   Min.   :104.0
##  1st Qu.:68.00   1st Qu.:26.00   1st Qu.:188.0   1st Qu.:127.0
##  Median :71.00   Median :33.00   Median :225.0   Median :139.0
##  Mean   :78.28   Mean   :34.72   Mean   :255.2   Mean   :155.4
##  3rd Qu.:90.00   3rd Qu.:44.00   3rd Qu.:305.0   3rd Qu.:183.0
##  Max.   :97.00   Max.   :57.00   Max.   :449.0   Max.   :268.0
##       tlc          pemax
##  Min.   : 81   Min.   : 65.0
##  1st Qu.:101   1st Qu.: 85.0
##  Median :113   Median : 95.0
##  Mean   :114   Mean   :109.1
##  3rd Qu.:128   3rd Qu.:130.0
##  Max.   :147   Max.   :195.0

We see that there are no NA’s for any of the variables. In addition, there are not any outliers in the data.

What is the structure of our data:

str(cystfibr)
## 'data.frame':    25 obs. of  10 variables:
##  $age : int 7 7 8 8 8 9 11 12 12 13 ... ##$ sex   : int  0 1 0 1 0 0 1 1 0 1 ...
##  $height: int 109 112 124 125 127 130 139 150 146 155 ... ##$ weight: num  13.1 12.9 14.1 16.2 21.5 17.5 30.7 28.4 25.1 31.5 ...
##  $bmp : int 68 65 64 67 93 68 89 69 67 68 ... ##$ fev1  : int  32 19 22 41 52 44 28 18 24 23 ...
##  $rv : int 258 449 441 234 202 308 305 369 312 413 ... ##$ frc   : int  183 245 268 146 131 155 179 198 194 225 ...
##  $tlc : int 137 134 147 124 104 118 119 103 128 136 ... ##$ pemax : int  95 85 100 85 95 80 65 110 70 95 ...

We have 25 observations and 10 variables.

# Analysis

We will first perform PCA but we will need to exclude the response variable pemax.

cystfibr2 = cystfibr[c(1:9)]

Let us see if our data needs to be centered and scaled by seeing the means and variances for each variable.

apply(cystfibr, 2, mean)
##     age     sex  height  weight     bmp    fev1      rv     frc     tlc
##  14.480   0.440 152.800  38.404  78.280  34.720 255.200 155.400 114.000
##   pemax
## 109.120
apply(cystfibr, 2, var)
##          age          sex       height       weight          bmp
##   25.5933333    0.2566667  462.2500000  320.3429000  144.1266667
##         fev1           rv          frc          tlc        pemax
##  125.3766667 7398.9166667 1911.3333333  287.9166667 1118.0266667

Looks like both the mean and the variance vary alot among the factors so they need to be centered and scaled. Although normally we would do so, since we are comparing the loadings to SPCA we will not standardize them.

## PCA

We now perform principal components analysis using the prcomp() function, which is one of several functions in R that perform PCA.

pca = prcomp(cystfibr2, scale = FALSE)

pca$rotation ## PC1 PC2 PC3 PC4 PC5 ## age 0.031498125 0.162167605 -0.019191699 0.0531173158 -0.04009464 ## sex -0.001341311 0.001216805 0.005088485 -0.0008596744 -0.01344699 ## height 0.136851862 0.715140996 -0.188024665 0.1882259388 -0.23208225 ## weight 0.121319041 0.526536249 -0.308001233 -0.0910015795 0.18856161 ## bmp 0.071014909 0.078589166 -0.291137250 -0.5449591468 0.36629408 ## fev1 0.077727137 -0.028177612 0.042128161 -0.2143474148 0.66693772 ## rv -0.874080901 0.321415168 0.300717070 -0.0022861672 0.18507319 ## frc -0.422264176 -0.219654367 -0.695184739 -0.2781576842 -0.33457960 ## tlc -0.110228154 -0.162810122 -0.457317716 0.7301816969 0.42851029 ## PC6 PC7 PC8 PC9 ## age 0.001562442 -0.333651057 0.917039439 -0.125001531 ## sex 0.047514196 0.001078915 -0.135177913 -0.989574371 ## height -0.271821645 0.519734234 0.031493463 -0.014069684 ## weight 0.117755863 -0.668054322 -0.327196098 0.046037307 ## bmp 0.534703619 0.398115211 0.168687869 -0.002936077 ## fev1 -0.702883403 0.036846699 0.052600008 -0.049693842 ## rv 0.085552698 0.025196202 -0.002581493 0.005101294 ## frc -0.312281375 -0.051700351 0.019789740 -0.016238121 ## tlc 0.158727916 0.095364465 0.033103142 -0.005656267 Plot of the components: biplot(pca, scale = 0) Using the summary function, we can see the proportion of variance explained by each component and the cumulative variance explained: summary(pca) ## Importance of components: ## PC1 PC2 PC3 PC4 PC5 PC6 ## Standard deviation 97.8464 22.36648 17.92796 11.67362 9.74219 6.04798 ## Proportion of Variance 0.8968 0.04686 0.03011 0.01276 0.00889 0.00343 ## Cumulative Proportion 0.8968 0.94362 0.97372 0.98649 0.99538 0.99881 ## PC7 PC8 PC9 ## Standard deviation 3.38355 1.09428 0.33308 ## Proportion of Variance 0.00107 0.00011 0.00001 ## Cumulative Proportion 0.99988 0.99999 1.00000 Seems like 2 components do a great job of explaining a significant amount (94.4%) of the variation. ## Sparce PCA Now let us perform SPCA on the same data set with the 9 predictor variables. We will need to utilize the elasticnet library. library(elasticnet) ## Loading required package: lars ## Loaded lars 1.2 Will perform SPCA using only 3 principal components and including 3, 4, and 5, non-zero loadings in the first, second, and third components, respectively. spca = spca(cystfibr2, K = 3, type="predictor", sparse="varnum", para = c(3,4,5)) spca$loadings
##                PC1         PC2         PC3
## age     0.00000000 0.000000000  0.00000000
## sex     0.00000000 0.000000000  0.00000000
## height  0.00000000 0.733126213  0.05697345
## weight  0.00000000 0.665813010  0.00000000
## bmp     0.00000000 0.138344180 -0.24381875
## fev1    0.00000000 0.000000000  0.00000000
## rv     -0.93549166 0.008937508  0.31811819
## frc    -0.35153603 0.000000000 -0.76860451
## tlc    -0.03574612 0.000000000 -0.49533259
spca$pev ## [1] 0.84624458 0.04468700 0.03166664 ## Sliced Inverse Regression Now let us fit an SIR model to our data and see how this dimension reduction technique compares to PCA and SPCA. We will need to use the dr library. library(dr) ## Loading required package: MASS We will use the original data set that contains our response variable pemax. sir = dr(pemax~., data = cystfibr) summary(sir) ## ## Call: ## dr(formula = pemax ~ ., data = cystfibr) ## ## Method: ## sir with 9 slices, n = 25. ## ## Slice Sizes: ## 2 2 4 5 2 2 2 2 4 ## ## Estimated Basis Vectors for Central Subspace: ## Dir1 Dir2 Dir3 Dir4 ## age 0.522790 0.13917 -0.531206 0.76443 ## sex -0.679561 -0.97698 -0.825360 -0.31549 ## height 0.126367 0.05554 0.000646 -0.39742 ## weight -0.424208 -0.04995 0.159870 0.18742 ## bmp 0.246745 0.03737 -0.065381 -0.02942 ## fev1 -0.061155 0.11955 -0.064642 0.26879 ## rv -0.020168 -0.01347 0.002343 0.03810 ## frc 0.062581 0.06497 -0.006314 0.03418 ## tlc -0.004537 0.02182 -0.050434 -0.21749 ## ## Dir1 Dir2 Dir3 Dir4 ## Eigenvalues 0.8253 0.7382 0.6084 0.3913 ## R^2(OLS|dr) 0.7819 0.9345 0.9346 0.9397 ## ## Large-sample Marginal Dimension Tests: ## Stat df p.value ## 0D vs >= 1D 72.94 72 0.4469 ## 1D vs >= 2D 52.31 56 0.6155 ## 2D vs >= 3D 33.85 42 0.8106 ## 3D vs >= 4D 18.64 30 0.9472 Now let us view the loadings or in this case the eigenvectors: sir.evectors = sir$evectors
sir.evectors
##                Dir1        Dir2          Dir3        Dir4         Dir5
## age     0.522789883  0.13916550 -0.5312061369  0.76443327  0.124255045
## sex    -0.679560934 -0.97697688 -0.8253598603 -0.31549008 -0.982604183
## height  0.126366947  0.05554471  0.0006460316 -0.39742335  0.029900283
## weight -0.424207544 -0.04995398  0.1598702513  0.18741860 -0.071722831
## bmp     0.246745406  0.03736733 -0.0653813858 -0.02941728  0.091509179
## fev1   -0.061155224  0.11955479 -0.0646415518  0.26879368 -0.051821942
## rv     -0.020168151 -0.01346782  0.0023431571  0.03809973  0.022407252
## frc     0.062581459  0.06497255 -0.0063135937  0.03418011 -0.037352849
## tlc    -0.004536585  0.02181833 -0.0504343009 -0.21748896  0.007399244
##               Dir6        Dir7        Dir8          Dir9
## age    -0.06266279 -0.28583693  0.39600027 -0.0046510466
## sex     0.96497115 -0.92752570  0.89962738 -0.9992979127
## height  0.12483776 -0.08743813  0.03279579 -0.0058666896
## weight -0.17368912  0.12791833 -0.15141353 -0.0146497012
## bmp     0.08238622  0.10318285  0.07818079  0.0038776186
## fev1    0.10578833 -0.13001849  0.03988550 -0.0299053464
## rv      0.01117421  0.01665266 -0.01469515  0.0006603060
## frc    -0.01625916 -0.06830404  0.03604349 -0.0149395486
## tlc     0.02799616  0.03860755 -0.02519067  0.0001607534

# Comparison

Now that we have the loadings for PCA, SPCA, and SIR - let us compare the loadings of the three methods for the first 3 components:

pca$rotation[,1:3] ## PC1 PC2 PC3 ## age 0.031498125 0.162167605 -0.019191699 ## sex -0.001341311 0.001216805 0.005088485 ## height 0.136851862 0.715140996 -0.188024665 ## weight 0.121319041 0.526536249 -0.308001233 ## bmp 0.071014909 0.078589166 -0.291137250 ## fev1 0.077727137 -0.028177612 0.042128161 ## rv -0.874080901 0.321415168 0.300717070 ## frc -0.422264176 -0.219654367 -0.695184739 ## tlc -0.110228154 -0.162810122 -0.457317716 spca$loadings
##                PC1         PC2         PC3
## age     0.00000000 0.000000000  0.00000000
## sex     0.00000000 0.000000000  0.00000000
## height  0.00000000 0.733126213  0.05697345
## weight  0.00000000 0.665813010  0.00000000
## bmp     0.00000000 0.138344180 -0.24381875
## fev1    0.00000000 0.000000000  0.00000000
## rv     -0.93549166 0.008937508  0.31811819
## frc    -0.35153603 0.000000000 -0.76860451
## tlc    -0.03574612 0.000000000 -0.49533259
sir.evectors[,1:3]
##                Dir1        Dir2          Dir3
## age     0.522789883  0.13916550 -0.5312061369
## sex    -0.679560934 -0.97697688 -0.8253598603
## height  0.126366947  0.05554471  0.0006460316
## weight -0.424207544 -0.04995398  0.1598702513
## bmp     0.246745406  0.03736733 -0.0653813858
## fev1   -0.061155224  0.11955479 -0.0646415518
## rv     -0.020168151 -0.01346782  0.0023431571
## frc     0.062581459  0.06497255 -0.0063135937
## tlc    -0.004536585  0.02181833 -0.0504343009

Looking at the first component of each method we see that SPCA has six zero coefficients (age, sex, height, weight, bmp, and fev1). The loadings for PCA and SIR are quite different as a lot of the values are not very similar, take for instance age, for PCA it is quite small while for SIR it is large. Another example is sex, PCA is near zero while SIR has a value very high at -0.68. The only variable that is similar in loadings between PCA and SIR is height. Between all three methods, there are not any that are similar.

For the second component, none of the loadings are similar for any of the variables. PCA and SIR are similar for age. PCA and SPCA have similar loadings for height, weight, and bmp. SPCA and SIR do not have any similar variable loadings.

For the third component, again no variable loadings are similar. Between PCA and SPCA the similar loadings are for bmp, rv, frc, and tlc. SPCA and SIR do not have any similar variable loadings.

# Part 2

Now we will try out principal components regression, ordinary least squares based on SIR directions, partial least squares, and sparse partial least squares (SPLS), and compare their (predictive) performance with the best subset selection, the lasso, and ridge regression.

Split our data into a test and train set:

set.seed(2)
train.size = dim(cystfibr)[1] / 2
train = sample(1:dim(cystfibr)[1], train.size)
test = -train
cystfibr.train = cystfibr[train, ]
cystfibr.test = cystfibr[test, ]

## Principal Components Regression

Let us import the libary necessary for PCR:

library(pls)
##
## Attaching package: 'pls'
## The following object is masked from 'package:stats':
##
##     loadings

Now to fit our PCR model to the data:

pcr = pcr(pemax~., data = cystfibr, scale = T, validation = "CV")
validationplot(pcr, val.type="MSEP")

We see that there is a significant drop in the error for more than one component of PCR. What would a reasonable number of compenents be? Seems to be 3.

Let us try making a prediction using PCR and with the number of components at 3:

pcr.pred = predict(pcr, cystfibr.test, ncomp=3)
mean((cystfibr.test$pemax - pcr.pred)^2) ## [1] 728.2685 We obtain a test error of 728.2685. ## Ordinary Least Squares based on SIR Directions We need to first multiply the data matrix with the SIR directions to obtain our predictors: cystfibr.sir =cystfibr*sir.evectors[,1:3] Split it into test and train set: set.seed(4) train.size = dim(cystfibr.sir)[1] / 2 cystfibr.sir.train = sample(1:dim(cystfibr.sir)[1], train.size) cystfibr.sir.test = -cystfibr.sir.train cystfibr.sir.train = cystfibr.sir[train, ] cystfibr.sir.test = cystfibr.sir[test, ] Now that we have our data set with the SIR loadings implanted, we can bulid an OLS model. ols = lm(pemax~., data = cystfibr.sir.train) summary(ols) ## ## Call: ## lm(formula = pemax ~ ., data = cystfibr.sir.train) ## ## Residuals: ## 5 17 14 4 20 19 3 16 8 ## 2.829 -3.409 -28.540 -1.620 -9.353 10.286 23.901 21.435 5.748 ## 9 18 22 ## 15.641 -17.767 -19.152 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -46.92597 28.93353 -1.622 0.246 ## age 0.55341 2.70764 0.204 0.857 ## sex -77.66920 92.88606 -0.836 0.491 ## height 0.75658 0.69617 1.087 0.391 ## weight -2.26885 1.65041 -1.375 0.303 ## bmp -1.62399 0.74081 -2.192 0.160 ## fev1 -2.55864 1.92482 -1.329 0.315 ## rv -0.04472 0.26655 -0.168 0.882 ## frc -0.34382 0.50536 -0.680 0.566 ## tlc -0.74816 0.46767 -1.600 0.251 ## ## Residual standard error: 38.86 on 2 degrees of freedom ## Multiple R-squared: 0.7579, Adjusted R-squared: -0.3314 ## F-statistic: 0.6958 on 9 and 2 DF, p-value: 0.7127 Now let us see what our test error is: ols.pred = predict(ols, cystfibr.sir.test) mean((cystfibr.sir.test$pemax - ols.pred)^2)
## [1] 11255.69

We obtain a test error rate of 11255.69, which is larger than our PCR test error.

## Partial Least Squares

Now let us fit PLS to our data.

pls = plsr(pemax~., data=cystfibr.train, scale=T, validation="CV")
validationplot(pls, val.type="MSEP")

Accoring to our plot, the number of components with the least amount of error is 2.

Now let us fit a PLS model with 2 components and see what the test error would be:

pls.pred = predict(pls, cystfibr.test, ncomp=2)
mean((cystfibr.test$pemax - pls.pred)^2) ## [1] 787.2785 We obtain a test error of 787.2785 which is a little larger than PCR. ## Sparse Partial Least Squares Let us import the necessary library: library(spls) ## Sparse Partial Least Squares (SPLS) Regression and ## Classification (version 2.2-2) We will first use cross-validation to pick the optimal values of ‘eta’ and ‘k’, this will also plot a heatmap of v-fold cross-validated mean squared prediction error and return optimal eta (thresh- olding parameter) and K (number of hidden components). cv <- cv.spls( cystfibr.train[,c(1:9)], cystfibr.train[,10], eta = seq(0.1,0.9,0.1), K = c(1:5) ) ## eta = 0.1 ## eta = 0.2 ## eta = 0.3 ## eta = 0.4 ## eta = 0.5 ## eta = 0.6 ## eta = 0.7 ## eta = 0.8 ## eta = 0.9 ## ## Optimal parameters: eta = 0.4, K = 2 Our optimal parameters are eta = 0.4 and k = 2. Now we can fit our SPLS model using the optimal parameters. spls = spls(cystfibr.train[,c(1:9)], cystfibr.train[,10], eta = cv$eta.opt, K = cv$K.opt ) spls.pred = predict(spls, cystfibr.test[,c(1:9)]) mean((cystfibr.test$pemax - spls.pred)^2)
## [1] 809.5094

We see that the test error for SPLS is 809.5094.

## Best Subsets Selection

We will use a 10-fold Cross-Validation approach, which entails randomly dividing the dataset into 10 groups of equal size. Nine of the groups will be used to train our model while the last group will be used as a test set. The process is repeated until all 10 groups have been used as a test set only once.

Fit the Best Subsets Selection Model:

library(leaps)

Create the folds for our cross-validation:

k=10
set.seed(1)
folds=sample(1:k,nrow(cystfibr),replace=TRUE)
cv.errors=matrix(NA,k,9, dimnames=list(NULL, paste(1:9)))

Create a predict function for best subsets selection:

predict.regsubsets=function(object ,newdata ,id,...){
form=as.formula(object$call [[2]]) mat=model.matrix(form,newdata) coefi=coef(object ,id=id) xvars=names(coefi) mat[,xvars]%*%coefi } Create a for loop for cross-validation: for(j in 1:k){ best.fit=regsubsets(pemax~.,data=cystfibr[folds!=j,],nvmax=9) for(i in 1:9){ pred=predict(best.fit,cystfibr[folds==j,],id=i) cv.errors[j,i]=mean( (cystfibr$pemax[folds==j]-pred)^2)
}
}

This gives us a matrix with our cross-validation folds for the various best subset selections models with different predictors.

Now lets average all the different folds for each model with a different amount of predictors.

mean.cv.errors=apply(cv.errors ,2,mean)
mean.cv.errors
##         1         2         3         4         5         6         7
##  841.6223  732.7934  884.7167  906.5749 1155.4890 1299.2207 1284.3679
##         8         9
## 1269.6000 1342.6793

A plot of the errors for all the best models for each number of predictors.

par(mfrow=c(1,1))
plot(mean.cv.errors ,type="b")

We see that the model with 2 coeffieicents has the lowest test error at 732.7934. Let us see more information about the model.

reg.best=regsubsets(pemax~.,data=cystfibr ,nvmax=9)
coef(reg.best,2)
## (Intercept)      weight         bmp
##  124.829749    1.640256   -1.005393

We see that the variables that are in this model are weight and bmp.

## Ridge Regression

Next up, let us see how Ridge Regression performs.

library(glmnet)
## Loading required package: Matrix
## Loading required package: foreach
## Loaded glmnet 2.0-16

We will fit the ridge regression to the training data and also will pick lambda based on cross-validation:

train.mat = model.matrix(pemax~., data=cystfibr.train)
test.mat = model.matrix(pemax~., data=cystfibr.test)
grid = 10 ^ seq(4, -2, length=100)
mod.ridge = cv.glmnet(train.mat, cystfibr.train[, "pemax"], alpha=0, lambda=grid, thresh=1e-12)
## Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations
## per fold
lambda.best = mod.ridge$lambda.min lambda.best ## [1] 18.73817 The best lambda is 18.73817 or ~19. Now we will run the model with the best lambda. ridge.pred = predict(mod.ridge, newx=test.mat, s=lambda.best) mean((cystfibr.test[, "pemax"] - ridge.pred)^2) ## [1] 722.0995 Our test error (RSS) is 722.0995. Lasso regression should bring some our our coefficients to zero. Let us fit the Lasso regression to the training set. ## Lasso mod.lasso = cv.glmnet(train.mat, cystfibr.train[, "pemax"], alpha=1, lambda=grid, thresh=1e-12) ## Warning: from glmnet Fortran code (error code -100); Convergence for 100th ## lambda value not reached after maxit=100000 iterations; solutions for ## larger lambdas returned ## Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations ## per fold lambda.best = mod.lasso$lambda.min
lambda.best
## [1] 9.326033

The best lambda is 12.32847 or rounded to ~12. Now let us predict using this lambda.

lasso.pred = predict(mod.lasso, newx=test.mat, s=lambda.best)
mean((cystfibr.test[, "pemax"] - lasso.pred)^2)
## [1] 913.5076

Our test error is 982.6509 which is worse than Ridge.

Let us take a look at the coefficients:

mod.lasso = glmnet(model.matrix(pemax~., data=cystfibr), cystfibr[, "pemax"], alpha=1)
lasso.coef = predict(mod.lasso, s=lambda.best, type="coefficients")
lasso.coef
## 11 x 1 sparse Matrix of class "dgCMatrix"
##                       1
## (Intercept) 79.80908669
## (Intercept)  .
## age          0.50967601
## sex          .
## height       .
## weight       0.50342614
## bmp          .
## fev1         0.07480493
## rv           .
## frc          .
## tlc          .

How many non-zero coefficients are there?

lasso.coef[lasso.coef!=0]
## <sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient
## [1] 79.80908669  0.50967601  0.50342614  0.07480493

There are 2 non-zero variables (excluding the intercept) and they are: age and weight.

# Conclusion

Now that we have performed a variety of different methods on the cystfibr dataset, we can summarize the performances of the models.

We have fit a model to the cystfibr data using the following methods:
-Principal Component Regression (Test Error = 728.2685) -Ordinary Least Squares Regression using SIR directions (Test Error = 11255.69)
-Partial Least Squares (Test Error = 787.2785)
-Sparse Partial Least Squares (Test Error = 809.5094)
-Best Subsets Selection (Test Error = 732.7934)
-Ridge Regression (Test Error = 722.0995)
-Lasso Regression (Test Error = 982.6509)

To summarize, we performed a variety of methods (Best Subsets, Ridge, Lasso, PCR, PLS, SPLS, OLS with SIR loadings) and picked the method based on test error alone which was Ridge Regression followed closely by PCR. Sparse Partial Lease Squares seemed to perform slightly better than average overall. The worst method seemed to be utilizing OLS with the SIR directions. All in all, what method you use all depends on the type of data you have. All in all, what method you use all depends on the type of data you have and what you are looking to acheive. There is no one-size fits all method.