Welcome to the PracticalMachineLearning wiki!
library(AppliedPredictiveModeling)
library(caret)
library(randomForest)
trainData <- read.csv("D:\aa\pml-training.csv", na.strings=c("NA",""), header=TRUE)
testData<- read.csv("D:\aa\pml-testing.csv", na.strings=c("NA",""), header=TRUE)
yModel <- train(trainData$classe ~ ., method="rf", trControl=trainControl(method = "cv", number = 10), data=trainData,importance = TRUE)
I got this : Error in contrasts<-
(
tmp`, value = contr.funs[1 + isOF[nn]]) 👍 contrasts can be applied only to factors with 2 or more levels
trainData<-na.omit(trainData)
yModel <- train(trainData$classe ~ ., method="rf", trControl=trainControl(method = "cv", number = 10), data=trainData,importance = TRUE)
Same Error Error in contrasts<-
(
tmp`, value = contr.funs[1 + isOF[nn]]) 👍 contrasts can be applied only to factors with 2 or more levels
trainColumns <- colnames(trainData)
validColumns <- function(x) {
as.vector(apply(x, 2, function(x) length(which(!is.na(x)))))
}
cols <- validColumns(trainData)
columnsToRemove <- c()
for (i in 1:length(cols)) {
if (cols[i] < nrow(trainData)) {
columnsToRemove <- c(columnsToRemove, trainColumns[i])
}}
trainData <- trainData[,!(names(trainData) %in% columnsToRemove)]
trainData<-trainData[,8:length(colnames(trainData))]
yModel <- train(trainData$classe ~ ., method="rf", trControl=trainControl(method = "cv", number = 10), data=trainData,importance = TRUE)
It worked but couldnot see the result it took so long couldnot wait more
train <- createDataPartition(y=trainData$classe, p=0.1, list=FALSE) trainData <- trainData[train,] myModel <- train(trainData$classe ~ ., method="rf", trControl=trainControl(method = "cv", number = 10), data=trainData,importance = TRUE) print(myModel, digits=3)
It worked and I could see the result: The accuracy was promissing so I decided to use the testdata and submit. Summary of sample sizes: 1471, 1473, 1473, 1475 Resampling results across tuning parameters: mtry Accuracy Kappa Accuracy SD Kappa SD
2 0.926 0.906 0.0181 0.0230
27 0.926 0.906 0.0218 0.0277
52 0.918 0.896 0.0215 0.0273
testData <- testData[,!(names(testData) %in% columnsToRemove)]
testData<-testData[,8:length(colnames(testData))]
predictions <- predict(myModel, newdata=testData)
print(predictions)
Output was [1] ..... (removed for privacy) Levels: A B C D E Only One were not correct so I submitted by the next alphabet it was correct