Skip to content

tayfuntuna/PracticalMachineLearning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 

Repository files navigation

Welcome to the PracticalMachineLearning wiki!

1. I loaded library and files(after manually download) and try to train by RF

library(AppliedPredictiveModeling)
library(caret)
library(randomForest)
trainData <- read.csv("D:\aa\pml-training.csv", na.strings=c("NA",""), header=TRUE)
testData<- read.csv("D:\aa\pml-testing.csv", na.strings=c("NA",""), header=TRUE)
yModel <- train(trainData$classe ~ ., method="rf", trControl=trainControl(method = "cv", number = 10), data=trainData,importance = TRUE)

I got this : Error in contrasts<-(tmp`, value = contr.funs[1 + isOF[nn]]) 👍 contrasts can be applied only to factors with 2 or more levels

2. I tried to remove empty row/columns and unnecessary columns before train by RF

trainData<-na.omit(trainData)
yModel <- train(trainData$classe ~ ., method="rf", trControl=trainControl(method = "cv", number = 10), data=trainData,importance = TRUE)

Same Error Error in contrasts<-(tmp`, value = contr.funs[1 + isOF[nn]]) 👍 contrasts can be applied only to factors with 2 or more levels

3. I removed empty row/columns and unnecessary columns in another way before train by RF

trainColumns <- colnames(trainData)
validColumns <- function(x) {
as.vector(apply(x, 2, function(x) length(which(!is.na(x)))))
}
cols <- validColumns(trainData)
columnsToRemove <- c()
for (i in 1:length(cols)) {
if (cols[i] < nrow(trainData)) {
columnsToRemove <- c(columnsToRemove, trainColumns[i])
}}
trainData <- trainData[,!(names(trainData) %in% columnsToRemove)]
trainData<-trainData[,8:length(colnames(trainData))]
yModel <- train(trainData$classe ~ ., method="rf", trControl=trainControl(method = "cv", number = 10), data=trainData,importance = TRUE)

It worked but couldnot see the result it took so long couldnot wait more

4. I took only a portion from trainSet and processed

train <- createDataPartition(y=trainData$classe, p=0.1, list=FALSE) trainData <- trainData[train,] myModel <- train(trainData$classe ~ ., method="rf", trControl=trainControl(method = "cv", number = 10), data=trainData,importance = TRUE) print(myModel, digits=3)

It worked and I could see the result: The accuracy was promissing so I decided to use the testdata and submit. Summary of sample sizes: 1471, 1473, 1473, 1475 Resampling results across tuning parameters: mtry Accuracy Kappa Accuracy SD Kappa SD
2 0.926 0.906 0.0181 0.0230
27 0.926 0.906 0.0218 0.0277
52 0.918 0.896 0.0215 0.0273

5. I tested the file from my model and submitted the results

testData <- testData[,!(names(testData) %in% columnsToRemove)]
testData<-testData[,8:length(colnames(testData))]
predictions <- predict(myModel, newdata=testData)
print(predictions)

Output was [1] ..... (removed for privacy) Levels: A B C D E Only One were not correct so I submitted by the next alphabet it was correct RStudioScreenShoot

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published