-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a simple NaiveBayes model #80
Comments
I still do not want to include the functions which does not related to hashing of a matrix. However, there is a compromising way to put the NaiveBayes into the package. I did include a ML algorithm FTPRL into this package, but it is not an official function for this package. It is for the vignette. So, I suggest you to implement the NaiveBayes like what I did for FTPRL and write a vignette for showing how to do analysis with both FeatureHashing and NaiveBayes. How do you think? Best, |
My concern is that Vignette content is not well referenced by most R documentation search motor. Another solution would be to put the two algorithms in a kind of FeatureHashing "accessory" package like many packages do for datasets. IMO Naive Bayes is too simple to deserve its own package but with FTPRL it would make sense... However it means maintaining two packages, may be not that good idea. Btw I have no strong feeling against putting NB in a Vignette text too :-) Kind regards, |
I agree your point that the naivebayse is too simple to deserve its own package. However, writing them in vignette makes sense to me. For advance users, they can write their own implementation. For beginners, they need vignette to understand why and when they should use the naivebayse with FeatureHashing. I think we could put the implementation under Wush |
It sounds ok to me. |
@wush978 , I understand that this package has a very specific purpose : hashing of a matrix.
It is not a general ML package at all, and should not try to mimick carret or scikit.
However it will be mainly used in ML context.
I was thinking to add a simple NaiveBayes function to the package.
The idea is to provide a very intuitive model, with very few parameter which can be used by many as a base line for their tests.
I have noticed that none of the existing package support
dgCMatrix
matrix and even if you convert it to spam format which is sometimes supported, the matrix is converted to a dense data.frame before computation!!!! So you almost always finish with a memory allocation error.I have already written a basic code for sparse Matrix working well on
dgCMatrix
matrix.Moreover this basic model may be used in the example instead of more advanced one like XGBoost.
Can you tell me if you want me to make a PR?
Kind regards,
Michael
The text was updated successfully, but these errors were encountered: