Skip to content

Latest commit

 

History

History
8 lines (7 loc) · 616 Bytes

README.md

File metadata and controls

8 lines (7 loc) · 616 Bytes

classif-text-ining

Text Mining used for a classification task using PubMed unstructured data.

Text Mining tools were used for classification task. We considered unstructured medical data for two different topics - Human Immunodeficiency Virus (HIV) and human papilloma virus (HPV) taken from the National Center for Biotechnology Information (NCBI) databases using the R package RISmed. Text mining processing strategies were applied. We considered the Document Term Matrix struture and performed dimensional using information gain. Results show an accuracy of 81.3%-94.6% when predicting the class of documents.