Comparison of ML Models with ILPD (Indian Liver Patient Dataset) Data Set which contains 10 variables that are age, gender, total Bilirubin, direct Bilirubin, total proteins, albumin, A/G ratio, SGPT, SGOT and Alkphos.
This notebook would implement the follwing machine learning methods:
- Logistic Regression
- Decision Trees
- Random Forest
- Neural Networks: we will use Multi-layer perceptron.
Response Variable: field used to split the data into two sets (patient with liver disease (value = 2), or no disease (value = 1))
This data set contains 416 liver patient records and 167 non liver patient records.The data set was collected from north east of Andhra Pradesh, India. Selector is a class label used to divide into groups(liver patient or not). This data set contains 441 male patient records and 142 female patient records. Any patient whose age exceeded 89 is listed as being of age "90".
- Age Age of the patient
- Gender Gender of the patient
- TB Total Bilirubin
- DB Direct Bilirubin
- Alkphos Alkaline Phosphotase
- Sgpt Alamine Aminotransferase
- Sgot Aspartate Aminotransferase
- TP Total Protiens
- ALB Albumin
- A/G Ratio Albumin and Globulin Ratio
- Selector field used to split the data into two sets (labeled by the experts)
- Bendi Venkata Ramana, Prof. M. S. Prasad Babu and Prof. N. B. Venkateswarlu, “A Critical Comparative Study of Liver Patients from USA and INDIA: An Exploratory Analysis�, International Journal of Computer Science Issues, ISSN :1694-0784, May 2012.
- Bendi Venkata Ramana, Prof. M. S. Prasad Babu and Prof. N. B. Venkateswarlu, “A Critical Study of Selected Classification Algorithms for Liver Disease Diagnosis�, International Journal of Database Management Systems (IJDMS), Vol.3, No.2, ISSN : 0975-5705, PP 101-114, May 2011.