The purpose of this challenge is to analyze the effectiveness of various drugs on reducing the tumor size of a certain type of skin cancer in mice, with a focus on the drug Capomulin. The results are displayed in various graphs in order to visualize factors such as number of tracked timepoints per drug, sex distribution, final tumor volume at the end of treatment for the top four performing drugs, and mouse weight versus average tumor volume for the Capomulin treatment group.
This section includes a summary table showing the following tumor metrics for each drug treatment:
- mean tumor volume
- median tumor volume
- tumor volume variance
- standard deviation of tumor volume
- standard error of the mean of tumor volume
The bar charts created in this section both display the total number of timepoints for each drug treatment over the course of the study. One chart was created using the Pandas DataFrame.plot() method, while the other was created using Matplotlib's pyplot methods.
In this section, the final tumor volume (that is, the volume of the tumor at the end of the 45 day treatment regimen) was calculated for the four most promising drugs: Capomulin, Ramicane, Infubinol, and Ceftamin. This data was plotted in a boxplot, with any potential outliers displayed.
In this section, a line plot of tumor volume versus time point (over the course of the 45 day treatment regimen) was plotted for a single mouse with the Capomulin treatment group. The code is written such that each time it is run, a random mouse from the group is selected and its data displayed. Next, a scatter plot was created that displays the mouse weight versus average observed tumor volume for every mouse included in the Capomulin treatment group.
In this section, the correlation coefficient and linear regression model were calculated for mouse weight versus average observed tumor volume for every mouse included in the Capomulin treatment group. The correlation coefficient was found to be 0.84 (strong positive correlation) and the linear regression model was plotted on the scatter plot created in the previous section.
Data generated by Mockaroo, LLCLinks to an external site., (2022). Realistic Data Generator. Data for this dataset was generated by edX Boot Camps LLC, and is intended for educational purposes only.