Analysis of student funding and students standardized tests scores to make decisions regarding school budgets and priorities.We are using Python and Jupyter Notebook along with The Pandas Library.
After the complete analysis of the school district student data,the grades of the ninth graders at Thomas High School have been changed. While administrators do not know the full extent of this academic integrity violation, they want to uphold the standards of state testing and have turned to us for help.After assessing the situation with the school superintendent and Maria, we decide the best approach is to: Replace the ninth-grade math and reading scores from Thomas High School and Keep all other data associated with the ninth-grade students and Thomas High School intact.
Analysis is done in the Jupyter Notebook in the PyCitySchools_Challenge.ipynb file.
Objectives The goals of this challenge are: Filter DataFrames using logical operators. Replace the incorrect values with NaN. Explain how our PyCitySchools analysis changes after we handle the incorrect data.
- After creating a duplicate of PyCitySchools.ipynb and renaming it PyCitySchools_Challenge_testing.ipynb; we corrected the students names so there are no professional prefixes or suffixes. Also, replaced the reading and math scores for ninth graders at Thomas High School with NaN We used the pandas.DataFrame.loc to equate the math and reading scores of 9th graders from Thomas High School to np.nan.To use np.nan, we need to import Numpy.
After we replaced the reading and math scores for ninth graders at Thomas High School with NaN, DataFrame looks like as below:
After merging the clean student data with the school dataset and checking the column order for all the DataFrames and number formatting same as what was covered in this module, we started analysis on it. After replacing the reading and math scores,and recreating the district and school summary DataFrames we have to analyse by answering the questions for each step.
- Before DATA Cleanup:
Average Math Score, Average Reading Score, % Passing Math, % Passing Reading, % Overall Passing are:
79.0, 81.9, 75, 86, 65
- After DATA Cleanup:
Average Math Score, Average Reading Score, % Passing Math, % Passing Reading, % Overall Passing are:
78.9, 81.9, 74.8, 85.7, 64.9
- Before DATA Cleanup:
Thomas High School's % Overall Passing = 91, placing second
- After DATA Cleanup:
% Overall Passing = 65, placing eighth!
Observation: Overall ranking order change due to THOMAS HS, which slipped from second to eighth position.
For this we have to analyse How does replacing the ninth graders’ math and reading scores affect Thomas High School’s performance, relative to the other schools?
- Before Cleanup:original math and reading scores were
- After Cleanup:
"Percentage passing" score is reduced as Total number of students (denominator) remains unchanged, but total passing value (numerator) is reduced by the number of removed 9th grade scores.(461) Thomas HS 9th grade math & reading scores set to "NaN" Totals for passing math & reading across grades are reduced as all of 9th grade scores are equivalent to failing Average scores calculation not significantly affected by removal of 9th grade scores, seems due to count() function NOT including 9th grade scores = nan We calcuated number of students with a math grade
-
Before cleanup: Thomas High School 1635
-
After cleanup: Thomas High School 1174
Thomas HS is in the spending bucket "$630-644"
Removing 9th grade scores reduces the "% Passing Math", "% Passing Reading" and "% Overall Passing" scores for spending bucket "$630-644" as follows
- Before:
- After:
Thomas HS is in the "Medium (1000-2000)" size bucket Removing 9th grade scores reduces the "% Passing Math", "% Passing Reading" and "% Overall Passing" scores for size bucket "Medium (1000-2000)" as follows
- Before:
- After:
Thomas HS is in the "CHARTER" type bucket Removing 9th grade scores reduces the "% Passing Math", "% Passing Reading" and "% Overall Passing" scores for type bucket "CHARTER" as follows
- Before:
- After:
Summary: Summarize four changes in the updated school district analysis after reading and math scores for the ninth grade at Thomas High School have been replaced with NaNs.
Overall ranking order change due to THOMAS HS, which slipped from second to eighth position. In case of Math and Reading scores, as 9th grade scores are NaN , analysis of the school records based on school spending, school size, and school type are not substancially affected.