Introduction:
In this project, I attempt to classify, compare, and study 135 different private colleges in the U.S. bsaed on 14 different variables like admission rate, averaet SAT score, percentage of white and non-white, faculty salary, etc. I implement unsupervised learning techniques like Principal Component Analysis (PCA) and K-means clustering to achieve my goal.
Main goal of this project:
- Implement PCA to identify qualitative features of the U.S. colleges
- Determine the candidate for optimal number of clusters using elbow plot
- Use k-means clustering on top of PCA to classify, compared, and observe any patters among colleges
How users can get started with the project?
- CollegeScoreboard.Rmd is the main file
- Download all the files and change the directory of the data as you read in Colleges2015.csv.
- Run the CollegeScoreboard.Rmd
- The final report is CollegeScoreboard.pdf