• Home
  • About
    • Kshitij Gurung photo

      Kshitij Gurung

      Welcome!

    • Learn More
    • Email
    • LinkedIn
    • Instagram
    • Github
    • Youtube
  • Projects
    • All Projects
  • Hobbies
  • Resume

College Scoreboard: PCA and k-means clustering

06 May 2020

Reading time ~1 minute

Introduction:

In this project, I attempt to classify, compare, and study 135 different private colleges in the U.S. bsaed on 14 different variables like admission rate, averaet SAT score, percentage of white and non-white, faculty salary, etc. I implement unsupervised learning techniques like Principal Component Analysis (PCA) and K-means clustering to achieve my goal.

Main goal of this project:

  • Implement PCA to identify qualitative features of the U.S. colleges
  • Determine the candidate for optimal number of clusters using elbow plot
  • Use k-means clustering on top of PCA to classify, compared, and observe any patters among colleges

How users can get started with the project?

  • CollegeScoreboard.Rmd is the main file
  • Download all the files and change the directory of the data as you read in Colleges2015.csv.
  • Run the CollegeScoreboard.Rmd
  • The final report is CollegeScoreboard.pdf

Source code


fig: Clustering results


Share Post +1