• Skip to main content
  • Skip to primary sidebar

Yu Carl

Anything & Everything Data science

  • Blog
  • About me
You are here: Home / Uncategorized / Comparison of data dimensionality reduction methods

Comparison of data dimensionality reduction methods

October 5, 2021 by yucarl

Today, I am going to compare three dimensionality reduction methods, they are PCA (Principle Component), PLS (Partial least squares) and UMAP(Uniform manifold approximation and projection). The dataset we use today is Billboard top 100 songs, comes from #TidyTuesday.

Explore data

Our goal is to reduce the dimensionality of the features of Billboard Top 100 songs, connecting the positions of the songs with mostly audio features available from Spotify.

library(tidyverse)

## billboard ranking data
billboard <- readr::read_csv("billboard.csv")

## spotify feature data
audio_features <- readr::read_csv("audio_features.csv")

Using the data.table package to import .csv files is faster, but Billboard Top 100 songs dataset is not very large, so readr is sufficient.

urlweek_idweek_positionsongperformersong_idinstanceprevious_week_positionpeak_positionweeks_on_chart
1http://www.billboard.com/charts/hot-100/1965-07-177/17/196534.00Don’t Just Stand TherePatty DukeDon’t Just Stand TherePatty Duke1.0045.0034.004.00
2http://www.billboard.com/charts/hot-100/1965-07-247/24/196522.00Don’t Just Stand TherePatty DukeDon’t Just Stand TherePatty Duke1.0034.0022.005.00
3http://www.billboard.com/charts/hot-100/1965-07-317/31/196514.00Don’t Just Stand TherePatty DukeDon’t Just Stand TherePatty Duke1.0022.0014.006.00
4http://www.billboard.com/charts/hot-100/1965-08-078/7/196510.00Don’t Just Stand TherePatty DukeDon’t Just Stand TherePatty Duke1.0014.0010.007.00
5http://www.billboard.com/charts/hot-100/1965-08-148/14/19658.00Don’t Just Stand TherePatty DukeDon’t Just Stand TherePatty Duke1.0010.008.008.00
6http://www.billboard.com/charts/hot-100/1965-08-218/21/19658.00Don’t Just Stand TherePatty DukeDon’t Just Stand TherePatty Duke1.008.008.009.00

Filed Under: Uncategorized

Primary Sidebar

Recent Posts

  • (no title)
  • Comparison of data dimensionality reduction methods
  • Hello world!

Archives

  • October 2021

Categories

  • Data Mining
  • Feature engineering
  • Uncategorized
  • Blog
  • About me

Copyright © 2025 · KIMI on Genesis Framework · WordPress · Log in