Using Principal Component Analysis Combined with K-Nearest Neighbors and Logistic Regression
This project explores music genre classification using machine learning techniques applied to Spotify audio features. By combining Principal Component Analysis (PCA) with K-Nearest Neighbors and Logistic Regression, we achieved impressive classification accuracies on five distinct musical genres.
We analyzed five distinct genres, each with unique audio characteristics that make them suitable for classification:
Our approach combined dimensionality reduction with supervised learning techniques:
Gathered 100 songs per genre using Spotify Web API
Extracted 12 audio features including energy, danceability, and valence
Standardized features and removed irrelevant variables
Reduced dimensions from 10 to 6 while retaining 91% variance
Trained KNN (k=7) and Logistic Regression classifiers
Cross-validation and confusion matrix analysis
PCA effectively reduced our dataset dimensionality while preserving most of the variance:
Key Finding: The first 6 principal components captured 91.3% of the total variance, allowing us to reduce computational complexity while maintaining predictive power. The relatively low variance in the first component (40.21%) indicates that our audio features are not highly correlated, which is beneficial for classification.
Heavy Metal | Latino Pop | Neo Soul | Punk Pop | Study Music | |
---|---|---|---|---|---|
Heavy Metal | 25 | 0 | 1 | 2 | 0 |
Latino Pop | 0 | 11 | 3 | 0 | 0 |
Neo Soul | 0 | 1 | 9 | 0 | 0 |
Punk Pop | 4 | 0 | 0 | 20 | 0 |
Study Music | 0 | 0 | 0 | 0 | 24 |
Heavy Metal | Latino Pop | Neo Soul | Punk Pop | Study Music | |
---|---|---|---|---|---|
Heavy Metal | 25 | 0 | 2 | 1 | 0 |
Latino Pop | 0 | 12 | 2 | 0 | 0 |
Neo Soul | 0 | 1 | 9 | 0 | 0 |
Punk Pop | 2 | 2 | 0 | 20 | 0 |
Study Music | 0 | 0 | 0 | 0 | 24 |
Study Music achieved 100% accuracy in both models due to its distinct instrumental characteristics and high acousticness values.
Heavy Metal and Punk Pop showed some confusion due to similar instrumentation, loudness, and tempo characteristics.
Logistic Regression slightly outperformed KNN (90% vs 89%) with better handling of multi-class boundaries.
Energy, Acousticness, and Instrumentalness were the most distinguishing features across genres.
Several enhancements could further improve the classification accuracy:
Increasing the number of songs per genre could improve model generalization and reduce overfitting, especially for underperforming genres like Neo Soul.
Implementing deep learning approaches like CNNs, RNNs, or LSTMs could capture more complex patterns in audio features.
Including rhythm patterns, temporal dynamics, and artist metadata could provide richer feature representations.
Developing a web application for real-time genre prediction would demonstrate practical applications of the model.
Dive deeper into the technical details and see the code in action
View Google Colab GitHub Repository Download Paper (PDF)