data, data

Data, data is a deep dive into the sonic and lyrical universe of Jorge Drexler — exploring his work through the lens of data science and music theory.


I analyzed all of Drexler’s officially released songs using APIs, scraped lyrics, and combined tools from natural language processing, emotion modeling, and music theory. The result? A multi-layered view of the artist’s evolution, themes, and creative patterns.


  • For data collection i used pandas, BeautifulSoup, Spotipy, the Genius API, and Spotify's Web API
  • Analysis and modeling using pandas, NumPy, Matplotlib, Seaborn, scikit-learn, SciPy, NLTK, wordcloud, and py-lex
  • All wrapped in Python 3, Jupyter Notebook, and the occasional PyCharm pass

Beyond the data, what made this special was that Jorge Drexler himself acknowledged the work — in a tweet that completely made my year.

visuals

Emotions through time

🧠 Emotional trends in Drexler's lyrics over time

Tempo by albums

🎵 Tempo patterns by album — from slow ballads to faster experiments

Musical keys

🎼 Most common keys used across his discography

Top songs by word count

📝 The songs with the most lyrical density

Wordcloud

☁️ Wordcloud of Drexler’s most frequent words

Lyrical vs lexical density

📚 Comparing lyrical and lexical complexity

Correlation between emotions

🧩 Correlation matrix between emotional markers

from the artist himself

press

  • El Observador – one of Uruguay's major newspapers (🇺🇾 Spanish)
  • Redacción – Argentinean digital media spotlight (🇦🇷 Spanish)
← back to all projects