Video Summary and Transcription
The video introduces Max Martin, a Swedish producer and songwriter, and discusses how song repetition has increased over the years. It explains how MyPart serves as a matchmaker between songwriters and singers by analyzing lyrical, structural, harmonic, and melodic features of songs. Repetition analysis in songwriting is highlighted, using data compression techniques to measure repetitiveness. The video describes how the Levenstein distance measures the similarity between two lyric segments, helping to determine if they should be considered similar. Unique music features such as melodic range and the percentage of unique chords are also analyzed. MyPart uses the extracted data to train models that predict how well a song matches a singer's preferences, providing a personalized list of relevant songs.
1. Introduction to Repetition in Songwriting
Welcome, everyone. Meet Max Martin, a Swedish producer and songwriter. We're the intelligent matchmaker between songwriters and singers. Let's talk about repetition. Songs are getting more repetitive over the years. The best songs are more repetitive than the rest. We analyze segments and find similarities between strings.
Welcome, everyone. There's someone I'd like you to meet. This is Max Martin, a Swedish producer and songwriter that won Songwriter of the Year Award. In fact, he won Songwriter of the Year Award almost every year in the last decade. He wrote hit songs for everyone, from Snoop Dogg to Ariana Grande. Apparently, it's hard to find good songwriters out there. That's where we come in.
My name is Yamaa. I'm a musician and a data scientist, and I'm part of the MyPart core team. We're the intelligent matchmaker between independent songwriters and performing singers. For each singer, we build a benchmark of reference songs that represent what the singer is looking for. From these songs, we extract lyrical, structural, harmonic and melodic features on which we train our models. According to the model's predictions, we prioritize all the songs submitted to the singer, presenting them with the songs that are most relevant to them. My goal in the next few minutes is to give you scientifically proven tools so that you too can become the next songwriter of the year.
The first feature we're going to talk about is repetition. This graph was taken from an essay written by Colin Morris. Morris analyzed top-charting songs from the last 60 years. That's the x-axis. The blue line represents the average of the top 100 songs, and the orange one represents the top 10. Colin measured repetitiveness by compressing the lyrics file and looking at the ratio of the size of the file before and after the compression. These are the percentages in the y-axis. Compression is a nice measure because it uses repeated sequences, so it's not only about which words repeat, but also about their order. We can see that the songs are getting more repetitive throughout the years, and you might have heard that before, usually as criticism, but we can also see that the best songs, those that reached the top 10, were on average more repetitive than the rest in every year, and that the gap is widening. So what else can we do with repetition? In this song, Shape of You by Ed Sheeran, we can see six segments in the lyrics. Two of them repeat twice. In that case, we call the first one fricos, and the second one cos. We'll consider the rest to be verses. In this song, there's one sentence that repeats the most, and that is, I'm in love with your body. So we call it the refrain. But what happens if the chorus in one of its appearances ends with a few more oi? It should still be recognized as chorus, right? So we need to find an algorithm that checks the similarity of two strings and a threshold that says when do we call two strings similar.
2. String Metric and Similarity Threshold
Levenstein distance measures the changes needed to transform one string into another. For example, transforming Justin Timberlake into Justin Bieber requires replacing letters and deleting others. The Timberlake Bieber distance is 8, making them 73% similar. We consider segments similar if they are 70% alike.
We ended up with Levenstein distance, which is a string metric that measures the changes you need to do to transform one string into the other. For example, if you would like to transform Justin Timberlake into Justin Bieber, all you have to do is replace the T with a B to get Justin Bimberlake, replace the M with an E to get Justin Bieber Lake, and then delete the letters L, A, K and E to get Justin Bieber. If we count all those replacements and deletions, we get that the Timberlake Bieber distance is 8, and in relation to the length of the string, Justin Bieber and Justin Timberlake are 73% similar. The threshold we got to is 70%, meaning if two segments or sentences are 70% alike, we would consider them similar. So for that matter, we consider the adjustments related.
Comments