Ten Million Words* About Pop Vocal Delivery (And Nick Jonas' Jealous)

A while back on the internet, I mentioned that I liked Nick Jonas’ “nuanced” vocal delivery on Jealous, and I got a response asking me if I could expand a bit on this.

This is a good question. Writers often use a lot of language which can be literarily satisfying, but also kinda meaningless, musically… and the word “nuanced” definitely leans in this direction. One of the things I’m working on is bringing more technical detail into my writing — and as a musician, vocal performance is hands-down one of my favorite technical topics.

In order to properly answer this question about “nuance”, we should start by breaking down the individual features of a vocal performance, so that we can discuss the subject with a certain degree of specificity. If you’ve ever taken music classes or sung in a choir, you might be familiar with some or most of this already — though these topics aren’t usually approached with an eye towards the particulars of pop music.

Like Drake, we’re starting from the bottom:

The most basic features of singing are time and pitch. In time, a vocalist can sing right on, or either lean into or lay back on the beat (ie. anticipating or lagging behind the exact rhythmic center). The consonant and vowel sounds which form a word provide additional options, for example the length of the transition from the closed N to the open O in the word “Now” can be used to shift a note’s timing. When the backing track has a strong or unusual groove, the vocalist needs to make a decision about how closely with the groove to sing.¹

For pitch, a vocalist can sing a note exactly on, or either rise or fall into the pitch. Once on pitch, a vocalist can use vibrato or other pitch inflections.² Transitioning between notes is another decision point: a vocalist can slide between notes (portamento), and the length of the slide is variable. Embellishments like runs (melismas) and turns can be added. Singing off-pitch is a valid decision… though you don’t hear it much on the Top 10 after around 2000.

And all of this also applies to the ends of notes as well. If you pay attention (particularly with vocalists who do a lot of multitracked harmony), the way in which a singer ends her notes is usually a notable feature of her delivery. From obvious weirdo stuff like the rising yodels on the chorus of Zombie,³ to more subtle variations in the “tightness” or “looseness” of harmonized lines, or creating subtle pauses to separate notes — there’s a lot of options here. Experience in formal settings tends to make singers more cognizant and particular about note endings.

For singers, dynamics dovetail somewhat with tone and timbre. In the realm of dynamics, a vocalist can sing at a constant volume, crescendo or decrescendo, or use less common dynamic techniques like tremolo or glottal stops.⁴ A vocalist can sing in full voice, a scream, a breathy tone, a whisper,⁵ etc. etc. and freely transition between these. Less common techniques like the vocal fry⁶ occasionally pop up. In certain styles of singing, often the use of tone becomes the predominant feature of the performance (eg. extreme metal,⁷ sexy R∓B, etc).

Because your voice is a part of your body, each individual will have a different “instrument”. Using (or not using) the diaphragm to support the voice makes a huge difference, and in the context of recording this can be used intentionally to achieve a desired tone. Subtle physical adjustments can direct the voice to resonate with different parts of the skull, which affects tone in various ways.

There’s also decisions about pronunciation. In pop, this mostly boils down to the shape of vowels. During the 2010s, I noticed many younger vocalists started bending their vowels in unusual ways (eg. O into OU or OE), particularly in acoustic/folky pop. This is just a guess, but I think it might come from increased exposure to the voices and vowels of international artists via the internet.⁸ Some pop artists (eg. Nicki, Gaga, etc) explicitly use extreme or theatrical pronunciation shifts.

There’s also some technical considerations when working with microphones, particularly with plosives (B, K, T) and sibilants (S, Sh). Experienced recording vocalists will often intentionally soften these by using slightly different mouth and tongue positions or unvoiced plosives.⁹ Engineers love this because it saves us time during editing and mixing.

Phrasing is the broad term for how you deal with these technical elements over “a line” of music. In pop, lines are generally sentence-like in verses, and choruses are a linguistic free-for-all. Pop vocal phrasing tends to be a little more varied in verses. Pre-choruses are normally phrased to build into a big chorus, though sometimes a sudden dynamic contrast works great, too.¹⁰

But perhaps the single most important aspect of pop singing is affect, which is a bit harder to define. You might think of this as the attitude or general expression which is overlaid on top of all these musical features.

Having a particular affect can make a pop vocalist truly iconic… even if they ignore all the technical stuff we just talked about. This is the main thing that sets a pop vocalist apart from (for example) a theater, classical, or jazz vocalist. In those worlds, affect is important too, but you need to come correct with your technique. In the 90s, many vocalists were making bizarre or sloppy technical decisions, but delivered iconic performances because they nailed the affect.¹¹

Back when I was doing more studio work, “flat affect” vocalists were quite popular. These days (thankfully), affect in pop music is much more variable, and singers with very extreme or unusual affects (eg. Future) can top the charts. One of the things I love about 80s pop was the trend of dramatic, theatrical, or filmic affects (particularly with New Romantic artists).

A quick non-technical term for this whole bag of stuff is Delivery. A “great delivery” also relates to how the vocalist’s decisions interact with the song itself. For example, a Desiigner-esq delivery on a trap song is probably great, but a Desiigner-esq delivery on something like Vanessa Carlton’s “A Thousand Miles” is probably weird. Of course, whether a delivery is considered bad or good or normal or weird changes over time and with varying trends.

As with all forms of communication, there is a semiotic layer associated with many decisions a vocalist can make.¹² For example, a Bro Country singer who uses the pronunciation and affect of an old-money New Yorker will have a hard time connecting with the Bro Country audience, because of what these vocal expressions signify. The links between a musical or performative style and its culture of origin can likewise lead to overt intersectional conflict¹³ when that music begins to influence other cultures.

And finally, the post-internet age of accessible archives has resulted in a trend towards stylistic heterogeneity in pop music, as vocalists and musicians are now drawing elements from a broader range of temporally and aesthetically disparate styles.

In the late 2010s, it’s now commonplace to have (for example) The 1975’s late-80s-INXS-replicating¹⁴ debut album sharing the charts with Meghan Trainor’s updated 50s Phil Spector sound and Kanye West’s avant-garde hip hop. Popular music trends used to be more monolithic — think Grunge supplanting Hair Metal in the 90s — but now, all bets are basically off.

What a time to be alive! And what a time to be a pop vocalist.

On “Nuance”

When I wrote that Nick Jonas’ vocals on Jealous were “nuanced”, what I meant specifically was that he’s making a lot of complicated performance choices, but limiting their extremity.

Let’s begin by examining a small part of a more extreme vocal performance:

On the opening verse of Lana Del Rey’s “Born To Die”, she consistently begins her phrases laying way back — but tightens up the timing slightly as the stanza progresses. She uses this shifting time feel to emphasize certain words, such as “Gates” and “Home” by (unexpectedly) nailing them right on the beat.

She uses exaggerated, slow slides to fall or rise into the pitches (which sometimes last for over half the length of the note). She occasionally ads vibrato to notes at the ends of phrases, and adds a small pitch inflection to “mistake”. Many of her longer notes end with a slight falling pitch if she’s leaving a rest before the next lyric.

She uses a light Midatlantic accent,¹⁵ slow/de-emphesized formation of words (i.e. long transitions between consonants and vowels), unvoiced plosives, and softens her sibilants on the 2nd stanza. These timing and pronunciation features coalesce into a drunk / druggy / wasted affect, and the Midatlantic accent adds sociopolitical connotations of white 20th-century wealth and glamour. These potent semiotic signals set the stage for high drama and forbidden lust.

Her phrasal dynamics are relatively subtle. She makes an interesting decision by singing the first line “Feet don’t fail me now” slightly stronger, then dropping into the mp which holds for the rest of the verse. This decision lends a subtle shade of “giving up” to the affect. She brings the lyric “on the Friday nights” out dynamically, before once again pulling back into mp.

So, even in a performance with a lot of extremity in the timing and pitch decisions (along with some unique affectation choices), there’s still a fair amount of subtle things going on, too.

Nick Jonas’ Vocal Delivery on Jealous

To circle back around: Non-vocal-nerds might really like Jonas’ delivery on Jealous, but have a hard time explaining exactly why. That’s because (with the exception of the “I still get jealous!” hook) his lack of extremity makes its most characteristic features harder to pinpoint, as compared to (for example) Lana Del Rey’s mile-wide slides and “movie star voice”.

I’m going to be referencing two performances of this song: the studio version, and a live concert recording from 2016. In the concert setting, many of the delivery’s features have been refined by two years of repetition, and he pushes some of them to a greater degree of extremity, which is illuminating.

Note: I couldn’t find an audio-only version of this live video, so remember not to “listen with your eyeballs” 😤

The First Verse:

The most noticeable feature of Jonas’ delivery is that he’s sliding up or down into almost every note — which is partly a personal style (see the first verse of Bacon¹⁶). Coupled with the narrow melodic ranges in the first two phrases, and some indistinct pronunciation of the conversationally-worded lyrics, “I don’t like the way he’s lookin’ at you / I’m startin’ to think you want him, too” (with unvoiced Ks and softened H sounds)— right off the bat, he begins to form shades of a postwar working-class affect. Think: Brando in Streetcar or Deen in Rebel — virile (and mumbly) men at the mercy of their uncontrollable passions.

Along with the sliding pitches, he’s also keeping the timing loose and leaning into the beats, most obviously at the beginning of phrases, and on words which begin with softer consonants like “way” and “look”. The slight decrescendo and hesitation on “lookin’ at you…” adds a sort of musical ellipsis, engaging the listener in anticipation of the narrative continuation.

The ascending melodies in the second half of the verse cover more intervalic ground, and the short slides become more pronounced. He uses variation in these slides and shortens notes to assist in the phrasing of each line. For example, he begins “Am I crazy” and “Even though” right on pitch and beat before sliding up the melodic line (also note how he emphasizes the starting consonants of “crazy” and “though” which land on beat 1).

This second half of the verse builds both vocally and in the production, so the listener is prepared for the lead-in to the chorus when he nails the “Can’t” in “Can’t help it” both right on pitch and beat. This change reverses the flow of the previous two melodic lines, signaling to the listener that we’re about to hit a new section (coupled with the brief production breakdown immediately following). In the live recording, he emphasizes the shape and rhythm of these lines further by shortening the ends of notes.

The Chorus

Jonas’ delivery in the first half of this chorus is animated primarily by timing. He consistently leans into the beat, primarily by anticipating with the consonants and opening the vowels right on or just beforehand. He pushes this timing a bit more extremely in the live performance, and you can hear it very clearly in the opening “Chin Music Up”. The effect of this timing is a bouncy or rounded feel, like keeping time by swinging your arms while walking rather than tapping your foot while sitting.

Also critical to maintaining this feel is the way he fades and shortens individual notes. For example, notice how he decrescendos “fault”, and then adds an explicit cutoff to “Hover”, which he pronounces colloquially as “Huh-vuh”. These features again play up the affect of masculine bravado, and are further heightened by the anachronistic idioms used in the lyrics.

He loosens timing even further in the delivery of “Right to be hellish”, delaying the “to” into a heavy swing feel. This also emphasizes the song’s very first transition into the falsetto range, preparing us for the solid gold hook in the next measure. His handling of the break out of falsetto on “hellish” is likewise a great touch, adding an extra umph to an interesting and unusual word.

We reach the full-on singalong moment in the second half of the chorus. He tightens up the timing on “You’re too sexy beautiful”, which helps to clarify what might otherwise be an awkward line to sing (back-to-back syllables beginning with the S and X). Along with this, he leaves little gaps after the notes in “too / se / xy”, and adds a nice decrescendo to the back half of “beautiful”. On the line leading up to the hook, Jonas pushes the slide in “taste” out to almost a full 16th note, and uses strong, specific note cutoffs after “taste” and “that’s.”

These two short notes on 3 and 4 are followed by an extremely satisfying bit of melodic work over the changes: After a D to F#m turnaround, the accompaniment hits the downbeat on an A chord, and we get a surprising falling 5th in the melody from F# to B on “why…”. This melody resolves down to the tonic a beat later: harmonically, forming an A+9 to A. You can tell Jonas digs this fancy resolution, too: in the live version he adds an ornament to the resolution.

The two beats of rest after “why” are a great heads up! for listeners that we’re about to hit the hook, but it also allows the vocalist to take a nice deep breath before launching upwards. Besides being fun to sing, approaching this falsetto hook by an intervalic leap rather than stepwise motion allows the singer to nail the falsetto notes mid-register rather than navigating the break range. Aside from being the second-highest note in the song, Jonas also adds an exaggerated vibrato to “Jealous” and shifts the note into a slightly breathier tone as it progresses.

These melodic lines in the chorus overlap in the studio recording, so in the concert setting he shortens “Jealous” to give himself some time to drop back into his normal register and eliminates the subtle decrescendo to keep the vocal energy up. It’s probably worth mentioning that in this live recording, this song was likely at the end of the set, after an hours’ worth of singing — so in this context some vocal tiredness is to be expected and completely excusable… and in any event, he sounds solid.

Second Verse

There’s a lot of variety in the delivery on the second verse, and each line gets a slight melodic or arrangement twist as well. The first line plays with the rhythm by leaning into words which start on softer sounds like “h” and “a,” and nailing the harder sounds like “d,” “t,” and “p” right on the beat. After sliding way up into the “it,” we get the song’s only specified melisma (there’s a few rando ones in the ad-libbing at the end).

The first two lines build dynamically up to the harmonized “just for me”, which he emphasizes by shortening the preceding “bit” to a 16th note. He nails the “for” aggressively (bending the O into an UH), and the “Me” includes both a whole step slide in and out, and finishes with a short exhale leading into 2. He delivers this line with the slurred, blustery affect of bravado established in the chorus.

The contrast between the studio and live versions of “protective or possessive, yeah” is helpful. Once again, he’s using the harder consonants “tec” and “tive” to nail the beats, and brings out the slides slightly on the sibilants “se” and “sive.” In the live version he sings the “or” louder, and dips lower into his chest voice for the final “Yeah” (which he pronounces kind of like “ö-eh” … yeah, no clue on that one).

The rhythm of the final line diverges from the first verse. He specifically places a pause after “call it”, and (like the first verse) nails the “pass” right on the beat rather than sliding into it. In the live version he places each syllable more cleanly on the beat, and adds a bonus melisma to the “I turn my.”

The Bridge

At the bridge, Jonas makes the unusual and interesting decision to use repeated upward slides into notes during the first half of each phrase (rather than sliding between each note). This also allows the length of the upward slide to control the perception of a note’s timing. In the live version, he pushes both the start of the slides and their lengths out slightly, which keeps the bouncy/round timefeel from the chorus going strong.

The bridge contains a few specific ornaments (technically, an inverted turn on “else” and a mordent on “know”¹⁷), and for the live version he throws in a bonus mordent on “only”. Tbh I’m not crazy about these, because I feel like the specificity with which he performs these ornaments runs semiotically counter to the established salt-of-the-earth affect… but this is a relatively minor complaint.

The bridge also features a few prominent vocal harmonies. The chorus is also harmonized, but these parts are mixed way back. The bridge harmonies feature some unusual parallel 4ths, and when the vocal melody jumps up to “no one else”, this satisfyingly shifts to more consonant diatonic 3rds.

The final measure has some notable harmonic things going on. The bass walks up to a G, and the harmonized vocals sustain an A on top and resolve an F# to E natural below, creating a kind of Em/G or G+6+9 feel (or a B quartal/G if you roll that way). In the live version, the band spices this up with some unexpected NFL Theme vibes,¹⁸ which helps to create momentum leading into the breakdown before the final choruses.

The Breakdown & Ad-Libs

Breakdowns usually provide a pop vocalist with some space to step out and nudge their deliveries into more adventurous territory, and Jonas takes the opportunity to do so. With the production stripped down, you can more clearly hear the slight upward slides into notes — though the audio edit between “music up / and I’m” is a bit exposed here. You can also hear more clearly which words he’s adding vibrato to (“face”, “no”, and “be”), as well as the slight vocal fries on the end of “obsessed” and the beginning of “I mean”.

Jonas shifts “It’s not your fault that they hover” up an octave into falsetto, and stretches the rhythm of the line way out; displacing “I mean” by an eighth note and condensing it to a pair of sixteenth notes, followed by “no” arriving on the sixteenth after the downbeat. The transition back to full voice is tricky, placing the “it’s” in falsetto and the “my” in full voice. Take a moment to sing this yourself, and you’ll see why he adds a small rest in the live version to handle this in a more comfortable way. On the final “I still get jealous” hook, he triumphantly kicks the high D up to an E.

The ad-libs during the final chorus are all in Bm pentatonic, providing a comfortable and familiar harmonic space for melodic interjections. For a singer with Jonas’ level of chops, they’re comparatively reserved, with the exception of a blink-and-you’ll-miss-it 16th triplet run crossing the bar line for “That’s why”.

This decision to hold back is (in a sense) another “nuanced” one: a younger vocalist looking to prove his mettle might instead opt for several measures of pyroclastic melismas — and that sort of decision has its own charm as well! But, perhaps having already weathered the rolling seas of teenage stardom, Jonas felt comfortable sitting back and letting the unrelenting hookiness of the song’s chorus do the heavy lifting; his ad-libs function more as subtle punctuation (IMO the most interesting of which is the “You / Hey” which places the “Hey” on the offbeat of 4, where the chord changes without accent in the drums).

This deferential approach is carried over into the live version, where he sings the hook almost unadorned, and allows the band to elevate the energy level of the final choruses with a thundering flurry of activity (those bass fills 😫👌💯).

A Matter Of Taste

On the few occasions I’ve been wine tasting, the day often begins with the best of intentions, only to devolve into a freewheeling bacchanalia as the afternoon rolls along. This is, in part, because I lack a certain amount of self-control (and generally, class); but more pertinently, because I haven’t really acquired the skill of tasting wine.

After a generous sip of a red which I can identify only in the roughest sense as “dry” or “tannin-y”, hearing a professional Sommelier break down the specific components of the wine’s flavor is a fascinating experience. And while that activity can be as much performative as informative, even when the Sommelier is clearly leaning towards the former, I find myself (at least momentarily) inspired to approach the activity of tasting with greater care and clarity.

For writers with a certain amount of technical experience in music, pop occupies an unusual position: roughly comparable to the unavoidable discussion of Budweiser in New Yorker articles about the world of artisinal craft brewing.¹⁹ As the post-internet critical consensus tipped inexorably towards Poptimism in heated (and necessary) debates over inclusivity, aesthetics, and authenticity policing, one smaller issue which got lost in the shuffle is that from a technical perspective, almost anything that hits the Top 40 — regardless of style — is roughly comparable in terms of form and complexity.

The act of declaring in favor of pop music positions a technical music writer — like the Orval Brewmeister Jean-Marie Rock boldly declaring Budweiser the “best” American beer — within a particular frame of contrarian-chic… though I think this perception might be slightly skewed.

It’s easy to sing the praises of pop artists who indulge in exciting leftfield tweaks to the standard formulas (like the adventurous harmony in Britney’s “Toxic”, oddball prog-pop acts like Ice Choirs, or artists who transition beyond categorical pop entirely like Prince, The Beatles, or Kanye) because they tend to make decisions which are overtly unusual and in-and-of-themselves musically interesting. What takes a little longer to appreciate is that even the most infuriatingly generic, banal, market-driven, or disposable pop music contains a staggering tree of musical decisions which can be observed.

As to whether these decisions require extensive comment… well, on the one hand: my gut tells me there’s only so many 5,000-word taste profiles of Budweiser one can write in a lifetime — but on the other: certainly the omnipresence of America’s favorite wallpaper of a lager at the weddings, wakes, and the millions of ephemeral moments which pepper our lives is, in itself, one of its most notable features.

I am writing all of this not because I think Jealous is an overlooked masterpiece due for a serious critical re-evaluation — don’t get me wrong, I think it’s a solid pop tune with a great delivery — but rather because someone on the internet asked me about the word “nuance”. And right now, I happen to be in a place where I have the time and motivation to consider this question fully… and perhaps, like in the woozy afternoon following a heady day of wine tasting, in far too great detail and length, and with just a touch of embarrassing sincerity.


* Article may contain less than ten million words.

[1] Here’s two really extreme examples: Prince’s “The Ladder” and Brandy’s “What About Us?”.

[2] See the repeated appoggiatura motif in Jermih’s “Birthday Sex”.

[3] The Cranberries’ “Zombie”.

[4] See the “Heart / Hea-ah-art” hook in Regina Spektor’s “Fidelity”.

[5] You already knew this footnote was going to be the Ying Yang Twins.

[6] Here’s a decent explainer video. Personally, I think vocal fries are fine and the hermeneutics surrounding the argument itself are more interesting.

[7] Metal vocal technique is fascinating and will lead you down some highly rewarding internet rabbit holes.

[8] For example, take a moment to listen closely to Australian artist Troye Sivan’s seductive vowels.

[9] Unvoiced Plosives.

[10] Compare the equally effective transitions into the chorus on Alanis Morrisette’s “You Oughta Know” and Miley Cyrus’ “Wrecking Ball”.

[11] Back in the mid 90s, my local pop/alternative station played The Violent Femmes’ “Blister In The Sun” on heavy rotation.

[12] Semiotics: the final frontier.

[13] Katy Perry covering “Paris” live on BBC1.

[14] I was honestly a little surprised that more reviews of The 1975 didn’t go into greater depth on its primary stylistic template: INXS’ “Kick”.

[15] Midatlantic Accent. Here’s an Atlantic article on the topic.

[16] Nick Jonas’ “Bacon”, live 2016.

[17] Inverted Turn, Mordent.

[18] “King of the sports jingle” Scott Schreer’s indomitable NFL Theme.

[19] The New Yorker, “A Better Brew”, Nov. 24, 2008.