Data Limitations and Double-Barrel Names

Those of us who study names tend to rely on public data produced by our countries of interest.  I primarily study American names; hence, I use the popularity info that the Social Security Administration publishes and updates each May.  But, I’m also very much interested in British naming practices, and so if I want to look at data for England and Wales, I can look at the info dispersed by the Office of National Statistics.  Scotland, oddly enough, produces data separate of those two, as does Northern Ireland.  Unfortunately I must admit that I know comparatively little about trends outside of the U.S. and Britain, and even my knowledge of British name popularity is limited.  My expertise revolves around the American national data.

I was spurred to write this post after seeing another blogger’s post, which mentioned what I call compound names.  Not sure if that’s the right term for them, but I understand compound names as those like Annabeth, Maryanne, and Rosemarie.  These are all mostly accepted as names in their own right, but you may notice that they are comprised of two separate names each – Anna+Beth, Mary+Anne, Rose+Marie.  But let me ask you this – is Maryanne just that or perhaps Mary-Anne?

If you look at American data, Mary-Anne doesn’t technically exist.  Sure, there are likely lots of Mary-Annes out there.  However, the data doesn’t accept hyphens.  This also means that Lily-Rose will appear as Lilyrose, Mary-Elizabeth as Maryelizabeth, and so on.  Interestingly enough, English and Welsh data does make the distinction.  From what I hear, hyphenated names (also known as “double-barrel”) are common enough in the UK to list Olivia-Rose as Olivia-Rose

The American data also doesn’t show apostrophes or other diacritical marks, yet many of us have met people with apostrophes in their names.  The data also only recognizes the first capitalized letter of a name, so RosaLinda actually appears as Rosalinda.  What the American data does differentiate is unique spelling; as a result, one can expect every single variation of Caitlin to appear as a separate name.  

Other limitations occur when the name consists of only one letter.  If and when those names occur, they won’t show up in the public data.  We’ll never know how many people there are just named “A” or “E.”  It takes two to tango, and it takes two letters to appear.  Numbers are utterly verboten.   

The biggest limitation is that excepting a few years in the 1880s, the SSA only publishes the data for names receiving 5 or more uses in a year.  This is considered a privacy protection.  Regarding historical research, another major limitation revolves around card distribution.  Many people born before 1937 died before they could apply for SS, and for a long time certain occupations (i.e., in agriculture or domestic work) were excluded from the program. 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s