The Database Journal
03/10/2006
Adjusting Autonumber Increments

03/10/2006
Moving Access Data to Microsoft Excel

03/10/2006
Exporting Access Data with TransferSpreadsheet

03/10/2006
Export Access Data with VBA

03/13/2006
Data Layout in Microsoft Excel

03/17/2006
Excel Text to Number Conversion

03/21/2006
Transposing Excel Ranges

03/25/2006
Access VBA Timer Event

03/29/2006
What is MySQL?

04/02/2006
Microsoft Access Parameter Query Tutorial Video

04/06/2006
How does PHP Work?

04/10/2006
Microsoft Access VBA Tools - References

Click here to view all articles.

Database Design
Website Design
Search Engine Marketing
Home
Contact Us
About Us
Database Journal
Newsletters
Database Newsletter

Separating First and Last Names in Microsoft Excel

As someone who’s worked on thousands of Excel Spreadsheets fixing problems ranging from data layout, to charting, to programmatically adjusting pivot table properties through code, there’s one problem I’ve seen resurface constantly. The dreaded “Name” problem.

As much as we’d all like to think that every person in the known universe is an expert in Microsoft Excel and has years of background in application design and programming, it’s just not the case. It’s not their fault, they just don’t know.

It usually starts with a simple email, “These are the people that are going to the picnic...” or some such thing is the only sentence in the body of the email, and attached is an Excel spreadsheet called, “These are the people that are going to the picnic.xls”. Calmly, you detach the file, rename it to something more appropriate like, “Picnic.xls” and open it up to find a large list of names in no discernible format or order. Your job is to mail merge these names into form letters and you don’t have the time to fix them by hand, so what do you do?

You have a few options within Microsoft Excel, including some tools that are readily available to split names, and the use of some formulas to separate them for import into Microsoft Access.

Microsoft Excel has made available a wonderful tool called, “Text to Columns”. This handy-dandy menu option is available under your Data menu, and is a simple wizard that walks you through splitting data that has spaces or commas as delimiters, or has a fixed width. Basically, names like “John Smith” in cell A1 and “Bob Jones” in A2 can be split so John and Bob are in column A, and Smith and Jones are in column B. Whenever possible, this is your best method to split data. The issue with Text to Columns is that sometimes you’ve got more variables, so in some cases there may be two spaces, and in some cases one. Other times there may be one period in the middle of the field, and sometimes no periods. Figure 1 is an example of data that might be more difficult to split successfully with “Text to Columns”, thus requiring a formula:

Figure 1

In this example, if we used Text to Columns, Stephen, Mark, and Arthur would appear in cells A1, A2, and A3 respectively. Our problem is that since Arthur’s middle initial of “C.” is included, column B would hold “King”, “Twain”, and “C.”. Finally, column C would hold “Clarke” all by it’s lonesome.

This is the general problem splitting up fields that have a variable number of spaces. Text to Columns is great for consistent data, but change it up a little bit and you could be asking for it, especially with thousands of records.

So what we need to come up with here is a formula that will split the names successfully. The formula should give us “Arthur C.” in one column, and “Clarke” in another, but it should also cover regular names like “Mark Twain” and split it up appropriately. By having all of our last names in one column, we can easily sort and merge. First let’s look at a basic formula to split a first name from a full name:

=LEFT(A1,FIND(" ",A1,1)-1)

This formula is saying, “Start from the Left in cell A1, give me as many characters as there are to the first space you encounter, minus one.”

This formula used on our above example would give the result “Stephen”. Since the space in “Stephen King” is the 8th character, it simply starts from the first character, counts eight characters over, subtracts one (so we don’t include the space), and gives you that result.

In cell C2, we need a formula to capture “King” This one is a little trickier:

=RIGHT(A1,LEN(A1)-FIND(" ",A1,1))

This formula is saying, “Start from the Right in cell A1, subtract the position of the space in A1 from the length of the entire string, and return that many characters from the right.”

This formula gives the result of “King”. The space is the 8th character in the string, and the length of “Stephen King” is 12 characters which is a difference of four. Four characters starting from the right is “King”.

Now we are confronted with the problem of “Arthur C. Clarke”. You can approach this in a couple of ways. If the list contains other names with middle initials, it’s beneficial to do a quick scan to see if all of the middle initials are followed by a period. In the interest of time, we’re going to assume that our list has middle initials followed by periods. The following formula will pull the first name and middle initial:

=LEFT(A3,FIND(".",A3,1))

This formula looks for a period instead of a space to provide an anchor point for the Left function. Additionally, we are not subtracting one, because we don’t want to get rid of the period like we wanted to get rid of the space.

Our function to pull the last name works similarly:

=RIGHT(A3,LEN(A3)-FIND(".",A3,1)-1)

This is the same example as the above formula used to split “Stephen King”. The only differences are that we are looking for the period instead of a space (as there are two), and we are subtracting one because we don’t want “Clarke” with a space before it.

Now our task is to create one simple formula to split the first names (and middle initials) in one column, and last names in another. Since we’ll need some logic built in, we’ll use a simple IF statement to determine whether the field has a middle initial or not (by hunting for a period), and then decide on which of the two formulas to used based on that criteria.

Here’s the first part of the IF statement:

=IF(ISERROR(FIND(".",A3,1)),

IF statements are broken down into three parts. There’s what’s called a “logical test” which is where the decision is made, then there’s “Value if True”, and “Value if False”. This part of our IF statement is a test. It wants to know if the function “FIND(“.”,A3,1))” is an error (thus including it in an “ISERROR” function). If there is no period in the name field, it will return a #Value error. So our IF statement’s test is to see if that error would occur. The next part of the IF statement is what will happen if that is the case:

LEFT(A3,FIND(" ",A3,1)-1),

If the IF statement’s logical test is TRUE (in other words, it tries to find a period and can’t, so would produce an error), then we need to use this formula to pull the first name. This formula is the first we used in our example to set “Stephen” apart from “Stephen King”.

If the IF statement doesn’t find an error with the test, or in other words it finds a period somewhere within the name, then we know it has a middle initial, and the last part of the IF statement will be used:

LEFT(A3,FIND(".",A3,1)))

This will capture the first name AND middle initial like, “Arthur C.”. The reason this formula gets used is because the IF statements logical test tells us that finding a period wouldn’t produce an error, therefore it’s a name with a middle initial. Here’s the entire formula:

=IF(ISERROR(FIND(".",A3,1)),LEFT(A3,FIND(" ",A3,1)-1),LEFT(A3,FIND(".",A3,1)))

The formula to pull the right names is exactly the same structure, however instead of using two LEFT functions, we’re using the two RIGHT functions that we came up with previously. Here is that entire formula:

=IF(ISERROR(FIND(".",A1,1)),RIGHT(A1,LEN(A1)-FIND(" ",A1,1)),RIGHT(A1,LEN(A1)-FIND(".",A1,1)-1))

If you are unfamiliar with IF statements, they are an absolute lifesaver when working with Excel and are well worth the small amount of time spent picking up the syntax and use. Also, if you are unfamiliar with the string functions like LEFT, RIGHT, MID, and LEN, they are essential learning when working with strange data, or just working with data in general.

So are these the only two ways to split data apart? Of course not. The IF statement I gave you is likely the most common way, but far from the only way. You can use the MID formula to check for spaces or characters in the middle of the string, or use a simple VBA procedure with some of the same formulas to make it a lot easier on yourself. All in all, using an IF statement with a combined LEFT and RIGHT function is a very solid way to split up a name!


©2014 Blue Moose Technology, LLC

David Badurina, President of Blue Moose Technology, LLC, is a relational database design expert. David's unique ability to easily explain virtually any technical concept has allowed him to work with companies such as AMD, Motorola, the American Heart Association, and countless small businesses. Learn more about database design right now at http://www.bluemoosetech.com.

Article Publishing and Re-Print Guidelines

You can easily link to this article!

If you'd like to link to this article from your own website, copy and paste the following HTML code onto your page:

On your page, the code will appear like this:

Separating First and Last Names in Microsoft Excel