2 easy ways to parse data in Microsoft Excel


become microsoft excel expert
Image: IB Photography/Adobe Stock

If you work with imported or foreign data in Microsoft Excel, chances are you have to clean the data a bit before you can use it. You might have to remove spaces at the beginning or end of values or reformat a column of values from text to numbers or vice versa. It’s also common to parse values when a column stores more than one value.

A good example of this is names. You’ll often find the entire name in a single column and that makes those values difficult to work with. In this tutorial, I’ll show you two quick ways to parse data using Excel’s Flash Fill and Power Query features.

I’m using Microsoft 365 on a Windows 10 64-bit system. You can use string functions in older versions including the .xls menu format. Flash Fill is available in Excel 2013 and later, and Power Query is available in Excel 2010 and later. In addition, Excel for the web supports Flash Fill. You can also open existing Power Query queries in Excel for the web, but you can’t create them in the browser edition.

SEE: Windows, Linux, and Mac commands everyone needs to know (free PDF) (TechRepublic)

How to parse using Microsoft Excel string functions

Before Flash Fill, most parsing required combining string functions such as RIGHT(), LEFT(), MID(), FIND() and LEN(). For example, the expression in column D of Figure A.

=IFERROR(RIGHT(B3,LEN(B3)-IFERROR(FIND(” “,B3,FIND(” “,B3)+1),FIND(” “,B3))),B3)

The above expression returns the last name from the values in column B. IFERROR() handles the error when there’s no second space. Fortunately, our name values are consistent in format, but that won’t always be the case. Even if IFERROR() is removed, it’s still a complex and tedious expression to work through.

using a string function to parse data in Microsoft Excel
Figure A: You can use string functions to parse data.

Flash Fill can complete many parsing tasks in more recent versions of Microsoft Excel.

How to use Flash Fill to parse data in Microsoft Excel

Excel’s Flash Fill feature has been around since Excel 2013, so you may be familiar with it. When using it to parse, enter the first value in the target column, so Excel can evaluate the pattern. Sometimes, Flash Fill requires only one value. Because our data is consistent, this parsing task will be easy.

Let’s use Flash Fill to parse the last names from the name values in column B.

First, enter Harkins in C3 and press Ctrl + Enter. Most likely, Flash Fill won’t fill the column.

Then, enter the first few characters in Smyth, as shown in Figure B. As you can see, Flash Fill now has the pattern and offers to fill the entire column.

Flash Fill autofills the last names in the new column
Figure B: Keep entering last names until Flash Fill picks up the pattern.

Press Enter to fill the remaining cells, as shown in Figure C.

the last names have been filled in the new column using Flash Fill
Figure C: Once Flash Fill has the pattern, press Enter to fill the remaining cells.

Excel’s Flash Fill is a lot faster and easier than combining string functions. Even when the data isn’t consistent, this feature does a good job of finding the pattern.

Excel for the web supports Flash Fill. Look for it on the Data tab. Instead of pressing Ctrl + Enter, click Flash Fill in the Data Tools group.

SEE: 6 ways to save time using Flash Fill in Microsoft Excel (TechRepublic)

How to use Power Query to parse data in Microsoft Excel

Power Query is available to Microsoft Excel 2010 users and later. Its purpose is to retrieve and clean data, but it’s full of features that can do much more. Using Power Query, we’ll add a column and parse the last names into that column.

First, we need to load the data into Power Query as follows:

  1. Click anywhere inside the table that you want to load into Power Query.
  2. Click the Data tab.
  3. In the Get & Transform Data group, click From Table/Range.
  4. If you haven’t formatted the data as a Table, Power Query will prompt you to do so. The demonstration Table is TablePersonnel. You don’t need to name the Table, but a named Table is a bit easier to work with.

Figure D shows TablePersonnel in Power Query.

menu option to load the data into Power Query
Figure D: Load the data into Power Query.

Once the data is in Power Query, add a column based on the Personnel column.

First, click the Personnel header to select the column, and click the Add Column tab.

In the General Group, choose From Selection from the Column From Examples dropdown (Figure E). Power Query adds a new blank column. To enter the new last name values for this column, we’ll use Power Query’s own version of Flash Fill.

menu option to name the new column based on the existing Personnel column
Figure E: Base the new column on the existing Personnel column.

Into the first cell of the new column, Column 1, enter Harkins (Figure F). Power Query responds by offering a set of last name values for the entire column. Look them over a bit. If they’re not correct, enter another last name. However, Power Query returns the correct list with just one name to set the pattern for our data.

filling in the fields of the new column
Figure F: Enter the first last name.

Click OK to create and populate the new column with last names, as shown in Figure G.

data is populated using Power Query based on existing data
Figure G: Power Query populates the new column.

All that’s left to do is to save the modified data to Microsoft Excel, as follows:

  1. Click the Home tab.
  2. In the Close group, click Close and Load from the Close & Load dropdown.

Figure H shows the parsed last names in an Excel sheet. Power Query uses the name TablePersonnel to name the new sheet. At this point, you can rename the new column. You can also do so in Power Query before saving the data.

the Power Query data that has been saved in Microsoft Excel
Figure H: Save the data to Excel.

Flash Fill was definitely quicker but knowing the Power Query method will help when you’re already working with data in Power Query.



Source link