15 Excel Data Analysis Functions You Need to Know
If you’ve ever used Excel, then you’ve probably experienced the agony of choosing an incorrect formula to analyze a data set. Maybe you worked on it for hours, finally giving up because the data output was wrong or, the function was too complicated, and it seemed easier to just manually count the data yourself. If that sounds like you then this Data Analysis in Excel top 15 is for you.
There are hundreds of functions in Excel and it can be overwhelming trying to match the right formula with the right kind of data analysis. The most useful functions don’t have to be complicated. In fact, there are 15 simple functions that will improve your ability to analyze data, making you wonder how you ever lived without them.
Whether you dabble in Excel or use it heavily at your job, there is a function for everyone in this list.
=CONCATENATE is one of the easiest to learn but most powerful formulas when conducting data analysis. Combine text, numbers, dates and more from multiple cells into one. This is a great function for creating API endpoints, product SKUs, and Java queries.
Formula: =CONCATENATE(SELECT CELLS YOU WANT TO COMBINE)
In this example: =CONCATENATE(A2,B2)
=LEN quickly provides the number of characters in a given cell. As in the example above, you can identify two different kinds of product Stock Keeping Units (SKUs) using the =LEN formula to see how many characters the cell contains. LEN is especially useful when trying to determine the differences between different Unique Identifiers (UIDs), which are often lengthy and not in the right order.
Formula: =LEN(SELECT CELL)
In this example: =LEN(A2)
=COUNTA identifies whether a cell is empty or not. In the life of a data analyst, you’re going to run into incomplete data sets daily. COUNTA will allow you to evaluate any gaps the dataset might have without having to reorganize the data.
Formula: =COUNTA(SELECT CELL)
In the example: =COUNTA(A10)
=DAYS is exactly what it implies. This function determines the number of calendar days between two dates. This is a useful tool for determining lifecycle of products, contracts, and run rating revenue depending on service length – a data analysis essential.
=NETWORKDAYS is slightly more robust and useful. This formula determines the number of “workdays” between two dates as well as an option to account for holidays. Even workaholics need a break every now and then! Using these two formulas to compare time frames is especially helpful for project management.
Formulas: =DAYS(SELECT CELL, SELECT CELL) OR =NETWORKDAYS(SELECT CELL, SELECT CELL,[numberofholidays]) note: [numberofholidays] is optional
In the example: =DAYS(C8,B8) OR =NETWORKDAYS(B7,C7,3)
=SUMIFS is one of the “must know” formulas for a data analyst. The common formula used is =SUM, but what if you need to sum values based on multiple criteria? SUMIFS is it. In the example below, SUMIFS is used to determine how much each product is contributing to top-line revenue.
Formula: =SUMIF(RANGE,CRITERIA,[sum_range]) note: [sum_range] is optional
In the example: =SUMIF($B$2:$B$28,$A$2:$A$28,$F2)
Much like SUMIFS, AVERAGEIFS allows you to take an average based on one or more criteria.
Formula: =AVERAGEIF(SELECT CELL, CRITERIA,[AVERAGE_RANGE]) note: [average_range] is optional
In the example: =AVERAGEIF($C:$C,$A:$A,$F2)
VLOOKUP is one of most useful and recognizable data analysis functions. As an excel user, you’ll probably need to “marry” data together at some point. For example, accounts receivable might know how much each product costs, but the shipping department can only provide units shipped. This is the perfect use case for VLOOKUP.
In the image below we use reference data (A2) combined with the pricing table to have excel looking up matching criteria in the first column and returning an adjacent value.
Formula: =VLOOKUP(LOOKUP_VALUE,TABLE_ARRAY,COL_INDEX_NUM, [RANGE_LOOKUP])
In the example: =VLOOKUP($A2,$G$1:$H$5,2,0)
=FIND/=SEARCH are powerful functions for isolating certain text within a data set. Both are listed here because =FIND will return a case-sensitive match, i.e. if you use FIND to query for “Big” you will only return Big=true results. But a =SEARCH for “Big” will match with Big or big, making the query a bit broader. This is particularly useful for looking for anomalies or unique identifiers.
Formula: =FIND(TEXT,WITHIN_TEXT,[START_NUMBER]) OR =SEARCH(TEXT,WITHIN_TEXT,[START_NUMBER]) note: [start_number] is optional and is used to indicate the starting cell in the text to search
In the example: =(FIND(“Big”, A2,1)””)
=IFERROR is something that any analyst who actively presents data should take advantage of. Using the previous example, looking for specific text/values in a dataset won’t return a match. This causes a #VALUE error, and while harmless, it is distracting and an eyesore.
Use =IFERROR to replace the #VALUE errors with any text/value. In the example above the cell is blank so that data consumers can easily pick out which rows returned a matching value.
Formula: =IFERROR(FIND“VALUE”,SELECT CELL,VALUE_IF_ERROR)
In the example: =IFERROR(FIND“BIG”,A6,1),“”)
=COUNTIFS is the easiest way to count the number of instances a dataset meets a set of criteria. In the example above the product name is used to determine which product was the best seller. COUNTIFS is powerful because of the limitless criteria you can input.
In the example: =COUNTIFS($A:$A,$F9)
=LEFT, =RIGHT are simple and efficient methods for extracting static data out of cells. =LEFT will return the “x” number of characters from the beginning of the cell, while =right will return the “x” number of characters from the end of the cell. In the example below, =LEFT is used to extract the consumers area code from their phone number, while =RIGHT is used to extract the last 4 digits.
Formula: =LEFT(SELECT CELL,NUMBER) OR =RIGHT(SELECT CELL,NUMBER)
In this example: =LEFT(A6, 3) AND =RIGHT(A6,4)
=RANK is an ancient excel function but that doesn’t downplay its effectiveness for data analysis. =RANK allows you to quickly denote how values rank in a dataset in ascending or descending order. In the example, RANK is being used to determine which clients order the most product.
Formula: =RANK(SELECT CELL,RANGE_TO_RANK_AGAINST,[ORDER]) note: [order] is optional
In the example: =RANK($B7,$B$2:$B$7,0) note: 0 returns the largest value ranked #1
=MINIFS is very similar to the min function except it allows you to take the minimum of a set of values, and match on criteria as well. In the example, =MINIFS is used to find the lowest price each product sold for.
In this example: =MINIFS($B$B,$A:$A,$E5)
=MAXIFS like its counterpart minifs, allows you to match on criteria, but this time it looks for the maximum number.
In this example: =MAXIFS($B$B,$A:$A,$E5)
=SUMPRODUCT is a great function to calculate average returns, price points, and margins. SUMPRODUCT multiples one range of values by its corresponding row counterparts. It’s data analysis gold. In the example below, we calculate the average selling price of all our products by using sumproduct to times Price by Quantity and then divide by the total volume sold.
Formula: =SUMPRODUCT(RANGE1,RANGE2)/SELECT CELL
In this example: =SUMPRODUCT(B2:B9,C2:C9)/C10
We hope you found that useful. If you’re interested in Data Analysis in Excel, take a look at the Excel course that has helped hundreds of thousands of people master Excel.