The VLOOKUP function is probably one of the most dreaded functions by beginner Excel users and also one of the most appreciated by the more advanced. VLOOKUP is used to lookup a row in a certain table, based on a value and return a corresponding value in a certain column with that row.
VLOOKUP is easily used in Excel. Today, however, we will explore how to do lookup operations in VBA (VBA VLOOKUP) or how to VLOOKUP in VBA if you prefer.
For this example table:
A simple VLOOKUP operation for say to lookup Roberts age might look like this:
What about the same in Visual Basic for Applications (VBA)? The below is the equivalent to the VLOOKUP operation in Excel. We can use the WorksheetFunction object to run a VLOOKUP operation as such:
The lookup value – the value which is searched for within the first column of the provided Table_array parameter
A VBA Range variable. A table of at least 2 columns or more. The first column is the lookup column.
The number of the column in the Table_array table VBA Range from which the matching value must be returned.
An optional parameter. A logical parameter (True or False) that specifies whether you want the VLookup method to find an exact match or an approximate match. True – approximate match, False – exact match.
VBA VLOOKUP Dictionary
Another approach is to use the VBA Dictionary Object. For the same table as above we can use the following code:
What happens above? First we loop through all the rows and create a VBA Dictionary object with a Key-Value pair of all rows. The Dictionary contains only the lookup and matching columns. Next we lookup our desired Key and return the matching Value.
Right. So if both ways provide the same result what is the benefit of building a Dictionary upfront? Well, imagine wanting to lookup multiple values within your VBA code. In the first approach you would have to reintroduce the entire arguments of the Vlookup function. In the second you can run additional simple calls:
The Dictionary approach will in this case prove MUCH FASTER. This is because the Dictionary is built once and then just reused to lookup a certain key as opposed to doing the entire lookup operation on the entire table as a Vlookup operation would do.
The Excel VLOOKUP function (Vertical Lookup) is one of need-to-know features in Excel! I can’t stress enough how many articles/posts out there have been published on using VLOOKUP. Probably rightly so. I couldn’t however resist to make an my attempt at consolidating all the knowledge around how to use VLOOKUP and all the bits and pieces of information you need to know to be able to fully leverage the vertical lookup function in Excel! Let’s start with a VLOOKUP example…
The VLOOKUP Phonebook analogy
To quickly explain what VLOOKUP does I usually like to use the common Phonebook analogy. In the example below imagine we have a friend called John White. Let’s say we want to call John. What we need to do is to locate John in our Phonebook by his Name and Surname and call him on his corresponding phone number located within the same row.
VLOOKUP returns a cell located at a row corresponding to your lookup value. In a phonebook analogy you would be looking-up a person’s phonenumber by his name and surname
Everyday we deal with similar lookup problems e.g. restaurant menu (looking up the dish price), grocery lists (how many bottles of milk did I need to buy?) etc. Hence the usefulness of using VLOOKUP. Let us move on to learn how to use VLOOKUP.
How to use VLOOKUP?
Considering the above VLOOKUP example let’s take a closer look at the parameters of the VLOOKUP function:
Let’s look at the arguments of VLOOKUP in more detail:
lookup_value – value you want to look up. Must be in the first column of the range of cells you specify in table-array
table_array – range of cells VLOOKUP will
Search FIRST column to find the ROW containing the lookup_value
Search and return corresponding value in column col_index_num of the table_array
col_index_num – column number within the table_array that contains return values. The first left-most column in the table_array starts numbering with 1
range_lookup – an OPTIONAL boolean (True/False) parameter that defines how the VLOOKUP function will behave:
FALSE – search for the first EXACT MATCH in the first column of the table_array
TRUE – Default option. assume the FIRST COLUMN in table_array is sorted either numerically or alphabetically and return an APPROXIMATE MATCH. This parameter works correctly ONLY IF THE FIRST COLUMN OF THE table_array IS SORTED!
FALSE VLOOKUP Example
Nothing beats a hands-on VLOOKUP example! See the below animation to witness for yourself how VLOOKUP works!
Notice that I am using the range_lookup set to FALSE. This means that the VLOOKUP function will always look for an EXACT MATCH.
TRUE VLOOKUP Example
In the previous example the range_lookup parameter was set to FALSE. This example will equally work with range_lookup equal to TRUE as long as we sort our FIRST COLUMN alphabetically (or numerically in other cases)!
So what’s the difference you will ask?PERFORMANCE. The TRUE (approximated match) VLOOKUP is significantly faster than the FALSE (exact match) VLOOKUP. Read here more on VLOOKUP Performance.
Important! If you got hooked up already on the performance bit of the TRUE VLOOKUP beware! The TRUE VLOOKUP will always return a result… even if it didn’t find an exact match. That is a setback. Luckily Excel experts have found a way around this with a DOUBLE TRUE VLOOKUP. Read on.
DOUBLE TRUE VLOOKUP
To exemplify the DOUBLE TRUE VLOOKUP let’s consider this table:
We want to match the Animal Monkey against a Category. If we use a standard FALSE VLOOKUP like this:
We will get the following results:
"Mammal" if found
"#N/A" if not found
Great! But what happens if we use a simple TRUE VLOOKUP?:
We will get the following results:
"Mammal" if found
"Amphibian" (or similar) if not found
Well that’s not very appropriate huh? How can we know for sure if the TRUE VLOOKUP cross-referenced the Animal correctly? Well.. why not use a second VLOOKUP for that? Consider the DOUBLE VLOOKUP below:
"Mammal" if found
"" if not found (replace "" with #N/A if needed)
See what happens here? The first VLOOKUP validates whether there is an EXACT MATCH in the first column. If so the second VLOOKUP will return the corresponding result. See below:
Are 2 DOUBLE TRUE VLOOKUPs really worth the trouble? If you are looking for performance than YES! Read more here.
INDEX MATCH vs VLOOKUP – Do’s and Don’t
If you’ve been using VLOOKUP long enough you probably stumbled across MANY articles/posts on why many consider a certain combination of two Excel functions i.e. INDEX MATCH, better than using a regular VLOOKUP. In fact MOST Google results for VLOOKUP vs INDEX MATCH won’t mention the significant advantage VLOOKUP has over INDEX MATCH which I hinted in the previous section.
Let’s compare the pro’s and con’s of using INDEX MATCH (instead of VLOOKUP):
More flexible – allows you to match both against rows and columns
Less error prone – adding/removing columns/rows from the lookup table should not crash the INDEX-MATCH combo
Both vertical and horizontal lookups – VLOOKUP and HLOOKUP address either only vertial or horizontal lookups, whereas with the INDEX MATCH you can easily do both
Harder to use – VLOOKUP is a little easier to understand than INDEX-MATCH and I know some people have difficulty with this two step approach
In some cases you would like to run a VLOOKUP against MORE THAN ONE COLUMN i.e. lookup a certain set of values against several columns instead of just one column. There is a simple way to achieve this by introducing a HELPER COLUMN with a concatenation of the lookup columns. Let’s consider the example below:
As you can see each month is specified in two separate columns. A simple one-column VLOOKUP will not do. So how to use VLOOKUP to get the result for multiple criteria? Using a HELPER COLUMN! See the solution below:
The above is the simplest approach to a multiple criteria VLOOKUP. There are more elegant approaches that need not a Helper Column e.g. check-out Chandoo’s example here.
If you are impatient for the answer – for best performance always use DOUBLE TRUE VLOOKUPS. Now for a more thorough explanation let’s start from the beginning and summarize what we know in bullet points:
The VLOOKUP function can be well replaced with other functions/features in Excel
A common practice it to replace a VLOOKUP with a INDEX MATCH function combo. This does not affect performance (much) but solve a lot of typical usability issues
I have hinted above that a TRUE (approximated) VLOOKUP on a sorted table_array will have better performance that a regular FALSE (exact match) VLOOKUP
The TRUE (approximate) VLOOKUP function seems to be best candidate in terms of performance. However, approximating the result may render certain issues – a TRUE VLOOKUP will ALWAYS return a result! Even if there is no exact match! This creates a certain issue for us. Luckily there is a certain trick to use 2 TRUE VLOOKUPs in a combo to replace a regular FALSE VLOOKUP and expect similar results. This is commonly called a DOUBLE TRUE VLOOKUP.
Performance: VLOOKUPs VS INDEX MATCH VS SQL
The below performance comparison was carried out by me in my separate post on VLOOKUP vs INDEX-MATCH vs SQL. Follow to my post to read more. In the meantime let’s look at the comparison:
What do the various categories mean?:
VLOOKUP (sorted) – a regular FALSE (exact match) VLOOKUP against a sorted table_array
DOUBLE TRUE VLOOKUP (sorted) – 2 combined TRUE (approximate match) VLOOKUPs against a sorted table_array
INDEX-MATCH (sorted) – a combination of the INDEX and MATCH functions against a sorted table_array
SQL (Sorted) – an Excel MS Query (SQL) executed against a sorted table_array
How to use VLOOKUP recap
Well I do hope this exhausts the subject of using VLOOKUP in Excel. VLOOKUP is a common used function in Excel sometimes wrongly which, I hope from my performance stats above, can seriously tamper with your Workbook performance causing Excel Workbooks to recalculate in matters of minutes instead of seconds. Feel free to share your comments/thoughts below or on the AnalystCave forum.
Ever been in a situation where your Excel workbook runs so slow you could make yourself a coffee before it finishes recalculating? Fighting to reduce the file size of your latest Excel report? Or just looking to optimize your Excel workbook a bit? Welcome to the AnalystCave Excel Optimizer the AddIn that will optimize Excel for you! Stay tuned for updates on this post!
I must say this idea has been popping in and out of my head for some time now – although it turned out the first working version of the Optimizer took me only a couple of hours to build. Often coming along Excel files that literally called for wrath from the Gods, I yearned for a time that Excel would include a “Fix Me” button – doing away with all the wrongs of the Workbook I was working on. Unfortunately this is (still) not the case and building your Excel file many of us have to be constantly aware of the dangers that lie ahead. What is more, many Excel users waste a lot of time not appreciating simple tips and tricks that can seriously enhance their Excel experience (like simply saving files in XLSB file format).
Optimize Excel with the Excel Optimizer
The Excel Optimizer is a simple AddIn to Excel that runs certain rules across your Workbook and points out the main pain-points or suggests certain improvements – all with performance in mind. Some rules can be implemented (I am hoping making all as such) while others will simply point out what needs to get done.
This is still a beta-work-in-progress-prototype so keep in mind that there may be some error/bugs here and there. What is more the Excel Optimizer current will only be able to suggest certain fixes (such as replacing some Excel functions with others). Depending on how popular this AddIn becomes I am planning to extend it to be able to actually implement most of the recommended tweaks. Optimizing Excel for best performance has become much easier.
See this video for a quick showcase of how the Excel Optimizer works.
The Excel Optimizer consists of 2 features:
Optimize Excel – the main feature. Opens a window (screen below) that will allow you to run a rule-analysis on your current Workbook. Each rule with validate against different performance standards, rules also have different scopes. There are rules validating the format of the Workbook, validating worksheets and then there are the low-level rules that validate each and every formula. Using the settings button you can turn rules on/off or configure them to enhance your experience
Timer Full Calculation Rebuild – a simple macro that calculates the time needed to do a Full Rebuild Calculation of your Excel workbook. This comes in handy when wanting to set a baseline for the improvements you might want to implement
Below a screenshot of how the main window looks like:
Rules are what drive the Excel Optimizer. The AddIn has been built in a way to make adding/removing/enabling/disabling rules as easy as possible. Currently the Excel Optimizer introduces the following types of rules – many of most still cannot be implemented automatically (expect this in the near future):
Checks if workbooks is in XLSB format if not suggests so
Validates the Used Range of each Worksheet to see if the Used Range is excessive and can be reduced for better performance and optimum file size
Looks for exact (FALSE) VLOOKUPs. If found suggest replacing them with double approximate (TRUE) VLOOKUPs for best performance
Looks for exact repetitions of certain functions and suggest replacing them with a certain cell to minimize the need to recalculate the same function multiple times
Looks for volatile native Excel functions and suggests removing or replacing them
Looks for any external links used within cells in the entire Workbook and suggests removing them or reducing to a bare minimum
Looks for Array Formulas within cells in the entire Workbook and suggests replacing them with regular Excel functions if possible
Looks errors within cells in the entire Workbook and suggests correcting them
You can download the latest version here:
Do you have ideas for new rules? Any suggestions / wishes for additional features? Put your comment below or on my Facebook / Twitter page.
VLOOKUP vs INDEX MATCH vs SQL vs VBA – today you are in for the ultimate Excel Showdown. The VLOOKUP Excel function is one of the most popular functions, around which there has always been much debate. You will most definitely find an article about this function on almost every Excel blog site out there that matters. Similarly there has been much argue about how efficient this function is, when compared to other combos like INDEX MATCH or DOUBLE TRUE VLOOKUPS. I have always wanted to put the dot over the “i” in at least the discussion around performance when using the VLOOKUP and the INDEX MATCH combo (VLOOKUP vs INDEX MATCH).
What to do with your VLOOKUPs to significant gain in performance, and what to replace them with if you are looking to make the workbook more maintainable? How much performance will you actually gain? Hopefully here you will find answers to these questions.
One additional thing I always wanted to bring into this discussion was the MS Query. Many Excel experts often forget to mention that when you need to lookup a lot of values within a certain table there is an approach almost as effective as any Excel trick out there – Microsoft Query (SQL). Excel features so call Query Tables which can execute OLEDB SQL queries on Excel data (worksheets treated like separate SQL Tables).
This means that instead of doing a lookup of a certain value cell-by-cell you can do it within a single query. This query can always be refreshed at the push of a button (or macro), instead of dealing with uncontrollable automatic recalculations. I felt the urge to include this approach in this post as it can challenge face on all the other approaches out there. But let us start from the beginning…
What VLOOKUP does is lookup a certain key (in the example below a “Dog”) within a column of keys in a certain table. Then it takes a value corresponding to the row in which the key was located and returns a corresponding value from another column.
Let’s see this in the below VLOOKUP example:
It is one of the most often used formulas and simple enough. However, the VLOOKUP function has several setbacks:
Hard to maintain when columns are added/removed to/from the lookup table
Key column needs to be first in the lookup table
Little flexibility – cannot be used to match against both rows and columns of a lookup table. Although it can be replaced with HLOOKUP (the forgotten twin brother of VLOOKUP) this can be a nuisance if you want to create a table lookuping up both columns and rows
Why INDEX MATCH?
There are many decent posts on why to consider using INDEX MATCH against the common VLOOKUP. But before we go into the pros and cons let’s understand how the INDEX MATCH combo works.
In short we can replace a VLOOKUP with a combo consisting of 2 functions:
INDEX – returning the value of an element in a table or an array, selected by the row and/or column number indexes)
MATCH – returning the relative position of an item in a specified range
In the example below the MATCH function will first return the relative position (the row number) of the Dog in the A column. Next the INDEX function will return a corresponding value from the same row in column B.
Not really much complicated than the VLOOKUP but the INDEX MATCH combo certainly handles all of the setbacks of the VLOOKUP pretty well (see section above).
But why use INDEX MATCH instead of VLOOKUP, especially if we get the exactly same result? Short summary of the pros and cons of the INDEX MATCH vs. the VLOOKUP:
More flexible – allows you to match both against rows and columns
Less error prone – adding/removing columns/rows from the lookup table should not crash the INDEX MATCH combo
Can be split to match multiple columns – by splitting the INDEX from the MATCH we can in fact match several column off a single INDEX column which points us to the result row
Both vertical and horizontal lookups – VLOOKUP and HLOOKUP address either only vertial or horizontal lookups, whereas with the INDEX MATCH you can easily do both
Harder to use – VLOOKUP is a little easier to understand than INDEX MATCH and I know some people have difficulty with this two step approach
What about SQL?
As promised a quick example as to how SQL (MS Query) can fit in. The query below will do the same lookup operation as the VLOOKUP and the INDEX-MATCH.
SELECTCategory FROM[Sheet1$]WHERE Animal="Dog"
Returning the value
A Query to lookup a whole column of values would look more like this:
How to create an SQL query in Excel? Just go to:DATAFrom Other SourcesFrom Microsoft Query or check out my Excel SQL Add-In.
Now SQL will not prove much useful for just a single lookup operation. Its benefits appear when needing to carry out A LOT of lookup operations. As you will see there are certain tricks you can use in Excel to get better performance than SQL can provide you, however, I would encourage learning SQL as in most cases it can easily replace the need to create complex queries or Array Formulas and still provide awesome performance.
What about VBA?
Couldn’t miss out on an opportunity to check how VBA compares to the other approaches, although it seems pretty obvious it wouldn’t be a fair match – as VBA is singlethreaded as opposed to Excel’s native formulas, which are recalculated concurrently in multiple threads. To challenge the other approaches I devised a simple VBA procedure using the VBA Dictionary object. How does the VBA lookup procedure work? It loads the entire lookup table to a VBA Dictionary object and then looks-up the entire lookup values against the VBA Dictionary.
The VLOOKUP allows you to either approximate a match (by providing range_lookup value to TRUE) or select an exact match. Strangely enough Excel defaults to the approximate options which is obviously a nuisance for most of us… but maybe that was actually a hint from Microsoft that this is the option to go with? It turn out that using the TRUE option that approximates the lookup result will return the lookup value providing a significant performance boost (scroll down to see how significant).
However, a VLOOKUP using the TRUE option will always return a result – not necessarily the one you were looking for. The trick therefore socialized by Charles Williams here shows how using 2 TRUE VLOOKUPS you can get accurate results with great performance. How does it work? In short what Charles suggests is using an IF function with the condition being the FIRST TRUE VLOOKUP to match for the lookup value (the key). If the result matches against the original lookup value (the key) the IF functions returns the SECOND TRUE VLOOKUP which returns the matched value in the right column. If not it returns any default value set. See below:
Notice that the condition in the formula verifies if the VLOOKUP located the right lookup value. If so, the second VLOOKUP will return the associated lookup result. Might seem strange at first that 2 VLOOKUPs are better than one. Let’s look at the statistics to see how it will compare against the other approaches…
There is one CONSIDERABLE setback to keep in mind. The lookup table has to be SORTED by the lookup column (key column)! Otherwise the query will return inconclusive results.
Let’s now dive right into the performances stats around these different approach.
Before we start I want to level set a couple of things. I am running these tests with Excel 2013 installed so keep in mind that if you are using a different version you may see slightly different results (even the INDEX MATCH being quicker then the simple VLOOKUP as I am told to believe). As all things changes, so should the discussion around VLOOKUP need updating – especially performance-wise. Apart from that some additional things to mention:
Tests run on a large dataset (200k lookup table) and assuming a large number of lookups (>25k) to diffuse the problem of accurate performance measurements (25k – 200k lookup operations). Results should give a good approximation of the actual metrics
Lookups on UNSORTED data
In most common cases you are carrying out lookup operations on a UNSORTED lookup table. The chart below present results of the following alternatives:
VLOOKUP (UnSorted) – a simple VLOOKUP on an unsorted lookup table
INDEX MATCH (UnSorted) – a simple INDEX MATCH on an unsorted lookup table
SQL (UnSorted) – an simple SELECT query matching against the lookup values (keys) of the VLOOKUP (returns same results in same order)
VBA (Sorted) – a VBA procedure that creates a dictionary of the lookup table and matches the lookups using the VBA Dictionary. You can find the source code here: source code
So what do we see? It seems that the INDEX MATCH combo performs consistently slightly worse then the simple VLOOKUP. There are no large differences when increasing the amounts of operations performed which seems like there is no reason to jump to an INDEX MATCH combo in the need for just performance. SQL (and VBA) on the other hand wiped out the competition being almost 19x faster when executed against 200k lookups. It seems like the MS Query did not increase it’s execution time considerably probably leading to think that it may perform equally well on much larger lookup tables. VBA was also quite efficient when executed on less than 100k lookup operations. Nevertheless, I consider SQL the winner as clearly it performed better for more operations.
Lookups on SORTED data
Let’s now consider the ideal situation where we have a SORTED lookup table. The chart below will now present results of the following alternatives:
VLOOKUP (Sorted) – a simple VLOOKUP on a sorted lookup table
DOUBLE TRUE VLOOKUP (Sorted) – a DOUBLE TRUE VLOOKUP on a sorted lookup table
INDEX MATCH (Sorted) – a simple INDEX MATCH on a sorted lookup table
SQL (Sorted) – an simple SELECT query matching against the lookup values (keys) of the
VLOOKUP (returns same results in same order)
VBA (Sorted) – a VBA procedure that creates a dictionary of the lookup table and matches the lookups using the VBA Dictionary, you can find the source code here: source code
As we can see the DOUBLE TRUE VLOOKUP rules the stage with an astonishing 0.22 seconds vs. the aweful 110 seconds of a regular VLOOKUP. An astonishing improvement! SQL comes second with a score just slightly above 5 seconds which seems reasonable (although using only 1 core due to being single threaded). What comes as strange is that both the VLOOKUP and the INDEX MATCH actually performed worse when executed against a sorted lookup table. Not something you might expect, but broadly explained by Excel-guru Bill Jelen in this podcast really worth watching if you want to know more on the subject.
The RIGHT approach
Although the DOUBLE TRUE VLOOKUP proved superior to any other method, VLOOKUP is a function that is far from perfect (read section on INDEX MATCH above). In short VLOOKUP is less immune to changes than the INDEX MATCH. Ideally we would like to have an INDEX MATCH formula that is just as efficient as the DOUBLE TRUE VLOOKUP… Well in fact there is a way. At least if we mix a little of both worlds – by combining a TRUE VLOOKUP with a APPROXIMATE INDEX MATCH.
So what’s happening here? We are using the first TRUE VLOOKUP to check whether the lookup_value is present in the lookup table. Then once we have that confirmed we can do an APPROXIMATE INDEX MATCH (less than or equal in this case) to efficiently search for our corresponding value in the result_column.
See below how the MATCH function is defined in Excel:
A simple example below:
The APPROXIMATE INDEX MATCH is similarly as efficient as the TRUE VLOOKUP hence both approaches are equivalent in terms of performance! Fantastic right!?
Keep in mind that the lookup table has to be SORTED by the lookup column (key column)! Otherwise the query will return inconclusive results.
Conclusions on VLOOKUP vs INDEX MATCH
For me these results mean at least 4 things:
On a daily basis swapping your VLOOKUPs for INDEX-MATCH combos will not affect your Excel workbook performance, although may provide you with more flexibility and reduce the number of errors when working on the lookup table, which I personally appreciate
If performance is key sort your lookup table and swap those VLOOKUPS with DOUBLE TRUE VLOOKUPS or rather the TRUE VLOOKUP APPROXIMATE INDEX MATCH combo. Alternatively, swap the VLOOKUPs with a Microsoft Query (SQL) for more simplicity and control
Unless you are using the DOUBLE TRUE VLOOKUP (or TRUE VLOOKUP APPROXIMATE INDEX MATCH combo) don’t sort your lookup data table if you want to gain performance – this will have the opposite effect
When working with very large datasets (>100k rows) consider MS Query. Although it may be just a little slower (consistently a couple of sec) than the DOUBLE TRUE VLOOKUP, for lookup operations, it is more flexible (performs well against other Excel functions) and provides you more control over your query. This also releases you out of the mercy of the Excel Automatic Calculation feature which may cause cells to recalculate when source data is modified (you can refresh the query via macro or by clicking refresh)
Don’t resort to VBA for performance. It is an overkill for this exercise, although it performed almost as well as SQL, it introduces unnecessary complexity and requires saving files in XLSM/XLSB/XLS file format. Other than for amusement, VBA has no justification for this scenario
The VLOOKUP vs INDEX MATCH topic is one of the most popular Excel debates, but I hope that this post will shed more light and provide a measurable comparison of the options out there. I also feel I need to encourage the use of MS Queries as having their own place in this debate.
Do you agree? What do you think about the comparison of these approaches?
Today on Top 10 Excel features you need to know. I am a frequent visitor to StackOverflow to see what is troubling the Excel community (and other developer/analyst communities). It is nothing incredible that many users have issues due to not being aware of some of the most significant features in Excel like PivotTables, Array Formulas, Tables and other. I have seen more than once users reaching out to use VBA macros instead of much easier PivotTables/Charts. Although, I value VBA it should also be the solution of last resort compare to “native” Excel features used more common.
I have decided therefore to list some of the top 10 features regarded often as the most significant and useful to know. So let’s start with the 10 Top Excel Features…
No. 1: PivotTables
You can’t call yourself an advanced Excel user without knowing about PivotTables! There is no other feature in Excel I being used more often and with success. Almost the first thing I always do when analyzing data is pivot the data in multiple manners and analyze the patterns/results. Pivots can process a large amount of data in a short period of time and are optimized in the Excel back-end for performance. This is definitely one of the 10 top Excel features!
PivotTables allow you to transform and analyze data in a structure manner. Just select a range of data (data in columns with headers) and select the ROWS, COLUMNS and VALUES for your Pivot Table! You can also create custom columns (based on formulas), summarize data by groups/rows/columns etc. There is almost no limit in the possibilities.
How to find it in Excel? Insert->Tables->PivotTables
Filtering and sorting your data is just as useful as using PivotTables. Excel is meant to transform and analyze data and filtering/sorting is one of the key elements. When provided with a table of data you will probably want to sort the data in a descending/ascending manner or filter out rows based on some features (values in certain columns). This is a must-know feature.
If you want your data tables to be neat and structured you need to use Excel Data Tables. What do you get when using Tables in Excel? A consistent structure and formatting of your entire data table, automated copied formulas (across columns), non-repeating column headers and more. It is always good to resort to Excel Data Tables as you will have less work managing your data table and can focus on more interesting work like data transformation/analysis.
How to find it in Excel? Insert->Tables->Table
No. 4: Conditional formatting
Analyzing/transforming data is important, but it is just as useful to be able to identify variances in a range of values using graphics like colors, bars or icons. Conditional formatting can allow you to notice patterns in data values which might not be obvious when looking at raw numbers.
How to find it in Excel? Home->Styles->Conditional Formatting
No. 5: Lookup Excel functions
Although these functions are not exactly a separate feature they are considered to be one of the most useful and most often used when analyzing data. I can’t stress how often I stumbled upon articles/posts about these functions. These functions are also frequently used by recruiters for Excel jobs.
VLOOKUP – search the first column of a range of cells, and then return a value from any cell on the same row of the range
HLOOKUP – as above but for columns instead of rows
INDEX – returns the value of an element in a table or an array, selected by provided index
MATCH – searches for a specified item in a range of cells, and then returns the relative position of that item in the range
How to use these functions?
The VLOOKUP function will return a corresponding value from another cell of the same row of a value found in the first column of the data table.
The INDEX and MATCH functions are best to be used together (see the links section below). Why? They allow you to basically achieve the same result as the LOOKUP functions HOWEVER these are more flexible. I encourage you to read the links below.
Array Formulas are one of the greatest knowledge gaps in the Excel community in my opinion. I see so often questions which can be easily answered if someone at least made an effort to learn them. Many Excel users fall into the trap of writing a lot of custom VBA just because they are not aware or are too lazy to use a neat Array Formula.
How to use Array Functions?
Go to the links section for a decent tutorial. However, the process itself is quite simple:
Create a function using an Excel range e.g. A1:A10
So easy and yet so powerful! Let’s jump into a simple example:
Excel Array Formula Example
Let’s assume we have a range of value for a certain period of time. We want to get the maximum value for dates after the 1st of March 2015. We can get this in a single Array Formula! See below.
How does it work?
'MAX ( returns all cells from A2:A8 for which B2:B8 are older than 2015-03-01)
See the logic? You can also multiply/divide/sum and do other cool thing with Array Formulas. See the tutorial link below.
All tools needed in doing basic data cleansing can be found in the Data Tools section of the Data ribbon. Working on data often? You need to know how and when to use Text to Columns, Data Validation and all the other neat tools.
Naming cells/ranges comes in handy when you repeatedly reference certain cells or arrays e.g. USD/EUR currency field, interest rate used in formulas etc. This is a nice and clean way to manage all reference to those fields and allows you to easily relocate these cells or ranges.
How to set a Defined Name to an Excel cell/range?
Click on the text field in the upper-left with the cell/range reference
Type in Defined Name for this cell/range – it must not have whitespaces
Now when trying to reference the cell/range in a formula simply type the new Defined Name.
How to find it in Excel? Formulas->Defined Names
No. 9: VBA macros and recording macros
The reason VBA is placed before last on the 10 Top Excel features list is because, in my opinion, it is so often abused but users who refuse to learn well the remaining Excel features. VBA macros fills the gap of all the missing features/functions in Excel. Macros let’s you program almost anything in Excel, you name it – forms, database connectivity, analytics, web browsing etc. You can’t basically consider yourself an Excel Pro without being able to program macros in VBA. However, it is important to stress that the problem with VBA is that once learned it tends to make analysts lazy – instead of Excel Array Formulas you will see custom macros or hideous UDF-functions. VBA should be used as a tool of last resort!
Typical applications of VBA
Custom algorithms (custom analysis of data sets)
Custom Excel UDF functions (User Defined Functions)
Excel Forms (making custom user forms to input data or create a custom UI)
How to find it in Excel? Developer->Code->Visual Basic Recording macros
The other fantastic thing with Excel is that it enables you to record macros – by recording your actions in Excel and translating them into VBA code. In some cases you need not even understand the code to be able to reuse it e.g. applying custom formatting to selected cells. This is definitely a great feature on it’s own.
How to find it in Excel? Developer->Code->Record Macro
No. 10: Microsoft Power Add-In’s
Last but not least on the 10 Top Excel features list are the PowerPivot, PowerQuery and PowerMap powerful Microsoft developed Add-Ins for Excel. Harness the power of Big Data, SQL, complex pivots and charts with these fantastic Add-ins! The PowerMap is a relatively new member of the family delivering nice bells and whistles to your Workbooks!
PowerPivot basically pumps Excel with more Analytics features by extend the PivotTable with summarization, cross-tabulation, expanded data capacity, advanced calculations, ability to import data from multiple sources, and the ability to publish the workbooks as interactive web applications.
PowerQuery will allow you to easily harness data and access to external data sources such as files, the Web, databases etc. and more easily manipulate and cleanse data. PowerQuery enables you to process enormous data sources/tables counting millions of records (more than an Excel Worksheet can contain).
PowerMap (as quoted on the official MS site) is a three-dimensional (3-D) data visualization tool that lets you look at information in new ways. A power map lets you discover insights you might not see in traditional two-dimensional (2-D) tables and charts.
Both these tools complete one another. If you want to do Business Intelligence in Excel you need to be able to use both these Add-Ins. Microsoft seems to have great plans for them and I would anticipate that both these Add-Ins become a “native” part of Excel in the following versions of Microsoft Excel.