Ever been in a situation where your Excel workbook runs so slow you could make yourself a coffee before it finishes recalculating? Fighting to reduce the file size of your latest Excel report? Or just looking to optimize your Excel workbook a bit? Welcome to the AnalystCave Excel Optimizer the AddIn that will optimize Excel for you! Stay tuned for updates on this post!
I must say this idea has been popping in and out of my head for some time now – although it turned out the first working version of the Optimizer took me only a couple of hours to build. Often coming along Excel files that literally called for wrath from the Gods, I yearned for a time that Excel would include a “Fix Me” button – doing away with all the wrongs of the Workbook I was working on. Unfortunately this is (still) not the case and building your Excel file many of us have to be constantly aware of the dangers that lie ahead. What is more, many Excel users waste a lot of time not appreciating simple tips and tricks that can seriously enhance their Excel experience (like simply saving files in XLSB file format).
Optimize Excel with the Excel Optimizer
The Excel Optimizer is a simple AddIn to Excel that runs certain rules across your Workbook and points out the main pain-points or suggests certain improvements – all with performance in mind. Some rules can be implemented (I am hoping making all as such) while others will simply point out what needs to get done.
This is still a beta-work-in-progress-prototype so keep in mind that there may be some error/bugs here and there. What is more the Excel Optimizer current will only be able to suggest certain fixes (such as replacing some Excel functions with others). Depending on how popular this AddIn becomes I am planning to extend it to be able to actually implement most of the recommended tweaks. Optimizing Excel for best performance has become much easier.
See this video for a quick showcase of how the Excel Optimizer works.
The Excel Optimizer consists of 2 features:
Optimize Excel – the main feature. Opens a window (screen below) that will allow you to run a rule-analysis on your current Workbook. Each rule with validate against different performance standards, rules also have different scopes. There are rules validating the format of the Workbook, validating worksheets and then there are the low-level rules that validate each and every formula. Using the settings button you can turn rules on/off or configure them to enhance your experience
Timer Full Calculation Rebuild – a simple macro that calculates the time needed to do a Full Rebuild Calculation of your Excel workbook. This comes in handy when wanting to set a baseline for the improvements you might want to implement
Below a screenshot of how the main window looks like:
Rules are what drive the Excel Optimizer. The AddIn has been built in a way to make adding/removing/enabling/disabling rules as easy as possible. Currently the Excel Optimizer introduces the following types of rules – many of most still cannot be implemented automatically (expect this in the near future):
Checks if workbooks is in XLSB format if not suggests so
Validates the Used Range of each Worksheet to see if the Used Range is excessive and can be reduced for better performance and optimum file size
Looks for exact (FALSE) VLOOKUPs. If found suggest replacing them with double approximate (TRUE) VLOOKUPs for best performance
Looks for exact repetitions of certain functions and suggest replacing them with a certain cell to minimize the need to recalculate the same function multiple times
Looks for volatile native Excel functions and suggests removing or replacing them
Looks for any external links used within cells in the entire Workbook and suggests removing them or reducing to a bare minimum
Looks for Array Formulas within cells in the entire Workbook and suggests replacing them with regular Excel functions if possible
Looks errors within cells in the entire Workbook and suggests correcting them
You can download the latest version here:
Do you have ideas for new rules? Any suggestions / wishes for additional features? Put your comment below or on my Facebook / Twitter page.
VLOOKUP vs INDEX MATCH vs SQL vs VBA – today you are in for the ultimate Excel Showdown. The VLOOKUP Excel function is one of the most popular functions, around which there has always been much debate. You will most definitely find an article about this function on almost every Excel blog site out there that matters. Similarly there has been much argue about how efficient this function is, when compared to other combos like INDEX MATCH or DOUBLE TRUE VLOOKUPS. I have always wanted to put the dot over the “i” in at least the discussion around performance when using the VLOOKUP and the INDEX MATCH combo (VLOOKUP vs INDEX MATCH).
What to do with your VLOOKUPs to significantly gain performance, and what to replace them with if you are looking to make your workbook more maintainable? How much performance will you actually gain? Hopefully here you will find answers to these questions.
One additional thing I always wanted to bring into this discussion was the MS Query. Many Excel experts often forget to mention that when you need to lookup a lot of values within a certain table there is an approach almost as effective as any Excel trick out there – Microsoft Query (SQL). Excel features so call Query Tables which can execute OLEDB SQL queries on Excel data (worksheets treated like separate SQL Tables).
This means that instead of doing a lookup of a certain value cell-by-cell you can do it within a single query. This query can always be refreshed at the push of a button (or macro), instead of dealing with uncontrollable automatic recalculations. I felt the urge to include this approach in this post as it can challenge face on all the other approaches out there. But let us start from the beginning…
What VLOOKUP does is lookup a certain key (in the example below a “Dog”) within a column of keys in a certain table. Then it takes a value corresponding to the row in which the key was located and returns a corresponding value from another column.
Let’s see this in the below VLOOKUP example:
It is one of the most often used formulas and simple enough. However, the VLOOKUP function has several setbacks:
Hard to maintain when columns are added/removed to/from the lookup table
Key column needs to be first in the lookup table
Little flexibility – cannot be used to match against both rows and columns of a lookup table. Although it can be replaced with HLOOKUP (the forgotten twin brother of VLOOKUP) this can be a nuisance if you want to create a table lookuping up both columns and rows
Why INDEX MATCH?
There are many decent posts on why to consider using INDEX MATCH against the common VLOOKUP. But before we go into the pros and cons let’s understand how the INDEX MATCH combo works.
In short we can replace a VLOOKUP with a combo consisting of 2 functions:
INDEX – returning the value of an element in a table or an array, selected by the row and/or column number indexes)
MATCH – returning the relative position of an item in a specified range
In the example below the MATCH function will first return the relative position (the row number) of the Dog in the A column. Next the INDEX function will return a corresponding value from the same row in column B.
Not really much complicated than the VLOOKUP but the INDEX MATCH combo certainly handles all of the setbacks of the VLOOKUP pretty well (see section above).
But why use INDEX MATCH instead of VLOOKUP, especially if we get the exactly same result? Short summary of the pros and cons of the INDEX MATCH vs. the VLOOKUP:
More flexible – allows you to match both against rows and columns
Less error prone – adding/removing columns/rows from the lookup table should not crash the INDEX MATCH combo
Can be split to match multiple columns – by splitting the INDEX from the MATCH we can in fact match several column off a single INDEX column which points us to the result row
Both vertical and horizontal lookups – VLOOKUP and HLOOKUP address either only vertial or horizontal lookups, whereas with the INDEX MATCH you can easily do both
Harder to use – VLOOKUP is a little easier to understand than INDEX MATCH and I know some people have difficulty with this two step approach
What about SQL?
As promised a quick example as to how SQL (MS Query) can fit in. The query below will do the same lookup operation as the VLOOKUP and the INDEX-MATCH.
SELECTCategory FROM[Sheet1$]WHERE Animal="Dog"
Returning the value
A Query to lookup a whole column of values would look more like this:
How to create an SQL query in Excel? Just go to:DATAFrom Other SourcesFrom Microsoft Query or check out my Excel SQL Add-In.
Now SQL will not prove much useful for just a single lookup operation. Its benefits appear when needing to carry out A LOT of lookup operations. As you will see there are certain tricks you can use in Excel to get better performance than SQL can provide you, however, I would encourage learning SQL as in most cases it can easily replace the need to create complex queries or Array Formulas and still provide awesome performance.
What about VBA?
Couldn’t miss out on an opportunity to check how VBA compares to the other approaches, although it seems pretty obvious it wouldn’t be a fair match – as VBA is singlethreaded as opposed to Excel’s native formulas, which are recalculated concurrently in multiple threads. To challenge the other approaches I devised a simple VBA procedure using the VBA Dictionary object. How does the VBA lookup procedure work? It loads the entire lookup table to a VBA Dictionary object and then looks-up the entire lookup values against the VBA Dictionary.
The VLOOKUP allows you to either approximate a match (by providing range_lookup value to TRUE) or select an exact match. Strangely enough Excel defaults to the approximate options which is obviously a nuisance for most of us… but maybe that was actually a hint from Microsoft that this is the option to go with? It turn out that using the TRUE option that approximates the lookup result will return the lookup value providing a significant performance boost (scroll down to see how significant).
However, a VLOOKUP using the TRUE option will always return a result – not necessarily the one you were looking for. The trick therefore socialized by Charles Williams here shows how using 2 TRUE VLOOKUPS you can get accurate results with great performance. How does it work? In short what Charles suggests is using an IF function with the condition being the FIRST TRUE VLOOKUP to match for the lookup value (the key). If the result matches against the original lookup value (the key) the IF functions returns the SECOND TRUE VLOOKUP which returns the matched value in the right column. If not it returns any default value set. See below:
Notice that the condition in the formula verifies if the VLOOKUP located the right lookup value. If so, the second VLOOKUP will return the associated lookup result. Might seem strange at first that 2 VLOOKUPs are better than one. Let’s look at the statistics to see how it will compare against the other approaches…
There is one CONSIDERABLE setback to keep in mind. The lookup table has to be SORTED by the lookup column (key column)! Otherwise the query will return inconclusive results.
Let’s now dive right into the performances stats around these different approach.
Before we start I want to level set a couple of things. I am running these tests with Excel 2013 installed so keep in mind that if you are using a different version you may see slightly different results (even the INDEX MATCH being quicker then the simple VLOOKUP as I am told to believe). As all things changes, so should the discussion around VLOOKUP need updating – especially performance-wise. Apart from that some additional things to mention:
Tests run on a large dataset (200k lookup table) and assuming a large number of lookups (>25k) to diffuse the problem of accurate performance measurements (25k – 200k lookup operations). Results should give a good approximation of the actual metrics
Lookups on UNSORTED data
In most common cases you are carrying out lookup operations on a UNSORTED lookup table. The chart below present results of the following alternatives:
VLOOKUP (UnSorted) – a simple VLOOKUP on an unsorted lookup table
INDEX MATCH (UnSorted) – a simple INDEX MATCH on an unsorted lookup table
SQL (UnSorted) – an simple SELECT query matching against the lookup values (keys) of the VLOOKUP (returns same results in same order)
VBA (Sorted) – a VBA procedure that creates a dictionary of the lookup table and matches the lookups using the VBA Dictionary. You can find the source code here: source code
So what do we see? It seems that the INDEX MATCH combo performs consistently slightly worse then the simple VLOOKUP. There are no large differences when increasing the amounts of operations performed which seems like there is no reason to jump to an INDEX MATCH combo in the need for just performance. SQL (and VBA) on the other hand wiped out the competition being almost 19x faster when executed against 200k lookups. It seems like the MS Query did not increase it’s execution time considerably probably leading to think that it may perform equally well on much larger lookup tables. VBA was also quite efficient when executed on less than 100k lookup operations. Nevertheless, I consider SQL the winner as clearly it performed better for more operations.
Lookups on SORTED data
Let’s now consider the ideal situation where we have a SORTED lookup table. The chart below will now present results of the following alternatives:
VLOOKUP (Sorted) – a simple VLOOKUP on a sorted lookup table
DOUBLE TRUE VLOOKUP (Sorted) – a DOUBLE TRUE VLOOKUP on a sorted lookup table
INDEX MATCH (Sorted) – a simple INDEX MATCH on a sorted lookup table
SQL (Sorted) – an simple SELECT query matching against the lookup values (keys) of the
VLOOKUP (returns same results in same order)
VBA (Sorted) – a VBA procedure that creates a dictionary of the lookup table and matches the lookups using the VBA Dictionary, you can find the source code here: source code
As we can see the DOUBLE TRUE VLOOKUP rules the stage with an astonishing 0.22 seconds vs. the aweful 110 seconds of a regular VLOOKUP. An astonishing improvement! SQL comes second with a score just slightly above 5 seconds which seems reasonable (although using only 1 core due to being single threaded). What comes as strange is that both the VLOOKUP and the INDEX MATCH actually performed worse when executed against a sorted lookup table. Not something you might expect, but broadly explained by Excel-guru Bill Jelen in this podcast really worth watching if you want to know more on the subject.
The RIGHT approach
Although the DOUBLE TRUE VLOOKUP proved superior to any other method, VLOOKUP is a function that is far from perfect (read section on INDEX MATCH above). In short VLOOKUP is less immune to changes than the INDEX MATCH. Ideally we would like to have an INDEX MATCH formula that is just as efficient as the DOUBLE TRUE VLOOKUP… Well in fact there is a way. At least if we mix a little of both worlds – by combining a TRUE VLOOKUP with a APPROXIMATE INDEX MATCH.
So what’s happening here? We are using the first TRUE VLOOKUP to check whether the lookup_value is present in the lookup table. Then once we have that confirmed we can do an APPROXIMATE INDEX MATCH (less than or equal in this case) to efficiently search for our corresponding value in the result_column.
See below how the MATCH function is defined in Excel:
A simple example below:
The APPROXIMATE INDEX MATCH is similarly as efficient as the TRUE VLOOKUP hence both approaches are equivalent in terms of performance! Fantastic right!?
Keep in mind that the lookup table has to be SORTED by the lookup column (key column)! Otherwise the query will return inconclusive results.
Conclusions on VLOOKUP vs INDEX MATCH
For me these results mean at least 4 things:
On a daily basis swapping your VLOOKUPs for INDEX-MATCH combos will not affect your Excel workbook performance, although may provide you with more flexibility and reduce the number of errors when working on the lookup table, which I personally appreciate
If performance is key sort your lookup table and swap those VLOOKUPS with DOUBLE TRUE VLOOKUPS or rather the TRUE VLOOKUP APPROXIMATE INDEX MATCH combo. Alternatively, swap the VLOOKUPs with a Microsoft Query (SQL) for more simplicity and control
Unless you are using the DOUBLE TRUE VLOOKUP (or TRUE VLOOKUP APPROXIMATE INDEX MATCH combo) don’t sort your lookup data table if you want to gain performance – this will have the opposite effect
When working with very large datasets (>100k rows) consider MS Query. Although it may be just a little slower (consistently a couple of sec) than the DOUBLE TRUE VLOOKUP, for lookup operations, it is more flexible (performs well against other Excel functions) and provides you more control over your query. This also releases you out of the mercy of the Excel Automatic Calculation feature which may cause cells to recalculate when source data is modified (you can refresh the query via macro or by clicking refresh)
Don’t resort to VBA for performance. It is an overkill for this exercise, although it performed almost as well as SQL, it introduces unnecessary complexity and requires saving files in XLSM/XLSB/XLS file format. Other than for amusement, VBA has no justification for this scenario
The VLOOKUP vs INDEX MATCH topic is one of the most popular Excel debates, but I hope that this post will shed more light and provide a measurable comparison of the options out there. I also feel I need to encourage the use of MS Queries as having their own place in this debate.
Do you agree? What do you think about the comparison of these approaches?
VBA easily allows you to get your hand on your PCs performance metrics which often can prove useful when wanting to do some performance statistics. Having been playing with the VBA Multithreading Tool I wanted a way to see how well all my processor cores are utilized – especially when wanting to optimize the amount of threads needed to be spawned.
Windows Management Instrumentation – the single source of your performance metrics
Windows Management Instrumentation (WMI) is the infrastructure for management data and operations on Windows-based operating systems. It is useful when wanting to extract performance data, when automating administrative tasks both on local and remote workstation. Read more here.
WMI can be used in all Windows-based applications, and can be easily accessed using VBA, VBscript, C# or VB.NET.
Extracting performance data in Excel VBA
Walking you through this process I will want to extract CPU usage measured as percentage processor time.
Fortunately, Microsoft has issued a fantastic tool for generating code needed to extract any needed performance metrics using WMI.
Download the WMI Code Creator and select your metrics
Although you don’t need to it is much easier to utilize the Microsoft WMI Code Creator to generate the necessary performance metrics code e.g. CPU usage.
Use the link below to download from Microsoft: Link to Microsoft WMI Code Generator
Open the attached WMICodeCreator.exe file to open the WMI Code Creator window. You should see something like this:
To configure it properly:
Select Visual Basic Script from the Code Language drop down menu.
Leave the namespace as rootcimv2 if you want general workstation performance metrics e.g. CPU usage, memory usage etc. Feel free to read more on the WMI Reference.
Select an appropriate class for your desired metrics. E.g for CPU usage I will select Win32_PerfFormattedData_PerfOS_Processor.
The code should generate automatically. You can execute it to see if it works by clicking Execute code.
Copy the code to your VBA Project and adjust it to VBA
Now just copy the above mentioned code into your VBA project. VBA and VBscript are very similar programming languages (or scripting languages as preferred by others). VBA does not identify the Wscript.Echo method which in VBA is represented by the Debug.Print equivalent. See an example below of the VBscript vs. VBA code.
VBscript (original from WMI Code Creator):
strComputer = "."
Set objWMIService = GetObject("winmgmts:\" & strComputer & "rootcimv2")
Set colItems = objWMIService.ExecQuery( _
"SELECT * FROM Win32_PerfFormattedData_PerfOS_Processor",,48)
For Each objItem in colItems
Wscript.Echo "Win32_PerfFormattedData_PerfOS_Processor instance"
Wscript.Echo "PercentProcessorTime: " & objItem.PercentProcessorTime
strComputer = "." 'Local machine, can be adjusted to access remote workstations
Set objWMIService = GetObject("winmgmts:\" & strComputer & "rootcimv2")
Set colItems = objWMIService.ExecQuery( _
"SELECT * FROM Win32_PerfFormattedData_PerfOS_Processor",,48)
For Each objItem in colItems
Debug.Print "Win32_PerfFormattedData_PerfOS_Processor instance"
Debug.Print "PercentProcessorTime: " & objItem.PercentProcessorTime
Example: Measure CPU usage in VBA
Using WMI and the code above I quickly coded an Excel VBA project workbook that allows you to measure CPU performance in real time (1-2 second intervals :)).
The macro works as follows:
When hitting start the macro generates as many cores as identified using WMI
Every 2 seconds the metrics will refresh showing both CPU percentage processor time and the core utilization
Today on Top 10 Excel features you need to know. I am a frequent visitor to StackOverflow to see what is troubling the Excel community (and other developer/analyst communities). It is nothing incredible that many users have issues due to not being aware of some of the most significant features in Excel like PivotTables, Array Formulas, Tables and other. I have seen more than once users reaching out to use VBA macros instead of much easier PivotTables/Charts. Although, I value VBA it should also be the solution of last resort compare to “native” Excel features used more common.
I have decided therefore to list some of the top 10 features regarded often as the most significant and useful to know. So let’s start with the 10 Top Excel Features…
No. 1: PivotTables
You can’t call yourself an advanced Excel user without knowing about PivotTables! There is no other feature in Excel I being used more often and with success. Almost the first thing I always do when analyzing data is pivot the data in multiple manners and analyze the patterns/results. Pivots can process a large amount of data in a short period of time and are optimized in the Excel back-end for performance. This is definitely one of the 10 top Excel features!
PivotTables allow you to transform and analyze data in a structure manner. Just select a range of data (data in columns with headers) and select the ROWS, COLUMNS and VALUES for your Pivot Table! You can also create custom columns (based on formulas), summarize data by groups/rows/columns etc. There is almost no limit in the possibilities.
How to find it in Excel? Insert->Tables->PivotTables
Filtering and sorting your data is just as useful as using PivotTables. Excel is meant to transform and analyze data and filtering/sorting is one of the key elements. When provided with a table of data you will probably want to sort the data in a descending/ascending manner or filter out rows based on some features (values in certain columns). This is a must-know feature.
If you want your data tables to be neat and structured you need to use Excel Data Tables. What do you get when using Tables in Excel? A consistent structure and formatting of your entire data table, automated copied formulas (across columns), non-repeating column headers and more. It is always good to resort to Excel Data Tables as you will have less work managing your data table and can focus on more interesting work like data transformation/analysis.
How to find it in Excel? Insert->Tables->Table
No. 4: Conditional formatting
Analyzing/transforming data is important, but it is just as useful to be able to identify variances in a range of values using graphics like colors, bars or icons. Conditional formatting can allow you to notice patterns in data values which might not be obvious when looking at raw numbers.
How to find it in Excel? Home->Styles->Conditional Formatting
No. 5: Lookup Excel functions
Although these functions are not exactly a separate feature they are considered to be one of the most useful and most often used when analyzing data. I can’t stress how often I stumbled upon articles/posts about these functions. These functions are also frequently used by recruiters for Excel jobs.
VLOOKUP – search the first column of a range of cells, and then return a value from any cell on the same row of the range
HLOOKUP – as above but for columns instead of rows
INDEX – returns the value of an element in a table or an array, selected by provided index
MATCH – searches for a specified item in a range of cells, and then returns the relative position of that item in the range
How to use these functions?
The VLOOKUP function will return a corresponding value from another cell of the same row of a value found in the first column of the data table.
The INDEX and MATCH functions are best to be used together (see the links section below). Why? They allow you to basically achieve the same result as the LOOKUP functions HOWEVER these are more flexible. I encourage you to read the links below.
Array Formulas are one of the greatest knowledge gaps in the Excel community in my opinion. I see so often questions which can be easily answered if someone at least made an effort to learn them. Many Excel users fall into the trap of writing a lot of custom VBA just because they are not aware or are too lazy to use a neat Array Formula.
How to use Array Functions?
Go to the links section for a decent tutorial. However, the process itself is quite simple:
Create a function using an Excel range e.g. A1:A10
So easy and yet so powerful! Let’s jump into a simple example:
Excel Array Formula Example
Let’s assume we have a range of value for a certain period of time. We want to get the maximum value for dates after the 1st of March 2015. We can get this in a single Array Formula! See below.
How does it work?
'MAX ( returns all cells from A2:A8 for which B2:B8 are older than 2015-03-01)
See the logic? You can also multiply/divide/sum and do other cool thing with Array Formulas. See the tutorial link below.
All tools needed in doing basic data cleansing can be found in the Data Tools section of the Data ribbon. Working on data often? You need to know how and when to use Text to Columns, Data Validation and all the other neat tools.
Naming cells/ranges comes in handy when you repeatedly reference certain cells or arrays e.g. USD/EUR currency field, interest rate used in formulas etc. This is a nice and clean way to manage all reference to those fields and allows you to easily relocate these cells or ranges.
How to set a Defined Name to an Excel cell/range?
Click on the text field in the upper-left with the cell/range reference
Type in Defined Name for this cell/range – it must not have whitespaces
Now when trying to reference the cell/range in a formula simply type the new Defined Name.
How to find it in Excel? Formulas->Defined Names
No. 9: VBA macros and recording macros
The reason VBA is placed before last on the 10 Top Excel features list is because, in my opinion, it is so often abused but users who refuse to learn well the remaining Excel features. VBA macros fills the gap of all the missing features/functions in Excel. Macros let’s you program almost anything in Excel, you name it – forms, database connectivity, analytics, web browsing etc. You can’t basically consider yourself an Excel Pro without being able to program macros in VBA. However, it is important to stress that the problem with VBA is that once learned it tends to make analysts lazy – instead of Excel Array Formulas you will see custom macros or hideous UDF-functions. VBA should be used as a tool of last resort!
Typical applications of VBA
Custom algorithms (custom analysis of data sets)
Custom Excel UDF functions (User Defined Functions)
Excel Forms (making custom user forms to input data or create a custom UI)
How to find it in Excel? Developer->Code->Visual Basic Recording macros
The other fantastic thing with Excel is that it enables you to record macros – by recording your actions in Excel and translating them into VBA code. In some cases you need not even understand the code to be able to reuse it e.g. applying custom formatting to selected cells. This is definitely a great feature on it’s own.
How to find it in Excel? Developer->Code->Record Macro
No. 10: Microsoft Power Add-In’s
Last but not least on the 10 Top Excel features list are the PowerPivot, PowerQuery and PowerMap powerful Microsoft developed Add-Ins for Excel. Harness the power of Big Data, SQL, complex pivots and charts with these fantastic Add-ins! The PowerMap is a relatively new member of the family delivering nice bells and whistles to your Workbooks!
PowerPivot basically pumps Excel with more Analytics features by extend the PivotTable with summarization, cross-tabulation, expanded data capacity, advanced calculations, ability to import data from multiple sources, and the ability to publish the workbooks as interactive web applications.
PowerQuery will allow you to easily harness data and access to external data sources such as files, the Web, databases etc. and more easily manipulate and cleanse data. PowerQuery enables you to process enormous data sources/tables counting millions of records (more than an Excel Worksheet can contain).
PowerMap (as quoted on the official MS site) is a three-dimensional (3-D) data visualization tool that lets you look at information in new ways. A power map lets you discover insights you might not see in traditional two-dimensional (2-D) tables and charts.
Both these tools complete one another. If you want to do Business Intelligence in Excel you need to be able to use both these Add-Ins. Microsoft seems to have great plans for them and I would anticipate that both these Add-Ins become a “native” part of Excel in the following versions of Microsoft Excel.
There are many different articles and post on improving Excel performance. What I personally was missing was a simple, comprehensive, short overview of how the performance of VBA macros can be improved without needing to read through long articles on how the Excel compiler works.
That is why I gathered all VBA performance guidelines into this single post which contains all the known ways of Improving VBA Performance, organized by impact on VBA Performance.
Turn Off Automatic Calculation
To increase VBA speed of execution be sure to set the Calculation mode to xlCalculationManual so that no Excel formula calculations are carried out within the Excel Workbook until the Calculation mode is changed back to xlCalculationAutomatic or by running Application.Calculate:
'Set calculation mode to manual (on demand). Default value is xlCalculationAutomatic.
Many case you will see a boost in VBA speed.
Turn Off Screen Updating
Prevent Excel from updating the Excel screen until this option is changed to True:
'Turn off automatic ScreenUpdating (on demand). Run ApplicationDoEvents procedure to manually update the screen
Avoid Range Copy/Paste Functions
The Excel Range.Copy procedure is a memory hog. Unfortunately in VBA we have limited control over the Garbage Collector – the mechanizm by which Excel will free up internal memory (declared unused variables, tables, objects etc.). The Excel Garbage Collecter unfortunately runs seemingly at specific intervals which means that if you create a lot of objects in a short amount of time you may run out of memory resulting in slowing down of Excel or simply crashing the application altogether
'This uses up a lot of memory as it copies the everything associated with the Range
'e.g. the value, formatting, data validation etc. Avoid copy at all costs!
'Doing Range.Copy in a large loop will surely quickly increase Excel RAM Memory usage or
'result in crashing Excel altogether
'Instead if you want to copy just values do it directly
Use For Each on Collections
The For and For Each Loops are not equivalent and should be used in different situations. ALWAYS use the For Each Loop when looping through Collections, Ranges etc.
It is not so much the presence of this macro that benefits your performance but the impact of the behavior it requires from you. Option Explicit requires that you declare ALL VARIABLES. Add this option at the very beginning of your VBA project module or class. This will force you to explicitly declare the types of your variables. VBA is a dynamic typed language therefore you do not normally have to declare the type of your variable or declare variables in that matter. This causes some additional overhead when your VBA code is executed especially if this variable is used often in your code:
'...your module code below...
y=20'Compile ERROR! Variable not declared using Dim statement
The XLSB format vs. XLSM
A known way to improve Excel VBA speed and efficiency, especially fore large Excel files, is to save your Workbooks in binary XLSB format. To read more on the benefits read my post on the XLSB format.
Use Early binding
When working with OLE objects remember that it is more efficient to declare the OLE object type explicitly. Use:
Use vbNullString vs “”
Instead of comparing an empty string use thge vbNullString constant:
Disable VBA Events
Prevent Excel from firing Excel Events while processing your macro:
'Disables Excel events during the runtime of the VBA Macro
Use With…End Clause
Use the With…End Clause when working with Object such as Ranges, Worksheets, Workbooks, Collections etc. It is not only more efficient but also much easier to read, and simplifies your code.
'This is better:
With…End Clause on this MSDN Article here
Disable Excel PageBreaks
Disabling PageBreaks in Excel can increase Excel and VBA Performance. Read here why this helps your VBA performance: MSDN PageBreaks and Performance. You can disable PageBreaks like this:
'Disable page breaks in Excel spreadsheets to save time screen refreshing
Use VBA Constants
Constants are precompiled variables that can’t be modified during VBA untime. Because of this they are more efficient than regular variables defined with the Dim statement.
'This is slower:
Increasing VBA performance
It is good to handle turning off some of the optimizations mentioned above. I usually use a procedure I defined which I named “OptimizeVBA”. I turn optimizations before running my macro and then turn them off after completion.
Turning ScreenUpdating and Calculations off although improves VBA performance can sometimes causes some issues.
In case calculations are turned off do not expect that Excel with recalculate its formulas based on any changes made to your Worksheets. In case you want to control these recalculations use (in this order):
ActiveSheet.Calculate – this will recalculate only the “dirty” cells within your ActiveSheet
ActiveWorkbook.Calculate – this will recalculate all the “dirty” cells within your ActiveWorkbook
Application.CalculateFull – this will recalculate all open Excel Workbooks
Application.CalculateFullRebuild – this will recalculate all open Excel Workbooks and rebuild all dependencies
When turning ScreenUpdating off you will not be able to access or see what is happening on the Excel Worksheet. In some rare cases Excel may crash assuming that the Application is not responding. So how to respond to this? How to control when the Screen is Updated and when it is not? Use the DoEvents procedure. See the example below
'Your code here
DoEvents'Update screen etc.
'Your other code here
If you still aren’t satisfied with your VBA performance try Compiling your VBA to a .NET COM DLL or multithreading VBA. VBA code can be mulithreaded although this makes sense for macros running more than approx. 10 seconds (as there is an overhead).
Find links to the tools that can further improve your VBA code performance below:
When working on improving VBA performance you will certainly want to verify how some optimization will reduce your total execution time. There are several known methods for measuring VBA code execution time. The Performance Module of the VBA Time Saver contains at least 2 methods for measuring execution time as well as other way for Improving VBA Performance. There is also a hack to Create an Additional VBA Thread.
The Performance Module contains:
2 Methods for Calculating VBA Execution Time (up to 1/1’000’000 of a Second)
Turn Optimizations On/Off
Create Additional VBA Thread (unstable on some environments)