Tag Archives: PowerQuery

merge worksheets

Merging worksheets / tables in Excel

1 Star2 Stars3 Stars4 Stars5 Stars (4 votes, average: 4.75 out of 5)
Loading...

Often we need to merge worksheets / workbooks containing lots of data. Imagine receiving periodical daily reports and wanting to quickly consolidate them to generate a weekly or monthly report… seems like a lot of work. Merging worksheets does not necessarily need to be hard… as long as you read through today’s post.

Merging worksheets with VBA

To exemplify the issue let’s consider a Workbook consisting of 2 worksheets (below named Sheet1 and Sheet2) with identical columns.

consolidate worksheets
Consolidating worksheets: Example data

What we want to do is merge these 2 worksheets into one consolidated worksheet. We may of course want to do this in multiple ways e.g. by:

  • Simply appending the worksheets to each other
  • Merging the data sets while removing duplicates

Here is the example result we want to achieve (Sheet2 appended to Sheet1 on a third separate Worksheet):

merging worksheets
Merging worksheets: Result

In today’s Tip of the Day I will exemplify how to merge worksheets / tables (any number) with an option of eliminating duplicates.

Merging worksheets with VBA

Let’s first consider a simple piece of VBA code. The procedure below can handle any number of Worksheets – including Worksheets from external Workbooks as long as you have the Workbook open and provide it within the array of Worksheets passed to the procedure.

VBA Code

Sub Merge(ws() As Worksheet, destWs As Worksheet, headerInFirstRow As Boolean, removeDuplicates As Boolean)
    'Clear destination worksheet
    destWs.UsedRange.EntireRow.Delete
    Dim pasteRange As Range, header As Range, firstFreeRow As Range, w As Variant, copyRange As Range
    'Paste header
    If headerInFirstRow Then
        Set header = ws(0).UsedRange.Cells(1).EntireRow:
        header.Copy destWs.Cells(1).EntireRow
    End If
    Set firstFreeRow = destWs.UsedRange.Rows(destWs.UsedRange.Rows.Count).Offset(1).EntireRow
    'Paste worksheets
    For Each w In ws
        Set copyRange = w.UsedRange.Rows("" & _
            IIf(headerInFirstRow, 2, 1) & ":" & w.UsedRange.Rows.Count)
        copyRange.Copy firstFreeRow.Cells(1, 1)
        Set copyRange = Nothing
        Set firstFreeRow = destWs.UsedRange.Rows(destWs.UsedRange.Rows.Count).Offset(1).EntireRow
    Next w
    'Remove duplicates
    If removeDuplicates Then
        Dim colArr As Variant, col As Long: ReDim colArr(0 To destWs.UsedRange.Columns.Count - 1)
        For col = 1 To destWs.UsedRange.Columns.Count
            colArr(col - 1) = col
        Next col
        destWs.UsedRange.removeDuplicates Columns:=(colArr), header:=IIf(headerInFirstRow, xlYes, xlNo)
    End If
    'Clean
    Set firstFreeRow = Nothing: Set w = Nothing: Set header = Nothing
End Sub

  

A now a simple example of this procedure being used assuming the example data I referenced above. Let’s assume data on is on worksheets Sheet1 and Sheet2, the destination Worksheet will be Sheet3.

Sub TestMerge()
    Dim ws(0 To 1) As Worksheet
    Set ws(0) = Sheet1
    Set ws(1) = Sheet2
    Merge ws, Sheet3, True, False
End Sub

Notice that the Merge procedure accepts the following parameters:

  • ws – an array of Worksheets
  • destWs – the destination Worksheet
  • headerInFirstRow – if true assumes that the header is in the first row of each source Worksheet
  • removeDuplicates – if true remove all duplicates after merging the Worksheets

Simple enough right? Not too long. If you want to optimize it for best performance read this.

Merging worksheets with MS Query (SQL)

Now let’s consider a second approach – using MS Query (SQL). This is my personal favorite as MS Query SQL code is short, does not require VBA and the destination Worksheet can be refresh by simply right clicking on the table and hitting refresh.

To create a MS Query feel free to use my free Excel SQL AddIn or follow the steps below:

merge worksheets excel
SQL AddIn: Merging worksheets

Follow the instructions below if you don’t have my free Excel SQL AddIn:

Open the From Microsoft Query Wizard

Data ribbon and then selecting From Other Sources and From Microsoft Query -> till the end be sure to hit View Data or Edit Query in Microsoft Query to be able to provide your own SQL query.

Proceed according to Wizard to edit SQL Query

As we want to merge several Worksheets within our current Excel Workbook we need to select Excel Files as the designated data source for our MS Query:

Select From Excel Files data source
Select From Excel Files data source

Next we need to select our Current Excel Workbook from the File Dialog:
Select your current Excel Workbook
Select your current Excel Workbook

Select the first Excel Worksheet you want to merge and click the right arrow > to drag it to the Columns in your Query section.
Select any Worksheet you want to merge to proceed
Select any Worksheet you want to merge to proceed

Next instead of Returning the Data hit the View data or edit Query in Microsoft Query radio button and proceed:
Proceed to edit the MS Query
Proceed to edit the MS Query

A new Window should pop-up. Look for the SQL button to modify the existing MS Query:
Hit SQL to edit MS Query
Hit SQL to edit MS Query

Modify the SQL Query to merge Worksheets

A new SQL Window should pop-up. You can modify any number of Worksheets using this approach. Below I am assuming I want to merge two Worksheets – Sheet1 and Sheet2. The MS Query need to therefore look like this:

SELECT * FROM [Sheet1$] UNION ALL SELECT * FROM [Sheet2$]

Simply modify the existing MS Query in the SQL Window and hit OK.

What if you need to merge 3 Worksheets not just 1? Simply append an additional UNION ALL like this:

SELECT * FROM [Sheet1$] UNION ALL SELECT * FROM [Sheet2$] UNION ALL SELECT * FROM [Sheet3$]

For additional Worksheets repeat this pattern

Return Data to selected Worksheet

Return Data to Excel Workbook
Return Data to Excel Workbook
Now the pleasant part – loading the Query data. Look for the Return Data button, select a cell where the MS Query is supposed to load and hit OK.

Comment on using the MS Query approach

Simple and quick – this approach is more efficient than any VBA code we may develop as it is handled by the OLEDB driver. MS Query (SQL) is also more efficient in handling large chunks of data as compared to other approaches.

Merging worksheets with PowerQuery

If you are not familiar with PowerQuery feel free to download it from Microsoft here. PowerQuery features a lot of neat data-crunching / BI features. Let’s see how can we leverage PowerQuery to do our job.

Create tables from each source Worksheet

First what we need to do is link our source tables to PowerQuery so we can create the output query.
consolidate2
Select the whole range of each source worksheet (CTRL + A)
and select the From Table option from the PowerQuery ribbon. For this example I have named Sheet1 as Table1 and similarly with Sheet2.

Merge the Worksheets

So now that we have our Tables configured in PowerQuery we can append the 2 (or more). Hit the Append button from the Combine group in the PowerQuery ribbon.

PowerQuery - Combine
PowerQuery – Combine

Next select Table1 from the first drop-down and Table2 from the second drop-down:

PowerQuery - Append
PowerQuery – Append

and hit OK

Note: When wanting to append multiple (>2) worksheets simply repeat this step. The first Table will need to be the Appended one from the remaining tables

That’s it! Similarly as with the MS Query example you can always easily refresh your query by right-clicking and hitting Refresh!

Summary

I hope you found these methods useful. On one last note I would like to share my thoughts on how these methods compare.

Personally I prefer to use MS Query (SQL) via my SQL AddIn. This sentiment may of course be partially as I used to work a lot with databases and am used to coding SQL. I know for most Excel users out there this is not the case. In such cases and if you only have a couple of Worksheets you need to merge on a regular basis I do suggest downloading the PowerQuery AddIn. It’s easy to use and also allows you to refresh your query easily. In all other scenarios feel free to use the VBA approach and simply customize your code to work best for you!

Let me know if you found this useful! And be sure to subscribe to more Tips on my Twitter or Facebook page!

Excel Top 10 Features

Top 10 Excel Features

1 Star2 Stars3 Stars4 Stars5 Stars (6 votes, average: 3.83 out of 5)
Loading...

Today on Top 10 Excel features you need to know. I am a frequent visitor to StackOverflow to see what is troubling the Excel community (and other developer/analyst communities). It is nothing incredible that many users have issues due to not being aware of some of the most significant features in Excel like PivotTables, Array Formulas, Tables and other. I have seen more than once users reaching out to use VBA macros instead of much easier PivotTables/Charts. Although, I value VBA it should also be the solution of last resort compare to “native” Excel features used more common.

I have decided therefore to list some of the top 10 features regarded often as the most significant and useful to know. So let’s start with the 10 Top Excel Features…

No. 1: PivotTables

You can’t call yourself an advanced Excel user without knowing about PivotTables! There is no other feature in Excel I being used more often and with success. Almost the first thing I always do when analyzing data is pivot the data in multiple manners and analyze the patterns/results. Pivots can process a large amount of data in a short period of time and are optimized in the Excel back-end for performance. This is definitely one of the 10 top Excel features!

PivotTables allow you to transform and analyze data in a structure manner. Just select a range of data (data in columns with headers) and select the ROWS, COLUMNS and VALUES for your Pivot Table! You can also create custom columns (based on formulas), summarize data by groups/rows/columns etc. There is almost no limit in the possibilities.

How to find it in Excel?
Insert->Tables->PivotTables

No. 2: Filtering and sorting data


Filtering and sorting your data is just as useful as using PivotTables. Excel is meant to transform and analyze data and filtering/sorting is one of the key elements. When provided with a table of data you will probably want to sort the data in a descending/ascending manner or filter out rows based on some features (values in certain columns). This is a must-know feature.

How to find it in Excel?
Data->Sort->Filter

No. 3: Excel Tables

If you want your data tables to be neat and structured you need to use Excel Data Tables. What do you get when using Tables in Excel? A consistent structure and formatting of your entire data table, automated copied formulas (across columns), non-repeating column headers and more. It is always good to resort to Excel Data Tables as you will have less work managing your data table and can focus on more interesting work like data transformation/analysis.

How to find it in Excel?
Insert->Tables->Table

No. 4: Conditional formatting

Analyzing/transforming data is important, but it is just as useful to be able to identify variances in a range of values using graphics like colors, bars or icons. Conditional formatting can allow you to notice patterns in data values which might not be obvious when looking at raw numbers.

How to find it in Excel?
Home->Styles->Conditional Formatting

No. 5: Lookup Excel functions

Although these functions are not exactly a separate feature they are considered to be one of the most useful and most often used when analyzing data. I can’t stress how often I stumbled upon articles/posts about these functions. These functions are also frequently used by recruiters for Excel jobs.

  • VLOOKUP – search the first column of a range of cells, and then return a value from any cell on the same row of the range
  • HLOOKUP – as above but for columns instead of rows
  • INDEX – returns the value of an element in a table or an array, selected by provided index
  • MATCH – searches for a specified item in a range of cells, and then returns the relative position of that item in the range

How to use these functions?
The VLOOKUP function will return a corresponding value from another cell of the same row of a value found in the first column of the data table.

The INDEX and MATCH functions are best to be used together (see the links section below). Why? They allow you to basically achieve the same result as the LOOKUP functions HOWEVER these are more flexible. I encourage you to read the links below.

No. 6: Array Formulas

Array Formulas are one of the greatest knowledge gaps in the Excel community in my opinion. I see so often questions which can be easily answered if someone at least made an effort to learn them. Many Excel users fall into the trap of writing a lot of custom VBA just because they are not aware or are too lazy to use a neat Array Formula.

How to use Array Functions?
Go to the links section for a decent tutorial. However, the process itself is quite simple:

  1. Create a function using an Excel range e.g. A1:A10
  2. Hit CTRL+SHIFT+ENTER

So easy and yet so powerful! Let’s jump into a simple example:

Excel Array Formula Example
Let’s assume we have a range of value for a certain period of time. We want to get the maximum value for dates after the 1st of March 2015. We can get this in a single Array Formula! See below.

10 Top excel features: Excel Array Formula Example
Excel Array Formula Example

How does it work?

'MAX ( returns all cells from A2:A8 for which B2:B8 are older than 2015-03-01)
=MAX(IF(B$2:B$8>DATE(2015;3;1);A$2:A$8))

See the logic? You can also multiply/divide/sum and do other cool thing with Array Formulas. See the tutorial link below.

No. 7: Data Analysis Excel Tools

All tools needed in doing basic data cleansing can be found in the Data Tools section of the Data ribbon. Working on data often? You need to know how and when to use Text to Columns, Data Validation and all the other neat tools.

How to find it in Excel?
Data->Data Tools

No. 8: Naming fields / NameManager

Naming cells/ranges comes in handy when you repeatedly reference certain cells or arrays e.g. USD/EUR currency field, interest rate used in formulas etc. This is a nice and clean way to manage all reference to those fields and allows you to easily relocate these cells or ranges.

How to set a Defined Name to an Excel cell/range?

  • Click on the text field in the upper-left with the cell/range reference
  • Type in Defined Name for this cell/range – it must not have whitespaces
  • Hit ENTER

Now when trying to reference the cell/range in a formula simply type the new Defined Name.

10 Top excel features: Setting a Defined Name to a Cell
Setting a Defined Name to a Cell

How to find it in Excel?
Formulas->Defined Names

No. 9: VBA macros and recording macros

The reason VBA is placed before last on the 10 Top Excel features list is because, in my opinion, it is so often abused but users who refuse to learn well the remaining Excel features. VBA macros fills the gap of all the missing features/functions in Excel. Macros let’s you program almost anything in Excel, you name it – forms, database connectivity, analytics, web browsing etc. You can’t basically consider yourself an Excel Pro without being able to program macros in VBA. However, it is important to stress that the problem with VBA is that once learned it tends to make analysts lazy – instead of Excel Array Formulas you will see custom macros or hideous UDF-functions. VBA should be used as a tool of last resort!

Typical applications of VBA

  • Cleansing/filtering/sorting/copying data
  • Custom algorithms (custom analysis of data sets)
  • Custom Excel UDF functions (User Defined Functions)
  • Excel Forms (making custom user forms to input data or create a custom UI)

How to find it in Excel?
Developer->Code->Visual Basic
Recording macros
The other fantastic thing with Excel is that it enables you to record macros – by recording your actions in Excel and translating them into VBA code. In some cases you need not even understand the code to be able to reuse it e.g. applying custom formatting to selected cells. This is definitely a great feature on it’s own.

How to find it in Excel?
Developer->Code->Record Macro

No. 10: Microsoft Power Add-In’s

Last but not least on the 10 Top Excel features list are the PowerPivot, PowerQuery and PowerMap powerful Microsoft developed Add-Ins for Excel. Harness the power of Big Data, SQL, complex pivots and charts with these fantastic Add-ins! The PowerMap is a relatively new member of the family delivering nice bells and whistles to your Workbooks!

PowerPivot basically pumps Excel with more Analytics features by extend the PivotTable with summarization, cross-tabulation, expanded data capacity, advanced calculations, ability to import data from multiple sources, and the ability to publish the workbooks as interactive web applications.

PowerQuery will allow you to easily harness data and access to external data sources such as files, the Web, databases etc. and more easily manipulate and cleanse data. PowerQuery enables you to process enormous data sources/tables counting millions of records (more than an Excel Worksheet can contain).

PowerMap (as quoted on the official MS site) is a three-dimensional (3-D) data visualization tool that lets you look at information in new ways. A power map lets you discover insights you might not see in traditional two-dimensional (2-D) tables and charts.

Both these tools complete one another. If you want to do Business Intelligence in Excel you need to be able to use both these Add-Ins. Microsoft seems to have great plans for them and I would anticipate that both these Add-Ins become a “native” part of Excel in the following versions of Microsoft Excel.