Tutorial: Import Data into Excel, and Create a Data Model

Tutorial: Import Data into Excel, and Create a Data Model

Abstract:    This is the first tutorial in a series designed to get you acquainted and comfortable using Excel and its built-in data mash-up and analysis features. These tutorials build and refine an Excel workbook from scratch, build a data model, then create amazing interactive reports using Power View. The tutorials are designed to demonstrate Microsoft Business Intelligence features and capabilities in Excel, PivotTables, Power Pivot, and Power View.

Note: This article describes data models in Excel 2013. However, the same data modeling and Power Pivot features introduced in Excel 2013 also apply to Excel 2016.

In these tutorials you learn how to import and explore data in Excel, build and refine a data model using Power Pivot, and create interactive reports with Power View that you can publish, protect, and share.

The tutorials in this series are the following:

In this tutorial, you start with a blank Excel workbook.

The sections in this tutorial are the following:

At the end of this tutorial is a quiz you can take to test your learning.

This tutorial series uses data describing Olympic Medals, hosting countries, and various Olympic sporting events. We suggest you go through each tutorial in order. Also, tutorials use Excel 2013 with Power Pivot enabled. For more information on Excel 2013, click here. For guidance on enabling Power Pivot, click here.

Import data from a database

We start this tutorial with a blank workbook. The goal in this section is to connect to an external data source, and import that data into Excel for further analysis.

Let’s start by downloading some data from the Internet. The data describes Olympic Medals, and is a Microsoft Access database.

1. Click the following links to download files we use during this tutorial series. Download each of the four files to a location that’s easily accessible, such as Downloads or My Documents, or to a new folder you create:
> OlympicMedals.accdb Access database
> OlympicSports.xlsx Excel workbook
> Population.xlsx Excel workbook
> DiscImage_table.xlsx Excel workbook

2. In Excel 2013, open a blank workbook.

3. Click DATA > Get External Data > From Access. The ribbon adjusts dynamically based on the width of your workbook, so the commands on your ribbon may look slightly different from the following screens. The first screen shows the ribbon when a workbook is wide, the second image shows a workbook that has been resized to take up only a portion of the screen.

4. Select the OlympicMedals.accdb file you downloaded and click Open. The following Select Table window appears, displaying the tables found in the database. Tables in a database are similar to worksheets or tables in Excel. Check the Enable selection of multiple tables box, and select all the tables. Then click OK.

5. The Import Data window appears.

Note: Notice the checkbox at the bottom of the window that allows you to Add this data to the Data Model, shown in the following screen. A Data Model is created automatically when you import or work with two or more tables simultaneously. A Data Model integrates the tables, enabling extensive analysis using PivotTables, Power Pivot, and Power View. When you import tables from a database, the existing database relationships between those tables is used to create the Data Model in Excel. The Data Model is transparent in Excel, but you can view and modify it directly using the Power Pivot add-in. The Data Model is discussed in more detail later in this tutorial.

Select the PivotTable Report option, which imports the tables into Excel and prepares a PivotTable for analyzing the imported tables, and click OK.

6. Once the data is imported, a PivotTable is created using the imported tables.

With the data imported into Excel, and the Data Model automatically created, you’re ready to explore the data.

Explore data using a PivotTable

Exploring imported data is easy using a PivotTable. In a PivotTable, you drag fields (similar to columns in Excel) from tables (like the tables you just imported from the Access database) into different areas of the PivotTable to adjust how it presents your data. A PivotTable has four areas: FILTERS, COLUMNS, ROWS, and VALUES.

It might take some experimenting to determine which area a field should be dragged to. You can drag as many or few fields from your tables as you like, until the PivotTable presents your data how you want to see it. Feel free to explore by dragging fields into different areas of the PivotTable; the underlying data is not affected when you arrange fields in a PivotTable.

Let’s explore the Olympic Medals data in the PivotTable, starting with Olympic medalists organized by discipline, medal type, and the athlete’s country or region.

1. In PivotTable Fields, expand the Medals table by clicking the arrow beside it. Find the NOC_CountryRegion field in the expanded Medals table, and drag it to the COLUMNS area. NOC stands for National Olympic Committees, which is the organizational unit for a country or region.

2. Next, from the Disciplines table, drag Discipline to the ROWS area.

3. Let’s filter Disciplines to display only five sports: Archery, Diving, Fencing, Figure Skating, and Speed Skating. You can do this from within the PivotTable Fields area, or from the Row Labels filter in the PivotTable itself.

1. Click anywhere in the PivotTable to ensure the Excel PivotTable is selected. In the PivotTable Fields list, where the Disciplines table is expanded, hover over its Discipline field and a dropdown arrow appears to the right of the field. Click the dropdown, click (Select All)to remove all selections, then scroll down and select Archery, Diving, Fencing, Figure Skating, and Speed Skating. Click OK.

2. Or, in the Row Labels section of the PivotTable, click the dropdown next to Row Labels in the PivotTable, click (Select All) to remove all selections, then scroll down and select Archery, Diving, Fencing, Figure Skating, and Speed Skating. Click OK.

4. In PivotTable Fields, from the Medals table, drag Medal to the VALUES area. Since Values must be numeric, Excel automatically changes Medal to Count of Medal.

5. From the Medals table, select Medal again and drag it into the FILTERS area.

6. Let’s filter the PivotTable to display only those countries or regions with more than 90 total medals. Here’s how.

1. In the PivotTable, click the dropdown to the right of Column Labels.

2. Select Value Filters and select Greater Than….

3. Type 90 in the last field (on the right). Click OK.

Your PivotTable looks like the following screen.

With little effort, you now have a basic PivotTable that includes fields from three different tables. What made this task so simple were the pre-existing relationships among the tables. Because table relationships existed in the source database, and because you imported all the tables in a single operation, Excel could recreate those table relationships in its Data Model.

But what if your data originates from different sources, or is imported at a later time? Typically, you can create relationships with new data based on matching columns. In the next step, you import additional tables, and learn how to create new relationships.

Import data from a spreadsheet

Now let’s import data from another source, this time from an existing workbook, then specify the relationships between our existing data and the new data. Relationships let you analyze collections of data in Excel, and create interesting and immersive visualizations from the data you import.

Let’s start by creating a blank worksheet, then import data from an Excel workbook.

1. Insert a new Excel worksheet, and name it Sports.

2. Browse to the folder that contains the downloaded sample data files, and open OlympicSports.xlsx.

3. Select and copy the data in Sheet1. If you select a cell with data, such as cell A1, you can press Ctrl + A to select all adjacent data. Close the OlympicSports.xlsx workbook.

4. On the Sports worksheet, place your cursor in cell A1 and paste the data.

5. With the data still highlighted, press Ctrl + T to format the data as a table. You can also format the data as a table from the ribbon by selecting HOME > Format as Table. Since the data has headers, select My table has headers in the Create Table window that appears, as shown here.

Formatting the data as a table has many advantages. You can assign a name to a table, which makes it easy to identify. You can also establish relationships between tables, enabling exploration and analysis in PivotTables, Power Pivot, and Power View.

6. Name the table. In TABLE TOOLS > DESIGN > Properties, locate the Table Name field and type Sports. The workbook looks like the following screen.

7. Save the workbook.

Import data using copy and paste

Now that we’ve imported data from an Excel workbook, let’s import data from a table we find on a web page, or any other source from which we can copy and paste into Excel. In the following steps, you add the Olympic host cities from a table.

1. Insert a new Excel worksheet, and name it Hosts.

2. Select and copy the following table, including the table headers.

 City NOC_CountryRegion Alpha-2 Code Edition Season Melbourne / Stockholm AUS AS 1956 Summer Sydney AUS AS 2000 Summer Innsbruck AUT AT 1964 Winter Innsbruck AUT AT 1976 Winter Antwerp BEL BE 1920 Summer Antwerp BEL BE 1920 Winter Montreal CAN CA 1976 Summer Lake Placid CAN CA 1980 Winter Calgary CAN CA 1988 Winter St. Moritz SUI SZ 1928 Winter St. Moritz SUI SZ 1948 Winter Beijing CHN CH 2008 Summer Berlin GER GM 1936 Summer Garmisch-Partenkirchen GER GM 1936 Winter Barcelona ESP SP 1992 Summer Helsinki FIN FI 1952 Summer Paris FRA FR 1900 Summer Paris FRA FR 1924 Summer Chamonix FRA FR 1924 Winter Grenoble FRA FR 1968 Winter Albertville FRA FR 1992 Winter London GBR UK 1908 Summer London GBR UK 1908 Winter London GBR UK 1948 Summer Munich GER DE 1972 Summer Athens GRC GR 2004 Summer Cortina d'Ampezzo ITA IT 1956 Winter Rome ITA IT 1960 Summer Turin ITA IT 2006 Winter Tokyo JPN JA 1964 Summer Sapporo JPN JA 1972 Winter Nagano JPN JA 1998 Winter Seoul KOR KS 1988 Summer Mexico MEX MX 1968 Summer Amsterdam NED NL 1928 Summer Oslo NOR NO 1952 Winter Lillehammer NOR NO 1994 Winter Stockholm SWE SW 1912 Summer St Louis USA US 1904 Summer Los Angeles USA US 1932 Summer Lake Placid USA US 1932 Winter Squaw Valley USA US 1960 Winter Moscow URS RU 1980 Summer Los Angeles USA US 1984 Summer Atlanta USA US 1996 Summer Salt Lake City USA US 2002 Winter Sarajevo YUG YU 1984 Winter
1. In Excel, place your cursor in cell A1 of the Hosts worksheet and paste the data.

2. Format the data as a table. As described earlier in this tutorial, you press Ctrl + T to format the data as a table, or from HOME > Format as Table. Since the data has headers, select My table has headers in the Create Table window that appears.

3. Name the table. In TABLE TOOLS > DESIGN > Properties locate the Table Name field, and type Hosts.

4. Select the Edition column, and from the HOME tab, format it as Number with 0 decimal places.

5. Save the workbook. Your workbook looks like the following screen.

Now that you have an Excel workbook with tables, you can create relationships between them. Creating relationships between tables lets you mash up the data from the two tables.

Create a relationship between imported data

You can immediately begin using fields in your PivotTable from the imported tables. If Excel can’t determine how to incorporate a field into the PivotTable, a relationship must be established with the existing Data Model. In the following steps, you learn how to create a relationship between data you imported from different sources.

1. On Sheet1, at the top ofPivotTable Fields, clickAll to view the complete list of available tables, as shown in the following screen.

2. Scroll through the list to see the new tables you just added.

3. Expand Sports and select Sport to add it to the PivotTable. Notice that Excel prompts you to create a relationship, as seen in the following screen.

This notification occurs because you used fields from a table that’s not part of the underlying Data Model. One way to add a table to the Data Model is to create a relationship to a table that’s already in the Data Model. To create the relationship, one of the tables must have a column of unique, non-repeated, values. In the sample data, the Disciplines table imported from the database contains a field with sports codes, called SportID. Those same sports codes are present as a field in the Excel data we imported. Let’s create the relationship.

4. Click CREATE... in the highlighted PivotTable Fields area to open the Create Relationship dialog, as shown in the following screen.

5. In Table, choose Disciplines from the drop down list.

6. In Column (Foreign), choose SportID.

7. In Related Table, choose Sports.

8. In Related Column (Primary), choose SportID.

9. Click OK.

The PivotTable changes to reflect the new relationship. But the PivotTable doesn’t look right quite yet, because of the ordering of fields in the ROWS area. Discipline is a subcategory of a given sport, but since we arranged Discipline above Sport in the ROWS area, it’s not organized properly. The following screen shows this unwanted ordering.

1. In the ROWS area, move Sport above Discipline. That’s much better, and the PivotTable displays the data how you want to see it, as shown in the following screen.

Behind the scenes, Excel is building a Data Model that can be used throughout the workbook, in any PivotTable, PivotChart, in Power Pivot, or any Power View report. Table relationships are the basis of a Data Model, and what determine navigation and calculation paths.

In the next tutorial, Extend Data Model relationships using Excel 2013, Power Pivot, and DAX, you build on what you learned here, and step through extending the Data Model using a powerful and visual Excel add-in called Power Pivot. You also learn how to calculate columns in a table, and use that calculated column so that an otherwise unrelated table can be added to your Data Model.

Checkpoint and Quiz

Review What You’ve Learned

You now have an Excel workbook that includes a PivotTable accessing data in multiple tables, several of which you imported separately. You learned to import from a database, from another Excel workbook, and from copying data and pasting it into Excel.

To make the data work together, you had to create a table relationship that Excel used to correlate the rows. You also learned that having columns in one table that correlate to data in another table is essential for creating relationships, and for looking up related rows.

You’re ready for the next tutorial in this series. Here’s a link:

QUIZ

Want to see how well you remember what you learned? Here’s your chance. The following quiz highlights features, capabilities, or requirements you learned about in this tutorial. At the bottom of the page, you’ll find the answers. Good luck!

Question 1: Why is it important to convert imported data into tables?

A: You don’t have to convert them into tables, because all imported data is automatically turned into tables.

B: If you convert imported data into tables, they will be excluded from the Data Model. Only when they’re excluded from the Data Model are they available in PivotTables, Power Pivot, and Power View.

C: If you convert imported data into tables, they can be included in the Data Model, and be made available to PivotTables, Power Pivot, and Power View.

D: You cannot convert imported data into tables.

Question 2: Which of the following data sources can you import into Excel, and include in the Data Model?

A: Access Databases, and many other databases as well.

B: Existing Excel files.

C: Anything you can copy and paste into Excel and format as a table, including data tables in websites, documents, or anything else that can be pasted into Excel.

D: All of the above

Question 3: In a PivotTable, what happens when you reorder fields in the four PivotTable Fields areas?

A: Nothing – you cannot reorder fields once you place them in the PivotTable Fields areas.

B: The PivotTable format is changed to reflect the layout, but underlying data is unaffected.

C: The PivotTable format is changed to reflect the layout, and all underlying data is permanently changed.

D: The underlying data is changed, resulting in new data sets.

Question 4: When creating a relationship between tables, what is required?

A: Neither table can have any column that contains unique, non-repeated values.

B: One table must not be part of the Excel workbook.

C: The columns must not be converted to tables.

D: None of the above is correct.

1. Correct answer: C

2. Correct answer: D

3. Correct answer: B

4. Correct answer: D

Notes: Data and images in this tutorial series are based on the following:

• Olympics Dataset from Guardian News & Media Ltd.

• Flag images from CIA Factbook (cia.gov)

• Population data from The World Bank (worldbank.org)

• Olympic Sport Pictograms by Thadius856 and Parutakupiu

Expand your Office skills
Explore training
Get new features first
Join Office Insiders