Move data from Excel to Access

This article shows you how to move your data from Excel to Access and convert your data to relational tables so that you can use Microsoft Office Excel and Access together. To summarize, Access is best for capturing, storing, querying, and sharing data, and Excel is best for calculating, analyzing, and visualizing data.

Two articles, Using Access or Excel to manage your data and Top 10 reasons to use Access with Excel, discuss which program is best suited for a particular task and how to use Excel and Access together to create a practical solution.

In this article

The essence of relational databases and data modeling

The components of a relational table

What is Normalization?

Tables can be in different normal forms

Relationships and keys

Data integrity and validity

Summing up

The basic steps to moving data from Excel to Access

Step 1: Import data from Excel to Access

Automatically append data the easy way

Step 2: Normalize data by using the Table Analyzer Wizard

Step 3: Connect to Access data from Excel

Get your data into Access

The essence of relational databases and data modeling

Many data files, including Excel, are called flat files. These files are often large and contain redundant data, columns that are rarely used, and many blank values. You may inherit these files from another system or user, or they might have evolved this way as columns were added over time to satisfy changing requirements. Although the organization of the data in a flat file works for a specific purpose, they are not flexible, and you can find it difficult to answer unanticipated questions about your data.

a flat file

The time-tested solution to a flat file is a relational database. Access is a relational database program and works best when tables are well designed with relationships that conform to the relational database model.

Top of Page

The components of a relational table

In a well-designed relational database, each table is a collection of named columns and multiple rows that stores information about a single subject, such as employees. Each column of the table is uniquely named and contains information about the subject, such as an employee's first name and address. The rows of the table contain occurrences of the subject, such as all the current employees in the company. A single value is stored in the intersection of a row and column, and is a single fact, such as "Seattle." Finally, you can re-order rows and columns without changing the meaning of the table.

a relational table

1. A table represents a single subject — a person, place, thing, event, or concept

2. Each row is unique and has a primary key, such as a badge number

3. Each column has a unique, short, and meaningful name

4. All the values in a column are similar in meaning and format

5. Each value in the table (The equivalent of a cell in Excel) represents a single fact

Top of Page

What is Normalization?

Unfortunately, a well-designed relational table does not just automatically happen in an Access database. You must use a method to analyze data in a flat file and to rearrange data from one table into two or more related tables. This method is called normalization. In a step-by-step process, you split one table into two or more smaller tables by removing repeating values in a column, removing redundant data from rows, and adding primary keys (a field that uniquely identifies each record in a table) and foreign keys (a field that contains values found in a primary key field in a related table) to define relationships between the new tables.

a relational diagram

1. A relational table

2. Column names

3. A Primary Key

4. A Foreign Key

5. Relational lines and symbols

Top of Page

Tables can be in different normal forms

A table can be in one of four different normal forms: zero, first, second, and third. Each form describes the degree to which data in a table is organized and can be successfully used in a relational database. Zero normal form is the least organized and third normal form is the most efficiently organized.

Zero normal form    One sign that a table is in the least organized state, known as "zero normal form," is when one or more columns contain "non-atomic" values, which means that multiple values are contained in a single cell. For example, a customer address might consist of a street address (such as 2302 Harvard Ave), a city, a state, and a postal (ZIP) code. Ideally, each of these elements of the address are stored in separate columns. As another example, a column that contains full names, such as "Li, Yale" or "Ellen Adams" should be split into separate columns for first and last names. Storing first and last names in separate columns is a good practice that helps you quickly find and sort your data.

Another sign of data in zero normal form is when it contains information about different subjects, such as salespersons, products, customers, and orders. Whenever possible, data should be separated into separate tables for each subject.

First normal form    A table is in first normal form when every column contains atomic values, but one or more columns contain some redundant data, such as salesperson or customer information for each portion of an order. For example, Adams, Ellen is repeated five times in the worksheet because she has two different orders (one with three products and one with two products).

Second normal form    A table is in second normal form when redundant data has been removed, but one or more columns are either not based on the primary key or contain calculated values (such as Price * Discount).

Third normal form    A table is in third normal form when all columns in the table are based solely on the primary key. As shown in the following picture, product and supplier information are stored in separate tables and are joined on a lookup to the Supplier ID field in the Suppliers table.

Tables in zero, first, and second normal form can present problems when you want to change data. For example, updating values that are repeated frequently is an extremely time consuming process. Each time you update a value, you need to check to see if every other row contains that same value. This can waste your time and be an error-prone operation. Furthermore, it's difficult to effectively sort and filter columns that contain repeating values. Tables in first and second normal form are a big improvement over zero normal form, but they can still be a problem when you insert, update or delete data.

When you go through the process of normalizing your data, you convert a table from a lower form to a higher form until all tables are in third normal form. In most cases, third normal form is ideal because:

  • Modification problems can be eliminated when data is inserted, deleted, or updated.

  • The integrity of the data can be maintained with data constraints and business rules.

  • You can query the data in a variety of ways to answer your questions.

Top of Page

Relationships and keys

A well-defined relational database contains several tables, each in third normal form, but there are also relationships between these tables that help bring the data together. For example, Employees belong to Departments and are assigned to Projects, Projects have Subtasks, Subtasks are owned by Employees, and Departments manage Projects. In this scenario, a relational database would have four tables defined: Employees, Projects, Subtasks, and Departments, with each of these key relationships defined: belong to, are assigned to, have, are owned by, and manage.

There are three types of relationships:

  • One-to-one (1:1)    For example, each employee has a unique badge ID and each badge ID refers to has a unique employee.

  • One-to-many (1:M)    For example, each employee is assigned to one department, but a department has many employees. This is also called a parent-child relationship.

  • Many-to-many (M:M)    For example, an employee can be assigned to many projects, and each project can have many employees assigned. Note that a special table, called a junction table, is often used to create a one-to-many relationship between each table in third normal form for a total of three tables that together form the many-to-many relationship.

You create a relationship between two or more tables based on primary and foreign keys. A primary key is a column in a table whose values uniquely identify each row in a table, such as a Badge Number or Department Code. A foreign key is a column in a table whose values are the same as the primary key of another table. You can think of a foreign key as a copy of primary key from another relational table. The relationship between two tables is made by matching the values of the foreign key in one table with the values of the primary key in another.

primary and foreign key relationships

Top of Page

Data integrity and validity

After you create a relational database with all tables in third normal form and the correct relationships defined, you want to ensure the integrity of that data. Data integrity means that you can correctly and consistently navigate relationships and manipulate the tables in the database over time as the database is updated. There are two basic rules in relational databases that help to ensure data integrity.

The entity rule    Each row in a table must have a primary key and that primary key must have a value. This rule makes sure that every row in a table can be uniquely identified and is never inadvertently lost. Furthermore, whenever you insert, update, or delete data, the uniqueness and existence of all primary keys can be maintained.

The referential integrity rule    This rule controls insertion and deletion rules of the one-to-many relationship. If a table has a foreign key, every value of the foreign key must either be null (no value) or must match the values in the relational table in which that foreign key is a primary key.

editing relationships

You can also further ensure data integrity in a relational database by using various data validation rules, including data type (such as Integer), data length (such as 15 characters or less), data format (such as currency), default value (such as 10), and constraints (such as Inventory_Amt > ReOrder_Amt). These data validation rules help ensure that the database has quality data and also conforms to established business rules.

It's worth noting that data entry is an important difference between an Access database and an Excel workbook. Entering data in an Excel worksheet is "free-form." You can enter data just about anywhere and you can easily undo a change. However, an Access database is much more structured and constrained. Furthermore, when you enter data in a table, the change is committed to the database. You cannot undo the data entry in the same way as Excel, although you can delete or update the data to correct any mistakes.

Top of Page

Summing up

After you normalize your data into relational tables with well-defined relationships and define its data integrity, it becomes much easier to:

  • Save space and improve performance, because repeating and redundant data is physically removed.

  • Update data accurately and maintain the integrity of the data.

  • Sort, filter, create calculated columns, aggregate, and summarize data.

  • Query data in a variety of ways to answer expected and unanticipated questions.

To be sure, there are advanced aspects to relational database design, such as composite keys (a key that consists of values from two or more columns), additional normal forms (fourth normal form — a multivalued dependency), and denormalization. But for most simple to moderate database needs, you have the essential information about database design that you need to understand the following case studies in this article.

Top of Page

The basic steps to moving data from Excel to Access

When you move data from Excel to Access, there are three basic steps to the process.

three basic steps

Top of Page

Step 1: Import data from Excel to Access

Importing data is an operation that can go a lot more smoothly if you take some time to prepare and clean your data. Importing data is like moving to a new home. If you clean out and organize your possessions before you move, settling into your new home is much easier.

Clean your data before you import

Before you import data into Access, in Excel it's a good idea to:

  • Convert cells that contain non-atomic data (that is, multiple values in one cell) to multiple columns. For example, a cell in a "Skills" column that contains multiple skill values, such as "C# programming," "VBA programming," and "Web design" should be broken out to separate columns that each contain only one skill value.

  • Use the TRIM command to remove leading, trailing, and multiple embedded spaces.

  • Remove non-printing characters.

  • Find and fix spelling and punctuation errors.

  • Remove duplicate rows or duplicate fields.

  • Ensure that columns of data do not contain mixed formats, especially numbers formatted as text or dates formatted as numbers.

Watch the following demo to learn how to clean up your data in Excel.

Show me Watch this

                             Video created by Office Online staff writers

For more information, see the following Excel help topics:

Note   If your data cleaning needs are complex, or you don't have the time or resources to automate the process on your own, you might consider using a third-party vendor. For more information, search for "data cleansing software" or "data quality" by your favorite search engine in your Web browser.

Choose the best data type when you import

During the import operation in Access, you want to make good choices so that you receive few (if any) conversion errors that will require manual intervention. The following table summarizes how Excel number formats and Access data types are converted when you import data from Excel to Access, and offers some tips on the best data types to choose in the Import Spreadsheet Wizard.

Excel number format

Access data type

Comments

Best practice

Text

Text, Memo

The Access Text data type stores alphanumeric data up to 255 characters. The Access Memo data type stores alphanumeric data up to 65,535 characters.

Choose Memo to avoid truncating any data.

Number, Percentage, Fraction, Scientific

Number

Access has one Number data type that varies based on a Field Size property (Byte, Integer, Long Integer, Single, Double, Decimal).

Choose Double to avoid any data conversion errors.

Date

Date

Access and Excel both use the same serial date number to store dates. In Access, the date range is larger: from -657,434 (January 1, 100 A.D.) to 2,958,465 (December 31, 9999 A.D.).

Because Access does not recognize the 1904 date system (used in Excel for the Macintosh), you need to convert the dates either in Excel or Access to avoid confusion.

For more information, see Change the date system, format, or two-digit year interpretation and Import or link to data in an Excel workbook

.

Choose Date.

Time

Time

Access and Excel both store time values by using the same data type.

Choose Time, which is usually the default.

Currency, Accounting

Currency

In Access, the Currency data type stores data as 8-byte numbers with precision to four decimal places, and is used to store financial data and prevent rounding of values.

Choose Currency, which is usually the default.

Boolean

Yes/No

Access uses -1 for all Yes values and 0 for all No values, whereas Excel uses 1 for all TRUE values and 0 for all FALSE values.

Choose Yes/No, which automatically converts underlying values.

Hyperlink

Hyperlink

A hyperlink in Excel and Access contains a URL or Web address that you can click and follow.

Choose Hyperlink, otherwise Access may use the Text data type by default.

Once the data is in Access, you can delete the Excel data. Don't forget to backup the original Excel workbook first before deleting it.

For more information, see the Access help topic, Import or link to data in an Excel workbook.

Top of Page

Automatically append data the easy way

A common problem Excel users have is appending data with the same columns into one large worksheet. For example, you may have an asset tracking solution that started out in Excel but now has grown to include files from many workgroups and departments. This data may be in different worksheets and workbooks, or in text files that are data feeds from other systems. There is no user interface command or easy way to append similar data in Excel. There is a work-around explained in the following article:

Microsoft Excel Blog: Append multiple text files into a worksheet without code

But even this workaround has limitations.

The best solution is to use Access, where you can easily import and append data into one table by using the Import Spreadsheet Wizard. Furthermore, you can append a lot of data into one table. You can save the import operations, add them as scheduled Microsoft Office Outlook tasks, and even use macros to automate the process.

Top of Page

Step 2: Normalize data by using the Table Analyzer Wizard

At first glance, stepping through the process of normalizing your data may seem a daunting task. Fortunately, normalizing tables in Access is a process that is much easier, thanks to the Table Analyzer Wizard.

the table analyzer wizard

1. Drag selected columns to a new table and automatically create relationships

2. Use button commands to rename a table, add a primary key, make an existing column a primary key, and undo the last action

You can use this wizard to do the following:

  • Convert a table into a set of smaller tables and automatically create a primary and foreign key relationship between the tables.

  • Add a primary key to an existing field that contains unique values, or create a new ID field that uses the AutoNumber data type.

  • Automatically create relationships to enforce referential integrity with cascading updates. Cascading deletes are not automatically added to prevent accidently deleting data, but you can easily add cascading deletes later.

  • Search new tables for redundant or duplicate data (such as the same customer with two different phone numbers) and update this as desired.

  • Back up the original table and rename it by appending "_OLD" to its name. Then, you create a query that reconstructs the original table, with the original table name so that any existing forms or reports based on the original table will work with the new table structure.

Top of Page

Step 3: Connect to Access data from Excel

After the data has been normalized in Access and a query or table has been created that reconstructs the original data, it's a simple matter of connecting to the Access data from Excel. Your data is now in Access as an external data source, and so can be connected to the workbook through a data connection, which is a container of information that is used to locate, log on to, and access the external data source. Connection information is stored in the workbook and can also be stored in a connection file, such as an Office Data Connection (ODC) file (.odc file name extension) or a Data Source Name file (.dsn extension). After you connect to external data, you can also automatically refresh (or update) your Excel workbook from Access whenever the data is updated in Access.

For more information, see Overview of connecting (importing) data and Exchange (copy, import, export) data between Excel and Access.

Top of Page

Get your data into Access

This section walks you through the following phases of normalizing your data: Breaking values in the Salesperson and Address columns into their most atomic pieces, separating related subjects into their own tables, copying and pasting those tables from Excel into Access, creating key relationships between the newly created Access tables, and creating and running a simple query in Access to return information.

Example data in non-normalized form

The following worksheet contains non-atomic values in the Salesperson column and the Address column. Both columns should be split into two or more separate columns. This worksheet also contains information about salespersons, products, customers, and orders. This information should also be split further, by subject, into separate tables.

Salesperson

Order ID

Order Date

Product ID

Qty

Price

Customer Name

Address

Phone

Li, Yale

2348

3/2/09

J-558

4

$8.50

Contoso, Ltd.

2302 Harvard Ave Bellevue, WA 98227

425-555-0222

Li, Yale

2348

3/2/09

B-205

2

$4.50

Contoso, Ltd.

2302 Harvard Ave Bellevue, WA 98227

425-555-0222

Li, Yale

2348

3/2/09

D-4420

5

$7.25

Contoso, Ltd.

2302 Harvard Ave Bellevue, WA 98227

425-555-0222

Li, Yale

2349

3/4/09

C-789

3

$7.00

Fourth Coffee

7007 Cornell St Redmond, WA 98199

425-555-0201

Li, Yale

2349

3/4/09

C-795

6

$9.75

Fourth Coffee

7007 Cornell St Redmond, WA 98199

425-555-0201

Adams, Ellen

2350

3/4/09

A-2275

2

$16.75

Adventure Works

1025 Columbia Circle Kirkland, WA 98234

425-555-0185

Adams, Ellen

2350

3/4/09

F-198

6

$5.25

Adventure Works

1025 Columbia Circle Kirkland, WA 98234

425-555-0185

Adams, Ellen

2350

3/4/09

B-205

1

$4.50

Adventure Works

1025 Columbia Circle Kirkland, WA 98234

425-555-0185

Hance, Jim

2351

3/4/09

C-795

6

$9.75

Contoso, Ltd.

2302 Harvard Ave Bellevue, WA 98227

425-555-0222

Hance, Jim

2352

3/5/09

A-2275

2

$16.75

Adventure Works

1025 Columbia Circle Kirkland, WA 98234

425-555-0185

Hance, Jim

2352

3/5/09

D-4420

3

$7.25

Adventure Works

1025 Columbia Circle Kirkland, WA 98234

425-555-0185

Koch, Reed

2353

3/7/09

A-2275

6

$16.75

Fourth Coffee

7007 Cornell St Redmond, WA 98199

425-555-0201

Koch, Reed

2353

3/7/09

C-789

5

$7.00

Fourth Coffee

7007 Cornell St Redmond, WA 98199

425-555-0201

Sousa, Luis

2354

3/7/09

A-2275

3

$16.75

Contoso, Ltd.

2302 Harvard Ave Bellevue, WA 98227

425-555-0222

Adams, Ellen

2355

3/8/09

D-4420

4

$7.25

Adventure Works

1025 Columbia Circle Kirkland, WA 98234

425-555-0185

Adams, Ellen

2355

3/8/09

C-795

3

$9.75

Adventure Works

1025 Columbia Circle Kirkland, WA 98234

425-555-0185

Li, Yale

2356

3/10/09

C-789

6

$7.00

Contoso, Ltd.

2302 Harvard Ave Bellevue, WA 98227

425-555-0222

Information in its smallest parts: atomic data

Working with the data in this example, you can use the Text to Column command in Excel to separate the "atomic" parts of a cell (such as street address, city, state, and postal code) into discrete columns.

The following table shows the new columns in the same worksheet after they have been split to make all values atomic. Note that the information in the Salesperson column has been split into Last Name and the First Name columns and that the information in the Address column has been split into Street Address, City, State, and ZIP Code columns. This data is in "first normal form."

Last Name

First Name

 

Street Address

City

State

ZIP Code

Li

Yale

2302 Harvard Ave

Bellevue

WA

98227

Adams

Ellen

1025 Columbia Circle

Kirkland

WA

98234

Hance

Jim

2302 Harvard Ave

Bellevue

WA

98227

Koch

Reed

7007 Cornell St Redmond

Redmond

WA

98199

Sousa

Luis

2302 Harvard Ave

Bellevue

WA

98227

Breaking data out into organized subjects in Excel

The several tables of example data that follow show the same information from the Excel worksheet after it has been split into tables for salespersons, products, customers, and orders. The table design isn't final, but it's on the right track.

Watch the following demo to learn how to break your Excel data out into tables.

Show me Watch this

                             Video created by Office Online staff writers

The Salespersons table contains only information about sales personnel. Note that each record has a unique ID (SalesPerson ID). The SalesPerson ID value will be used in the Orders table to connect orders to salespeople.

Salespersons

Salesperson ID

Last Name

First Name

101

Li

Yale

103

Adams

Ellen

105

Hance

Jim

107

Koch

Reed

109

Sousa

Luis

The Products table contains only information about products. Note that each record has a unique ID (Product ID). The Product ID value will be used to connect product information to the Order Details table.

Products

Product ID

Price

A-2275

16.75

B-205

4.50

C-789

7.00

C-795

9.75

D-4420

7.25

F-198

5.25

J-558

8.50

The Customers table contains only information about customers. Note that each record has a unique ID (Customer ID). The Customer ID value will be used to connect customer information to the Orders table.

Customers

Customer ID

Name

Street Address

City

State

ZIP Code

Phone

1001

Contoso, Ltd.

2302 Harvard Ave

Bellevue

WA

98227

425-555-0222

1003

Adventure Works

1025 Columbia Circle

Kirkland

WA

98234

425-555-0185

1005

Fourth Coffee

7007 Cornell St

Redmond

WA

98199

425-555-0201

The Orders table contains information about orders, salespersons, customers, and products. Note that each record has a unique ID (Order ID). Some of the information in this table needs to be split into an additional table that contains order details so that the Orders table contains only four columns — the unique order ID, the order date, the salesperson ID, and the customer ID. The table shown here has not yet been split into the Order Details table.

Orders

Order ID

Order Date

SalesPerson ID

Customer ID

Product ID

Qty

2348

3/2/09

101

1001

J-558

4

2348

3/2/09

101

1001

B-205

2

2348

3/2/09

101

1001

D-4420

5

2349

3/4/09

101

1005

C-789

3

2349

3/4/09

101

1005

C-795

6

2350

3/4/09

103

1003

A-2275

2

2350

3/4/09

103

1003

F-198

6

2350

3/4/09

103

1003

B-205

1

2351

3/4/09

105

1001

C-795

6

2352

3/5/09

105

1003

A-2275

2

2352

3/5/09

105

1003

D-4420

3

2353

3/7/09

107

1005

A-2275

6

2353

3/7/09

107

1005

C-789

5

2354

3/7/09

109

1001

A-2275

3

2355

3/8/09

103

1003

D-4420

4

2355

3/8/09

103

1003

C-795

3

2356

3/10/09

101

1001

C-789

5

Order details, such as the product ID and quantity are moved out of the Orders table and stored in a table named Order Details. Keep in mind that there are 9 orders, so it makes sense that there are 9 records in this table. Note that the Orders table has a unique ID (Order ID), which will referred to from the Order Details table.

The final design of the Orders table should look like the following:

Orders

Order ID

Order Date

SalesPerson ID

Customer ID

2348

3/2/09

101

1001

2349

3/4/09

101

1005

2350

3/4/09

103

1003

2351

3/4/09

105

1001

2352

3/5/09

105

1003

2353

3/7/09

107

1005

2354

3/7/09

109

1001

2355

3/8/09

103

1003

2356

3/10/09

101

1001

The Order Details table contains no columns that require unique values (that is, there is no primary key), so it is okay for any or all columns to contain "redundant" data. However, no two records in this table should be completely identical (this rule applies to any table in a database). In this table, there should be 17 records — each corresponding to a product in an individual order. For example, in order 2349, three C-789 products comprise one of the two parts of the entire order.

The Order Details table should, therefore, look like the following:

Order Details

Order ID

Product ID

Qty

2348

J-558

4

2348

B-205

2

2348

D-4420

5

2349

C-789

3

2349

C-795

6

2350

A-2275

2

2350

F-198

6

2350

B-205

1

2351

C-795

6

2352

A-2275

2

2352

D-4420

3

2353

A-2275

6

2353

C-789

5

2354

A-2275

3

2355

D-4420

4

2355

C-795

3

2356

C-789

5

Copying and pasting data from Excel into Access

Now that the information about salespersons, customers, products, orders, and order details has been broken out into separate subjects in Excel, you can copy that data directly into Access, where it will become tables.

Watch the following demo to learn how to copy your Excel data into new tables in Access.

Show me Watch this

                             Video created by Office Online staff writers

Creating relationships between the Access tables and running a query

After you have moved your data to Access, you can create relationships between tables and then create queries to return information about various subjects. For example, you can create a query that returns the Order ID and the names of the salespersons for orders entered between 3/05/09 and 3/08/09.

In addition, you can create forms and reports to make data entry and sales analysis easier.

Watch the following demo to learn how to create relationships in Access and build and run a query.

Show me

Show me Watch this
Video created by Office Online staff writers

Top of Page

Applies To: Excel 2007



Was this information helpful?

Yes No

How can we improve it?

255 characters remaining

To protect your privacy, please do not include contact information in your feedback. Review our privacy policy.

Thank you for your feedback!

Support resources

Change language