Calculated Columns in Power Pivot
With calculated columns, you can add new data to a table in your Power Pivot Data Model. But instead of pasting or importing values into the column, you create a Data Analysis Expressions (DAX)formula that defines the column values. For example, you want to add sales profit values to each row in a factSales table. By adding a new calculated column, and by using the formula =[SalesAmount]-[TotalCost]–[ReturnAmount], values are calculated by subtracting values from each row in the TotalCost and ReturnAmount columns from values in each row of the SalesAmount column. The Profit column can then be used in a PivotTable, PivotChart, or Power View report as you would any other column.
Calculated column in Power Pivot
Understanding Calculated Columns
The formulas in calculated columns are much like the formulas you create in Excel. Unlike in Excel, however, you cannot create different formulas for different rows in a table; instead, the DAX formula is automatically applied to the entire column.
When a column contains a formula, the value is computed for each row. The results are calculated for the column as soon as you enter the formula. Column values are then recalculated as necessary, such as when the underlying data is refreshed.
You can create calculated columns based on calculated fields and other calculated columns. For example, you might create one calculated column to extract a number from a string of text, and then use that number in another calculated column.
A calculated column can be based on data that you have added into an existing table. For example, you might choose to concatenate values, perform addition, extract substrings, or compare the values in other fields. To add a calculated column, you should have already added at least one table in Power Pivot.
Let’s look at the following formula;
Using the Contoso sample data, this formula extracts the month from the StartDate column in the Promotion table. It then calculates the end of the month value for each row in the Promotion table. The second parameter specifies the number of months before or after the month in StartDate; in this case, 0 means the same month. For example, if the value in the StartDate column is 6/1/2001, the value in the calculated column will be 6/30/2001.
For more information about the Contoso sample data, see Get sample data for DAX and Data Model tutorials.
Naming Calculated Columns
By default, new calculated columns are added to the right of other columns, and the column is automatically assigned the default name of CalculatedColumn1, CalculatedColumn2, and so forth. You can rearrange and rename columns after they are created. However, there are some restrictions on changes to calculated columns:
Each column name should be unique within a table.
Avoid names that have already been used for calculated fields within the same workbook. Although it is possible for a calculated field and a calculated column to have the same name, if names are not unique you can easily get calculation errors. To avoid accidentally invoking a calculated field, when referring to a column always use a fully qualified column reference.
When you rename a calculated column, any formulas that rely on the existing column must be updated. Unless you are in manual update mode, updating the results of formulas takes place automatically. However, this operation might take some time.
There are some characters that cannot be used within the names of columns, or in the names of other objects in Power Pivot. For more information, see "Naming Requirements" "in DAX Syntax Specification for Power Pivot.
To rename or edit an existing calculated column
Changing the Data Type
You can change the data type for a calculated column in the same way you can change the data type for non-calculated columns. You cannot make the following data-type changes: from text to decimal, from text to integer, from text to currency, and from text to date. You can make a change from text to Boolean.
Performance of Calculated Columns
The formula for a calculated column can be more resource-intensive than the formula used for a calculated field. One reason is that the result for a calculated column is always calculated for each row in a table, whereas a calculated field is only calculated for the cells that are used in the PivotTable or PivotChart.
For example, a table with a million rows will always have a calculated column with a million results, and a corresponding effect on performance. However, a PivotTable generally filters data by applying row and column headings; therefore, the calculated field is calculated only for the subset of data in each cell of the PivotTable.
A formula has dependencies on the objects that are referenced in the formula, such as other columns or expressions that evaluate values. For example, a calculated column that is based on another column, or a calculation that contains an expression with a column reference, cannot be evaluated until the other column is evaluated. By default, automatic refresh is enabled; therefore, all such dependencies can affect performance while values are updated and formulas refreshed.
To avoid performance issues when you create calculated columns, follow these guidelines:
Rather than create a single formula that contains many complex dependencies, create the formulas in steps, with results saved to columns, so that you can validate the results and assess performance.
Modification of data will often require that calculated columns be recalculated. You can prevent this by setting the recalculation mode to manual; however, if any values in the calculated column are incorrect the column will be grayed out, until you refresh and recalculate the data.
If you change or delete relationships between tables, formulas that use columns in those tables will become invalid.
If you create a formula that contains a circular or self-referencing dependency, an error will occur.
The following articles contain additional information about working with calculated columns.