What is the Latest changes to extracts in tableau?

Extracts in the web

Beginning with version 2020.4, extracts are available in web authoring and content server. Now, you no longer have to use Tableau Desktop to extract your data sources. For more information, see Create Extracts on the Web.

Logical and physical table extracts

With the introduction of logical tables and physical tables in the Tableau data model in version 2020.2, extract storage options have changed from Single Table and Multiple Tables, to Logical Tables and Physical Tables. These options better describe how extracts will be stored. For more information, see Decide how the extract data should be stored.

Beginning with version 10.5, when you create a new extract it uses the .hyper format. Extracts in the .hyper format take advantage of the improved data engine, which supports faster analytical and query performance for larger data sets.

Similarly, when an extract-related task is performed on a .tde extract using version 10.5 and later, the extract is upgraded to a .hyper extract. After a .tde extract is upgraded to a .hyper extract, it can’t be reverted back to .tde extract. For more information, see Extract Upgrade to .hyper Format.

Changes to values and marks in the view

To improve extract efficiency and scalability, values in extracts can be computed differently in versions 10.5 and later compared to versions 10.4 and earlier. Changes to how the values are computed can affect the way marks in your view are populated. In some rare cases, the changes can cause your view to change shape or become blank. These changes can also apply to multi-connection data sources, data sources that use live connections to filed-based data, data sources that connect to Google Sheets data, cloud-based data sources, extract-only data sources, and WDC data sources.

To get an idea of some of the differences you might see in your view using version 2021.2, see the sections below.

Format of date and date time values

In versions 10.5 and later, extracts are subject to more consistent and stricter rules around how date strings are interpreted through the DATE, DATETIME, and DATEPARSE functions. This affects how dates are parsed, or the date formats and patterns that are allowed for these functions. More specifically, the rules can be generalized as the following:

  1. Dates are evaluated and then parsed by column, not by row.
  2. Dates are evaluated and then parsed based on the locale of where the workbook was created, not on locale of the computer where the workbook is opened.

These new rules allow extracts to be more efficient and to produce results that are consistent with commercial databases.

However, because of these rules, particularly in international scenarios where the workbook is created in a locale different from the locale that the workbook is opened in or the server that the workbook is published to, you might see that 1.) date and datetime values change to different date and datetime values or 2.) date and datetime values change to Null. When your date and datetime values change to different date and datetime values or become Null, it’s often an indication that there are issues with the underlying data.

Here are some common reasons why you might see changes to your date and datetime values in your extract data source using version 10.5 and later.

Common causes of changes to date/datetime values Common causes of null values
* When a function has to parse multiple date formats in a single column. Where the date is ambiguous and can be interpreted in several different ways, the date will be interpreted based on the format Tableau has determined for that column. For some examples, see Date scenario 1 and Date scenario 2 below.
  • When a function has to parse a YYYY-MM-DD (ISO) format. For an example, see Date scenario 3.
  • When a function doesn’t have enough information to derive the time, it can interpret a value as “00:00:00.0”, using “0” for hour, minute, second, and millisecond.
  • When a function doesn’t have enough information to derive the day, it can interpret a value as “1” or “January” for month.
  • When a function parses years, it is interpreted as the following:
    • Year “07” is interpreted as “2007”
    • Year “17” is interpreted as “2017.”
    • Year “30” is interpreted as “2030.”
    • Year “69” interpreted as “2069.”
    • Year “70” is interpreted as “1970.”|* When a function has to parse multiple date formats in a single column. After Tableau determines the date format, all other dates in the column that deviate from the format become null values. For some examples, see Date scenario 1 and Date scenario 2 below.
  • When a function has to parse a YYYY-MM-DD (ISO) format. Values that exceed what is allowed for “YYYY,” or “MM,” or “DD” cause null values. For an example, see Date scenario 3.
  • When a function has to parse date values that contain trailing characters. For example, time zone and daylight savings suffixes and keywords, such as “midnight” cause null values.
  • When a function has to parse an invalid date or time. For example, 32/3/2012 causes a null value. In another example, 25:01:61 causes a null value.
  • When a function has to parse contradicting inputs. For example, suppose the pattern is ‘dd.MM (MMMM) y’ and the input string is ‘1.09 (August) 2017’, where both “9” and “August” are months. The result is a null value because the month values are not the same.
  • When a function has to parse contradicting patterns. For example, a pattern that specifies a mix of Gregorian year (y) and ISO week (ww) causes null values.|
Date scenario 1

Suppose you have a workbook created in an English locale that uses .tde extract data source. The table below shows a column of string data contained in the extract data source.

10/31/2018
31/10/2018
12/10/2018

Based on the particular English locale, the format of the date column was determined to follow the MDY (month, day, and year) format. The following tables show what Tableau displays based on this locale when the DATE function is used to convert string values into date values.

October 31, 2018
October 31, 2018
December 10, 2018

If the extract is opened in a German locale, you see the following:

31 Oktober 2018
31 Oktober 2018
12 Oktober 2018

However, after the extract is opened in a German locale using version 10.5 and later, the DMY (day, month, and year) format of the German locale is strictly enforced and causes a Null value because one of the values doesn’t follow DMY format.

Null
October 31, 2018
October 12, 2018

Date scenario 2

Suppose you have another workbook created in an English locale that uses a .tde extract data source. The table below shows a column of numeric date data contained in the extract data source.

1112018
1212018
1312018
1412018

Based on the particular English locale, the format of the date column was determined to follow the MDY (month, day, and year) format. The following tables show what Tableau displays based on this locale when the DATE function is used to convert the numeric values into date values.

11/1/2018
12/1/2018
Null
Null

Date scenario 3

Suppose you have a workbook that uses a .tde extract data source. The table below shows a column of string data contained in the extract data source.

2018-10-31
2018-31-10
2018-12-10
2018-10-12

Because the date uses the ISO format, the date column always follows the YYYY-MM-DD format. The following tables show what Tableau displays when the DATE function is used to convert string values into date values.

October 10, 2018
Null
December 10, 2018
October 12, 2018

Note: In versions 10.4 (and earlier), ISO format and other date formats could have produced differing results depending on the locale of where the workbook was created. In an English locale for example, both 2018-12-10 and 2018/12/10 could produce December 12, 2o18. However, in a German locale 2018-12-10 could produce December 12, 2018 and 2018/12/10 could produce October 12, 2018.

Sort order and case sensitivity

Extracts have collation support and therefore can more appropriately sort string values that have accents or are cased differently.

For example, suppose you have a table of string values. In terms of sort order, this means that a string value like Égypte is now appropriately listed after Estonie and before Fidji.

About Excel data:

With regard to casing, this means that how Tableau stores values have changed between version 10.4 (and earlier) and version 10.5 (and later). However, the rules for sorting and comparing values haven’t. In version 10.4 (and earlier), string values like “House,” “HOUSE,” and “houSe” are treated the same and stored with one representative value. In version 10.5 (and later), the same string values are considered unique and therefore stored as individual values. For more information, see Changes to the way values are computed.

Breaking ties in Top N queries

When a Top N query in your extract produces duplicate values for a specific position in a rank, the position that breaks the tie can be different when using version 10.5 and later. For example, suppose you create a top 3 filter. Positions 3, 4, and 5 have the same values. When using version 10.4 and earlier, the top filter can return 1, 2, and 3 positions. However, when using version 10.5 and later, the top filter can return 1, 2, and 5 positions.

Precision of floating-point values

Extracts are better at taking advantage of the available hardware resources on a computer and therefore able to perform mathematical operations in a highly parallel way. Because of this, real numbers can be aggregated by .hyper extracts in different order. When numbers are aggregated in different order, you might see different values in your view after the decimal point each time the aggregation is computed. This is because floating-point addition and multiplication is not necessarily associative. That is, (a + b) + c is not necessarily the same as a + (b + c). Also, real numbers can be aggregated in different order because floating-point multiplication is not necessarily distributive. That is, (a x b) x c is not necessarily the same as a x b x c. This type of floating-point rounding behavior in .hyper extracts resemble that of floating-point rounding behavior in commercial databases.

For example, suppose your workbook contains a slider filter on an aggregated field comprised of floating point values. Because the precision of floating-point values have changed, the filter might now exclude a mark that defines the upper or lower bound of the filter range. The absence of these numbers could cause a blank view. To resolve this issue, move the slider on the filter or remove and add the filter again.

Accuracy of aggregations

Extracts optimize for large data sets by taking better advantage of the available hardware resources on a computer and therefore able to compute aggregations in a highly parallel way. Because of this, aggregations performed by .hyper extracts can resemble the results from commercial databases more than the results from software that specializes in statistical computations. If you’re working with a small data set or need a higher level of accuracy, consider performing aggregations through reference lines, summary card statistics, or table calculation functions like variance, standard deviation, correlation, or covariance.

About the Compute Calculations Now option for extracts

If the Compute Calculations Now option was used in a .tde extract using an earlier version of Tableau Desktop, certain calculated fields were materialized and therefore computed in advance and stored in the extract. If you upgrade the extract from a .tde extract to a .hyper extract, the previously materialized calculations in your extract are not included. You must use the Compute Calculations Now option again to ensure that materialized calculations are a part of the extract after the extract upgrade. For more information, see Materialize Calculations in Your Extracts.

New Extract API

You can use the Extract API 2.0 to create .hyper extracts. For tasks that you previously performed using the Tableau SDK, such as publishing extracts, you can use the Tableau Server REST API or the Tableau Server Client (Python) library. For refresh tasks, you can use the Tableau Server REST API as well. For more information, see Tableau Hyper API.