Different Connectivity Modes in Power BI

Understanding Power BI’s various connectivity modes is essential to realise its full potential: whether you use Power BI Desktop or the Power BI service, how you connect to your data sources has a significant impact on your analytical capabilities.

 

In this article, we’ll look at the different connectivity modes in Power BI, how they work, and when they’re most useful. From the versatile Import mode to the real-time access provided by DirectQuery through the flexibility of Composite mode, we’ll walk you through the different options so you can make informed decisions based on your specific requirements.

 

But before we get into the modes, let’s look at the semantic models. These models are more than just structures – they are a method of organising data in a logical and meaningful manner. Semantic models, which abstract and transform physical database objects into logical dimensions, lay the groundwork for effective analysis in Power BI.

 

Once we’ve understood the concepts behind the semantic models, we’ll look at Power BI’s four primary connectivity modes:

  • Import: Versatile and efficient, suitable for most scenarios.
  • DirectQuery: Ideal for real-time access but with certain limitations.
  • Composite: Offering the best of both worlds by combining Import and DirectQuery.
  • Live Connection: Facilitating direct access to data sources without importing them into Power BI.

 

Furthermore, Power BI offers a wide range of data sources to connect to, catering to a variety of user needs. Power BI integrates seamlessly with online services like Salesforce and Dynamics 365, databases like SQL Server and Amazon Redshift, as well as simple files in Excel and JSON formats. What’s more, you can connect to other data sources such as Spark, websites, and Microsoft Exchange, broadening the scope of your analytical efforts.

 

Let’s get down to the intricacies of Power BI’s connectivity modes!

 

Figure 1: Power BI connection types

 

 

Import Mode

 

The most common connectivity for Power BI is Import mode, where Power BI Desktop stores the data inside Power BI cache.

 

If the data size is less than 1 GB, or if the data is not continually changing, then we can use Import mode. All the data’s interactions and filters will be applied to this compressed cache source rather than the original data source.

 

To import the data from our source, follow the steps in the screenshot below (these steps may vary depending on the data source, be it Excel files, SQL databases, or queries).

 

Screenshot of Setting up an Import mode dataset

Figure 2: Setting up an Import mode dataset

Use Cases

Before diving deeper into Import mode and its characteristics, we need to explore its most common use cases, which depend on the needs of the user and the project.

 

The most common use cases are:

  • The need for high-performance data visualisations.
  • The dataset is not extremely large or complex.
  • Data needs to be transformed or enriched before analysis.
  • The need for offline access to the reports.

 

Advantages

Using this mode provides several significant advantages. First, its in-memory querying enables exceptionally fast performance, allowing for swift query execution. You also have full access to all M and DAX functions, enabling extensive data manipulation and calculation capabilities. The mode also allows for a wide range of field formatting options and offers complete freedom in data modelling, regardless of the data source types or variations.

 

Characteristics

As we already know, Import mode is the most commonly used mode in Power BI, boasting a lot of distinguishing characteristics:

  • In Import mode, Power BI imports data from the data source into its own internal model. This means that data is copied and stored within the Power BI file (.pbix).
  • It can perform data transformation tasks during the import process using Power Query, a powerful ETL tool integrated into Power BI Desktop. This allows us to clean, reshape and enrich the data before it’s stored in Power BI.
  • It delivers optimal performance for data visualisations since the data is preloaded into memory, ensuring a fast response when interacting with visualisations. It also supports the execution of complex calculations with ease.
  • The amount of data you can import is limited by the machine’s memory and your licensing tier (i.e. Power BI Pro or Power BI Premium).
  • Import mode enables offline access to your data, meaning you can create, modify and refresh reports even when you’re not connected to the data source.
  • You can ensure data security and compliance through Power BI’s security settings and access controls.
  • The data in Import models is only as up-to-date as the latest refresh.

 

Note: Power BI Pro has limitations on the size of datasets you can publish to the cloud, while Power BI Premium offers more scalability.

 

The figure below shows the different development modes and sources for Import mode:

Figure 3: Different development modes for Import mode

 

Disadvantages

We have already explored the advantages and characteristics of Import mode, but what about its disadvantages? While Import mode can indeed offer fast execution, it does come with some drawbacks. One significant issue is the potential for resource pressure, as the entire model must be loaded into memory before any querying can take place, which can strain available system resources and impact overall capacity.

 

Additionally, the refresh frequency of models in Import mode may pose a challenge, as they require scheduled refreshes to ensure that the data remains up-to-date.

 

 

DirectQuery

As the name implies, DirectQuery is a method of retrieving data directly from the data source. It allows you to directly connect with a live database and it does not load or copy data into the Power BI model: data comes directly from the data source. The dashboard queries the data source in real time during runtime, meaning that every single request is sent directly to the data source, which then pulls the relevant data into Power BI.

 

Use Cases

When you create Power BI reports, the users expect the reports to be refreshed periodically and the data to be constantly updated. This can be done manually, but DirectQuery offers a more efficient solution: scheduled refresh. With DirectQuery, you can connect directly to a dataset, allowing your project to access live data.

 

The most common use cases are when:

  • Real-time or near-real-time access to data is a requirement.
  • The dataset is too large to import into Power BI.
  • The data is already well-prepared in the data source.
  • Security policies necessitate enforcing data security at the source level.

 

However, DirectQuery is only supported by a limited number of data sources, including Amazon Redshift, Azure HDInsight Spark, Azure SQL Database, Azure SQL Data Warehouse (now Azure Synapse Analytics), IBM Netezza, Impala, Oracle Database, SAP Business Warehouse, SAP HANA, Snowflake, Spark, SQL Server, and Teradata Database.

 

Below we can see how to set up DirectQuery:

 

Setting up DirectQuery in Power BI

Figure 4: Setting up DirectQuery in Power BI

Advantages

Accessing data directly from the source offers several advantages. This approach eliminates the size limitation, as no data is stored in the Power BI file, ensuring you never encounter issues related to data volume. It also reduces both cache usage and storage requirements for the file.

 

However, it’s important to note that changes to the underlying data are not immediately reflected in existing visuals and a refresh is still necessary. Power BI Desktop will resend the required queries for each visual and update them accordingly.

 

The figure below shows the development mode and the supported data sources for DirectQuery:

Figure 5: Development mode and supported data sources for DirectQuery

 

Characteristics

Let’s now explore the different characteristics of DirectQuery mode:

  • DirectQuery mode establishes a live connection to the data source, such as an SQL database or analysis services. Instead of importing data, Power BI sends queries directly to the data source whenever a user interacts with a visualisation.
  • Data is accessed from the source in real time, ensuring that the reports reflect the most up-to-date information.
  • Data transformation is performed within the data source itself, not within Power BI, meaning that the data should be made suitable for analysis before using DirectQuery.
  • DirectQuery mode can handle very large datasets since it doesn’t store data internally.
  • It’s important to ensure that your data source has appropriate access controls and security measures in place.

 

Disadvantages

While it’s true that DirectQuery mode offers several advantages, it also comes with some drawbacks. This mode provides limited functionality, meaning fewer Power Query operations are available, particularly in terms of visualisations. Neither does DirectQuery offer the full range of Power BI features, as it only provides two tabs in Power BI Desktop: Report and Relationship. Although you can modify relationships in this mode, it may be a slower connection compared to other types of connections. Finally, performance issues may arise, especially with complex calculations or aggregations.

 

Note: When you open a dashboard, the tiles reflect the data at the time of the last refresh, not necessarily the latest changes made to the underlying source. You can refresh an open dashboard to ensure that it’s up-to-date.

 

Below we can see the model structure for DirectQuery:

Figure 6: Model structure for DirectQuery

 

 

Composite Mode

 

Now that we’ve covered Import mode and DirectQuery mode, let’s explore the one that combines the best of both: Composite mode, which allows you to mix and integrate multiple DirectQuery data sources within a single report. This feature enables a report to have two or more data connections, meaning greater flexibility. With Composite models, you can combine data from multiple DirectQuery sources and also mix data from both DirectQuery and Import data sources.

 

Here we can see the different dataset sources for the dual mode:

 

Figure 7: Development modes and supported data sources for the dual mode

Use Case

Composite models are typically used in the following scenarios:

  • Performance Enhancement: Boost performance by setting up appropriate storage modes for each table.
  • Combining DirectQuery and Import Data: Seamlessly combine different data sources.
  • Linking to Enterprise Semantic Models: Connect and leverage enterprise-level semantic models.

 

Advantages

Combining two connectivity modes naturally brings significant advantages. Composite mode offers the best of both Import and DirectQuery modes. Let’s take a look at the benefits of using Composite models in Power BI:

  • Fast Performance: Import-mode partitions provide fast query responses.
  • Real-Time Data Updates: DirectQuery requests fetch the latest data changes from the data source.
  • Resource Utilisation: Fewer data refreshes are needed, especially when working with large datasets.
  • Many-to-Many Relationships: Enables complex data modelling.
  • Combining Data Sources: Can combine data from different source groups.

 

Note: Once you’ve chosen Import mode, you can’t go back to DirectQuery.

 

Note: Import mode is generally faster than DirectQuery because all data is retrieved from the Power BI Desktop cache.

 

In the figure below, we can see how different data sources are represented in Power BI across various modes:

 

Composite mode in Power BI

Figure 8: Composite mode in Power BI

 

Table Storage Modes

In Composite models, you can specify the storage mode for each table depending on factors such as usage, data size, and other considerations. The available storage modes are:

  • DirectQuery: Ideal for tables with large data volumes or those which need near-real-time results. Data will never be imported into these tables.
  • Import: Great for tables that aren’t used for filtering and grouping fact tables. It’s the only option for tables based on sources that do not support DirectQuery mode.
  • Dual: Typically used for dimension-type tables that might be queried together with DirectQuery fact-type tables from the same source.
  • Hybrid: This mode allows you to add both an import partition and a DirectQuery partition to a fact table, capture the latest data changes in real time, or provide fast access to frequently used data through import partitions, leaving infrequently used data in the data warehouse.

 

Disadvantages

Using a mixed connectivity type in Power BI does limit some capabilities. Tables in the Power BI report that use DirectQuery still face limitations with time-intelligence DAX functions and many of the data transformations in Power Query.

 

Hybrid Tables

The most used source in Composite mode is the hybrid table, a large table that combines both Import mode partitions and a DirectQuery partition.

The import-mode partitions deliver very fast query performance, while the DirectQuery partition ensures that the latest data updates are included.

Example of a hybrid table

Figure 9: Example of a hybrid table

The advantages of using a hybrid table are:

  • Fast Performance: Import-mode partitions provide fast query responses.
  • Real-Time Data Updates: DirectQuery ensures that the latest data changes are fetched from the data source.
  • Resource Utilisation: Fewer data refreshes are needed, especially for large datasets.

 

In the example below, we can see that in Composite mode, the tables originating from a DirectQuery are highlighted in light blue, which means that the data is real-time. The tables shown in dark blue are loaded using Import mode.

 

Figure 10: Relation tab in Composite mode

 

Power BI Queries in Composite Mode

In Composite mode, we can use different queries to retrieve data, depending on the result we want to achieve:

  • Queries using only Import or Dual table(s): Power BI retrieves all data from the model cache. This scenario is common for dimension-type tables queried by filters or slicer visuals.
  • Queries using Dual table(s) or DirectQuery table(s) from the same source: Power BI retrieves all data by sending one or more native queries to the DirectQuery source.
  • Queries using Dual table(s) or Hybrid table(s) from the same source: This scenario is a combination of the previous two. Power BI retrieves data from the model cache when available and sends queries to the DirectQuery source when necessary.
  • All other queries: These queries involve cross-source group relationships, such as when an Import table relates to a DirectQuery table or when a Dual table relates to a DirectQuery table from a different source.

 

 

 

Live Connection

 

In Live Connection, only the connection string is retrieved. Neither the data nor the schema is brought into Power BI. As a result, we cannot make any changes to the data model within Power BI.

 

Live Connection and DirectQuery might seem similar because neither stores data in the Power BI model, but they are fundamentally different connectivity types and cannot be used interchangeably.

 

Figure 11: SQL Server Analysis Services Tabular as the source for Live Connection

 

Using Live Connection means that no data is stored in the Power BI model, so all interactions with a report will directly query the existing Analysis Services model.

 

Note: There are some limitations regarding DAX measures in Live Connection. Only Report Level Measures are available, meaning these measures are stored within the Power BI report and are not written back to the Analysis Services model. However, all reporting capabilities in Power BI remain available when using Live Connection.

 

Can Live Connection be Used in a Composite Connectivity Type?

Live Connection establishes a connection to a semantic layer such as Analysis Services or a Power BI dataset. In an enterprise scenario, having a semantic layer between your data source systems and Power BI is crucial, as it simplifies the data in the data warehouse, making it more accessible and usable for the business.

 

This approach aligns complex data with familiar business terms, removing the need for additional data preparation. As a result, it offers a complete and consolidated view across the organisation:

 

Figure 12: Live Connection mode in Power BI

 

To conclude, we cannot use Live Connection to a centralised version of the truth and extend it with our own data through an Import connectivity type. However, this capability has been frequently requested by the Power BI community.

 

 

Direct Lake

 

Direct Lake mode allows you to analyse data directly from a data lake without the need to query a lakehouse or warehouse endpoint. It eliminates the need to import or duplicate data into a Power BI model.

 

his mode is based on loading Parquet-formatted files (a crucial component in the world of big data storage and analytics, playing a key role in data lakes by improving performance when working with large datasets) directly from a data lake.

 

Using OneLake to connect to Direct Lake in Power BI

Figure 13: Using OneLake to connect to Direct Lake in Power BI

 

Before we dive deeper into this connectivity mode, let’s look at the difference between a lakehouse and a data lake. A data lake is a storage repository that holds vast amounts of raw data in its native format, such as logs, images, and sensor data. It’s designed for scalability and flexibility, allowing data to be ingested without prior transformation. In contrast, a lakehouse combines the benefits of a data lake with structured data management. It organises data into tables and schemas, making it easier to query and analyse using tools like SQL. Essentially, a lakehouse bridges the gap between raw data lakes and traditional data warehouses.

 

Use case

Direct Lake is ideal for analysing both very large models and models with frequent updates at the data source:

 

Creating a Lakehouse in data services

Figure 14: Creating a Lakehouse in data services

 

Note: Direct Lake is supported only on Power BI Premium P and Microsoft Fabric F SKUs.

 

Lakehouse Requirements

Before using Direct Lake, you must provision a lakehouse (or warehouse) with one or more Delta tables in a workspace hosted on a supported Power BI or Microsoft Fabric capacity. The lakehouse provides the storage location for your Parquet-formatted files in OneLake, and also serves as an access point to launch the web modelling feature to create a Direct Lake model.

 

The following prerequisites are needed to be able to connect Power BI to Databricks:

 

  1. Power BI Desktop 2.85.681.0 or above.
  2. Access token for Databricks or AAD access to Databricks.
  3. An Azure Databricks cluster or Databricks SQL warehouse.

 

Previously, a data warehouse was needed to expose your data in the data lake to front-end tools like Power BI. This often resulted in data duplication, as illustrated in the schema below:

 

Figure 15: Requirements for a Power BI lakehouse

 

SQL Endpoint

When provisioning a lakehouse, an SQL endpoint for SQL querying and a default model for reporting are automatically created and updated with any tables added to the lakehouse. Although Direct Lake mode doesn’t query the SQL endpoint when loading data directly from OneLake, the SQL endpoint is necessary when a Direct Lake model must seamlessly fall back to DirectQuery mode. This fallback occurs when the data source uses specific features, such as advanced security or views, that Direct Lake cannot process. Direct Lake mode also queries the SQL endpoint for schema- and security-related information.

Data Warehouse

As an alternative to a lakehouse with an SQL endpoint, you can provision a warehouse and add tables using SQL statements or data pipelines. The procedure for provisioning a standalone data warehouse is almost identical to that of a lakehouse.

 

The Difference Between the Power BI Connectivity Modes

 

Now that we’ve reviewed each connectivity mode, let’s summarise and identify the differences between them. The table below highlights the distinctions between the four connectivity modes we’ve covered:

 

Figure 16: The differences between Connectivity modes in Power BI

 

The Difference Between Real-Time, Live & DirectQuery

In a real-time dashboard, new data rows can be ingested from various methods or sources, such as Power BI’s REST API, Stream Analytics, or other streaming services like PubNub. The main difference is that a real-time dataset uploads data directly without any user interaction, whereas Live Connection refreshes data whenever requested.

 

Can we use Live Connection with a Composite Connectivity Type?

Using Live Connection is quite common in enterprise scenarios. Live Connection establishes a direct connection to a semantic layer, such as Analysis Services or a Power BI dataset. In an enterprise setting, having a semantic layer between your data source systems and Power BI is imperative, as it simplifies the data in the data warehouse, making it more accessible and usable for the business.

 

However, it is currently not possible to use Live Connection to centralise your data through a Composite connectivity type. This functionality has been repeatedly requested by the Power BI community.

  

Direct Lake vs. Import Mode

To learn more about how Direct Lake in Microsoft Fabric performs as compared to Import Mode, you can watch our on-demand webinar on the topic.

 

 

Conclusion

 

In this article, we’ve explored and gained a solid understanding of all the available connectivity types that Power BI provides, including the newest and most up-to-date option, Direct Lake. We’ve also examined the advantages, disadvantages, and use cases of each connectivity mode to help determine which is most suitable, depending on the data.

 

The choice of data model implementation also depends on the specific needs of the project. To summarise the differences between the models we’ve seen:

  • In Import mode, Power BI loads all data into the data model.
  • In DirectQuery mode, Power BI only loads metadata, such as column headers, into the data model.
  • In Live Connection mode, Power BI connects to the data source in real time without loading any data into the data model, requiring the data model to be created directly within the data source.
  • In Direct Lake mode, Power BI accesses data stored in OneLake directly, without loading it into the model or relying on an SQL endpoint, making it ideal for large datasets with frequent updates.
  • Lastly, Composite mode allows you to combine Import and DirectQuery connections in a single model, offering flexibility by enabling both direct and cached data access within the same dataset.

 

Are you looking to optimise your Power BI connectivity and unlock the full potential of your data? Whether you need to understand the best mode for your specific use case, streamline your data processes, or explore advanced features like Direct Lake and Composite models, our team is here to help.

 

Monia L
monia.lahsoumi@clearpeaks.com