03 Jun 2020 Data Lineage in Power BI Service
1. What is Data Lineage?
In modern BI projects, understanding the flow of data from its source to its destination is a challenge, an even greater one if you have built advanced analytical projects spanning multiple data sources, artifacts, and dependencies. Questions like “What happens if I change this data?” or “Why isn’t this report up to date?” can be hard to answer and may require a team of experts. To address all these issues, Microsoft has designed a data lineage view to help its users answer these questions.
Data Lineage was introduced to Power BI Service in 2018, and since then it has undergone multiple enhancements.
Power BI Service has several artifact types, such as dashboards, reports, datasets, and dataflows. Many datasets and dataflows connect to external data sources such as SQL Server, and to external datasets in other workspaces; external data sources and datasets make it harder to know where the data is coming from. In lineage view, you see the lineage relationships between all the artifacts in a workspace, and all the external dependencies; it shows connections between all workspace artifacts, including connections to dataflows, both upstream and downstream.
2. Data Lineage View
Data Lineage can be viewed from a workspace in Power BI Service. All workspaces, both new versions and old, offered Data Lineage view for its artifacts by default. For the user to be able to view the data lineage in the workspace they must have at least a “Contributor” role assigned.
The concept of dataflows is significant in the Data Lineage view in the Power BI Service workspaces. In addition to being able to see the dataflows and the dependencies between them, you can click on any dataflow to see the entities it contains; you can also jump directly to edit the dataflow from this view. Similarly selecting a dataset shows the tables it contains, and selecting a dashboard or report takes the user directly to the visualization. But the real power of this view comes from the relationships between objects, where Data Lineage really comes to the forefront.
Figure 1: Generalized Data Lineage view
Figure 2: Data Lineage view of a dataflow
3. Exploring Data Lineage View
We are going to explore the various workspace artifacts used in the Data Lineage view and how the data flows from one artifact to another.
3.1. Data Sources
The first card in the Data Lineage dataflow shows the data sources from which the datasets and dataflows get their data. On the data source cards, you see more information that can help identify the source. For example, for Azure SQL server you also see the database name.
Figure 3: Data source card in Data Lineage view
3.2. Data Gateways
If a data source is connected via an on-premises gateway, the gateway information is added to the data source card. If you have permissions, either as a gateway administrator or as a data source user, you see more information, such as the gateway name.
Figure 4: Data source card with Gateway details in Data Lineage view
3.3. Datasets and Dataflows
On datasets and dataflows, you see the last refresh time, as well as if the dataset or dataflow is certified or promoted.
Figure 5: Data source card with last update time and data category type (Promoted / Certified)
3.4. Inter-workspace relativity
For a report present in one workspace built using the dataset present in another workspace, the data source card will display the name of the workspace in the details.
3.5. More Artifact Details
If we require more information for any other artifact click on the ellipses and select the artifact information that we are looking for. If more metadata is required of an artifact, just click on the artifact and you can see the details in the side pane in the window.
Figure 6: Artifact metadata details displayed in the side pane
4. Getting started with Data Lineage view
- Every workspace, whether new or classic, automatically has a lineage view, except ‘My Workspace’.
- To access lineage view, go to the workspace list view, tap the arrow next to List view,and select Lineage view.
Figure 7: Artifact metadata details displayed in the side pane
Figure 8: Data Lineage page preview
Conclusion
- Power BI Pro license is required to see Data Lineage view.
- Data Lineage view is available only to users with access to the workspace. Users must have an admin, member, or contributor role in the workspace; users with a viewer role can’t switch to Data Lineage view.
- One limitation of Data Lineage view is that it cannot be displayed in Internet Explorer.
- Users can build their own Data Lineage view using Power BI Rest APIs. Reference Link.