How to Improve Data Quality in Power BI

Column quality, Column distribution, and Column profile

Data profiling feature in Power BI

In this blog, you will understand all about Power BI Data profiling tools (Column quality, Column distribution & Column profile).

The data profiling tools provide you with a visual way to understand more about your data. Using this you can clean, transform, and understand data in Power Query Editor.

Column quality, Column distribution & Column profile are the main parts of the data profiling process.

You can see Data profiling under Power Query Editor window, follow these steps to enable data profiling-

Open Power Query Editor > Go to View tab > and check the following under Data Preview-

Column quality, Column distribution & Column profile.Enable data profiling

Enable data profiling

Note:-

By default, Power Query will perform this data profiling over the first 1,000 rows of your data.

To perform with entire dataset, check the lower-left corner of your editor window to change how column profiling is performed.Data profiling with 1000 rows

Data profiling with 1000 rows

Understand data profiling in detail-

Monospaced:

Display data in Monospaced font, it will change the font-family under Power Query editor.

Show Whitespace:

Show WhiteSpace and Newline character, if exist in your data.Data-profiling Show whitespace

Data-profiling Show whitespace

Column quality:

The Column quality checks the quality of the data in terms of validError & Empty, also it displayed the percentage of data values associated with the selected table.

  • Valid- shown in green
  • Error- shown in red
  • Empty- shown in dark grey

Column-quality-data-profiling

Column-quality-data-profiling

By mouseover any column it will show you a numerical distribution of the quality of values throughout the columns, selecting the ellipsis icon(…) it opens some quick action buttons for operations on the values.

Column distribution:

In this section, you can see the distinct and unique records of the values in each of the columns as in histogram visual.

When you mouseover the column and select the ellipsis icon(…) it displays the suggestion for operations on the values.Data-profiling-column-distribution

Data-profiling-column-distribution

Column profile:

This is the most important feature and provides a more in-depth look at the data in a column.

Column Statistics:  It displays Count, Error, Empty, Distinct, Unique, Empty String, Min, & Max of the selected column.

Value Distribution: As per value shows data in a bar graph.

Data-profiling-Column-profile
Notice the data quality issue: there are two labels for warehouse (Warehouse, and the misspelled Ware House)

Below, It is another good article about this topic.

Profile data in Power BI – Learn | Microsoft Docs

Ref: https://powerbidocs.com/2021/03/02/column-quality-column-distribution-column-profile/