kohera-logo-regular.svg

World Cup Russia dashboard in Power BI: importing the teams from multiple web pages

Vector_BG.png

World Cup Russia dashboard in Power BI: importing the teams from multiple web pages

Vector_BG.png

While working on our World Cup dashboard my colleague Philippe Van Rillaer and I wanted to import data from tables spread out over multiple Wikipedia pages. One example was importing players that are currently selected for the teams that qualified for the upcoming World Cup. We could navigate to each country’s page separately, but of course Power BI offers more powerful options than this.

Creating our function

Usually this mass importing is quite easy to do quickly, following the correct steps:

We start with a new source (web) and enter the link (E.G. https://en.wikipedia.org/wiki/Belgium_national_football_team). After selecting the table we want to import, we have the data in our query editor. The steps we took to get to this point are on the right, under ‘Query Settings’. We can go one level deeper though, by opening our advanced editor (under ‘Home’ in the ribbon). We’re met with the code behind the steps, which is in the language ‘M’:

Next, we want to transform the importing of this single table into a function and apply it to a list of the qualified countries. In order to do this, we first create a parameter that will hold the country name: under manage parameters pick ‘New Parameter’. I fill out ‘Country’ for its name, set the type to ‘Text’ and add ‘Belgium’ as current value to start out.

Returning to the advanced editor of the table we already imported, you can now replace ‘Belgium’ and reference the parameter instead. The entire link then becomes:

https://en.wikipedia.org/wiki/” & Country & “_national_football_team

At this point we right click the table and choose ‘Create function’. This leaves us with a function we can then apply to another table, effectively giving us a method of importing the same table for a list of countries.

Next, we go to a table that enumerates the qualified countries. In this table we invoke our custom function (you can find this under ‘Add Column’). I named mine ‘GetCurrent’, so this is the function query I pick from the list. Power BI also asks us which column we want to use for the parameter Country, an easy choice given that it is a single column table.

The result is the table above. Clicking on the two arrows in the top right corner gives us a choice of which columns to import from our data source, after which each of the countries is expanded to cover multiple rows in our table, one for each player in their respective current squads.

Referencing by table name

Or at least, this was wat we expected. In reality the numbers of columns to choose from was overly large and the tables didn’t contain the data we expected to find.

What happened? It turns out that we didn’t reference the table by name (‘Current squad[edit]’ for the Belgian page), but by number instead. You can see this in the code in the advanced editor, it reads “Source{1}”. Other Wikipedia pages had different numbers and so we imported an amalgam of tables containing all sorts of information we didn’t need.

The solution seemed simple enough: reference by table name instead of number. In our custom function “Source{1}” was replaced by “Source{[Caption=”Current squad[edit]”]}” in the advanced editor and all seemed well for a few seconds.

Exception handling

Reapplying the custom function to our list of participating countries resulted in a few error lines however. Not only do the webpages have different numbering, it appears that there are also variations in the caption of this table.

For Belgium and a few other countries it is ‘Current Squad[edit]’. A variation however was simply ‘Current’ (looking at you, Peru). Once again we went back to the advanced editor of our custom function.

This time we added some logic that checks for the alternative caption of the table:

Third time’s the charm, we were greeted by a beautiful table filled with the data we wanted. Sometimes it’s the simple pleasures in life.

Want to see our final result? Click on this link and it’ll take you to the dashboard.

 

Photo of successful woman coder hacker web creator sitting armchair comfortable workspace workstation indoors.
The hurdles and pitfalls of moving or migrating a System-versioned temporal table cross database
Maybe you already have your own way of doing this and are wondering about alternative methods, or maybe you are...
Group of computer programmers working in the office. Focus is on blond woman showing something to her colleague on PC.
Updating your Azure SQL server OAuth2 credentials in Power BI via PowerShell for automation purposes
The better way to update OAuth2 credentials in Power BI is by automating the process of updating Azure SQL Server...
2401-under-memory-pressure-featured-image
Under (memory) pressure
A few weeks ago, a client asked me if they were experiencing memory pressure and how they could monitor it...
2402-fabric-lakehouse-featured-image
Managing files from other devices in a Fabric Lakehouse using the Python Azure SDK
In this blogpost, you’ll see how to manage files in OneLake programmatically using the Python Azure SDK. Very little coding...
2319-blog-database-specific-security-featured-image
Database specific security in SQL Server
There are many different ways to secure your database. In this blog post we will give most of them a...
kohera-2312-blog-sql-server-level-security-featured-image
SQL Server security made easy on the server level
In this blog, we’re going to look at the options we have for server level security. In SQL Server we...