While working on our World Cup dashboard my colleague Philippe Van Rillaer and I wanted to import data from tables spread out over multiple Wikipedia pages. One example was importing players that are currently selected for the teams that qualified for the upcoming World Cup. We could navigate to each country’s page separately, but of course Power BI offers more powerful options than this.
Usually this mass importing is quite easy to do quickly, following the correct steps:
We start with a new source (web) and enter the link (E.G. https://en.wikipedia.org/wiki/Belgium_national_football_team). After selecting the table we want to import, we have the data in our query editor. The steps we took to get to this point are on the right, under ‘Query Settings’. We can go one level deeper though, by opening our advanced editor (under ‘Home’ in the ribbon). We’re met with the code behind the steps, which is in the language ‘M’:
Next, we want to transform the importing of this single table into a function and apply it to a list of the qualified countries. In order to do this, we first create a parameter that will hold the country name: under manage parameters pick ‘New Parameter’. I fill out ‘Country’ for its name, set the type to ‘Text’ and add ‘Belgium’ as current value to start out.
Returning to the advanced editor of the table we already imported, you can now replace ‘Belgium’ and reference the parameter instead. The entire link then becomes:
“https://en.wikipedia.org/wiki/” & Country & “_national_football_team“
At this point we right click the table and choose ‘Create function’. This leaves us with a function we can then apply to another table, effectively giving us a method of importing the same table for a list of countries.
Next, we go to a table that enumerates the qualified countries. In this table we invoke our custom function (you can find this under ‘Add Column’). I named mine ‘GetCurrent’, so this is the function query I pick from the list. Power BI also asks us which column we want to use for the parameter Country, an easy choice given that it is a single column table.
The result is the table above. Clicking on the two arrows in the top right corner gives us a choice of which columns to import from our data source, after which each of the countries is expanded to cover multiple rows in our table, one for each player in their respective current squads.
Or at least, this was wat we expected. In reality the numbers of columns to choose from was overly large and the tables didn’t contain the data we expected to find.
What happened? It turns out that we didn’t reference the table by name (‘Current squad[edit]’ for the Belgian page), but by number instead. You can see this in the code in the advanced editor, it reads “Source{1}”. Other Wikipedia pages had different numbers and so we imported an amalgam of tables containing all sorts of information we didn’t need.
The solution seemed simple enough: reference by table name instead of number. In our custom function “Source{1}” was replaced by “Source{[Caption=”Current squad[edit]”]}” in the advanced editor and all seemed well for a few seconds.
Reapplying the custom function to our list of participating countries resulted in a few error lines however. Not only do the webpages have different numbering, it appears that there are also variations in the caption of this table.
For Belgium and a few other countries it is ‘Current Squad[edit]’. A variation however was simply ‘Current’ (looking at you, Peru). Once again we went back to the advanced editor of our custom function.
This time we added some logic that checks for the alternative caption of the table:
Third time’s the charm, we were greeted by a beautiful table filled with the data we wanted. Sometimes it’s the simple pleasures in life.
Want to see our final result? Click on this link and it’ll take you to the dashboard.
© 2023 Kohera
Crafted by
© 2022 Kohera
Crafted by
Cookie | Duration | Description |
---|---|---|
ARRAffinity | session | ARRAffinity cookie is set by Azure app service, and allows the service to choose the right instance established by a user to deliver subsequent requests made by that user. |
ARRAffinitySameSite | session | This cookie is set by Windows Azure cloud, and is used for load balancing to make sure the visitor page requests are routed to the same server in any browsing session. |
cookielawinfo-checkbox-advertisement | 1 year | Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Advertisement" category. |
cookielawinfo-checkbox-analytics | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics". |
cookielawinfo-checkbox-functional | 11 months | The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional". |
cookielawinfo-checkbox-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary". |
cookielawinfo-checkbox-others | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other. |
cookielawinfo-checkbox-performance | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance". |
CookieLawInfoConsent | 1 year | CookieYes sets this cookie to record the default button state of the corresponding category and the status of CCPA. It works only in coordination with the primary cookie. |
elementor | never | The website's WordPress theme uses this cookie. It allows the website owner to implement or change the website's content in real-time. |
viewed_cookie_policy | 11 months | The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data. |
Cookie | Duration | Description |
---|---|---|
__cf_bm | 30 minutes | Cloudflare set the cookie to support Cloudflare Bot Management. |
pll_language | 1 year | Polylang sets this cookie to remember the language the user selects when returning to the website and get the language information when unavailable in another way. |
Cookie | Duration | Description |
---|---|---|
_ga | 1 year 1 month 4 days | Google Analytics sets this cookie to calculate visitor, session and campaign data and track site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognise unique visitors. |
_ga_* | 1 year 1 month 4 days | Google Analytics sets this cookie to store and count page views. |
_gat_gtag_UA_* | 1 minute | Google Analytics sets this cookie to store a unique user ID. |
_gid | 1 day | Google Analytics sets this cookie to store information on how visitors use a website while also creating an analytics report of the website's performance. Some of the collected data includes the number of visitors, their source, and the pages they visit anonymously. |
ai_session | 30 minutes | This is a unique anonymous session identifier cookie set by Microsoft Application Insights software to gather statistical usage and telemetry data for apps built on the Azure cloud platform. |
CONSENT | 2 years | YouTube sets this cookie via embedded YouTube videos and registers anonymous statistical data. |
vuid | 1 year 1 month 4 days | Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos on the website. |
Cookie | Duration | Description |
---|---|---|
ai_user | 1 year | Microsoft Azure sets this cookie as a unique user identifier cookie, enabling counting of the number of users accessing the application over time. |
VISITOR_INFO1_LIVE | 5 months 27 days | YouTube sets this cookie to measure bandwidth, determining whether the user gets the new or old player interface. |
YSC | session | Youtube sets this cookie to track the views of embedded videos on Youtube pages. |
yt-remote-connected-devices | never | YouTube sets this cookie to store the user's video preferences using embedded YouTube videos. |
yt-remote-device-id | never | YouTube sets this cookie to store the user's video preferences using embedded YouTube videos. |
yt.innertube::nextId | never | YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen. |
yt.innertube::requests | never | YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen. |
Cookie | Duration | Description |
---|---|---|
WFESessionId | session | No description available. |