Some of my customers want me to import their data from a standard on-premises or cloud-based (Azure) SQL-database, but others want me to import their data from a REST API. If ever, you’ll have to import data from a REST API, you’ll surely encounter some obstacles. That’s why I write this article to prevent you from this. Based on a real-life example, I’ll talk about the most common problems when using REST API’s, where for some of them still no solution has been found by Microsoft itself (although not that I’m aware of).
My customer wants me to import data in Power BI, coming from the Zendesk API. Zendesk is known as a support ticketing system, which is mainly used by the IT helpdesk where tickets are created and updated when IT-related problems arise in a company. To give you a little example: an employee of a company sees that the printer doesn’t work anymore, so he sends a request to the IT helpdesk. The IT helpdesk creates a ticket with all the information needed and will try to solve the problem or send another more skilled employee to solve this problem. In the meantime, the ticket has to be updated with all the information, such as the name of the requester, the name of the agent (IT helpdesk employee), a description of the problem, when the ticket has been created, when the ticket has been solved and so on.
To summarize briefly what I had to do, is to import data about all the agents in the company, as well as all the tickets that have been created, so my customer can analyze this data afterwards.
Optional information: A big plus about Zendesk is that there is a lot of documentation (which can be found here if interested) of how to use the REST API’s, which is in general a good start of importing the desired data in a correct way to Power BI. Unfortunately, this still doesn’t solve the problems that occur with the function Web.Contents() and Power BI.
It all starts in Power Query M (Power BI) where I want to call REST API’s from the Zendesk API with the function Web.Contents(). This function is very useful and has a lot of parameters that can be used to call simple web pages such as Wikipedia, as creating new data with a REST API. If you don’t have any experience with calling REST API’s or how this all works internally, talking about HTTP-headers, GET, POST and such, then you’d struggle with the function Web.Contents() because it doesn’t have the essential information to use it correctly in Power Query M. It’s still a very powerful function if you use it correctly and can bypass it in certain ways.
The first problem that I crossed, is when I published my Power BI report (with the usage of Web.Contents() in it) from desktop to the service, to schedule an automatic refresh. Underneath is a screenshot that appeared before I wanted to schedule a refresh in Power BI service.
When I saw this screenshot, it told me nothing about how to solve this problem…
To find out how the problem occurred, I started from the bottom with a simple connection to the REST API. The following code in Power Query M consists of the main function Web.Contents(). It requires one parameter, namely the URL you want to call, as in my case: https://example.zendesk.com/api/v2/incremental/tickets.json?include=metric_sets&start_time=0. This URL will call the REST API to get all the tickets that have been created. You’ll also see that there are extra parameters added after the question mark, these “include=metric_sets” and “start_time=0”.
Important to notice is that I’ve hardcoded my URL in the function (I’ll come back to it later). As an extra, I’ve added an optional parameter to the function in which I create a HTTP-header with an access token added to it. To get the data from the Zendesk API, it is obligated to use an access token. The result of this query will give me the first 1.000 tickets created starting from the beginning.
Optional information: “include=metric_sets” will give me some extra data about the metrics of a ticket, and “start_time=0” is a UNIX-based value and is linked to the datetime of the first created ticket. If changing this to another UNIX-based value like 1292834834 (which is Monday 20 December 2010 08:47:14 as datetime) it will give me all the created tickets equal and after this UNIX-based value. More information about the REST API can be found here.
When running this query for the first time, it will ask me how I want to authenticate. As I will authenticate as anonymous, because my credentials are the access token stored in the header. Secondly, it asks me on which level I want to set my authentication. If I’m setting anonymous as authentication on the root level https://example.zendesk.com, it means that all the API calls underneath will be as anonymous.
It is also possible to set the authentication on the level of the API call, for example, if you make a call to https://example.zendesk.com/api/v2/incremental/tickets.json?include=metric_sets&start_time=0 with basic authentication, you can set different credentials then when you call the API https://example.zendesk.com/api/v2/users.json?page=1 (I won’t give a detailed explanation about this, but it is good to know that these settings have a role in the usage of the function Web.Contents(). I’ll give some examples in the chapters).
When the credentials are all set, everything works fine in Power BI desktop, I get the desired data. When publishing my Power BI report to the service, it gives me my first error as seen in the following picture. Do you notice that it is different than the first error? This is because the URL that I’m trying to call is a valid URL, but the credentials aren’t valid to get the desired data. Though why doesn’t it raise an error in Power BI desktop and in the service it does?
Many questions left unanswered… All of these problems have to do with the fact that Power BI desktop (in which refreshes succeed) uses a different method of analyzing the code in Power Query M, to determine the right data sources and the corresponding credentials, than in Power BI service. In Power BI service, static analysis is being used. A difficult term, isn’t it? Based on my example, I’ll explain what it does:
There’s a way to bypass this with a feature introduced by Microsoft which is called skip test connection. If you look back at the picture with the error message saying ‘invalid credentials’, at the bottom you’ll see a checkbox with the name skip test connection.
There are also cases in which this feature can be very useful, but here in my example it is not relevant to write about it. Though, there is a guy called Chris Webb whom is a Power Query M guru that speaks about the feature in more detail. (I must confess that this article is also based on the articles that he has been writing about the function Web.Contents(), but by using a real live example, it is easier to understand and getting more in detail about it.)
If you’ve been given the job to import data from public web contents such as the following URL: https://ckan.publishing.service.gov.uk/api/3/action/package_search?q=apples then you don’t have to worry about invalid credentials. When opening the URL in a web browser, it will give you the status code 200, which is OK and as expected also the right data. Therefore, as authentication, you can use anonymous and a scheduled refresh will succeed. You can try this by following the steps below:
As configured in Power BI destop, in Power BI service, the authentication method is also set to anonymous.
Instead of ignoring the test connection as discussed as the first problem solver, I could also add extra properties in the second parameter of the function Web.Contents(). In the example that raised the error in Power BI service, I used the full URL like shown below and because I used anonymous as authentication to call the API of Zendesk where credentials are actually needed, the test connection failed.
Because only the first parameter of the function Web.Contents() is being evaluated during the test connection or static analysis, I only need to give a valid URL as the first parameter, which in my case in the following example is https://example.zendesk.com. By opening this URL in a web browser, it gives me a valid result without giving me the error UNAUTHORIZED. Though, this doesn’t give me the API call to Zendesk… In the code, the call to the API of Zendesk as the extra parameters are given as extra properties in the second parameter of the function Web.Contents(). Only when the evaluation during the test connection or static analysis succeeds, then the rest of the function Web.Contents() will be executed. In my case, only when https://example.zendesk.com is valid as URL, then my properties in the second parameter of the function Web.Contents() will be executed.
To understand what is happening is that:
If possible, always use a different authentication method than anonymous. If for example, as in my case, the access token is stored hardcoded in the header, then it’s not secure. Next to using an access token, it is also possible to use basic authentication (other method than anonymous). To call the API of Zendesk, the agent of the IT helpdesk can use its login credentials, and in Power BI, it is also possible to use it as authentication method. In the following example, I’ll call the same URL like I did in the first example but instead of using anonymous I’ll use basic authentication. Giving the right credentials in Power BI, will lead to a successful scheduled refresh afterwards.
An extra remark: if I’m using different credentials than anonymous as authentication method in the user interface, it is not possible to combine it with the property ‘Authorization’ which is used in the second parameter of the function Web.Contents(), otherwise I get this error like this one.
That’s because the credentials that are passed during the authentication in the user interface are internally passed in the ‘Authorization’-property, so if I’m entering my credentials using basic authentication in the user interface, Power BI will convert these in base64 and pass them in the ‘Authorization’-property.
In the first code, I’m using basic authentication as authentication method, with the usage of the second parameter but without the ‘Authorization’-property.
In the second code, the username and password are parameters that the user can fill in. Here I’ll use anonymous as authentication method because the parameters are a replacement for the basic authentication method in the user interface.
If you have read the part of the test connection or static analysis, you’ll probably know how the piece of codes will be evaluated and executed.
Now I know how the function Web.Contents() works and I managed to resolve the problem with the invalid credentials. In my previous example, I always worked with static URL’s. But what if I want to work with parameters like in the following piece of codes. What if I want to include another set than the metric set I included in my previous examples? This can be done by parameters. In the following piece of codes which are functions, I can add an URL or a parameter, which will give me the wanted data. When I publish my Power BI report to the service, then I’ll get THE ERROR, the one that causes problems.
The problem has to do with the fact that Power BI service doesn’t recognize the data source. There’s not much information about why Power BI service falls over that, because if you give the exact URL as a parameter this should be the same as hardcoding the URL in the function Web.Contents()? I guess its logic that if you try to make it dynamic based on functions and parameters, that this can always be a variable value. In Power BI service, the test connection or static analysis will not know what the data source will be so it can’t be evaluated. In Power BI desktop, it also states that the used data source is not listed because the usage of a function or parameter. If Power BI desktop doesn’t recognize the data source, then Power BI service will certainly not recognize it.
The following code will finally solve the error. It is important that the root URL stays static if possible. In my previous experiences I know that the URL’s as the parameters of the API’s can change but that the root URL normally stays the same. For example, if I want to get the tickets of the company ‘example’ then I’ll use another path than I want to get the agents. If the test connection or static analysis succeeded with the root URL, then the rest of the function Web.Contents() will be executed anyway and then you can use as much parameters you want.
In the second part of the article I’ll show the full real-life case wherein functions as pagination will be used to call the API of Zendesk.