Process Azure analysis services through Azure data factory & Logic Apps

Process Azure analysis services through Azure data factory & Logic Apps

My first blog was a detailed explanation on how to use Azure data factory and logic apps together to improve your ETL process. Find it here. This blog represented the majority of user traffic so I decided that the time is there to follow up on this blog!

Azure analysis services is often used as an enterprise datamodel. Azure analysis services can be compared with Power BI (premium) as they share the same underlying architecture. Microsoft’s vision is to create a superset of Azure analysis services features within Power BI premium that might influence the choice when selecting where to host your datamodel (Power BI premium / Analaysis services). SQLBI has written a nice blog about choosing analysis services with large datasets. Find it here.

Azure analysis services supports two different method’s of storing data, direct query and import. In a direct query setting data resides in the database and is queried at runtime by the analysis services engine. This requires the standard pricing tier as it is a premium Analysis services feature. The import mode loads the data into the datamodel, this is called processing.

Processing Azure analysis services

There are multiple ways of processing Azure analysis services. Besides manual, automating the model processing can be accomplished via Azure functions, Azure automation, Logic Apps and even from within SSIS running under integration runtime in Azure data factory. Important to know is that it requires a service principle (app registration) or actual user + pass for authentication.

This blog describes how to set up the processing in combination with Azure data factory and Logic Apps.

Getting started

To get started provision an Azure Analysis services instance, Azure data factory instance and prepare a datamodel in visual studio that sources data from, for example an Azure sql database. For this blog I will re-use the datamodel from my previous blog about slicer dependancies.

In a classic Business Intelligence pipeline Azure data factory would function as an orchestrator calling Logic Apps, data flows / SSIS packages and it would take care of error handling. With the purpose to extract data from sources, transforming it and loading it into a datawarehouse and datamodel to enable the business to better base decisions on facts.

In this last step analysis services comes to play as a datamodel that can be used for easy reporting and self-service analytics. Analysis services should be processed with data from the datawarehouse, probably modelled as a starschema.

This blog is about how to process Azure Analysis Services with the use of Logic Apps in combination with Azure data factory.

Initiating the processing

Because Azure data factory is going to trigger this new logic app we are creating a logic app from scratch triggered by a HTTP request.

The second step is to initiate a HTTP request so we can call the Azure analysis services API for refresh. All details about the API can be found here.

The base URL should be altered by replacing the region, servername and modelname. /Refreshes initiates the actual refresh.

The base url: https://[westus.asazure.windows.net]/servers/[myserver]/models/[AdventureWorks]/refreshes

Fig 1

Figure 1 shows the Logic App configured thus far. Note the changed URL and the body. This is also described in the API details. For now we have configures a full refresh. One very important setting, and super easy to forget, is the change the asynchronous pattern for this request. This setting can be changed by clicking on the … for the HTTP task. Be sure it’s off. The reason we want this is that if we leave this setting on we get an immediate response from the API, and this does not yet contain the location that we need to check for the refresh state.

Authentication can be configured through Active directory OAuth and providing app registration details. Other options are possible but this depends on your approach and policies.

Running this Logic App will trigger Analysis services to start a full refresh. Triggering this Logic App from data factory is as easy as creating a pipeline, dragging a web request component onto the canvas and configuring it with the Logic App url and a sample body as shown in figure 2.

Fig 2

As we are used to with web requests we get an immidiate response only telling us the request was successful but not if the processing succeeded. To accomplish that we need to check the processing state.

Checking processing state

Compared to checking the state for an Azure SQL database checking the state for Analysis services requires the refresh location from the original request. This can be accomplished by returning this location to data factory and then use this in combination with the API to get the current processing state.

First the logic app is extended with a response task as shown in figure 3. Note that if the async pattern in still on this does not yet contain the location url.

Fig 3

Back in Azure Data Factory, create a variable (unselect all components and navigate to Variables) and drag a set variable component onto the canvas. Specify the formula to set the location variable to:
@activity(‘Trigger refresh’).output.Location. So far the pipeline triggers the refresh and stores the returned location into the Location variable.

To check the state we need a different Logic App. This app will fire a HTTP request on the analysis services API based on the location that was returned from the first Logic App. Because we need this location in Logic Apps our HTTP request needs to accept some input. This can be done by providing a body for the request as shown in figure 4.

Fig 4

The next step in the logic app is to parse the received JSON and fire the actual HTTP request to get the current processing state. To parse the JSON add a parse JSON task and use the body as content and provide the schema as shown in figure 5

Fig 5

After parsing the JSON we can now add a HTTP request that uses the parsed location to check the current processing state as shown in figure 6.

Fig 6

The last step is to return this response so we can use it in data factory to keep checking until the state is different than InProgress. The complete logic app including the details for the response is shown in figure 7.

Fig 7

Because this Logic App requires a valid url to check for the processing state we cannot run a test. The finishing touch including testing can be done from data factory.

Putting it all together

In azure data factory extend the pipeline with a until loop and use the following expression as shown in figure 8

@not(equals('InProgress', coalesce(activity('Check processing state').output.status, 'null'))).

It might be smart to configure a reasonable timeout so if there is something going wrong it does not keep polling.

Fig 8

As activity use a web request, set the Method to Post and provide the location variable as part of the body as shown in figure 9. To prevent this until loop from running every second a wait task for 60 seconds is a smart addition.

Fig 9

The pipeline is complete now and can be tested. In the first step the Logic App is triggered that posts a processing command to the analysis services API. It wait’s for a full response that contains the location wich we can use to later check for the current state for this processing command. The logic app will respond with this location.

The next step extracts the location from the response and stores it into a variable. Finally the until loop fires a web request to the Logic App that get’s the current state for this specific processing command. In data factory this process is repeated until the response is not equal to inProgress, so succeeded or failed. Figure 10 shows the full pipeline.

Fig 10

This pipeline can be saved as a “Process analysis services” pipeline and can be called from for example a master pipeline. This way we get small seperate pipelines that can be combined from a master / orchestrating pipeline.

Conclusion

Processing analysis services can be done in diferent ways. I prefer Logic Apps over Azure functions because there is no need for any code thus it’s really transparent and can be re-used for other scenario’s. It took me some time to figger out that the key thing in this set-up is to change the async pattern for the http request that fires the process command but besides that it’s super easy to set up.

Jordi

Geef een reactie