[Service Fabric] Using the Azure Files Volume driver with multiple volumes

I was recently working with a customer who has an application running in a Windows container, and that application outputs log files into different folders inside of the container. Their log extraction process was to manually remote desktop into the node, then go into the container to get the logs out.

This blog post and the samples/steps is something I used to help them understand how to use the Service Fabric Azure Files Volume driver.

Prerequisites

To implement the solution, you will be required to:

1. Already have a pre-existing Azure Container registry with your container images uploaded.

2. Already have a secure Azure Service Fabric cluster. You will need to modify your current cluster configuration to allow the use of the Service Fabric Azure Files Volume driver. This can be implemented by modification of your deployment ARM template.

3. Already have a Service Fabric container application. Modification of your Service Fabric applications ApplicationManifest.xml file for the volume driver configuration will be required. By Service Fabric container application, I mean you would use the Visual Studio template for a Service Fabric application and then choose the Container template.

What is not included in this blog post

There are improvements that can be made to this solution prior to releasing it to production. What has not been added to the solution is:

1. How to secure your Azure Storage keys in the ApplicationManifest.xml file.

2. How to secure your Azure Container Registry password in the ApplicationManifest.xml file.

3. How to secure your Azure Files shared drive endpoints.

Creating a storage account with Azure Files shares

Although your Service Fabric deployment already has at least two storage accounts, it is best to create a separate storage account to use for your Azure Files shares. This way, you can secure this storage account separately.

To create a storage account with Azure Files and then create your file shares, follow the steps on this link https://docs.microsoft.com/en-us/azure/storage/files/storage-files-quick-create-use-windows.

In regard to the file shares for this sample application, I created 3 shares:

· webapperror – Will contain log files that represent any trace information classified as LogError level or worse

· webappinfo – Will contain log files that represent any trace information classified as LogInformation or worse. This log file will also contain the container logs.

· webappwarn – Will contain log files that represent any trace information classified as LogWarning or worse

Configure your Service Fabric cluster

To order to use the Service Fabric Azure Files Volume share driver, you will need to modify your currently Service Fabric cluster configuration.

The instructions for doing all of the setup is located at https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-containers-volume-logging-drivers, but I will cover it here in more detail.

There are 2 ways to do this:

1. If you have a currently existing ARM deployment template:

a. Modify the fabricSettings section of your ARM template by adding:

image
You may decide that you want to use a different port number, and that is ok.

b. Redeploy the ARM template to update your cluster. Depending on the size of the cluster, this may take a while.

2. Use the Azure resources site.

a. Go to https://resources.azure.com. Log in with your Azure subscription credentials.

b. Click on subscriptions | <your subscription> | resourceGroups | <yourClusterResourceGroupName> | providers | Microsoft.ServiceFabric | clusters | <yourClusterName>.

image

c. Over to the right, click on the Edit button:

image

d. Find the fabricSettings section in your template and update it with the configuration information. Be careful of where you put your commas!

image

e. Click on the Put button at the top of the template. This will kick off the cluster update process that may take a while.

image

Deploying the Service Fabric Azure Files Volume driver

1. Download the PowerShell script to install the Azure Files volume driver from https://sfazfilevd.blob.core.windows.net/sfazfilevd/DeployAzureFilesVolumeDriver.zip.

2. Once you have unzipped the package, open PowerShell ISE to the directory where DeployAzureFilesVolumeDriver.ps1 file is located. Make sure you change the directory of the PowerShell command prompt window to the same directory.

3. Run the following command for Windows:
.\DeployAzureFilesVolumeDriver.ps1 -subscriptionId [subscriptionId] -resourceGroupName [resourceGroupName] -clusterName [clusterName] -windows

Or – this command for Linux

.\DeployAzureFilesVolumeDriver.ps1 -subscriptionId [subscriptionId] -resourceGroupName [resourceGroupName] -clusterName [clusterName] -linux

4. Wait until the deployment completes.

5. Open the Service Fabric Explorer to assure that the Azure Files Volume driver application has been installed:

image

Your Service Fabric application (container template) setup

Modification will be required to your ApplicationManifest.xml file for your Service Fabric container application, not the actual container image or the application within the container image. This is assuming that you have applications running in the containers, and you know where the log folders are. The sample application is located here.

Modify your ApplicationManifest.xml file

In my sample applications service manifest file, I am mapping 3 volume shares to 3 directories where my application (inside the container) drops log files:

image

You will need to add the <Volume> element in your ApplicationManifest.xml file. Pay close attention to these settings:

Source = this is the volume name. You can name it anything that you want. It does not have to match anything in your storage account or a folder path name

Destination = This is the path to your log file(s) location inside of the container.

<DriverOption – Value> – this is the name of your Azure Files share

Once you modify your ApplicationManifest.xml file, redeploy your Service Fabric container application to your cluster.

Testing the Azure Files share

If everything is working correctly, you should be able to go to your Azure Files share in the Azure portal, and see files in that share:

image

Some notes about the Azure Files share:

· Every file that is dropped into your applications log folder(s) will appear in the share

· You have the ability to download the current file from within the Azure portal or simply just view the contents

· Your application needs to have a different name for each log file, based on the node name that you have in your cluster. This way, you’ll know which node the information is coming from. Within a Service Fabric application, you can look at this link for some sample code on how to query for this information https://stackoverflow.com/questions/43959312/how-to-get-name-of-node-on-which-my-code-is-executing-in-azure-fabric-service.

If you have a guest executable running inside of the container, chances are, you are not going to have any Service Fabric framework code; therefore, you can’t use the FabricClient to query that kind of information. In this case, you need to come up with a different naming convention to know which machine the log file came from.

You can undoubtedly get the name of the machine by doing System.Environment.MachineName, but you can’t see the actual machine names in the Azure Portal or Service Fabric Explorer, so using the machine name may be too much of a challenge. Remember, if Service Fabric replaces a node for some reason, you could end up with a different machine name.

My sample application, which is a .Net Core 3.1 application (not a Service Fabric framework app), uses Serilog, and grabs the machine name.

[Logic Apps] Using BizTalk transform maps in Logic Apps

I was recently working with a customer that has been using BizTalk Server 2016 for quite a while and they wanted to start using Azure Logic Apps for some of their processes. I will note here that they were not in the process of completely moving off of BizTalk, but instead were just moving a few of the simpler processes out of BizTalk and were curious to see what Logic Apps could do for them.

What the customer wanted to do was use their BizTalk mapping file artifacts, as is, and use that artifact inside of a Logic App action to do a transformation to transform an incoming flat-file to an XML file.

The quick, executive summary answer to this question is, no, you can’t use the BizTalk mapper file inside of a Logic App. However, all is not lost, if you are on your BizTalk server machine and have your mapper artifacts, you can still achieve your goal. Otherwise, this would be a very short blog post.

In this blog post, we will go through the steps from BizTalk Server to Logic Apps, to transform a flat-file to an XML document. We won’t go through the process of actually creating the map itself, but we’ll look at it in the BizTalk Flat File Mapper.

Step 1 – Discover your BizTalk artifacts

Shown in the screenshot below is my BizTalk project that contains a BizTalk map (.btm file) and the source and destination schemas (.xsd files). In the case of this example, I am using Visual Studio 2015 because that was installed automatically on my BizTalk Azure VM. Here is a screenshot of my project.

image

To keep things simple, I’m not using Functoids or loops or anything like that. The purpose of this blog post is to discover how to go from what we have above, to Logic Apps. What we want to achieve though, is to use a Logic App to perform the transformation that BizTalk would have done for us.

Step 2 – Getting your XSL

As I said at the beginning of this post, you can’t use the BizTalk.btm map file with a Logic App. But really, it isn’t this .btm format file that you need, it is the XSL that is generated FROM this mapping operation that you need. But how to you get the XSL? It’s easy….

1. Right-click on your .btm file in the Visual Studio Solution Explorer and select the Validate Map menu item.

2. In the Visual Studio output window, you will see that the mapping component is going to be invoked (validated) and Visual Studio will report where the output XSL is located:

image

3. Go to the location/path listed in your output window and copy the .xsl file to a location where you can later use it with your logic app. Note that you will also need to copy both source and destination schemas for the Logic App to use too.

Step 3 – Creating your Logic App Integration Account

Normally, one might think that we would just start out right away and create a logic app. In our case however, we are going to need to use built in logic app actions that are designed to be used for ‘integration’. Actions such as transforms, XML validations, flat-file decoding etc. Generally speaking, what BizTalk is used for is enterprise integration and therefore, we need to use enterprise integration functionality that is not available in a normal logic app.

Anytime we are using integration actions, we need to create an integration account. Steps for creating an integration account can be found here. Perform that process now. You can set the Pricing Tier to Free.

Step 4 – Setting up your Integration Account

The simplest form of enterprise integration consists of source and destination schemas with an XSLT transform in the middle to perform some sort of transformation. After your integration account has been created, click on its icon from within your resource group.

What you will see are tiles for Schemas (.xsd), Maps (.xsl/.xslt), Assemblies etc.

1. Click on the Schemas tile.

2. Click on the +Add toolbar icon.

3. Give your schema a name (it doesn’t have to match your actual file name, although it probably should).

4. Browse for your actual schema file.

5. Select Ok to upload your schema file.

6. Repeat 1 – 6 for the second schema.

7. Go back to the tiles view in your integration account and click on the Maps tile.

8. Enter your map name and then select the Map type of XSLT. Upload the file.

Step 5 – Creating your Logic App

1. You will need to create a new Logic App. You can find basic instructions on how to do that here. Wait before you actually create the logic app, i.e., picking a template. You may be pushed right to the logic apps designer. If this is the case just go back to the main blade for the logic app.

2. Click on the Workflow settings menu item.

image

3. Click on the Integration account dropdown and find the integration account you previously created.

4. Click the Save button at the top of the blade.

5. Click on the Logic app designer menu item.

6. Choose the Blank Logic App template tile.

7. In the logic app design window, type in ‘http’ and then scroll down and choose the ‘When HTTP request is received’ icon. You will be passing in just flat-file data. Save the logic app.

image

8. Click the +New step button.

9. Type in ‘flat file’ and you should see the Flat File Decoding action. Select this action. The flat file decoder action outputs an xml document that you may do further conversions on. It also helps validate that the incoming flat file structure matches the schema.

image

10. For the Content setting, click your mouse in the content edit field and on the right-hand side, choose ‘See more’. From there, click on Body. This will insert the Body into the content field.

11. Click on the Schema Name drop-down and select the name of your schema, which in my case is BuyerFlatFileSchema. Save the logic app.

image

12. Click on the +New step button.

13. Type ‘xml validation’ into the search field and then select the XML Validation action. In the previous step, the flat file decoding step decoded the flat file document into XML, now you need to validate that XML against the SAME schema as in the last step. This may seem like an unnecessary step, but it is the kind of guarantee that most B2B processes expect as a document flows through the system.

image

14. For the Content, click in the content edit field and then choose Body in the dynamic content window.

15. For the Schema Name, choose the BuyerFlatFileSchema schema. Save the logic app.

16. At this point, you have gotten the flat file into the process, decoded it in to XML and validated the XML. The next thing that typically happens is that you need to confirm that the incoming XML document only has one root node. I cannot map one document format to another with the map from my BizTalk project if the document has more than one root node.

To validate this, you will using an xpath expression.

17. Click on the +New step button.

18. Type in Condition into the search field and select the Condition control icon.

image

19. Click the checkbox adjacent to the first edit field and place your cursor in that field. Select the Add dynamic content link if the dynamic content box does not appear.

image

20. Click on Expression in the dynamic content field. Either paste in or type the text below and then select the OK button.

xpath(xml(body(‘Flat_File_Decoding’)),’string(count(/.))’)

image

21. Leave the middle field set to ‘is equal to’ and put your cursor in the last edit field on the right-hand side. Go through the same process as before, selecting dynamic content -> Expression and then put in the text:
string(‘1’)

22. Save the logic app.

23. Before going any further, let’s look at the full expression that you just created to understand what is going on. Click on the </> Logic app code review menu item.

In the code you will see this:
“equals”: [“@xpath(xml(body(‘Flat_File_Decoding’)), ‘string(count(/.))’)”,“@string(‘1’)”

  • The first part of the expression highlighted above, takes the output of the flat file decoding action and makes sure the body represents an XML document
  • Next, the XPath code says ‘get a count of the number of root nodes ‘/’ starting with the first root node as the current node ‘.’. All this of course has been turned in to a string by XPath.
  • Finally, this output is compared to the string ‘1’ to make sure there is only one root node.

Go back to the Logic App designer.

24. Click on the Add an action link in the If true branch.

25. In the search field type in ‘xml’ and then look for the Transform XML icon. Click on the icon.

image

26. Put your cursor in the Content field and select Body from the Flat File Decoding section of the dynamic content dialog. For the Map, click on the Map drop-down control I select the name of my XSL map, BuyerToCustomer. This is the step in the workflow that actually does the full transformation from flat file to XML.

image

27. Save the logic app.

Step 6 – Testing your integration Logic App

1. Scroll to the top of your logic app and click on the Http request action. You should notice that you have a full URL available to you. Copy this URL to your clipboard.

image

2. Open Postman or tool of your choice that allows you to perform POST commands with data to your logic app.

3. Select a new tab in Postman and paste your URL into the address field. Make sure you select POST as the action you’re going to perform.

4. You do not need to modify the header settings because you will be putting in plain text. Click on the Body menu item and then the ‘raw’ choice button.

5. Put in some sample flat file data that you wish to pass in as shown below.

Make sure that at the end of the last line of text, you select Enter to make sure that you have a carriage return/line feed at that location.

image

6. Within Postman, click the Send button. What you should see is a 202 Accepted response in Postman.

7. Go back into the portal to the Overview blade for your logic app. Click on the Refresh toolbar button and then look at the Status in the Runs history section.

8. Click on the status and you will be taken to the ‘Logic app run’ blade where you can see the steps that your logic app took to complete the workflow. You can click on any of the actions and the action will expand and show you the details.

image

[Service Fabric] Why won’t Visual Studio connect to my cluster?

In this blog post, I’ll discuss something that has frustrated both myself and many others for quite a while, and that is, failure of Visual Studio to connect to an Azure Service Fabric cluster. We’ll be using Visual Studio 2017 as an example.

I’m not sure whether it’s a problem with the Visual Studio tooling, poor connectivity from Visual Studio out through the internet to the cluster, or what is really going on with this but it happens quite frequently.

Requirements:

    1. You have a secure cluster up and running in Azure.
    2. You have your cluster node-to-node certificate downloaded and imported to your development machine. The certificate can be self-signed in this case.

Scenario

I have a simple Service Fabric application up and running in Visual Studio on my dev machine and I’m ready to deploy it to my Azure Service Fabric cluster.

From within Visual Studio, right-click on your Service Fabric application and select the Publish menu item. From the publish dialog box, you will be expected to connect to your Azure account.

After a couple of seconds of a spinning cursor, what you should see is your cluster listed in the Connection Endpoint. Also if you expand the Advanced Connection Parameters drop-down, you should see your certificate listed.

SNAGHTML1a175702

You can now publish to your cluster.

Up until now, life is great!

So now you decide to open up a different Service Fabric application within Visual Studio and you want to deploy that application to your cluster also.

Just like you did in the previous steps, right-click on the application name and select Publish.When you do, you see the dreaded ‘red-x’.

image

If you hover over the red-x, you will more than likely see the message ‘Failed to contact the server. Please try again later or get help from ‘How to configure secure connections’.

Troubleshooting

Since you had previously connected to the same cluster before, you know you should be able to connect again, however, this time, you are in a different application with a different publish profile.

If you were on a totally different machine trying to connect to the cluster, the first thing to check would be the certificate. You would check to make sure you have the certificate imported and also you can check in the publish dialog box to see if your certificate is listed in the Advanced Connection Parameters drop-down.

In our case, the actual solution is to close the Publish dialog box and open up the Cloud.xml file in the PublishProfiles folder.

If you scroll down in this file, you’ll see a ClusterConnectionParameters element. You need to confirm that all the settings, especially ConnectionEndpoint and ServerCertThumbprint/FindValue have the correct information. I would say most of the time, it will be the certificate thumbprint that is incorrect, but your mileage may vary. That’s what it was in the case of this scenario.

image

Once I correct the certificate thumbprint and return to the Publish dialog box, you should now see that you are able to publish to your cluster.

Suppose though that this doesn’t correct your problem. What else can you check?

  • You can open a PowerShell command prompt and run the Connect-ServiceFabricCluster cmdlet with the parameters you have in the Cloud.xml file to see if you actually can connect.
  • Although you see the red-x, that warning in the pop-up really doesn’t help much. Go ahead and click the Publish button and attempt to publish. You may be surprised to find that you can actually publish and that Visual Studio for some reason just had not updated the dialog, OR the output window will show the actual error.
  • Another thing that could happen, if you have your client-to-node security setup with Azure Active Directory (AAD), is that you receive an access-denied error. In this case, if the application you are deploying is registered with AAD and your Azure login has no permissions to do anything with the app, you may not be able to publish the app to the cluster.
  • Last but not least, in one of my much earlier blog posts I demonstrated that I ran in to an issue where I actually had to physically type in the certificate thumbprint inside of the Advanced Connection Parameters thumbprint fields because of a white-space somewhere in the thumbprint I had pasted in.

Until next time, hope this helps you with your Service Fabric development!

[Service Fabric] Stateful Reliable Collection Restore –The New Way

In my previous blog post about how to do a Service Fabric stateful service backup, you learned how to setup the ARM template, use PowerShell scripts to create, enable, suspend and delete your backup policy.

I wanted to provide a separate blog post because although the PowerShell code for performing a Restore isn’t too complex, the amount of information you need and how to find that information is critical to your success in the Restore operation.

Pre-requisites (what I tested this on)

  • Visual Studio 2017 v15.9.4
  • Microsoft Azure Service Fabric SDK – 3.3.622

Code location:

https://github.com/larrywa/blogpostings/tree/master/SFBackupV2

It is assumed that you have read my previous blog posting and have your cluster setup, the Voting application installed and backups already collected in your Azure storage accounts blob storage container. After all, you can’t do a restore without a backup now can you?

In the previous posting, I did an ‘application’ backup, which backed up every stateful service and partition in the application. I only had one stateful service with one partition, so technically, I could have just executed the REST API command for backing up a partition if I wanted to.

What you will see below, is that we will restore to a particular partition in the running stateful service, which should help you understand how you would find the backup of the partition to restore from. The restore operation will automatically look in the storage location specified by the backup policy, but you could also customize where the restore operation gets its data from.

Task 1 – Capturing the Partition ID

The first piece of information you are going to need is the partitionId of the partition you want to restore to.

To find your partitionId, log in to the Azure portal and go to your cluster resource. Open Service Fabric Explorer. In the explorer, you can open the treeview where the fabric:/Voting/VotingData service is and capture/copy the partitionId.

image

Task 2 – Capturing the RestorePartitionDescription information

Now that you have your partitionId, you need to know about the information that is stored in the backup that you want to restore from. This is called the RestorePartitionDescription object. NOTE: If you do not provide this information, you will just get the last backup that took place.

The RestorePartitionDescription information includes the BackupId, BackupLocation and BackupStore (the BackupStore item is optional). But how do you get the BackupId and BackupLocation? You can do this by getting the partition backup list. Here is an example:

GET http://localhost:19080/Partitions/1daae3f5-7fd6-42e9-b1ba-8c05f873994d/$/GetBackups?api-version=6.4&StartDateTimeFilter=2018-01-01T00:00:00Z&EndDateTimeFilter=2018-01-01T23:59:59Z

The only parameters actually required in the GET request is the partitionId and api-version. The api-version though, leave at 6.4 until later Service Fabric updates to this feature. When you run the command above, you will get back a list of backups for this partition and then from this list, you can choose which backup you want to restore. The information in this backup is the restore partition description information.

Your output from the GetBackups call should look something like this:

{
“ContinuationToken”: “<app-or-service-info>”,
“Items”: [
{
“BackupId”: “3a056ac9-7206-43c3-8424-6f6103003eba”,
“BackupChainId”: “3a056ac9-7206-43c3-8424-6f6103003eba”,
“ApplicationName”: “fabric:/<your-app-name>”,
“ServiceManifestVersion”: “1.0.0”,
“ServiceName”: “fabric:/<your-app-name>/<partition-service-name>”,
“PartitionInformation”: {
“LowKey”: “-9223372036854775808”,
“HighKey”: “9223372036854775807”,
“ServicePartitionKind”: “Int64Range”,
“Id”: “1daae3f5-7fd6-42e9-b1ba-8c05f873994d”
},
“BackupLocation”: “<your-app-name>\\<partition-service-name>\\<partitonId>\\<name-of-zip-file-to-restore>”,
“BackupType”: “Full”,
“EpochOfLastBackupRecord”: {
“DataLossVersion”: “131462452931584510”,
“ConfigurationVersion”: “8589934592”
},
“LsnOfLastBackupRecord”: “261”,
“CreationTimeUtc”: “2018-01-01T09:00:55Z”,
“FailureError”: null
}

You can collect the BackupId and BackupLocation information by running the ListPartitionBackups.ps1 file in the root\Assets folder.

Task 3 – Restoring your partition

Now, you will need to run the Restore command and setup the body of your JSON request object with the RestorePartitionDescription information.

1. In the root\Assets folder, open the RestorePartitionBackup.ps1 script.

2. Fill in the parameters. If you scroll down below the parameters you can see how the JSON body is being constructed with the parameters you enter. Note that for the JSON that you are constructing, it is important to have the double-quotes around each JSON attribute as you see below.

image

image

3. Save the script and then press F5 to execute the restore. What you should see in the PowerShell cmd window is something like this:

image

Task 4 – What happens during the restore

In my test, I first started by running at least 1 full backup operation with 4 incrementals. Each time a backup is performed, I changed the number of votes for Voter1 and Voter2. What I had was something like this:

image
I wanted to let the backups take place at least past the 4th incremental backup and then what I wanted to do was restore back to Incremental-2 (Voter1 = 6, Voter2 = 7).

But first, here are a couple of things to think about when doing a restore:

  • What happens when I choose to restore Incremental-2? In this case, Incremental-1, Incremental-2 and Full are restored to the reliable collection.
  • What happens to the partition when its being restored? Is it offline/unavailable? Yes, this partition is down. If you have an algorithm in your code that sends particular data to this partition, you are going to have to take this down time in to account.
  • If a backup happens to be taking place when I submit the command to do a restore, what happens? It depends upon the time of the calls, but since a restore from IT/Operations is normally a planned task, you need to let whatever backup that may be running finish before doing a restore.

On with the show…here is what my current screen looks like for the Voting app, I’m on Incremental-4:

image

With my partitionId available, along with the backupId and information specific to Incremental-2, I kick off the RestorePartitionBackup.ps1 script.

Here is what the stateful service looked like whenever the partition was being restored:

image

After a minute or so (the time will depend on the size of your data restore), this was the result, which is back to the Incremental-2 data values:

image

Summary

I realize this is really a small sample with a single partition being restored, but the exercise is fundamental in how a restore takes place and what happens during this process.

I hope this blog post can help you out with your own restore processes!

For more information on the REST API for the Restore command, go here https://docs.microsoft.com/en-us/rest/api/servicefabric/sfclient-api-restorepartition.

[Service Fabric] Stateful Reliable Collection Backup –The New Way

In my previous article on Service Fabric Backup and Restore, it could be witnessed that the process of setting up the process via APIs was pretty tedious, plus the fact that it was almost entirely developer driven (meaning C# coding). It was also a bit confusing to try to figure out how the full and incremental backups were stored in their folder structure(s).

Recently, the Service Fabric team announced general availability of a new method of backup and restore of stateful reliable collections (requires version 6.4 of the Service Fabric bits). You may see what is termed as a ‘no code’ method of performing a backup/restore, I’ll explain what that means below.

What this blog post will provide, is a complete project sample on how to setup and perform a Backup (but not a Restore), with PowerShell, ARM template and code, to help you understand how to tie this all together. I’ll cover restore in a future post.

What do you mean by ‘No Code’?

You may see a description of the new backup/restore procedure that says it is ‘no code’. What that actually means is there there are two ways of configuring your backup/restore, one is using a C# API and here (this is what developers would use to configure/build the backup/restore process). Then there is using PowerShell scripts, which is what is called the ‘no code’ option (even though technically yes it is code). No code simply means that developers are not the ones writing code to set things up.

Most customers I have worked with DO NOT want their developers to have control over the backup/restore process since it is considered to be an HA/DR/Operations procedure. Using PowerShell, this takes the configuration/deployment out of the compiled code and back in to the IT operations realm. In fact, the developer may not even know a backup is being taken while the service is running. As I stated earlier, we’ll discuss Restore in a later post because for a Restore.

I tend to agree with the IT Operations folks on the idea that backup/restore should not be a developers focus, therefore, this is the way we’ll do it in this blog post.

Pre-requisites (what I tested this on)

  • Visual Studio 2017 v15.9.5
  • Microsoft Azure Service Fabric SDK – 3.3.622

Code location:

https://github.com/larrywa/blogpostings/tree/master/SFBackupV2

SFBackupV2 will be known in this post as the ‘root’ folder.

Task 1: Creating your certificate and Azure Key Vault

1. In the root\Assets folder, you will find a PowerShell script named CreateVaultCerts.ps1. Open PowerShell ISE as an administrator and open this file.

2. At the top of the script, you will see several parameters that you need to fill in depending on your subscription/naming conventions.

image

3. After filling in your parameters, log in to Azure using the PowerShell command prompt window in the ISE editor by using the command:
Login-AzureRmAccount

Task 2: Deploy your cluster using the ARM template

We will be building a 3 node secure cluster via an ARM template. The cluster will be secured with a single certificate and this single certificate will also be referenced by the backup configuration. We will not be using Azure Active Directory (AAD) in this sample. So what’s special about this template?

1. Using your editor of choice, open the ServiceFabricCluster.json file in the root\Assets folder. Although the template file is already setup appropriately, its important to understand some of the required settings.

In order to use the new Backup/Restore service, you need to have it enabled in the cluster. First, you need to be using the correct API version for Microsoft.ServiceFabric/clusters:
{
“apiVersion”: “2018-02-01“,
“type”: “Microsoft.ServiceFabric/clusters”,
“name”: “[parameters(‘clusterName’)]”,
“location”: “[parameters(‘clusterLocation’)]”,

}

2. Next, you need to enable the backup/restore service inside of your addonFeatures section of Microsoft.ServiceFabric/clusters:

properties”: {
“addonFeatures”: [

BackupRestoreService
],

3. Next, add a section in the fabricSettings for your X.509 certificate for the encryption of the credentials. Here, we’ll just use the same certificate we use for the cluster to make it more simple.

“properties”: {

“addonFeatures”: [“BackupRestoreService“],
“fabricSettings”: [{
“name”: “BackupRestoreService”,
“parameters”:  [{
“name”: “SecretEncryptionCertThumbprint”,
“value”: “[Thumbprint]”
}
]
}

}

4. Now open the ServiceFabricCluster.parameters.json file located in the root\Assets folder. There are several parameters that need to be filled in. Any parameter value that is already filled in, leave that as is. You will also notice that there are parameter values needed that you should have from running Task 1 (cert thumbprint, source vault resource id etc).

image

NOTE: For the clusterName, you only need to provide the first part of the FQDN like <clusterName>.eastus.cloudapp.azure.com.

One particular parameter to note is the ‘osSkuName’. This is the size/class VM that will be used for the cluster. At minimum, this needs to be a Standard_D2_v2.

5. Once you have entered your parameter values, save the file and then open the root\Assets\deploy.json file in PowerShell ISE.In order to execute this script, you’ll need to know your subscriptionId, the resource group name you want your cluster to be created in and the Azure region (data center). Press F5 to execute the script. It will take approximately 20 minutes to create the cluster.

Task 3: Review your cluster services

    1. Log in to the Azure portal and go to the resource group where your Service Fabric cluster resides.
    2. Click on the name of your cluster and then in the cluster blade, click on the link to open the Service Fabric Explorer.
    3. Expand the Services item in the treeview and you should see a BackupRestoreService system service listed.

image

Task 4: Deploy the Voting application

1. Open Visual Studio 2017 as an administrator and then open the Voting.sln solution in the root\Voting folder.

2. Rebuild the project to assure that all the NuGet packages have been restored.

3. Right-click on the Voting project and select Publish.

4. In the Publish dialog, pick your subscription and your cluster name. Make sure you have the Cloud.xml profile and parameters file selected. Once you select your cluster name, you should see a green check once the VS publish mechanism connects to your cluster. If you see a red X instead, you can still try to publish and then look at the output to see what the actual error is. NOTE: If you see the red X, go in to the PublishProfiles\Cloud.xml file and make sure your cluster name and certificate thumbprint are listed there:

<ClusterConnectionParameters ConnectionEndpoint=”<FQDN-of-your-cluster>:19000″ X509Credential=”true” ServerCertThumbprint=”<cluster-thumbprint>” FindType=”FindByThumbprint” FindValue=”<cluster-thumbprint>” StoreLocation=”CurrentUser” StoreName=”My” />

5. Click the Publish button to publish the app to your Service Fabric cluster.

6. Log in to the Azure portal and go to your cluster resource. You should be able to see that the application has been deployed (after a few minutes) and you also want to make sure it is healthy. This can be determined by seeing a green check beside the status.

image

7. Prior to creating and enabling our backup profile, you need to make sure you have an Azure Storage account setup with a blob container in order to capture the data being backed up. In this example, I am going to use one of the storage accounts that the Service Fabric cluster uses. Normally, this is a bad idea for many reasons, i/o usage, space consumed etc, but you can create your own separate storage account in your subscription if you wish. I will create a new blob container named ‘blobvotebackup’.

Task 5: Create and enable your backup policy

At this point, you have your cluster created, your app deployed and in a running healthy state. It’s time to create your backup policy and enable it.

1. In PowerShell ISE, open the Backup.ps1 file in the root\Assets folder.

2. There are several parameters to fill in here, well commented. This script will create the backup policy and then enable it. Fill in your parameters

image

If you scroll down through the script, you’ll see the configuration information for the backup policy.

#start setting up storage info

$StorageInfo = @{
ConnectionString = $storageConnString
ContainerName = $containerName
StorageKind = ‘AzureBlobStore’
}

 

# backup schedule info, backup every 5 minutes
$ScheduleInfo = @{
Interval = ‘PT5M’
ScheduleKind = ‘FrequencyBased’
}

 

$retentionPolicy = @{
RetentionPolicyType = ‘Basic’
RetentionDuration = ‘P10D’
}

 

# backup policy parameters
# After 5 incremental backups, do a full backup
$BackupPolicy = @{
Name = $backupPolicyName
MaxIncrementalBackups = 5
Schedule = $ScheduleInfo
Storage = $StorageInfo
RetentionPolicy = $retentionPolicy

}

  • Note that the ‘StorageKind’ is AzureBlobStore. You could also choose an on-premises file store.
  • Also note the Interval of how often a backup is taken and that it is taken as a Frequency. This could also be a scheduled or ad-hoc backup for a certain time of day. I’m setting mine to 5 minutes just to get the sample code going.
  • There is a retention policy for the data for 10 days
  • The MaxIncrementalBackups which will tell the policy how many incremental backups to take before doing a new full backup. The backup service will always starts with a full backup on a newly enabled policy.

Since we are using a PowerShell script to create and enable the backup policy, we are using calls directly to the BackupRestore REST APIs. Notice where the Create it taking place. Notice how the URL is being built to create the policy and the API version being used.

$url = “https://” + $clusterName + “:19080/BackupRestore/BackupPolicies/$/Create?api-version=6.4″

Farther down in the script you’ll see where the url is being created for the EnableBackup command. Notice that we are specifying to backup at the ‘Application’ level, meaning if the app had more than one stateful service, they would all use the same backup policy.You can also enable backup at a partition or service level.

$url = “https://” + $clusterName + “:19080/Applications/” + $appName + “/$/EnableBackup?api-version=6.4″

3. Press F5 to execute the script and create/enable the backup policy. At this point, after 10 minutes, a backup will be created in the background.

Task 6: Confirm that data is being backed up

  1. Go back to the Azure portal, to your storage account and drill down to your backup blob container name. If you click on the backup blob container name (after waiting at least 5 minutes), you’ll see how the structure of the full/incremental backup process has taken place.

image

A couple of things to note:

  • You’ll see the blob container name in the upper left hand corner
  • You’ll see the ‘Location’ where you have the name of the blob, the name of the app, that name of the service in the app and then the partitionId.
  • For each backup, you have a .bkmetadata and .zip file.

2. To get a complete list of the backups, open the ListBackups.ps1 script in the root\Assets folder.

3. Fill in the parameters, and select F5 to run the script. You should see a list of all the current backups, names, IDs, partition numbers etc. This type of information will be important when you are ready to do a restore. Remember that each partition in a stateful service will have its own backup. You can also find a ListPartitionBackups.ps1 script in the root\Assets folder, just add your partitionID to the script parameters.

Below is a snapshot of the type of information you would see from running ListPartitionBackups.ps1:

image

Task 7 – Disable and Delete your backup policy

Now that you’ve had all the fun of seeing your services reliable collections being backed up in your blob container, you have a few choices. You can:

  • Suspend – this essentially just stops backups from being taken for a period of time. Suspension is thought of as a temporary action.
  • Resume – resuming a suspended backup
  • Enable – enabling a backup policy
  • Disable – use this when there is no longer a need to back up data from the reliable collection.
  • Delete – deletes the entire backup policy but your data still exists

One example of using a mix of the settings above is where you could enable a backup for an entire application but suspend or disable a backup for a particular service or partition in that application.

  1. The script RemoveBackup.ps1 from the root\Assets will do all 3 of the above. Depending on what you want to do at this point, set breakpoints within the PowerShell script to first suspend, then disable the backup policy. You will notice that there will be no more backups taking place.
  2. Once you are finished with your tests, continue the script to delete the backup policy.

References

For further information on backup and restore, see https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-backuprestoreservice-quickstart-azurecluster

Client library usage https://github.com/Microsoft/service-fabric-client-dotnet/blob/develop/docs/ClientLibraryUsage.md

Upload a file from an Azure Windows Server Core machine to Azure Blob storage

The title for this could be a lot longer like ‘how to upload a file using the Azure CLI to Azure storage on a Windows Server 2016 Core DataCenter’ because that’s what this blog post is about…but that’s a ridiculously long title. The point is, with a Core OS, you have no UI and very few capabilities and tools like you normally would on a full UI OS. Here’s what happened…

I was working on what appeared to be an issue with the new Azure Backup and Restore service on an Azure Service Fabric Cluster. I needed to generate a crash dump from the FabricBRS service and send it to Azure support.

I remote desktopped in to the VM in my VM Scaleset that was being used to host the FabricBRS service, ran ‘taskmgr’ in the command prompt and then right-clicked on the FabricBRS service and selected ‘create dump file’.

image

The crash dump file is written to a location like C:\Users\<yourlogin>\AppData\Local\Temp\2\FabricBRS.DMP. And this is where the fun began. How do I get the file from here to my Azure Storage account?

1. To make it easy to find the file, I copied it from the above directory right to the C:\ drive on the machine.

2. I figured the easiest way to do this was to install the Azure CLI and then use the CLI with my storage account to upload the file to blob storage. To download the MSI file from https://aks.ms/installazurecliwindows, from your current command prompt window, type ‘PowerShell’. This will start PowerShell in your current command prompt. However, what you need to do, is open a new PowerShell command prompt.

To do this, run the following command:

Invoke-Item C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe

3. Within the new PowerShell command prompt, to download the file, I ran the following commands:
$downloader = New-Object System.Net.WebClient $downloader.DownloadFile(“https://aka.ms/installazurecliwindows”,”C:\azure-cli-2.0.54.msi”)

You can do a ‘dir’ on your C:\ drive to make sure its there.

image

4. To install the Azure CLI, run the command:

Start-Process C:\azure-cli-2.0.54.msi

You will be taken through a wizard that should step you through the install process. When you are finished, you are SUPPOSED to be able to just type in ‘az’ in to the command prompt window and get back some response from the Azure CLI. However, I found that I instead had to go out to the Azure portal and restart my VM Scaleset and then log back in to the machine to the command prompt.

5. At this point, the Azure CLI should be available and you know where your crash dump file is, now you need to upload it to your Azure Storage account, in a blob storage container.

6. Make sure you have a storage account and a specific container to upload the file in to. Grab the name of your storage account and then the storage account key.

7. Run the following command using the Azure CLI:

az storage blob upload –account-name <your-storage-account-name> –account-key <your-storage-account-key> –container-name <your-container-name> –-file ‘C:\FabricBRS.DMP’ –name ‘FabricBRS.DMP’

Its hard to see from the command above, but for each parameter, there is actually a double hypen ‘- -‘ with no space in between for each parameter. Also, the ‘name’ parameter is the name you want the file to be shown as in your blob storage container.

I hope this blog post saves someone who needs to do the same type of exercise a bit of time!

[Service Fabric] Auto-scaling your Service Fabric cluster–Part II

In Part I of this article, I demonstrated how to set up auto-scaling on the Service Fabric clusters scale set based on a metric that is part of a VM scale set (Percentage CPU). This setting doesn’t have much to do at all with the applications that are running in your cluster, it’s just a pure hardware scaling that may take place because of your services CPU consumption or some other thing consuming CPU.

There was a recent addition to the auto-scaling capability of a Service Fabric cluster that allows you to use an Application Insights metric, reported by your service to control the cluster scaling. This capability gives you more finite control over not just auto-scaling, but which metric in which service to provide the metric values.

Creating the Application Insights (AI) resource

Using the Azure portal, the cluster resource group appears to have a few strange issues in how you create your AI resource. To make sure you build the resource correctly, follow these steps.

1. Click on the name of the resource group where your cluster is located.

2. Click on the +Add menu item at the top of the resource group blade.

3. Choose to create a new AI resource. In my case, I created a resource that used a General application type and used a different resource group that I keep all my AI resources in.

clip_image001

4. Once the AI resource has been created, you will need to retrieve and copy the Instrumentation Key. To do this, click on the name of your newly created AI resource and expand the Essentials menu item, you should see your instrumentation key here:

clip_image003

You’ll need the instrumentation key for your Service Fabric application, so either paste it in to Notepad or somewhere where you can access it easily.

Modify your Service Fabric application for Application Insights

In Part I of this article, I was using a simple stateless service app https://github.com/larrywa/blogpostings/tree/master/SimpleStateless and inside of this code, I had commented out code for use in this blog post.

I left my old AI instrumentation key in-place so you can see where to place yours.

1. If you already have the solution from the previous article (or download it if you don’t), the solution already has the NuGet package for Microsoft.ApplicationInsights.AspNetCore added. If you are using a different project, add this NuGet package to your own service.

2. Open appsettings.json and you will see down at the bottom of the file where I have added my instrumentation key. Pay careful attention to where you put this information in this file, brackets, colons etc can really get you messed up here.

3. In the WebAPI.cs file, there are a couple of things to note:

a. The addition of the ‘using Microsoft.ApplicationInsights;’ statement.

b. Line 31 – Uncomment the line of code for our counter.

c. Line 34 – Uncomment the line of code that create a new TelemetryClient.

d. Lines 43 – 59 – Uncomment out the code that creates the custom metric named ‘CountNumServices’. Using myClient.TrackMetric, I can create any custom metric of any name that I wish. In this example, I am counting up to 1200 (20 minutes of time) and then I count down from there to 300, 1 second at a time. Here, I am trying to provide ample time for the cluster to react appropriately for the scale-out and scale-in.

4. Rebuild the solution.

5. Deploy your solution to the cluster.

Here is a catchy situation. To actually be able to setup your VM scale set to scale based on an AI custom metric, you have to deploy the code to start generating some of the custom metric data, otherwise it won’t show up as a selection. But, if you currently have a scaling rule set, as in Part I, the nodes may still be reacting to those rules and not your new rules. That’s ok, we will change that once the app is deployed and running in the cluster. And, you need to let the app run for about 10 minutes to start generating some of these metrics.

Confirming custom metric data in Application Insights

Before considering testing out our new custom metric, you need to make sure the data is being generated.

1. Open your AI resource that you recently created. The key was placed in the appsettings.json file.

2. Make sure you are on the Overview blade, then select the Metrics Explorer menu item.

3. On either of the charts, click on the very small Edit link at the upper right-hand corner of the chart.

4. In the Chart details blade, down toward the bottom, you should see a new ‘Custom’ section. Select CountNumServices.

clip_image004

5. Close the Chart details blade. It is handy to have this details chart so you can see what the current count it to predict whether the cluster should be scaling out or in.

clip_image006

Modifying your auto-scaling rule set

1. Make sure you are on the Scaling blade for your VM scale set.

2. In the current rule set, either click on the current Scale out rule, Scale based on a metric or click on the +Add rule link.

3. In my case, I clicked on Scale based on a metric, so we’ll go with that option. Select +Add a rule.

4. Here is the rule I created:

a. Metric source: Application Insights

b. Resource type: Application Insights

c. Resource: aiServiceFabricCustomMeric ~ this is my recently created AI resource

d. Time aggregation: Average

e. Metric name: CountNumServices ~ remember, if you don’t see it when you try to scroll and find the metric to pick from, you may not have waited long enough for data to appear.

f. Time grain: 1 minute

g. Time grain statistic: Average

h. Operator: Greater than

i. Threshold: 600

j. Duration: 5 minutes

k. Operation: Increase count by

l. Instance count: 2

m. Cool down: 1 minute

Basically, after the count reaches 600, increase by two nodes. Wait 1 minute before scaling up any more nodes (note, in real production, this number should be at least 5 minutes). Actually, the number will be greater than 600 because we are using Average as the time aggregation and statistic.

5. In the Scale rule options blade

6. Select the Add button.

7. To create a scale in rule, I followed the same process but with slightly different settings:

a. Metric source: Application Insights

b. Resource type: Application Insights

c. Resource: aiServiceFabricCustomMeric ~ this is my recently created AI resource

d. Time aggregation: Average

e. Metric name: CountNumServices ~ remember, if you don’t see it when you try to scroll and find the metric to pick from, you may not have waited long enough for data to appear.

f. Time grain: 1 minute

g. Time grain statistic: Average

h. Operator: Less than

i. Threshold: 500

j. Duration: 5 minutes

k. Operation: Decrease count by

l. Instance count: 1

m. Cool down: 1 minute

Notice that while decreasing, I am decreasing at a lesser rate than when increasing. This is to just make sure we don’t make any dramatic changes that could upset performance.

8. Save the new rule set.

9. If you need to add or change who gets notified for any scale action, go to the Notify tab, make your change and save it.

10. At this point, you could wait for an email confirmation of a scale action or go in to the Azure portal AI resource and view the metrics values grow. Remember though, what you see in the chart are ‘averages’, not a pure count of the number of incremented counts. Also, when scaling, it uses averages.

In case you are curious what these rules would look like in an ARM template, it looks something like this:

clip_image008

I’ll provide the same warning here that I did in Part I. Attempting to scale out or in in any rapid fashion will cause instability in your cluster. When scaling out, you may consider to increase the number of nodes that are added at one time to keep you from having to wait so long for performance improvements (if that is what you are going for).

[Service Fabric] Auto-scaling your Service Fabric cluster–Part I

Like most people, whenever I need to build an ARM template to do something with Service Fabric, I’m browsing around Github or where ever else I can find the bits and pieces of JSON that I need.

I was recently working on a project where they needed 3 things:

1. They wanted their Service Fabric cluster to use managed disks instead of Azure Storage accounts.

2. They wanted to have auto-scaling setup for their virtual machine scale set (VMSS). In this case, we started out using the Percentage CPU rule, which is based on the VM scale set hardware and nothing application specific. In Part II, I’ll talk about auto-scaling via custom metrics and Application Insights.

3. They had a stateless service where they wanted to register their service event source to output logging information in to the WADETWEventTable.

This template includes those 3 implementations. You can find a link to the files here https://github.com/larrywa/blogpostings/tree/master/VMSSScaleTemplate. This is a secure cluster implementation so you’ll have to have your primary certificate in Azure Key Vault etc.

Using managed disks

Some of the newer templates you’ll find out on Github already use managed disks for the Service Fabric cluster, but in case the one you are using doesn’t, you need to find the location in your JSON file, in the Microsoft.Compute/virtualMachineScaleSets resource provider and make the following modifications.

clip_image002

Another important setting you need to make sure you have is the overProvision = false setting (here placed in a variable)

clip_image003

This variable is actually used in the Microsoft.Compute/virtualMachineScaleSets resource provider properties:

clip_image005

More information about overprovisioning can be found here https://docs.microsoft.com/en-us/rest/api/compute/virtualmachinescalesets/create-or-update-a-set, but if this setting is missing or set to true, you may see more than the requested number of machines and nodes created at deployment and then the ones that are not in use are turned off. This will cause errors to appear in the Service Fabric Explorer. Service Fabric will eventually go behind and clean up itself but when you first see the errors, you’ll think you did something wrong.

Setting Up Auto-scale on your VMSS

At first my idea was to go to my existing cluster, turn on auto-scaling inside of the VMSS settings and then export the template from the Azure portal. I then discovered that my subscription did not have permission to use the microsoft.insights resource provider. Not sure you’ll run in to this, but if you do you can either enable it in the portal under Subscriptions -> your subscription -> Resource providers -> Microsoft.Insights.

The ‘microsoft.insights/autoscalesettings’ resource provider is placed at the same level in the JSON file as the other major resource providers like the cluster, virtual machine scale set etc. This scaleset, although it is found as a setting for the scaleset, is not a sub-section of the resource provider for the VMSS, it is actually a separate resource provider as shown in the next screenshot.

In the Json outline editor, the auto-scale resource will look like this:

clip_image006

There is a rule with conditions for scaling up and scaling down based on Percentage CPU metrics. Before deployments, go in and set your own desired levels in the capacity area, line 838 in this sample. The values for the Percentage CPU I have set in this sample are ridiculously low just to be able to see something happen, and if you’re impatient, you can make them even lower just to see scaling take place.

Registering your Event Source provider

To register an event source provider, you need to add your event source provider to the Microsoft.Compute/virtualMachineScaleSets resource provider in the Microsoft.Azure.Diagnostics section.

clip_image008

You will of course need to make sure that the ‘provider’ listed above matches the EventSource name in your ServiceEventSource.cs file of your service. This is what I have for my stateless service that I’ll deploy to the cluster once it’s up and running.

clip_image010

Template Deployment

For deployment to create the Service Fabric cluster, there is a deploy.ps1 PowerShell script included but you can really deploy the ARM template with whatever script you normally would use for your other ARM templates. Make sure you go through the process or creating your cluster certificate and put it in the Azure Key Vault first though, you will need that type of information for any secure cluster.

Note that in this example on Github, the parameters file is named ’empty-sfmanageddisk.parameters.json’ and to get the deploy.ps1 to work, you need to get rid of the ’empty-‘ part of the name.

Once your cluster is up and running….confirm your auto-scale settings

Within your cluster resource group, click on the name of your VM scale set. Then click on the Scaling menu item. What you should see is something like this:

clip_image012

If you want to make changes to the levels for Scale out and Scale in, just click on the ‘When’ rules and an edit blade will appear where you can make those changes. Then click the Update button.

Next, to get notified of an auto-scale event, click on the Notify tab, enter an email address and then click the Save button.

clip_image013

Before moving on to the next step, make sure that your cluster nodes are up and running by clicking on the name of your cluster in your resource group and looking for a visual confirmation of available nodes:

clip_image015

Publish your Service Fabric application

For this example, I have a simple stateless ASP.Net Core 2.0 sample that uses the Kestrel communications listener. You can either choose a dynamic port or specify a specific port number in your ServiceManifest.xml file. Publish the service to the cluster using Visual Studio 2017.

In the WebAPI.cs file, you will notice some code that is commented out. I’ll use this in part II of this article when I discuss scaling via custom metrics.

You can find this code sample at https://github.com/larrywa/blogpostings/tree/master/SimpleStateless

 

Email from Azure during auto-scale event

Each time an auto-scale event happens, you will receive an email similar to the one below:

clip_image017

WARNING: The values that I have set for my auto-scale ranges are extremely aggressive. This is just so that initially I can see something happening. In an actual production environment, you cannot expect rapid scale-up and scale-down of Service Fabric nodes. Remember, a node (machine is first created) and then the Service Fabric bits are installed on the VM and then all that has to be spun up, registered and the rest of the Service Fabric magic. You are looking at several minutes of time for this to take place. If your scale-up and scale-down parameters are too aggressive, you could actually get the cluster in a situation where it is trying to spin up a node why taking down the same node. Your node will then go in to an unknown state in which you will have to manually correct by turning off auto-scaling until things settle down.

[Service Fabric] Full/Incremental backup and restore sample

I recently worked with a client who said that they had had trouble implementing incremental backup and restore for their stateful services, so I was set to the task to build a sample that would do both a full and incremental backups to Azure blob storage.

I fully expect great improvements to happen in Service Fabric in the coming months regarding backup and restore, but for now, the customer needed to get moving.

I’ve posted the Visual Studio solution on GitHub for you to look at if you are needing something like this:

https://github.com/larrywa/blogpostings/tree/master/SFBackup

I am using Visual Studio 2017 15.3.3 with the Service Fabric SDK v2.7.198.

Here are some notes for the code you’ll find in the sample. I’ve put as many comments as seem logical within the code, but more description is always better.

Stateful1.cs

  • At the top of the file is a int named backupCount – this is a counter that specifies after how many counts to do before you do the next full or incremental backup.
  • In RunAsync, there is an bool, takeBackup that will keep track of whether a backup needs to take place
  • When takeBackup is true, based on the logic I’m using, a full backup will take place first, then after this, incremental backups. There is a flag, takeFullBackup and an incremental counter incrementalCount, that will need to be replaced by your own logic on when you want to do backups and under what conditions.
  • BackupCallbackAsync – this is the primary backup method. This method calls SetupBackupManager (discussed below) and ArchiveBackupAsync in the AzureBackupStore.cs file.  ArchiveBackupAsync is the method that will take the data that needs to be backed up, zip it up and push it out to blob storage. If this is a full backup, any files sitting in your blob storage container will be deleted because you should not need them at this point. (Here, you may want to do archiving if you are not sure of your backup strategy). For a stateful service with multiple partitions, you would generally have multiple blob containers, one for each partition.
  • SetupBackupManager – this method gets settings from the Settings.xml file in .\PackageRoot\Config to determine storage account keys, names etc. Then, an instance of the AzureBlobBackupManager class is created. This method is called right before you perform a backup or a restore just to make sure a connection is available.
  • OnDataLossAsync – this method is called when you execute the .\SFBackup\Scripts\EnumeratePartitions.ps1. This method calls the backup managers RestoreLatestBackupToTempLocation which is responsible for:
    • Getting a list of the blobs in blob storage that make up the backup
    • Looping through each blob and downloading/extracting it to your hard drive (in a folder you specify).
    • Once unzipped, deleting the zip file from your hard drive
    • Returning the root directory under which you have the unzipped full backup plus all incremental backups. The correct structure of a full + incremental backup should look like:Backup Folder (the root folder)/
      • Full/ (name may be different depending on your naming convention)
      • SM/
      • LR/
      • Metadata
        • Incremental0 (name may be different depending on your naming convention)
      • LR/
      • Metadata
        • Incremental1 (name may be different depending on your naming convention)
      • LR/
      • Metadata

Next, the RestoreAsync command is called which looks at your root backup directory, then cycles through the full backup and incremental backups to do the restore. The root directory is then deleted after a restore.

Settings.xml

  • For the parameters in this file, I have commented so you’ll know what they mean. Note that you are going to need to add code to the app to programmatically create the directory where your backup files are dropped in to. I’m not calling this the ‘root’ directory because the root directory IS created programmatically and the name will change based on the partition name/logic.

AzureBackupStore.cs

  • I won’t go in to each method here because I’ve commented the utility methods pretty well, but one key thing to watch for is in the constructor, it is showing you how to get the parameters out of the Settings.xml file that you need. Note how the tempRestoreDir (that is the folder YOU create on your hard drive) is combined to the partitionId name/number to form the actual ‘root’ folder which will be later deleted after you have done the restore.

 

So, how do you run this application?

 

  1. Open the solution and build the app to get your NuGet packages pulled down.
  2. Create an Azure storage account with a container to use for blob storage backup
  3. Create a directory on your hard drive to store the backup files
  4. Fill in the information in settings.xml
  5. Open EnumeratePartitions.ps1 in PowerShell ISE
  6. F5

 

When you start running the app, you can set breakpoints in various places to see how it runs, but for me, I let it go through a few backups and write down where I ended up at on the count (from the diagnostics window). Then approximately half way in the middle of another count run, I will execute the PS script. it will take a few seconds to trigger the restore.

The way the app is written is that when you trigger a restore, it does the restore and then after the restore, the next backup is a full backup etc. Something you will notice though as you do restores is that the service (the particular partition), will seem to freeze while the restore is taking place. This is because a ‘restore’ is expected to be a disaster recovery situation where the service partition would not normally be available. Other partitions may still be running, but not the one being restored.

Hope this helps you in your backup and restore efforts!