[Service Fabric] Why won’t Visual Studio connect to my cluster?

In this blog post, I’ll discuss something that has frustrated both myself and many others for quite a while, and that is, failure of Visual Studio to connect to an Azure Service Fabric cluster. We’ll be using Visual Studio 2017 as an example.

I’m not sure whether it’s a problem with the Visual Studio tooling, poor connectivity from Visual Studio out through the internet to the cluster, or what is really going on with this but it happens quite frequently.

Requirements:

    1. You have a secure cluster up and running in Azure.
    2. You have your cluster node-to-node certificate downloaded and imported to your development machine. The certificate can be self-signed in this case.

Scenario

I have a simple Service Fabric application up and running in Visual Studio on my dev machine and I’m ready to deploy it to my Azure Service Fabric cluster.

From within Visual Studio, right-click on your Service Fabric application and select the Publish menu item. From the publish dialog box, you will be expected to connect to your Azure account.

After a couple of seconds of a spinning cursor, what you should see is your cluster listed in the Connection Endpoint. Also if you expand the Advanced Connection Parameters drop-down, you should see your certificate listed.

SNAGHTML1a175702

You can now publish to your cluster.

Up until now, life is great!

So now you decide to open up a different Service Fabric application within Visual Studio and you want to deploy that application to your cluster also.

Just like you did in the previous steps, right-click on the application name and select Publish.When you do, you see the dreaded ‘red-x’.

image

If you hover over the red-x, you will more than likely see the message ‘Failed to contact the server. Please try again later or get help from ‘How to configure secure connections’.

Troubleshooting

Since you had previously connected to the same cluster before, you know you should be able to connect again, however, this time, you are in a different application with a different publish profile.

If you were on a totally different machine trying to connect to the cluster, the first thing to check would be the certificate. You would check to make sure you have the certificate imported and also you can check in the publish dialog box to see if your certificate is listed in the Advanced Connection Parameters drop-down.

In our case, the actual solution is to close the Publish dialog box and open up the Cloud.xml file in the PublishProfiles folder.

If you scroll down in this file, you’ll see a ClusterConnectionParameters element. You need to confirm that all the settings, especially ConnectionEndpoint and ServerCertThumbprint/FindValue have the correct information. I would say most of the time, it will be the certificate thumbprint that is incorrect, but your mileage may vary. That’s what it was in the case of this scenario.

image

Once I correct the certificate thumbprint and return to the Publish dialog box, you should now see that you are able to publish to your cluster.

Suppose though that this doesn’t correct your problem. What else can you check?

  • You can open a PowerShell command prompt and run the Connect-ServiceFabricCluster cmdlet with the parameters you have in the Cloud.xml file to see if you actually can connect.
  • Although you see the red-x, that warning in the pop-up really doesn’t help much. Go ahead and click the Publish button and attempt to publish. You may be surprised to find that you can actually publish and that Visual Studio for some reason just had not updated the dialog, OR the output window will show the actual error.
  • Another thing that could happen, if you have your client-to-node security setup with Azure Active Directory (AAD), is that you receive an access-denied error. In this case, if the application you are deploying is registered with AAD and your Azure login has no permissions to do anything with the app, you may not be able to publish the app to the cluster.
  • Last but not least, in one of my much earlier blog posts I demonstrated that I ran in to an issue where I actually had to physically type in the certificate thumbprint inside of the Advanced Connection Parameters thumbprint fields because of a white-space somewhere in the thumbprint I had pasted in.

Until next time, hope this helps you with your Service Fabric development!

[Service Fabric] Stateful Reliable Collection Restore –The New Way

In my previous blog post about how to do a Service Fabric stateful service backup, you learned how to setup the ARM template, use PowerShell scripts to create, enable, suspend and delete your backup policy.

I wanted to provide a separate blog post because although the PowerShell code for performing a Restore isn’t too complex, the amount of information you need and how to find that information is critical to your success in the Restore operation.

Pre-requisites (what I tested this on)

  • Visual Studio 2017 v15.9.4
  • Microsoft Azure Service Fabric SDK – 3.3.622

Code location:

https://github.com/larrywa/blogpostings/tree/master/SFBackupV2

It is assumed that you have read my previous blog posting and have your cluster setup, the Voting application installed and backups already collected in your Azure storage accounts blob storage container. After all, you can’t do a restore without a backup now can you?

In the previous posting, I did an ‘application’ backup, which backed up every stateful service and partition in the application. I only had one stateful service with one partition, so technically, I could have just executed the REST API command for backing up a partition if I wanted to.

What you will see below, is that we will restore to a particular partition in the running stateful service, which should help you understand how you would find the backup of the partition to restore from. The restore operation will automatically look in the storage location specified by the backup policy, but you could also customize where the restore operation gets its data from.

Task 1 – Capturing the Partition ID

The first piece of information you are going to need is the partitionId of the partition you want to restore to.

To find your partitionId, log in to the Azure portal and go to your cluster resource. Open Service Fabric Explorer. In the explorer, you can open the treeview where the fabric:/Voting/VotingData service is and capture/copy the partitionId.

image

Task 2 – Capturing the RestorePartitionDescription information

Now that you have your partitionId, you need to know about the information that is stored in the backup that you want to restore from. This is called the RestorePartitionDescription object. NOTE: If you do not provide this information, you will just get the last backup that took place.

The RestorePartitionDescription information includes the BackupId, BackupLocation and BackupStore (the BackupStore item is optional). But how do you get the BackupId and BackupLocation? You can do this by getting the partition backup list. Here is an example:

GET http://localhost:19080/Partitions/1daae3f5-7fd6-42e9-b1ba-8c05f873994d/$/GetBackups?api-version=6.4&StartDateTimeFilter=2018-01-01T00:00:00Z&EndDateTimeFilter=2018-01-01T23:59:59Z

The only parameters actually required in the GET request is the partitionId and api-version. The api-version though, leave at 6.4 until later Service Fabric updates to this feature. When you run the command above, you will get back a list of backups for this partition and then from this list, you can choose which backup you want to restore. The information in this backup is the restore partition description information.

Your output from the GetBackups call should look something like this:

{
“ContinuationToken”: “<app-or-service-info>”,
“Items”: [
{
“BackupId”: “3a056ac9-7206-43c3-8424-6f6103003eba”,
“BackupChainId”: “3a056ac9-7206-43c3-8424-6f6103003eba”,
“ApplicationName”: “fabric:/<your-app-name>”,
“ServiceManifestVersion”: “1.0.0”,
“ServiceName”: “fabric:/<your-app-name>/<partition-service-name>”,
“PartitionInformation”: {
“LowKey”: “-9223372036854775808”,
“HighKey”: “9223372036854775807”,
“ServicePartitionKind”: “Int64Range”,
“Id”: “1daae3f5-7fd6-42e9-b1ba-8c05f873994d”
},
“BackupLocation”: “<your-app-name>\\<partition-service-name>\\<partitonId>\\<name-of-zip-file-to-restore>”,
“BackupType”: “Full”,
“EpochOfLastBackupRecord”: {
“DataLossVersion”: “131462452931584510”,
“ConfigurationVersion”: “8589934592”
},
“LsnOfLastBackupRecord”: “261”,
“CreationTimeUtc”: “2018-01-01T09:00:55Z”,
“FailureError”: null
}

You can collect the BackupId and BackupLocation information by running the ListPartitionBackups.ps1 file in the root\Assets folder.

Task 3 – Restoring your partition

Now, you will need to run the Restore command and setup the body of your JSON request object with the RestorePartitionDescription information.

1. In the root\Assets folder, open the RestorePartitionBackup.ps1 script.

2. Fill in the parameters. If you scroll down below the parameters you can see how the JSON body is being constructed with the parameters you enter. Note that for the JSON that you are constructing, it is important to have the double-quotes around each JSON attribute as you see below.

image

image

3. Save the script and then press F5 to execute the restore. What you should see in the PowerShell cmd window is something like this:

image

Task 4 – What happens during the restore

In my test, I first started by running at least 1 full backup operation with 4 incrementals. Each time a backup is performed, I changed the number of votes for Voter1 and Voter2. What I had was something like this:

image
I wanted to let the backups take place at least past the 4th incremental backup and then what I wanted to do was restore back to Incremental-2 (Voter1 = 6, Voter2 = 7).

But first, here are a couple of things to think about when doing a restore:

  • What happens when I choose to restore Incremental-2? In this case, Incremental-1, Incremental-2 and Full are restored to the reliable collection.
  • What happens to the partition when its being restored? Is it offline/unavailable? Yes, this partition is down. If you have an algorithm in your code that sends particular data to this partition, you are going to have to take this down time in to account.
  • If a backup happens to be taking place when I submit the command to do a restore, what happens? It depends upon the time of the calls, but since a restore from IT/Operations is normally a planned task, you need to let whatever backup that may be running finish before doing a restore.

On with the show…here is what my current screen looks like for the Voting app, I’m on Incremental-4:

image

With my partitionId available, along with the backupId and information specific to Incremental-2, I kick off the RestorePartitionBackup.ps1 script.

Here is what the stateful service looked like whenever the partition was being restored:

image

After a minute or so (the time will depend on the size of your data restore), this was the result, which is back to the Incremental-2 data values:

image

Summary

I realize this is really a small sample with a single partition being restored, but the exercise is fundamental in how a restore takes place and what happens during this process.

I hope this blog post can help you out with your own restore processes!

For more information on the REST API for the Restore command, go here https://docs.microsoft.com/en-us/rest/api/servicefabric/sfclient-api-restorepartition.

[Service Fabric] Stateful Reliable Collection Backup –The New Way

In my previous article on Service Fabric Backup and Restore, it could be witnessed that the process of setting up the process via APIs was pretty tedious, plus the fact that it was almost entirely developer driven (meaning C# coding). It was also a bit confusing to try to figure out how the full and incremental backups were stored in their folder structure(s).

Recently, the Service Fabric team announced general availability of a new method of backup and restore of stateful reliable collections (requires version 6.4 of the Service Fabric bits). You may see what is termed as a ‘no code’ method of performing a backup/restore, I’ll explain what that means below.

What this blog post will provide, is a complete project sample on how to setup and perform a Backup (but not a Restore), with PowerShell, ARM template and code, to help you understand how to tie this all together. I’ll cover restore in a future post.

What do you mean by ‘No Code’?

You may see a description of the new backup/restore procedure that says it is ‘no code’. What that actually means is there there are two ways of configuring your backup/restore, one is using a C# API and here (this is what developers would use to configure/build the backup/restore process). Then there is using PowerShell scripts, which is what is called the ‘no code’ option (even though technically yes it is code). No code simply means that developers are not the ones writing code to set things up.

Most customers I have worked with DO NOT want their developers to have control over the backup/restore process since it is considered to be an HA/DR/Operations procedure. Using PowerShell, this takes the configuration/deployment out of the compiled code and back in to the IT operations realm. In fact, the developer may not even know a backup is being taken while the service is running. As I stated earlier, we’ll discuss Restore in a later post because for a Restore.

I tend to agree with the IT Operations folks on the idea that backup/restore should not be a developers focus, therefore, this is the way we’ll do it in this blog post.

Pre-requisites (what I tested this on)

  • Visual Studio 2017 v15.9.5
  • Microsoft Azure Service Fabric SDK – 3.3.622

Code location:

https://github.com/larrywa/blogpostings/tree/master/SFBackupV2

SFBackupV2 will be known in this post as the ‘root’ folder.

Task 1: Creating your certificate and Azure Key Vault

1. In the root\Assets folder, you will find a PowerShell script named CreateVaultCerts.ps1. Open PowerShell ISE as an administrator and open this file.

2. At the top of the script, you will see several parameters that you need to fill in depending on your subscription/naming conventions.

image

3. After filling in your parameters, log in to Azure using the PowerShell command prompt window in the ISE editor by using the command:
Login-AzureRmAccount

Task 2: Deploy your cluster using the ARM template

We will be building a 3 node secure cluster via an ARM template. The cluster will be secured with a single certificate and this single certificate will also be referenced by the backup configuration. We will not be using Azure Active Directory (AAD) in this sample. So what’s special about this template?

1. Using your editor of choice, open the ServiceFabricCluster.json file in the root\Assets folder. Although the template file is already setup appropriately, its important to understand some of the required settings.

In order to use the new Backup/Restore service, you need to have it enabled in the cluster. First, you need to be using the correct API version for Microsoft.ServiceFabric/clusters:
{
“apiVersion”: “2018-02-01“,
“type”: “Microsoft.ServiceFabric/clusters”,
“name”: “[parameters(‘clusterName’)]”,
“location”: “[parameters(‘clusterLocation’)]”,

}

2. Next, you need to enable the backup/restore service inside of your addonFeatures section of Microsoft.ServiceFabric/clusters:

properties”: {
“addonFeatures”: [

BackupRestoreService
],

3. Next, add a section in the fabricSettings for your X.509 certificate for the encryption of the credentials. Here, we’ll just use the same certificate we use for the cluster to make it more simple.

“properties”: {

“addonFeatures”: [“BackupRestoreService“],
“fabricSettings”: [{
“name”: “BackupRestoreService”,
“parameters”:  [{
“name”: “SecretEncryptionCertThumbprint”,
“value”: “[Thumbprint]”
}
]
}

}

4. Now open the ServiceFabricCluster.parameters.json file located in the root\Assets folder. There are several parameters that need to be filled in. Any parameter value that is already filled in, leave that as is. You will also notice that there are parameter values needed that you should have from running Task 1 (cert thumbprint, source vault resource id etc).

image

NOTE: For the clusterName, you only need to provide the first part of the FQDN like <clusterName>.eastus.cloudapp.azure.com.

One particular parameter to note is the ‘osSkuName’. This is the size/class VM that will be used for the cluster. At minimum, this needs to be a Standard_D2_v2.

5. Once you have entered your parameter values, save the file and then open the root\Assets\deploy.json file in PowerShell ISE.In order to execute this script, you’ll need to know your subscriptionId, the resource group name you want your cluster to be created in and the Azure region (data center). Press F5 to execute the script. It will take approximately 20 minutes to create the cluster.

Task 3: Review your cluster services

    1. Log in to the Azure portal and go to the resource group where your Service Fabric cluster resides.
    2. Click on the name of your cluster and then in the cluster blade, click on the link to open the Service Fabric Explorer.
    3. Expand the Services item in the treeview and you should see a BackupRestoreService system service listed.

image

Task 4: Deploy the Voting application

1. Open Visual Studio 2017 as an administrator and then open the Voting.sln solution in the root\Voting folder.

2. Rebuild the project to assure that all the NuGet packages have been restored.

3. Right-click on the Voting project and select Publish.

4. In the Publish dialog, pick your subscription and your cluster name. Make sure you have the Cloud.xml profile and parameters file selected. Once you select your cluster name, you should see a green check once the VS publish mechanism connects to your cluster. If you see a red X instead, you can still try to publish and then look at the output to see what the actual error is. NOTE: If you see the red X, go in to the PublishProfiles\Cloud.xml file and make sure your cluster name and certificate thumbprint are listed there:

<ClusterConnectionParameters ConnectionEndpoint=”<FQDN-of-your-cluster>:19000″ X509Credential=”true” ServerCertThumbprint=”<cluster-thumbprint>” FindType=”FindByThumbprint” FindValue=”<cluster-thumbprint>” StoreLocation=”CurrentUser” StoreName=”My” />

5. Click the Publish button to publish the app to your Service Fabric cluster.

6. Log in to the Azure portal and go to your cluster resource. You should be able to see that the application has been deployed (after a few minutes) and you also want to make sure it is healthy. This can be determined by seeing a green check beside the status.

image

7. Prior to creating and enabling our backup profile, you need to make sure you have an Azure Storage account setup with a blob container in order to capture the data being backed up. In this example, I am going to use one of the storage accounts that the Service Fabric cluster uses. Normally, this is a bad idea for many reasons, i/o usage, space consumed etc, but you can create your own separate storage account in your subscription if you wish. I will create a new blob container named ‘blobvotebackup’.

Task 5: Create and enable your backup policy

At this point, you have your cluster created, your app deployed and in a running healthy state. It’s time to create your backup policy and enable it.

1. In PowerShell ISE, open the Backup.ps1 file in the root\Assets folder.

2. There are several parameters to fill in here, well commented. This script will create the backup policy and then enable it. Fill in your parameters

image

If you scroll down through the script, you’ll see the configuration information for the backup policy.

#start setting up storage info

$StorageInfo = @{
ConnectionString = $storageConnString
ContainerName = $containerName
StorageKind = ‘AzureBlobStore’
}

 

# backup schedule info, backup every 5 minutes
$ScheduleInfo = @{
Interval = ‘PT5M’
ScheduleKind = ‘FrequencyBased’
}

 

$retentionPolicy = @{
RetentionPolicyType = ‘Basic’
RetentionDuration = ‘P10D’
}

 

# backup policy parameters
# After 5 incremental backups, do a full backup
$BackupPolicy = @{
Name = $backupPolicyName
MaxIncrementalBackups = 5
Schedule = $ScheduleInfo
Storage = $StorageInfo
RetentionPolicy = $retentionPolicy

}

  • Note that the ‘StorageKind’ is AzureBlobStore. You could also choose an on-premises file store.
  • Also note the Interval of how often a backup is taken and that it is taken as a Frequency. This could also be a scheduled or ad-hoc backup for a certain time of day. I’m setting mine to 5 minutes just to get the sample code going.
  • There is a retention policy for the data for 10 days
  • The MaxIncrementalBackups which will tell the policy how many incremental backups to take before doing a new full backup. The backup service will always starts with a full backup on a newly enabled policy.

Since we are using a PowerShell script to create and enable the backup policy, we are using calls directly to the BackupRestore REST APIs. Notice where the Create it taking place. Notice how the URL is being built to create the policy and the API version being used.

$url = “https://” + $clusterName + “:19080/BackupRestore/BackupPolicies/$/Create?api-version=6.4″

Farther down in the script you’ll see where the url is being created for the EnableBackup command. Notice that we are specifying to backup at the ‘Application’ level, meaning if the app had more than one stateful service, they would all use the same backup policy.You can also enable backup at a partition or service level.

$url = “https://” + $clusterName + “:19080/Applications/” + $appName + “/$/EnableBackup?api-version=6.4″

3. Press F5 to execute the script and create/enable the backup policy. At this point, after 10 minutes, a backup will be created in the background.

Task 6: Confirm that data is being backed up

  1. Go back to the Azure portal, to your storage account and drill down to your backup blob container name. If you click on the backup blob container name (after waiting at least 5 minutes), you’ll see how the structure of the full/incremental backup process has taken place.

image

A couple of things to note:

  • You’ll see the blob container name in the upper left hand corner
  • You’ll see the ‘Location’ where you have the name of the blob, the name of the app, that name of the service in the app and then the partitionId.
  • For each backup, you have a .bkmetadata and .zip file.

2. To get a complete list of the backups, open the ListBackups.ps1 script in the root\Assets folder.

3. Fill in the parameters, and select F5 to run the script. You should see a list of all the current backups, names, IDs, partition numbers etc. This type of information will be important when you are ready to do a restore. Remember that each partition in a stateful service will have its own backup. You can also find a ListPartitionBackups.ps1 script in the root\Assets folder, just add your partitionID to the script parameters.

Below is a snapshot of the type of information you would see from running ListPartitionBackups.ps1:

image

Task 7 – Disable and Delete your backup policy

Now that you’ve had all the fun of seeing your services reliable collections being backed up in your blob container, you have a few choices. You can:

  • Suspend – this essentially just stops backups from being taken for a period of time. Suspension is thought of as a temporary action.
  • Resume – resuming a suspended backup
  • Enable – enabling a backup policy
  • Disable – use this when there is no longer a need to back up data from the reliable collection.
  • Delete – deletes the entire backup policy but your data still exists

One example of using a mix of the settings above is where you could enable a backup for an entire application but suspend or disable a backup for a particular service or partition in that application.

  1. The script RemoveBackup.ps1 from the root\Assets will do all 3 of the above. Depending on what you want to do at this point, set breakpoints within the PowerShell script to first suspend, then disable the backup policy. You will notice that there will be no more backups taking place.
  2. Once you are finished with your tests, continue the script to delete the backup policy.

References

For further information on backup and restore, see https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-backuprestoreservice-quickstart-azurecluster

Client library usage https://github.com/Microsoft/service-fabric-client-dotnet/blob/develop/docs/ClientLibraryUsage.md

Upload a file from an Azure Windows Server Core machine to Azure Blob storage

The title for this could be a lot longer like ‘how to upload a file using the Azure CLI to Azure storage on a Windows Server 2016 Core DataCenter’ because that’s what this blog post is about…but that’s a ridiculously long title. The point is, with a Core OS, you have no UI and very few capabilities and tools like you normally would on a full UI OS. Here’s what happened…

I was working on what appeared to be an issue with the new Azure Backup and Restore service on an Azure Service Fabric Cluster. I needed to generate a crash dump from the FabricBRS service and send it to Azure support.

I remote desktopped in to the VM in my VM Scaleset that was being used to host the FabricBRS service, ran ‘taskmgr’ in the command prompt and then right-clicked on the FabricBRS service and selected ‘create dump file’.

image

The crash dump file is written to a location like C:\Users\<yourlogin>\AppData\Local\Temp\2\FabricBRS.DMP. And this is where the fun began. How do I get the file from here to my Azure Storage account?

1. To make it easy to find the file, I copied it from the above directory right to the C:\ drive on the machine.

2. I figured the easiest way to do this was to install the Azure CLI and then use the CLI with my storage account to upload the file to blob storage. To download the MSI file from https://aks.ms/installazurecliwindows, from your current command prompt window, type ‘PowerShell’. This will start PowerShell in your current command prompt. However, what you need to do, is open a new PowerShell command prompt.

To do this, run the following command:

Invoke-Item C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe

3. Within the new PowerShell command prompt, to download the file, I ran the following commands:
$downloader = New-Object System.Net.WebClient $downloader.DownloadFile(“https://aka.ms/installazurecliwindows”,”C:\azure-cli-2.0.54.msi”)

You can do a ‘dir’ on your C:\ drive to make sure its there.

image

4. To install the Azure CLI, run the command:

Start-Process C:\azure-cli-2.0.54.msi

You will be taken through a wizard that should step you through the install process. When you are finished, you are SUPPOSED to be able to just type in ‘az’ in to the command prompt window and get back some response from the Azure CLI. However, I found that I instead had to go out to the Azure portal and restart my VM Scaleset and then log back in to the machine to the command prompt.

5. At this point, the Azure CLI should be available and you know where your crash dump file is, now you need to upload it to your Azure Storage account, in a blob storage container.

6. Make sure you have a storage account and a specific container to upload the file in to. Grab the name of your storage account and then the storage account key.

7. Run the following command using the Azure CLI:

az storage blob upload –account-name <your-storage-account-name> –account-key <your-storage-account-key> –container-name <your-container-name> –-file ‘C:\FabricBRS.DMP’ –name ‘FabricBRS.DMP’

Its hard to see from the command above, but for each parameter, there is actually a double hypen ‘- -‘ with no space in between for each parameter. Also, the ‘name’ parameter is the name you want the file to be shown as in your blob storage container.

I hope this blog post saves someone who needs to do the same type of exercise a bit of time!

[Service Fabric] Auto-scaling your Service Fabric cluster–Part II

In Part I of this article, I demonstrated how to set up auto-scaling on the Service Fabric clusters scale set based on a metric that is part of a VM scale set (Percentage CPU). This setting doesn’t have much to do at all with the applications that are running in your cluster, it’s just a pure hardware scaling that may take place because of your services CPU consumption or some other thing consuming CPU.

There was a recent addition to the auto-scaling capability of a Service Fabric cluster that allows you to use an Application Insights metric, reported by your service to control the cluster scaling. This capability gives you more finite control over not just auto-scaling, but which metric in which service to provide the metric values.

Creating the Application Insights (AI) resource

Using the Azure portal, the cluster resource group appears to have a few strange issues in how you create your AI resource. To make sure you build the resource correctly, follow these steps.

1. Click on the name of the resource group where your cluster is located.

2. Click on the +Add menu item at the top of the resource group blade.

3. Choose to create a new AI resource. In my case, I created a resource that used a General application type and used a different resource group that I keep all my AI resources in.

clip_image001

4. Once the AI resource has been created, you will need to retrieve and copy the Instrumentation Key. To do this, click on the name of your newly created AI resource and expand the Essentials menu item, you should see your instrumentation key here:

clip_image003

You’ll need the instrumentation key for your Service Fabric application, so either paste it in to Notepad or somewhere where you can access it easily.

Modify your Service Fabric application for Application Insights

In Part I of this article, I was using a simple stateless service app https://github.com/larrywa/blogpostings/tree/master/SimpleStateless and inside of this code, I had commented out code for use in this blog post.

I left my old AI instrumentation key in-place so you can see where to place yours.

1. If you already have the solution from the previous article (or download it if you don’t), the solution already has the NuGet package for Microsoft.ApplicationInsights.AspNetCore added. If you are using a different project, add this NuGet package to your own service.

2. Open appsettings.json and you will see down at the bottom of the file where I have added my instrumentation key. Pay careful attention to where you put this information in this file, brackets, colons etc can really get you messed up here.

3. In the WebAPI.cs file, there are a couple of things to note:

a. The addition of the ‘using Microsoft.ApplicationInsights;’ statement.

b. Line 31 – Uncomment the line of code for our counter.

c. Line 34 – Uncomment the line of code that create a new TelemetryClient.

d. Lines 43 – 59 – Uncomment out the code that creates the custom metric named ‘CountNumServices’. Using myClient.TrackMetric, I can create any custom metric of any name that I wish. In this example, I am counting up to 1200 (20 minutes of time) and then I count down from there to 300, 1 second at a time. Here, I am trying to provide ample time for the cluster to react appropriately for the scale-out and scale-in.

4. Rebuild the solution.

5. Deploy your solution to the cluster.

Here is a catchy situation. To actually be able to setup your VM scale set to scale based on an AI custom metric, you have to deploy the code to start generating some of the custom metric data, otherwise it won’t show up as a selection. But, if you currently have a scaling rule set, as in Part I, the nodes may still be reacting to those rules and not your new rules. That’s ok, we will change that once the app is deployed and running in the cluster. And, you need to let the app run for about 10 minutes to start generating some of these metrics.

Confirming custom metric data in Application Insights

Before considering testing out our new custom metric, you need to make sure the data is being generated.

1. Open your AI resource that you recently created. The key was placed in the appsettings.json file.

2. Make sure you are on the Overview blade, then select the Metrics Explorer menu item.

3. On either of the charts, click on the very small Edit link at the upper right-hand corner of the chart.

4. In the Chart details blade, down toward the bottom, you should see a new ‘Custom’ section. Select CountNumServices.

clip_image004

5. Close the Chart details blade. It is handy to have this details chart so you can see what the current count it to predict whether the cluster should be scaling out or in.

clip_image006

Modifying your auto-scaling rule set

1. Make sure you are on the Scaling blade for your VM scale set.

2. In the current rule set, either click on the current Scale out rule, Scale based on a metric or click on the +Add rule link.

3. In my case, I clicked on Scale based on a metric, so we’ll go with that option. Select +Add a rule.

4. Here is the rule I created:

a. Metric source: Application Insights

b. Resource type: Application Insights

c. Resource: aiServiceFabricCustomMeric ~ this is my recently created AI resource

d. Time aggregation: Average

e. Metric name: CountNumServices ~ remember, if you don’t see it when you try to scroll and find the metric to pick from, you may not have waited long enough for data to appear.

f. Time grain: 1 minute

g. Time grain statistic: Average

h. Operator: Greater than

i. Threshold: 600

j. Duration: 5 minutes

k. Operation: Increase count by

l. Instance count: 2

m. Cool down: 1 minute

Basically, after the count reaches 600, increase by two nodes. Wait 1 minute before scaling up any more nodes (note, in real production, this number should be at least 5 minutes). Actually, the number will be greater than 600 because we are using Average as the time aggregation and statistic.

5. In the Scale rule options blade

6. Select the Add button.

7. To create a scale in rule, I followed the same process but with slightly different settings:

a. Metric source: Application Insights

b. Resource type: Application Insights

c. Resource: aiServiceFabricCustomMeric ~ this is my recently created AI resource

d. Time aggregation: Average

e. Metric name: CountNumServices ~ remember, if you don’t see it when you try to scroll and find the metric to pick from, you may not have waited long enough for data to appear.

f. Time grain: 1 minute

g. Time grain statistic: Average

h. Operator: Less than

i. Threshold: 500

j. Duration: 5 minutes

k. Operation: Decrease count by

l. Instance count: 1

m. Cool down: 1 minute

Notice that while decreasing, I am decreasing at a lesser rate than when increasing. This is to just make sure we don’t make any dramatic changes that could upset performance.

8. Save the new rule set.

9. If you need to add or change who gets notified for any scale action, go to the Notify tab, make your change and save it.

10. At this point, you could wait for an email confirmation of a scale action or go in to the Azure portal AI resource and view the metrics values grow. Remember though, what you see in the chart are ‘averages’, not a pure count of the number of incremented counts. Also, when scaling, it uses averages.

In case you are curious what these rules would look like in an ARM template, it looks something like this:

clip_image008

I’ll provide the same warning here that I did in Part I. Attempting to scale out or in in any rapid fashion will cause instability in your cluster. When scaling out, you may consider to increase the number of nodes that are added at one time to keep you from having to wait so long for performance improvements (if that is what you are going for).

[Service Fabric] Auto-scaling your Service Fabric cluster–Part I

Like most people, whenever I need to build an ARM template to do something with Service Fabric, I’m browsing around Github or where ever else I can find the bits and pieces of JSON that I need.

I was recently working on a project where they needed 3 things:

1. They wanted their Service Fabric cluster to use managed disks instead of Azure Storage accounts.

2. They wanted to have auto-scaling setup for their virtual machine scale set (VMSS). In this case, we started out using the Percentage CPU rule, which is based on the VM scale set hardware and nothing application specific. In Part II, I’ll talk about auto-scaling via custom metrics and Application Insights.

3. They had a stateless service where they wanted to register their service event source to output logging information in to the WADETWEventTable.

This template includes those 3 implementations. You can find a link to the files here https://github.com/larrywa/blogpostings/tree/master/VMSSScaleTemplate. This is a secure cluster implementation so you’ll have to have your primary certificate in Azure Key Vault etc.

Using managed disks

Some of the newer templates you’ll find out on Github already use managed disks for the Service Fabric cluster, but in case the one you are using doesn’t, you need to find the location in your JSON file, in the Microsoft.Compute/virtualMachineScaleSets resource provider and make the following modifications.

clip_image002

Another important setting you need to make sure you have is the overProvision = false setting (here placed in a variable)

clip_image003

This variable is actually used in the Microsoft.Compute/virtualMachineScaleSets resource provider properties:

clip_image005

More information about overprovisioning can be found here https://docs.microsoft.com/en-us/rest/api/compute/virtualmachinescalesets/create-or-update-a-set, but if this setting is missing or set to true, you may see more than the requested number of machines and nodes created at deployment and then the ones that are not in use are turned off. This will cause errors to appear in the Service Fabric Explorer. Service Fabric will eventually go behind and clean up itself but when you first see the errors, you’ll think you did something wrong.

Setting Up Auto-scale on your VMSS

At first my idea was to go to my existing cluster, turn on auto-scaling inside of the VMSS settings and then export the template from the Azure portal. I then discovered that my subscription did not have permission to use the microsoft.insights resource provider. Not sure you’ll run in to this, but if you do you can either enable it in the portal under Subscriptions -> your subscription -> Resource providers -> Microsoft.Insights.

The ‘microsoft.insights/autoscalesettings’ resource provider is placed at the same level in the JSON file as the other major resource providers like the cluster, virtual machine scale set etc. This scaleset, although it is found as a setting for the scaleset, is not a sub-section of the resource provider for the VMSS, it is actually a separate resource provider as shown in the next screenshot.

In the Json outline editor, the auto-scale resource will look like this:

clip_image006

There is a rule with conditions for scaling up and scaling down based on Percentage CPU metrics. Before deployments, go in and set your own desired levels in the capacity area, line 838 in this sample. The values for the Percentage CPU I have set in this sample are ridiculously low just to be able to see something happen, and if you’re impatient, you can make them even lower just to see scaling take place.

Registering your Event Source provider

To register an event source provider, you need to add your event source provider to the Microsoft.Compute/virtualMachineScaleSets resource provider in the Microsoft.Azure.Diagnostics section.

clip_image008

You will of course need to make sure that the ‘provider’ listed above matches the EventSource name in your ServiceEventSource.cs file of your service. This is what I have for my stateless service that I’ll deploy to the cluster once it’s up and running.

clip_image010

Template Deployment

For deployment to create the Service Fabric cluster, there is a deploy.ps1 PowerShell script included but you can really deploy the ARM template with whatever script you normally would use for your other ARM templates. Make sure you go through the process or creating your cluster certificate and put it in the Azure Key Vault first though, you will need that type of information for any secure cluster.

Note that in this example on Github, the parameters file is named ’empty-sfmanageddisk.parameters.json’ and to get the deploy.ps1 to work, you need to get rid of the ’empty-‘ part of the name.

Once your cluster is up and running….confirm your auto-scale settings

Within your cluster resource group, click on the name of your VM scale set. Then click on the Scaling menu item. What you should see is something like this:

clip_image012

If you want to make changes to the levels for Scale out and Scale in, just click on the ‘When’ rules and an edit blade will appear where you can make those changes. Then click the Update button.

Next, to get notified of an auto-scale event, click on the Notify tab, enter an email address and then click the Save button.

clip_image013

Before moving on to the next step, make sure that your cluster nodes are up and running by clicking on the name of your cluster in your resource group and looking for a visual confirmation of available nodes:

clip_image015

Publish your Service Fabric application

For this example, I have a simple stateless ASP.Net Core 2.0 sample that uses the Kestrel communications listener. You can either choose a dynamic port or specify a specific port number in your ServiceManifest.xml file. Publish the service to the cluster using Visual Studio 2017.

In the WebAPI.cs file, you will notice some code that is commented out. I’ll use this in part II of this article when I discuss scaling via custom metrics.

You can find this code sample at https://github.com/larrywa/blogpostings/tree/master/SimpleStateless

 

Email from Azure during auto-scale event

Each time an auto-scale event happens, you will receive an email similar to the one below:

clip_image017

WARNING: The values that I have set for my auto-scale ranges are extremely aggressive. This is just so that initially I can see something happening. In an actual production environment, you cannot expect rapid scale-up and scale-down of Service Fabric nodes. Remember, a node (machine is first created) and then the Service Fabric bits are installed on the VM and then all that has to be spun up, registered and the rest of the Service Fabric magic. You are looking at several minutes of time for this to take place. If your scale-up and scale-down parameters are too aggressive, you could actually get the cluster in a situation where it is trying to spin up a node why taking down the same node. Your node will then go in to an unknown state in which you will have to manually correct by turning off auto-scaling until things settle down.

[Service Fabric] Full/Incremental backup and restore sample

I recently worked with a client who said that they had had trouble implementing incremental backup and restore for their stateful services, so I was set to the task to build a sample that would do both a full and incremental backups to Azure blob storage.

I fully expect great improvements to happen in Service Fabric in the coming months regarding backup and restore, but for now, the customer needed to get moving.

I’ve posted the Visual Studio solution on GitHub for you to look at if you are needing something like this:

https://github.com/larrywa/blogpostings/tree/master/SFBackup

I am using Visual Studio 2017 15.3.3 with the Service Fabric SDK v2.7.198.

Here are some notes for the code you’ll find in the sample. I’ve put as many comments as seem logical within the code, but more description is always better.

Stateful1.cs

  • At the top of the file is a int named backupCount – this is a counter that specifies after how many counts to do before you do the next full or incremental backup.
  • In RunAsync, there is an bool, takeBackup that will keep track of whether a backup needs to take place
  • When takeBackup is true, based on the logic I’m using, a full backup will take place first, then after this, incremental backups. There is a flag, takeFullBackup and an incremental counter incrementalCount, that will need to be replaced by your own logic on when you want to do backups and under what conditions.
  • BackupCallbackAsync – this is the primary backup method. This method calls SetupBackupManager (discussed below) and ArchiveBackupAsync in the AzureBackupStore.cs file.  ArchiveBackupAsync is the method that will take the data that needs to be backed up, zip it up and push it out to blob storage. If this is a full backup, any files sitting in your blob storage container will be deleted because you should not need them at this point. (Here, you may want to do archiving if you are not sure of your backup strategy). For a stateful service with multiple partitions, you would generally have multiple blob containers, one for each partition.
  • SetupBackupManager – this method gets settings from the Settings.xml file in .\PackageRoot\Config to determine storage account keys, names etc. Then, an instance of the AzureBlobBackupManager class is created. This method is called right before you perform a backup or a restore just to make sure a connection is available.
  • OnDataLossAsync – this method is called when you execute the .\SFBackup\Scripts\EnumeratePartitions.ps1. This method calls the backup managers RestoreLatestBackupToTempLocation which is responsible for:
    • Getting a list of the blobs in blob storage that make up the backup
    • Looping through each blob and downloading/extracting it to your hard drive (in a folder you specify).
    • Once unzipped, deleting the zip file from your hard drive
    • Returning the root directory under which you have the unzipped full backup plus all incremental backups. The correct structure of a full + incremental backup should look like:Backup Folder (the root folder)/
      • Full/ (name may be different depending on your naming convention)
      • SM/
      • LR/
      • Metadata
        • Incremental0 (name may be different depending on your naming convention)
      • LR/
      • Metadata
        • Incremental1 (name may be different depending on your naming convention)
      • LR/
      • Metadata

Next, the RestoreAsync command is called which looks at your root backup directory, then cycles through the full backup and incremental backups to do the restore. The root directory is then deleted after a restore.

Settings.xml

  • For the parameters in this file, I have commented so you’ll know what they mean. Note that you are going to need to add code to the app to programmatically create the directory where your backup files are dropped in to. I’m not calling this the ‘root’ directory because the root directory IS created programmatically and the name will change based on the partition name/logic.

AzureBackupStore.cs

  • I won’t go in to each method here because I’ve commented the utility methods pretty well, but one key thing to watch for is in the constructor, it is showing you how to get the parameters out of the Settings.xml file that you need. Note how the tempRestoreDir (that is the folder YOU create on your hard drive) is combined to the partitionId name/number to form the actual ‘root’ folder which will be later deleted after you have done the restore.

 

So, how do you run this application?

 

  1. Open the solution and build the app to get your NuGet packages pulled down.
  2. Create an Azure storage account with a container to use for blob storage backup
  3. Create a directory on your hard drive to store the backup files
  4. Fill in the information in settings.xml
  5. Open EnumeratePartitions.ps1 in PowerShell ISE
  6. F5

 

When you start running the app, you can set breakpoints in various places to see how it runs, but for me, I let it go through a few backups and write down where I ended up at on the count (from the diagnostics window). Then approximately half way in the middle of another count run, I will execute the PS script. it will take a few seconds to trigger the restore.

The way the app is written is that when you trigger a restore, it does the restore and then after the restore, the next backup is a full backup etc. Something you will notice though as you do restores is that the service (the particular partition), will seem to freeze while the restore is taking place. This is because a ‘restore’ is expected to be a disaster recovery situation where the service partition would not normally be available. Other partitions may still be running, but not the one being restored.

Hope this helps you in your backup and restore efforts!

[Service Fabric] Securing an Azure Service Fabric cluster with Azure Active Directory via the Azure Portal

To make sure appropriate credit is given to a great article on this topic, I did take some of the information for the steps below from this article https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-cluster-creation-via-arm. The article found at the link focuses on using an ARM template to do the deployment and secure setup. Using ARM would be the way you would want to set this up in a production environment.

My post is going to be using a more manual approach where you set the cluster up via the portal, for those of you who are just testing this out and want to learn how things are done via the portal. Some of the steps for setting this up in the portal will be short and will point to another article for portal setup of the cluster, but it’s the security setup we want to focus on.

So we will discuss:

  • Setup of your cluster certificate
  • Setup of Azure AD
  • Azure Cluster creation
  • Testing your Admin and Read-only user login

Setting up your cluster certificate

My purpose for using Azure Active Directory (AAD) was to setup an admin user and a read-only user so they could access the Service Fabric Explorer with those permissions. You still need to have a cluster certificate to secure the cluster though.

  1. Open PowerShell ISE as an Administrator.
  2. In the PowerShell command window, log in to your Azure subscription using ‘Login-AzureRMAccount’. When you do this, in the command window you will see the subscriptionID. You need to copy the subscriptionID, because you will need that in the PowerShell script to create the Azure AD application. Also copy the tenantID value.
  3. Run the following PS script. This script create a new resource group for your key vault, a key vault, a self-signed certificate and a secret key. You will need to record the information that appears in the PS command prompt output window after the successful execution of this script. Fill in the variables with your own values. Note that it is usually best just to save the script code below to a PS file first.
#-----You have to change the variable values-------#
# This script will:
#  1. Create a new resource group for your key vaults
#  2. Create a new key vault
#  3. Create, export and import (to your certificate store) a self-signed certificate
#  4. Create a new secret and put the cert in key vault
#  5. Output the values you will need to supply to your cluster for cluster cert security.
#     Make sure you copy these values before closing the PowerShell window
#--------------------------------------------------#

#Name of the Key Vault service
$KeyVaultName = "<your-vault-name>" 
#Resource group for the Key-Vault service. 
$ResourceGroup = "<vault-resource-group-name>"
#Set the Subscription
$subscriptionId = "<your-Azure-subscription-ID>" 
#Azure data center locations (East US", "West US" etc)
$Location = "<region>"
#Password for the certificate
$Password = "<certificate-password>"
#DNS name for the certificate
$CertDNSName = "<name-of-your-certificate>"
#Name of the secret in key vault
$KeyVaultSecretName = "<secret-name-for-cert-in-vault>"
#Path to directory on local disk in which the certificate is stored  
$CertFileFullPath = "C:\<local-directory-to-place-your-exported-cert>\$CertDNSName.pfx"

#If more than one under your account
Select-AzureRmSubscription -SubscriptionId $subscriptionId
#Verify Current Subscription
Get-AzureRmSubscription -SubscriptionId $subscriptionId


#Creates the a new resource group and Key-Vault
New-AzureRmResourceGroup -Name $ResourceGroup -Location $Location
New-AzureRmKeyVault -VaultName $KeyVaultName -ResourceGroupName $ResourceGroup -Location $Location -sku standard -EnabledForDeployment 

#Converts the plain text password into a secure string
$SecurePassword = ConvertTo-SecureString -String $Password -AsPlainText -Force

#Creates a new selfsigned cert and exports a pfx cert to a directory on disk
$NewCert = New-SelfSignedCertificate -CertStoreLocation Cert:\CurrentUser\My -DnsName $CertDNSName 
Export-PfxCertificate -FilePath $CertFileFullPath -Password $SecurePassword -Cert $NewCert
Import-PfxCertificate -FilePath $CertFileFullPath -Password $SecurePassword -CertStoreLocation Cert:\LocalMachine\My 

#Reads the content of the certificate and converts it into a json format
$Bytes = [System.IO.File]::ReadAllBytes($CertFileFullPath)
$Base64 = [System.Convert]::ToBase64String($Bytes)

$JSONBlob = @{
    data = $Base64
    dataType = 'pfx'
    password = $Password
} | ConvertTo-Json

$ContentBytes = [System.Text.Encoding]::UTF8.GetBytes($JSONBlob)
$Content = [System.Convert]::ToBase64String($ContentBytes)

#Converts the json content a secure string
$SecretValue = ConvertTo-SecureString -String $Content -AsPlainText -Force

#Creates a new secret in Azure Key Vault
$NewSecret = Set-AzureKeyVaultSecret -VaultName $KeyVaultName -Name $KeyVaultSecretName -SecretValue $SecretValue -Verbose

#Writes out the information you need for creating a secure cluster
Write-Host
Write-Host "Resource Id: "$(Get-AzureRmKeyVault -VaultName $KeyVaultName).ResourceId
Write-Host "Secret URL : "$NewSecret.Id
Write-Host "Thumbprint : "$NewCert.Thumbprint

The information you need to record will appear similar to this:

Resource Id: /subscriptions/<your-subscriptionID>/resourceGroups/<your-resource-group>/providers/Microsoft.KeyVault/vaults/<your-vault-name>

Secret URL : https://<your-vault-name>.vault.azure.net:443/secrets/<secret>/<generated-guid>

Thumbprint : <certificate-thumbprint>

Setting up Azure Active Directory

  1. To secure the cluster with Azure AD, you will need to decide which AD directory in your subscription you will be using. In this example, we will use the ‘default’ directory. In step 2 above, you should have recorded the ‘tenantID’. This is the ID associated with your default Active directory. NOTE: If you have more than one directory (or tenant) in your subscription, you are going to have to make sure you get the right tenantID from your AAD administrator. The first piece of script you need to save to a file named Common.ps1 is:
<#
.VERSION
1.0.3

.SYNOPSIS
Common script, do not call it directly.
#>

if($headers){
    Exit
}

Try
{
    $FilePath = Join-Path $PSScriptRoot "Microsoft.IdentityModel.Clients.ActiveDirectory.dll"
    Add-Type -Path $FilePath
}
Catch
{
    Write-Warning $_.Exception.Message
}

function GetRESTHeaders()
{
	# Use common client 
    $clientId = "1950a258-227b-4e31-a9cf-717495945fc2"
    $redirectUrl = "urn:ietf:wg:oauth:2.0:oob"
    
    $authenticationContext = New-Object Microsoft.IdentityModel.Clients.ActiveDirectory.AuthenticationContext -ArgumentList $authString, $FALSE

    $accessToken = $authenticationContext.AcquireToken($resourceUrl, $clientId, $redirectUrl, [Microsoft.IdentityModel.Clients.ActiveDirectory.PromptBehavior]::RefreshSession).AccessToken
    $headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
    $headers.Add("Authorization", $accessToken)
    return $headers
}

function CallGraphAPI($uri, $headers, $body)
{
    $json = $body | ConvertTo-Json -Depth 4 -Compress
    return (Invoke-RestMethod $uri -Method Post -Headers $headers -Body $json -ContentType "application/json")
}

function AssertNotNull($obj, $msg){
    if($obj -eq $null -or $obj.Length -eq 0){ 
        Write-Warning $msg
        Exit
    }
}

# Regional settings
switch ($Location)
{
    "china"
    {
        $resourceUrl = "https://graph.chinacloudapi.cn"
        $authString = "https://login.partner.microsoftonline.cn/" + $TenantId
    }

    default
    {
        $resourceUrl = "https://graph.windows.net"
        $authString = "https://login.microsoftonline.com/" + $TenantId
    }
}

$headers = GetRESTHeaders

if ($ClusterName)
{
    $WebApplicationName = $ClusterName + "_Cluster"
    $WebApplicationUri = "https://$ClusterName"
    $NativeClientApplicationName = $ClusterName + "_Client"
}

You do not need to execute this script, it will be called by the next script, so make sure you have Common.ps1 in the same folder as the next script.

2.  Create a new script file and paste in the code below. Name this file SetupApplications.ps1. Note that you will need to record some of the output from the execute of this script (explained below) for later use.

<#
.VERSION
1.0.3

.SYNOPSIS
Setup applications in a Service Fabric cluster Azure Active Directory tenant.

.PREREQUISITE
1. An Azure Active Directory tenant.
2. A Global Admin user within tenant.

.PARAMETER TenantId
ID of tenant hosting Service Fabric cluster.

.PARAMETER WebApplicationName
Name of web application representing Service Fabric cluster.

.PARAMETER WebApplicationUri
App ID URI of web application.

.PARAMETER WebApplicationReplyUrl
Reply URL of web application. Format: https://<Domain name of cluster>:<Service Fabric Http gateway port>

.PARAMETER NativeClientApplicationName
Name of native client application representing client.

.PARAMETER ClusterName
A friendly Service Fabric cluster name. Application settings generated from cluster name: WebApplicationName = ClusterName + "_Cluster", NativeClientApplicationName = ClusterName + "_Client"

.PARAMETER Location
Used to set metadata for specific region: china. Ignore it in global environment.

.EXAMPLE
. Scripts\SetupApplications.ps1 -TenantId '4f812c74-978b-4b0e-acf5-06ffca635c0e' -ClusterName 'MyCluster' -WebApplicationReplyUrl 'https://mycluster.westus.cloudapp.azure.com:19080'

Setup tenant with default settings generated from a friendly cluster name.

.EXAMPLE
. Scripts\SetupApplications.ps1 -TenantId '4f812c74-978b-4b0e-acf5-06ffca635c0e' -WebApplicationName 'SFWeb' -WebApplicationUri 'https://SFweb' -WebApplicationReplyUrl 'https://mycluster.westus.cloudapp.azure.com:19080' -NativeClientApplicationName 'SFnative'

Setup tenant with explicit application settings.

.EXAMPLE
. $ConfigObj = Scripts\SetupApplications.ps1 -TenantId '4f812c74-978b-4b0e-acf5-06ffca635c0e' -ClusterName 'MyCluster' -WebApplicationReplyUrl 'https://mycluster.westus.cloudapp.azure.com:19080'

Setup and save the setup result into a temporary variable to pass into SetupUser.ps1
#>

Param
(
    [Parameter(ParameterSetName='Customize',Mandatory=$true)]
    [Parameter(ParameterSetName='Prefix',Mandatory=$true)]
    [String]
	$TenantId,

    [Parameter(ParameterSetName='Customize')]	
	[String]
	$WebApplicationName,

    [Parameter(ParameterSetName='Customize')]
	[String]
	$WebApplicationUri,

    [Parameter(ParameterSetName='Customize',Mandatory=$true)]
    [Parameter(ParameterSetName='Prefix',Mandatory=$true)]
	[String]
    $WebApplicationReplyUrl,
	
    [Parameter(ParameterSetName='Customize')]
	[String]
	$NativeClientApplicationName,

    [Parameter(ParameterSetName='Prefix',Mandatory=$true)]
    [String]
    $ClusterName,

    [Parameter(ParameterSetName='Prefix')]
    [Parameter(ParameterSetName='Customize')]
    [ValidateSet('china')]
    [String]
    $Location
)

Write-Host 'TenantId = ' $TenantId

. "$PSScriptRoot\Common.ps1"

$graphAPIFormat = $resourceUrl + "/" + $TenantId + "/{0}?api-version=1.5"
$ConfigObj = @{}
$ConfigObj.TenantId = $TenantId

$appRole = 
@{
    allowedMemberTypes = @("User")
    description = "ReadOnly roles have limited query access"
    displayName = "ReadOnly"
    id = [guid]::NewGuid()
    isEnabled = "true"
    value = "User"
},
@{
    allowedMemberTypes = @("User")
    description = "Admins can manage roles and perform all task actions"
    displayName = "Admin"
    id = [guid]::NewGuid()
    isEnabled = "true"
    value = "Admin"
}

$requiredResourceAccess =
@(@{
    resourceAppId = "00000002-0000-0000-c000-000000000000"
    resourceAccess = @(@{
        id = "311a71cc-e848-46a1-bdf8-97ff7156d8e6"
        type= "Scope"
    })
})

if (!$WebApplicationName)
{
	$WebApplicationName = "ServiceFabricCluster"
}

if (!$WebApplicationUri)
{
	$WebApplicationUri = "https://ServiceFabricCluster"
}

if (!$NativeClientApplicationName)
{
	$NativeClientApplicationName =  "ServiceFabricClusterNativeClient"
}

#Create Web Application
$uri = [string]::Format($graphAPIFormat, "applications")
$webApp = @{
    displayName = $WebApplicationName
    identifierUris = @($WebApplicationUri)
    homepage = $WebApplicationReplyUrl #Not functionally needed. Set by default to avoid AAD portal UI displaying error
    replyUrls = @($WebApplicationReplyUrl)
    appRoles = $appRole
}

switch ($Location)
{
    "china"
    {
        $oauth2Permissions = @(@{
            adminConsentDescription = "Allow the application to access " + $WebApplicationName + " on behalf of the signed-in user."
            adminConsentDisplayName = "Access " + $WebApplicationName
            id = [guid]::NewGuid()
            isEnabled = $true
            type = "User"
            userConsentDescription = "Allow the application to access " + $WebApplicationName + " on your behalf."
            userConsentDisplayName = "Access " + $WebApplicationName
            value = "user_impersonation"
        })
        $webApp.oauth2Permissions = $oauth2Permissions
    }
}

$webApp = CallGraphAPI $uri $headers $webApp
AssertNotNull $webApp 'Web Application Creation Failed'
$ConfigObj.WebAppId = $webApp.appId
Write-Host 'Web Application Created:' $webApp.appId

#Service Principal
$uri = [string]::Format($graphAPIFormat, "servicePrincipals")
$servicePrincipal = @{
    accountEnabled = "true"
    appId = $webApp.appId
    displayName = $webApp.displayName
    appRoleAssignmentRequired = "true"
}
$servicePrincipal = CallGraphAPI $uri $headers $servicePrincipal
$ConfigObj.ServicePrincipalId = $servicePrincipal.objectId

#Create Native Client Application
$uri = [string]::Format($graphAPIFormat, "applications")
$nativeAppResourceAccess = $requiredResourceAccess +=
@{
    resourceAppId = $webApp.appId
    resourceAccess = @(@{
        id = $webApp.oauth2Permissions[0].id
        type= "Scope"
    })
}
$nativeApp = @{
    publicClient = "true"
    displayName = $NativeClientApplicationName
    replyUrls = @("urn:ietf:wg:oauth:2.0:oob")
    requiredResourceAccess = $nativeAppResourceAccess
}
$nativeApp = CallGraphAPI $uri $headers $nativeApp
AssertNotNull $nativeApp 'Native Client Application Creation Failed'
Write-Host 'Native Client Application Created:' $nativeApp.appId
$ConfigObj.NativeClientAppId = $nativeApp.appId

#Service Principal
$uri = [string]::Format($graphAPIFormat, "servicePrincipals")
$servicePrincipal = @{
    accountEnabled = "true"
    appId = $nativeApp.appId
    displayName = $nativeApp.displayName
}
$servicePrincipal = CallGraphAPI $uri $headers $servicePrincipal

#OAuth2PermissionGrant

#AAD service principal
$uri = [string]::Format($graphAPIFormat, "servicePrincipals") + '&$filter=appId eq ''00000002-0000-0000-c000-000000000000'''
$AADServicePrincipalId = (Invoke-RestMethod $uri -Headers $headers).value.objectId

$uri = [string]::Format($graphAPIFormat, "oauth2PermissionGrants")
$oauth2PermissionGrants = @{
    clientId = $servicePrincipal.objectId
    consentType = "AllPrincipals"
    resourceId = $AADServicePrincipalId
    scope = "User.Read"
    startTime = (Get-Date).ToUniversalTime().ToString("yyyy-MM-ddTHH:mm:ss.fffffff")
    expiryTime = (Get-Date).AddYears(1800).ToUniversalTime().ToString("yyyy-MM-ddTHH:mm:ss.fffffff")
}
CallGraphAPI $uri $headers $oauth2PermissionGrants | Out-Null
$oauth2PermissionGrants = @{
    clientId = $servicePrincipal.objectId
    consentType = "AllPrincipals"
    resourceId = $ConfigObj.ServicePrincipalId
    scope = "user_impersonation"
    startTime = (Get-Date).ToUniversalTime().ToString("yyyy-MM-ddTHH:mm:ss.fffffff")
    expiryTime = (Get-Date).AddYears(1800).ToUniversalTime().ToString("yyyy-MM-ddTHH:mm:ss.fffffff")
}
CallGraphAPI $uri $headers $oauth2PermissionGrants | Out-Null

$ConfigObj

#ARM template
Write-Host
Write-Host '-----ARM template-----'
Write-Host '"azureActiveDirectory": {'
Write-Host ("  `"tenantId`":`"{0}`"," -f $ConfigObj.TenantId)
Write-Host ("  `"clusterApplication`":`"{0}`"," -f $ConfigObj.WebAppId)
Write-Host ("  `"clientApplication`":`"{0}`"" -f $ConfigObj.NativeClientAppId)
Write-Host "},"

3.  Execute the following command from the PS command prompt window:

.\SetupApplications.ps1 -TenantId ‘<your-tenantID>’ -ClusterName ‘<your-cluster-name>.<region>.cloudapp.azure.com’ -WebApplicationReplyUrl ‘https://<your-cluster-name>.<region>.cloudapp.azure.com:19080/Explorer/index.html

The ClusterName is used to prefix the AAD applications created by the script. It does not need to match the actual cluster name exactly as it is only intended to make it easier for you to map AAD artifacts to the Service Fabric cluster that they’re being used with. This can be a bit confusing because you haven’t created your cluster yet. But, if you know what name you plan to give your cluster, you can use it here.

The WebApplicationReplyUrl is the default endpoint that AAD returns to your users after completing the sign-in process. You should set this to the Service Fabric Explorer endpoint for your cluster, which by default is:

https://<cluster_domain>:19080/Explorer

For a full list of AAD helper scripts, you can find more of these at http://servicefabricsdkstorage.blob.core.windows.net/publicrelease/MicrosoftAzureServiceFabric-AADHelpers.zip.

Record the information at the bottom of the command prompt window. You will need this information when you deploy your cluster from the portal. The information will look similar to what you see below.

“azureActiveDirectory”: {

“tenantId”:”<Your-AAD-tenantID>”,

“clusterApplication”:”1xxxxxxxx-x68e-490a-89c8-2894e4b8686a”,

“clientApplication”:”xxxxxxx-7825-4e1e-a586-f0ff8d9e679e”

NOTE: You may receive a Warning that you have a missing assembly. You can ignore this warning.

4.  After you run the script in step 6, log in to the classic Azure portal at https://manage.windowsazure.com. For now you need to use the classic Azure portal because the production portal Azure Active Directory features are still in preview.

5.  Find your Azure Active Directory in the list and click on it.

clip_image001

6.  Add 2 new users to your directory. Name them whatever you want just as long as you know which one is Admin and which one would be the read-only user. Make sure to record the password that is initially generated, because the first time you try to log in to the portal as this user, you will be asked to change the password.

7.  Within your AAD, click on the Applications menu. In the Show drop-down box, pick Applications My Company Owns and then click on the check button over to the right to do a search.

clip_image002

8.  You should see two applications listed. One will be for Native client applications and the other for Web Applications. Click on the application name for the web application type. Since we will be doing our connectivity test connecting to the Service Fabric Explorer web UI, this is the application we need to set the user permissions on.

clip_image003

9.  Click on the Users menu.

10.  Click on the user name that should be the administrator and then select the Assign button at the bottom of the portal Window.

11.  In the Assign Users dialog box, pick Admin from the dropdown box and select the check button.

clip_image004

12.  Repeat 10 and 11 but this time, for the read-only user select Read-only from the Assign Users drop-down. This step completes what you will need to do in the classic portal, so you can close the classic portal window.

13.  You now have all the information you need to create your cluster in the portal. Log in to the Azure Portal at https://portal.azure.com.

Creating your Service Fabric cluster

  1. Create a new resource group and then within the resource group start the process of adding a new Service Fabric Cluster to the resource group. As you are stepping through creating the cluster, you will find 4 core blades with information you need to provide:
    • Basic – unique name for the cluster, operating system, username/password for RDP access etc.
    • Cluster configuration – node type count, node configuration, diagnostics etc.
    • Security – This is where we want to focus in the next step….

NOTE: If you want more details about creating your Service Fabric cluster via the portal, go to https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-cluster-creation-via-portal ~ the procedures at this link uses certificates for Admin and Read-only user setup, not AAD.

2.  In the cluster Security blade, make sure the Security mode is set to Secure. It is by default.

3.  The output from the first PS script you executed will contain the information you need for the Primary certificate information. After you enter your recorded information, make sure to select the Configure advanced settings checkbox.

clip_image005

4.  By selecting the Configure advanced settings checkbox, the blade will expand (vertically) and you can scroll down in the blade to find the area where you need to enter the Active Directory information. The information you recorded when you executed SetupApplications.ps1 will be used here.

clip_image006

5.  Select the Ok button in the Security blade.

6.  Complete the Summary blade after the portal validates your settings.

Testing your Admin and Read-only user access

  1. Once the cluster has completed the creation process, make sure you log out of the Azure portal. This assures that when you attempt to log in as the Admin or Read-only user, you will not accidentally log in to the portal as the subscription administrator.
  2. Log in to the portal as either the Admin or Read-only user. You will need to change the temporary password you were provided early and then the log in will complete.
  3. Open up a new browser window and log in to https://<yourfullclustername>:19080/Explorer/. Test the Explorer functionality.

 

Hope this helps you in your work with Azure Service Fabric!

[Service Fabric] Azure Service Fabric with Azure Diagnostics and Nlog log retrieval

I was recently working with a customer where we were migrating some of their Azure Cloud Services over to Azure Service Fabric.

Within their current Cloud Services, there is a process where NLog will drop a log file to a Local Resource folder. Azure Diagnostics is then used to pick up this log file and place it in a container in Azure storage for another process to pick up for processing.

With Service Fabric, we do not have the same concept of a Local Resource folder. Also, with Service Fabric, or more importantly Azure Resource Manager deployments, the only way to set Azure Diagnostics settings is to do this through an ARM template.

In order to get the NLog process of dropping the file to a folder to work, the following JSON code has to be added to the ARM template:

clip_image001

In order to find where to put the above code, you need to find the Service Fabric VM Scale set cluster (shown below in Visual Studio), select it, then in the JSON code, scroll down to where you see “publisher” : “Microsoft.Azure.Diagnostics”:

clip_image002

The blob storage account, blob container name, node folder and VM Scale set node name can be parameterized for more flexibility.

At this point in time, what cannot be parameterized or modified is how Azure Diagnostics places the file in to the blob container. Azure Diagnostics setups up a specific folder structure under the container that looks like this:

clip_image003

For more information on some of the differences between Cloud Services and Service Fabric, check out:

https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-cloud-services-migration-differences

 

Hope this helps you in your work with Azure Service Fabric!