Monday, March 31, 2014

Azure Outage Recovery

We use Microsoft Azure for our Integration and UAT environments for a variety of reasons. We've recently become struck with the issue of updates to the host servers resulting in the VM's on that host becoming unavailable. Most of the time this is an outage that lasts around 20 minutes but sometimes, the servers don't come back online and end up stuck in a "starting" state.

When they are stuck like this you can't do anything with them in the management portal so your only option is to try and wait it out or get in via the Azure Powershell command line. You can read on how to get started with that here.

Once you've done that you can now get started. If you're like us, you might have multiple subscriptions and chances are, the default one won't be the one that has the VM that is being problematic.

Show your list of VM's to validate you're in the right subscription

   > get-azurevm

If you need to change your subscription run the command below.

   > Select-AzureSubscription -SubscriptionName "Name of Subscription here"

From there it's pretty simple to just

   > Stop-AzureVM -Name NameofVM -ServiceName -NameofService -Force

It will take up to 30 seconds and then come back and tell you it succeeded. From there you can go into the portal and refresh and then start it manually. Alternatively you can start it with the command Start-AzureVM but I recommend doing it in the portal for the piece of mind that you see it stop and then your starting it by hand. My experience has shown that this can take up to 20 minutes to get the VM back online.

I hope that this is helpful to anyone else.
blog comments powered by Disqus