Monday, March 31, 2014

Azure Outage Recovery

We use Microsoft Azure for our Integration and UAT environments for a variety of reasons. We've recently become struck with the issue of updates to the host servers resulting in the VM's on that host becoming unavailable. Most of the time this is an outage that lasts around 20 minutes but sometimes, the servers don't come back online and end up stuck in a "starting" state.

When they are stuck like this you can't do anything with them in the management portal so your only option is to try and wait it out or get in via the Azure Powershell command line. You can read on how to get started with that here.

Once you've done that you can now get started. If you're like us, you might have multiple subscriptions and chances are, the default one won't be the one that has the VM that is being problematic.

Show your list of VM's to validate you're in the right subscription

   > get-azurevm

If you need to change your subscription run the command below.

   > Select-AzureSubscription -SubscriptionName "Name of Subscription here"

From there it's pretty simple to just

   > Stop-AzureVM -Name NameofVM -ServiceName -NameofService -Force

It will take up to 30 seconds and then come back and tell you it succeeded. From there you can go into the portal and refresh and then start it manually. Alternatively you can start it with the command Start-AzureVM but I recommend doing it in the portal for the piece of mind that you see it stop and then your starting it by hand. My experience has shown that this can take up to 20 minutes to get the VM back online.

I hope that this is helpful to anyone else.

Wednesday, January 25, 2012

Nerding Out: File system issues

I've been chasing down an issue where my backup software (Vembu Storegrid, who provides terrific support) would hang and become unresponsive. Their support team logged in and helped me figure out that part of the disk appears to become inresponsive in high IO (Reading and writing from the disk) situations like when doing backups. They suggested running a repair.

So I logged in and wanted to shut down things and unmount the drive to prevent corruption.

(I check to see the device name I want to unmount /data or /dev/sdb1)

[root@sys-util-1 /]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda2              68G   45G   21G  69% /
tmpfs                 2.0G     0  2.0G   0% /dev/shm
/dev/sdb1             2.8T  2.6T  195G  94% /data


(Now I try and unmount it, but it's showing as busy)
[root@sys-util-1 /]# umount /data
umount: /data: device is busy
umount: /data: device is busy

(So now I try and force unmount it with no luck)
[root@sys-util-1 /]# umount -f /data
umount2: Device or resource busy
umount: /data: device is busy
umount2: Device or resource busy
umount: /data: device is busy

(Next I ran an "lazy" unmount which means to unmount at the next moment it's not in use)
[root@sys-util-1 /]# umount -l /data
[root@sys-util-1 /]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda2              68G   45G   21G  69% /
tmpfs                 2.0G     0  2.0G   0% /dev/shm

(Now I see that the device is unmounted and I wanted to run the repair but it's failing with the error below)
[root@sys-util-1 /]# xfs_repair /dev/sdb1
xfs_repair: /dev/sdb1 contains a mounted filesystem
fatal error -- couldn't initialize XFS library

(I decided to try a basic check first instead but the result was a fail as well)
[root@sys-util-1 /]# xfs_check /dev/sdb1
xfs_check: /dev/sdb1 contains a mounted and writable filesystem
fatal error -- couldn't initialize XFS library

(So I mounted /data back up again and ran the fuser command to find out which applications were trying to hold open connections to the drive and then I killed them and confirmed that they went peacefully)
[root@sys-util-1 /]# mount /data
[root@sys-util-1 /]# fuser -vm /dev/sdb1
                     USER        PID ACCESS COMMAND
/dev/sdb1:           root       4567 f.... nautilus
                     root       4590 f.... trashapplet
                     root       4890 ..c.. bash
[root@sys-util-1 /]# kill 4567
[root@sys-util-1 /]# kill 4590
[root@sys-util-1 /]# kill 4890
[root@sys-util-1 /]# fuser -vm /dev/sdb1

(Next I unmounted the drive again and ran the repair. We are in business.)
[root@sys-util-1 /]# umount /data
[root@sys-util-1 /]# xfs_repair /dev/sdb1
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
I won't bother you with the rest but after googling around I didn't find anyone that had clearly laid out how to deal with these errors. I wanted to put something good out in the universe to hopefully help some others. 

Tuesday, October 25, 2011

Battlefail!

I love first person shooter games. They were my real gateway drug into gaming and I can't resist a good session of left clicking on face to get stress out. I played and loved Battlefield 2 so when Battlefield 3 was announced, I was pumped. The launch trailer looked amazing and I couldn't wait especially because I have the Nvidia NVision 3D glasses and a great Acer GD245HQ 3D Monitor to play it in.

I pre-ordered through the new EA Origin system so I could get some perks, so I waited. The release was scheduled at 2am on October 25th, 2011. To my surprise, I received an email that it would be released a day early, on that day. I tried to no avail and it turned out that I had to still wait till midnight eastern time zone. No biggy....



So, the time comes, and the irony all sets in. I've got Origin which is similar to the awesome Steam engine for managing games, licenses, purchases, accounts, etc. This was a familiar concept to have an app to manage that, and then the app for the game itself. Here is where it gets funky. I double click to play right? Well it kicked up another app that had to validate that the time to play had been reached before it would unlock it. Now we are at 3 apps.

So it was now the official release time and that app went away and my web browser launched to the "Battlelog" page. I closed it thinking that it was weird but nothing happened. So I waited a few, then told Origin to go again. Again Firefox pulled up the Battlelog site. Confused, I start reading. It turns out that I have to now install an ActiveX plugin to use the site. Make that now 4 apps including my browser + Active X = 5 applications to play a game.

After doing this and trying to get into multi-player (because I like to left-click on real peoples faces) I just get a prompt that spins saying it's trying to match-make. After about 10 minutes of waiting, I gave up and went into the campaign mode. This works like a champ from my browser, through my Active X plugin, to Origin to validate my license, to start the game.

The game content is amazing, beautiful, and impressive. I have fun for about 20 minutes in breath-taking terrific 3D and decided to try and play online again. This time, it works and I'm off to play. Again, the physics, graphics, and gameplay are terrific. I enjoy myself for another 20 minutes and then go to bed. Keep in mind, at this point it's about 12:10am central time and I'm a grown man of the average gamer age who has a job and real life responsibilities and needs to be professional. I can't just drag in there and half-ass it all day.

The day goes by and I get home, feed the kids, clean, and do all my grown up duties. It's now 7pm and I'm thinking I'll get in half an hour or so of gaming before I catch back up on email and all the other things I have to do. I'm assuming I'm pretty normal in this regard. Below is a screenshot of what I was greeted with upon trying to launch for the first time.

Ummm, excuse me? I just tried to play for the first time since I logged out and went to bed last night. This is a situation where a business it not aligned. The game is great, but the various silo's and business units apparently didn't work together very well to implement all of this. I like the ideas, but 5 different applications to play a game is a mess. You have the content and game creators, then the licensing people with Origin, then you have the marketing and social people working on the web interface.

I'm on a PC but I'm following dozens of others on the other platforms who can't play and are just as pissed about the Battlefail launch. With an open beta for all the pre-paid users, a timed release schedule and launch, you would think that this would be smoother. It's a great idea of how they are trying to bring in all the channels but it's a #battlefail in execution.

Those who know me know that I'm a very forgiving person, and I don't like to complain. I'm not even pissed but this is something that I had to point out as a very poor execution. I love competition and in that spirit, I don't even mind having to login to something other than steam for my games. You gave me a great way to import my friends through Facebook connect, etc. Coke wouldn't be Coke without Pepsi, but don't make us go through 5 layers of applications just to play our game and get access to the content we just paid $59.99 for.

Now I'm off to figure out how to get my account unlocked so I can left click on some face. :)

Friday, October 21, 2011

SSD Hard Drive Benchmarking Testing and Tuning - Part II

So, it's been a whole 30 minutes since I published my last blog and I already have the answers I was seeking. As many of you fellow bloggers know, the act of writing something down, thinking it through, and revisiting it even a few minutes later grants great clarity. 


I had the idea after posting and tweeting that the bus speed of the SATA 3 controller card might not be enough. The market on Amazon and NewEgg is littered with controllers that are PCI Express 1x. I started looking at posts, and speeds on this port and sure enough, that is what my new drive was topping out at. 


The on board SATA 2 controller didn't have that limitation so it was able to go faster. My options are to find a controller card that is more than 1x so it has more bandwidth or to update my motherboard to one that has SATA 3 on board. If I'm going to do that, I might as well upgrade everything. Given that I'm a every three years upgrade kind of guy, I'm going to sit tight, live it with it, and wait until this time next year. 


Standards are always changing and technology is always evolving. I'm patient enough with my money to know that I'll have many more options and the costs will be cheaper if I wait. For now, I'm going to appreciate my disk score in WEI going from 5.9 to 7.8. That is after all a pretty huge leap. 


I went down the road of reviewing how many slots my motherboard has and while there are three of the PCI Express 2.0 slots, it can either work in the mode of 16x16x0, 16x0x16 (My configuration), or 16x8x8. This means that if I wanted to tap into that third slot, I'd hurt my video card performance. While I don't game on all for of my monitors at once, I do want to keep them all optimal as I do occasionally flip on SLI and turn on 3D. 


The cards that I could put into my machine where in the price range of over $200 and up to $800. That is just too much even for this nerd and enthusiast. I wish I had better news to report but it is, what it is. I'm pretty happy but in reality, I could have save about 30 bucks getting the normal version and not worrying about GT. I wish that the drive manufacturers would release more real work information but I also understand why they wouldn't. 


As a side note, I'll be honest and say that I didn't know that the total bandwidth among the PCI Express 2.0 ports was limited and you couldn't do 16x per port. I had an eSata controller card plugged into my middle one which was robbing one of my video cards and keeping it at 8x instead of the full 16x it should have been. I swapped that into the 1x slot that I vacated by ganking the SATA 3.0 card to keep it at max. I see no use in the SATA 3.0 card if the 2.0 has more bandwidth and I have the available slots. I'm sure you would agree. 

SSD Hard Drive Benchmarking Testing and Tuning - Part I

To many people, I’m just a nerd and most of them don’t know what an SSD drive is and why they are taking over. In short, Solid State Drives are a much faster technology for hard drives that is more expensive but also  smaller. There are no moving parts and are used in many portable devices such as iPods and in smart phones and tablets. Throughout this article, I’ve put links to key terms in case you want to learn more.

Starting out with my old Patriot Torqx 60GB drive, my Windows Experience Index (WEI) was at a 5.9 as my slowest feature for disk performance. I was determined to bring this up.

The reason I selected this drive and purchased it from NewEgg is that it had just a bit extra performance, was red and matched the red Corsair Dominator memory that I have, and was much cheaper at NewEgg than elsewhere. The drive also won a performance award for “Fastest/Best Performance”.

If you are wondering why I’m writing this it’s because I knew that getting that kind of performance that is promised, wouldn’t be easy but I’m determined to do my best trying. J Upon base installation of the drive my WEI score for disk went from 5.9 to 7.4.

Enabling AHCI Mode
This is the most important part as you will get blue screens and other issues if you just try to enable it. This is the default Microsoft driver that needs to be set at 0. If it’s set to 3 it’s in IDE mode. You can see the path in registry in the screenshot. If you have an Intel chip, you have to also change the iaStorV entry the same way.


If you’re wondering what AHCI mode is, it’s an acronym that stands for “Advanced Hosting Controller Interface” that is designed to work with faster equipment and is used instead of the standard IDE interface.  Integrated Drive Electronics is the legacy technology with some limits that AHCI gets past.

Trim Enabled
Trim is a new technology created for SSD drives so that their speed can be increased and their lives extended. SSD drives have a finite amount of writes available to them so the Trim command handles deletes and garbage collection differently. If you are running Windows 7 and have an SSD, you should enable this.

Modern drives will automatically be detected and have this enabled but it doesn’t hurt to double check.
Running the command below will enable Trim in Windows 7.

DisableDeleteNotify = 1 (Windows TRIM commands are disabled)
DisableDeleteNotify = 0 (Windows TRIM commands are enabled)




Test #1


SATA 3 - Marvell Controller (I/O Connectivity Syba SY-PEX40032)
SATA 3 – Green SATA Cable



Test #2
In test two, I simply swapped out the green “SATA 3” cable for a standard run of the mill red SATA cable that comes with most hard drives. This cable was pre-SATA 3 and I wanted to see if that would make a difference. I had heard that the cables didn’t matter and this was simply a marketing gimmick. It turns out the performance was almost identical.



Test #3
In this test, I switched over to my onboard SATA 2 controller. As an FYI, I kept blue screening and was unable to boot until I switched to the bottom set of SATA ports that were previously for RAID only. After scratching my head on this and staring at the bios screen, I noticed that the only place I could change things to AHCI was on the SATA-RAID setting which gave me the clue to switch to the bottom set of ports.

I’ve got a MSI MF980-G65 motherboard which I also bought at NewEgg. I don’t believe they make it any more as it’s ancient at two years old. It was near top of the line when I bought it. I always by right under the top of the line because I’m not made of money. This board has been solid since I bought it and was the best choice for me when I got two Nvidia MSI GeForce FTX 360 Core video cards. Many of the motherboards were either Intel/Nvidia or AMD/Radeon.



As you can see, my performance really jumped by quite a bit. If you notice, each of the tests starts to top out. What I am beginning to think is that my SATA 3 controller is a piece of junk and I’ll need to go premium to get the most performance. I decided to re-run the WEI and I was very happy to see the disk performance jump to a 7.8 out of 7.9.



Next Steps
From what I can tell here, I’m still capped at the top speeds capable by SATA 2. I need to get a better SATA 3 controller and take full advantage of this drive. I’ll update this post as soon as I have more information. I’m on the hunt for a new controller. J

-Paul @SolarCurve Drew

References
Googling helped for most of this but the Corsair forums were very helpful with a consistent voice from Yellowbeard who is the community manager. I am impressed with their products and their support of their users. 

Saturday, March 26, 2011

Ignite Dallas #3

I recently spoke at Ignite Dallas #3 about intelligence and learning. The video was released and can be seen here or below. I've heard from quite a few people about this and the feedback was interesting. It seems to have resonated with most people but some took issue with it.

The people who get it understand that my speech reflected traits in ALL of us. We meet expectations and only rise to the challenges presented to us unless we get inspired and self motivated about something. Most of the time, that is NOT work. It is what it is and that is why I said a few times to apply this "student for life" mentality to things you need to do and not just what you want to do.

I've been told by others who used to work for me, with me, or were just friends that they felt as if I was picking on them specifically. I wasn't talking about ANYONE in particular and in fact the speech was built to touch on EVERYONE because we are all guilty of many of these things even if we never would admit it to others.

I'm very proud of the experience and still stand firmly behind my message. I'm curious if any part resonated with you, and if you have any comments about it. I'm also curious if you've come across any systems or ways to keep yourself motivated and learning. Am I just an idealist or do others agree with my message?



Links:
Toast Masters (Public Speaking)
MIT Free Learning
Wikipedia (Online learning encyclopedia)
Google (Everything else, yes Google works when you want to learn and understand)

Thursday, March 17, 2011

Testing Brightcove

I picked up a Brightcove account and I'm playing around with embedding a video into my blog. I'm very impressed with their platform and after evaluating many others, this is the best that I've seen. It's very afforable for a business who wants to do more than just YouTube. While there is nothing wrong with YouTube, it has limitations and this allows you to monetize and control access to your videos.