So I logged in and wanted to shut down things and unmount the drive to prevent corruption.
(I check to see the device name I want to unmount /data or /dev/sdb1)
[root@sys-util-1 /]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 68G 45G 21G 69% /
tmpfs 2.0G 0 2.0G 0% /dev/shm
/dev/sdb1 2.8T 2.6T 195G 94% /data
(Now I try and unmount it, but it's showing as busy)
[root@sys-util-1 /]# umount /data
umount: /data: device is busy
umount: /data: device is busy
(So now I try and force unmount it with no luck)
[root@sys-util-1 /]# umount -f /data
umount2: Device or resource busy
umount: /data: device is busy
umount2: Device or resource busy
umount: /data: device is busy
(Next I ran an "lazy" unmount which means to unmount at the next moment it's not in use)
[root@sys-util-1 /]# umount -l /data
[root@sys-util-1 /]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 68G 45G 21G 69% /
tmpfs 2.0G 0 2.0G 0% /dev/shm
(Now I see that the device is unmounted and I wanted to run the repair but it's failing with the error below)
[root@sys-util-1 /]# xfs_repair /dev/sdb1
xfs_repair: /dev/sdb1 contains a mounted filesystem
fatal error -- couldn't initialize XFS library
(I decided to try a basic check first instead but the result was a fail as well)
[root@sys-util-1 /]# xfs_check /dev/sdb1
xfs_check: /dev/sdb1 contains a mounted and writable filesystem
fatal error -- couldn't initialize XFS library
(So I mounted /data back up again and ran the fuser command to find out which applications were trying to hold open connections to the drive and then I killed them and confirmed that they went peacefully)
[root@sys-util-1 /]# mount /data
[root@sys-util-1 /]# fuser -vm /dev/sdb1
USER PID ACCESS COMMAND
/dev/sdb1: root 4567 f.... nautilus
root 4590 f.... trashapplet
root 4890 ..c.. bash
[root@sys-util-1 /]# kill 4567
[root@sys-util-1 /]# kill 4590
[root@sys-util-1 /]# kill 4890
[root@sys-util-1 /]# fuser -vm /dev/sdb1
(Next I unmounted the drive again and ran the repair. We are in business.)
[root@sys-util-1 /]# umount /data
[root@sys-util-1 /]# xfs_repair /dev/sdb1
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
I won't bother you with the rest but after googling around I didn't find anyone that had clearly laid out how to deal with these errors. I wanted to put something good out in the universe to hopefully help some others.