So we have this computer that has two 250Gb disks. In the interest of keeping the data, for evah I set them up as RAID1 using Linux MD RAID. This was cool. Much as a proper technical person does not keep backups of his personal stuff, having two copies seemed cool. The partitions were something like:
/dev/sda1 + /dev/sdb2 = /dev/md0 <= /boot /dev/sda2 + /dev/sdb2 = /dev/md1 <= /dev/vg volume group
On Friday the one disk died (when it woke up in the morning it had SMART errors and 4093 relocated sectors and didn't work so well). To make our life complete, the system wouldn't boot. Ubuntu booted up to a recovery shell, which I used to try to make the RAID run, which was stupid, since I didn't actually know which disk was faulty. After this exercise, it booted up to "GRUB rescue>" which means that the system is poked, and you're about to work through supper time to fix it.
I got out my trusty usb-creator and booted a rescue system, and found .. um, that all the data was gone. The partitions were there, but the system refused to mount them, saying that they were RAID partitions. Now I suppose at this point I should have wiped the RAID signature, but I didn't. I couldn't start the RAID, I couldn't open the volume group, I could do zip. I installed mdadm, I installed lvm. I ran pvscan, lvscan with odd options. Nothing worked. My RAID was unraidable, and the LVM physical volume it contained was unusable.
So what I did was I had a look to see if I could maybe find the filesystem by brute force - :
for ((a=0;a<2048;a++)); do dd if=/dev/sda2 bs=$((1024*1024)) count=1 skip=$a | file - done
That said that there were bits of ext4 filesystem (the superblock, perhaps?) at 1 Mbyte, 7 Mbytes and some other odd position, so I tried to mount them - the theory being that when something gets mounted it stays mounted:
for ((a=0;a<2048;a++)); do losetup -d /dev/loop2 losetup /dev/loop2 /dev/sda2 --offset $((a * 1024 * 1024)) mount /dev/loop2 /mnt && break done
Sadly, that didn't work. So then I decided to go ahead with the reinstallation of the system on the good disk, since the data on the disks was not actually so important. The installer complained though: it said it couldn't partition the disks, since /dev/sda2 was in use.
That's odd.
Wanting to get the system installed, I had a look, and discovered that the LVM volume groups from the lost system had reappeared. That's very odd, but without delaying to complain, I made a copy of the original filesystem, and happiness ensued. I got the complete filesystem data, made a tar file, and then nuked the half confused system with a new installation to frustrate the cause of science.
So what happened? When the system was making loopback devices and attempting to mount them as filesystems, the device mapper woke up and said "aha, I see thee, thou LVM physical volume /dev/loop2". Once the kernel and/or udev had automatically attached the LVM subsystem to the /dev/loop2 device, subsequent attempts to unmap the device failed - so the first thing that worked became permanent.
So now, it's official. Brute force mounting doesn't necessarily work, but for accidentally finding LVM volume groups on intransigent physical devices, it seems to work just fine. When all else fails, try it:
for ((a=0;a<2048;a++)); do losetup -d /dev/loop2 losetup /dev/loop2 /dev/sda2 --offset $((a * 1024 * 1024)) mount /dev/loop2 /mnt && break done
(Which reminds me: for years I have used "a" and "b" as loop variables, but recently I have learned that "i" and "j" are more popular. I don't care. I think that the supposed advantages of "i" and "j" are purely imaginary.)