Differences between revisions 4 and 5
Deletions are marked like this. Additions are marked like this.
Line 101: Line 101:
  1. There should also be two partitions associated with '''gm0s2''' device files. E.g. '''gm0s2d''' and '''gm0s2de'''. But these devies were not present in /dev    1. There should also be two partitions associated with '''gm0s2''' device files. E.g. '''gm0s2d''' and '''gm0s2de'''. But these devies were not present in /dev
Line 246: Line 246:
  1. Make sure we have a copy of the output of the '''bsdlabel''' for all disks on all server command saved in a safe place   1. Make sure we have a copy of the output of the '''bsdlabel''' command for all disks on all server command saved in a safe place

FreeBSD Notes

Restoring Partition Tables

Following occurred on 16-17 Feb-2012.

This section describes the restoration of the partition table on zg-1.softxs.ch, a FreeBSD 9.0 system located at the Datawire data center in Cham. The partition table was corrupted after the system was shutdown and restarted.

The reason for the restart was that the IP addresses of the machine were changed:

  1. 94.231.80.205 --> 94.231.88.100

  2. 94.231.80.206 --> 94.231.88.101

Events on zg-1.softxs.ch leading the problem:

  1. The system was shutdown, without manually un-mounting the encrypted vol002 file system
  2. After over 60 seconds the power was turned off. We did not have monitor connected and we were unable to see in the system was really stopped
  3. The system was turned on again
  4. The system booted but reported a file system error and went into single user mode
  5. Edited the /etc/fstab and commented out the entry for vol002
  6. Rebooted the system fron the console
  7. The system cam up without problems, but the vol002 and vol003 file systems were not mounted

I left the data center and when back on line remotely logged into to zg-1.softxs.ch. Discovered the following:

  1. vol001 could be mounted
  2. There were no /dev files present for the vol002 or vol003 file systems

The /etc/fstab contains the following:

  • cat /etc/fstab
    # Device                Mountpoint      FStype  Options         Dump    Pass#
    
    /dev/mirror/gm0s1b      none            swap    sw              0       0
    
    /dev/mirror/gm0s1a      /               ufs     rw              1       1
    /dev/mirror/gm0s1e      /tmp            ufs     rw              2       2
    /dev/mirror/gm0s1f      /usr            ufs     rw              2       2
    /dev/mirror/gm0s1d      /var            ufs     rw              2       2
    
    /dev/mirror/gm0s1g      /e/vol001       ufs     rw              2       2
    
    # Note: vol002 is gbde encrypted and must be mounted manually
    
    #/dev/mirror/gm0s2d     /e/vol002       ufs     rw              2       2
    /dev/mirror/gm0s2e      /e/vol003       ufs     rw              2       2
    
    # -- CD-ROM
    
    /dev/acd0               /cdrom          cd9660  ro,noauto       0       0
    
    

Tried to mount the encrypted vol002 file system:

  • root@zg-1 # ./mount_gbde.sh vol002
    /mount_gbde.sh: mount point: /e/vol002
    ./mount_gbde.sh: device file: gm0s2d
    ./mount_gbde.sh: attaching: gm0s2d
    Enter passphrase:
    gbde: Attach to mirror/gm0s2d failed: Provider not found: "mirror/gm0s2d"
    ./mount_gbde.sh: ERROR: file not found: /dev/mirror/gm0s2d.bde

The /dev entry for the file system was missing, as was that for vol003. Display the disk partitions:

  • root@zg-1 # gpart show
    =>        63  3907029042  mirror/gm0  MBR  (1.8T)
              63  1363148577           1  freebsd  [active]  (650G)  <--- This is vol001
      1363148640  2543880528           2  freebsd  [active]  (1.2T)  <--- Is this the vol002 & vol003?

See what's in /dev

  • root@zg-1 # ls -1 /dev/mirror/
    gm0
    gm0s1         <-- has a label that seems OK
    gm0s1a
    gm0s1b
    gm0s1d
    gm0s1e
    gm0s1f
    gm0s1g
    gm0s2         <-- has no label. And no file systems found. This is where vol002 and vol003 should be

Notes:

  1. gm0 is the entire disk, which is actually a mirrored device supported by two physical disks ad0 and ad2

  2. gm0s1 and gm0s2 are FreeBSD slices (which are actually DOS partitions)

  3. gm0s1a to gm0s1g are FreeBSD partitions, which each contain a file system. E.g. the root, swap, /var, /tmp, /usr and /e/vol001 file systems

  4. There should also be two partitions associated with gm0s2 device files. E.g. gm0s2d and gm0s2de. But these devies were not present in /dev

Check the BSD disk labels:

  • root@zg-1 # bsdlabel /dev/mirror/gm0s1
    # /dev/mirror/gm0s1:
    8 partitions:
    #          size    offset    fstype   [fsize bsize bps/cpg]
      a:    2097152         0    4.2BSD        0     0     0
      b:    8388608   2097152      swap
      c: 1363148577         0    unused        0     0         # "raw" part, don't edit
      d:   20971520  10485760    4.2BSD        0     0     0
      e:   20971520  31457280    4.2BSD        0     0     0
      f:   52428800  52428800    4.2BSD        0     0     0
      g: 1258290977 104857600    4.2BSD        0     0     0
    
    root@zg-1 # bsdlabel /dev/mirror/gm0s2
    bsdlabel: /dev/mirror/gm0s2: no valid label found

Note that gm0s2 appear to have no partitions defined on it. This is where vol002 and vol003 file systems should be.

Install the scan_ffs to investigate what's there and see if anything can be recovered.

  • cd /usr/ports/sysutils/scan_ffs
    make install

This didn't work as the ports tree was not up to date. Had to download the following distfile by hand scan_ffs-1.2.tar.bz2, which I found at:

Copied the downloaded file to /usr/ports/diskfiles, and ran the make install again.

Run scan_ffs on the gm0s2 slice, which took a long time (nearly 8 hours):

  • root@zg-1 # date
    Sat Feb 16 00:05:15 CET 2013
    root@zg-1 # scan_ffs -l /dev/ad2 ; date
    X: 2097152 63 4.2BSD 2048 16384 0 # /
    X: 20971520 10485823 4.2BSD 2048 16384 0 # /var
    X: 20971520 31457343 4.2BSD 2048 16384 0 # /tmp
    X: 52428800 52428863 4.2BSD 2048 16384 0 # /usr
    X: 1258290976 104857663 4.2BSD 2048 16384 0 # /e/vol001
    X: 1275100000 2631925600 4.2BSD 2048 16384 0 # /e/vol003
    scan_ffs: read: Input/output error
    Sat Feb 16 07:52:12 CET 2013

Notes

  1. The partition vol003 was detected, but not the partition vol002
  2. The output format, according to the scan_ffs man page, is in a format that can be read as input by the bsdlabel command

This result indicates that at least some the data in vol002 and vol003 file system is probably recoverable.

Searching in my notes I found of the output from the bsdlabel command, which I had made when I was reconfiguring the vol003 partition (to reformat it and make it a non-encrypted partition).

  • root@zg-1 # bsdlabel mirror/gm0s1
    # /dev/mirror/gm0s1:
    8 partitions:
    #        size     offset    fstype   [fsize bsize bps/cpg]
    a:    2097152          0    4.2BSD        0     0     0
    b:    8388608    2097152      swap
    c: 1363148577          0    unused        0     0         # "raw" part, don't edit
    d:   20971520   10485760    4.2BSD        0     0     0
    e:   20971520   31457280    4.2BSD        0     0     0
    f:   52428800   52428800    4.2BSD        0     0     0
    g: 1258290977  104857600    4.2BSD        0     0     0   #   <-- /e/vol001
    
    root@zg-1 # bsdlabel mirror/gm0s2
    # /dev/mirror/gm0s2:
    8 partitions:
    #        size     offset    fstype   [fsize bsize bps/cpg]
    c: 2543880528          0    unused        0     0         # "raw" part, don't edit
    d: 1268776960          0    4.2BSD        0     0     0   #    <-- /e/vol002
    e: 1275103568 1268776960    4.2BSD        0     0     0   #    <-- /e/vol003

As mentioned above, the output of the scan_ffs command can be used as input for updating (re-setting) the disk label. This is explained further in the following web page:

I prepared a text file with the offset and size information for partitions vol002 and vol003, taken from the output of the bsdlabel command listed above:

  • root@zg-1 # cat /tmp/vol-label
    d: 1268776960          0    4.2BSD        0     0     0       # /e/vol002 (605 GB)
    e: 1275103568 1268776960    4.2BSD        0     0     0       # /e/vol003 (614 GB)

Update the disk label:

  • root@zg-1 # bsdlabel -R /dev/mirror/gm0s2 /tmp/vol-label

The command completed without error and devices files gm0s2d and gm0s2e immediately appeared in the /dev directory

  • gm0s2d - vol002 encrypted file system

  • gm0s2e - vol003 normal file system

Try and mount vol003, but before mounting it, run a file system check:

  • root@zg-1 # fsck /dev/mirror/gm0s2e
    fsck: Could not determine filesystem type

This failed. But I tried mounting the file vol003 system anyways:

  • root@zg-1 # mount /e/vol003

It worked! The file system was there, but empty. Which was correct.

Try mounting the encrypted file system vol002:

  • root@zg-1 # ./mount_gbde.sh vol002
    ./mount_gbde.sh: mount point: /e/vol002
    ./mount_gbde.sh: device file: gm0s2d
    ./mount_gbde.sh: attaching: gm0s2d
    Enter passphrase:
    ./mount_gbde.sh: found file: /dev/mirror/gm0s2d.bde
    ./mount_gbde.sh: checking: gm0s2d.bde
    /dev/mirror/gm0s2d.bde: 338185 files, 269310313 used, 35520820 free
    (42276 frags, 4434818 blocks, 0.0% fragmentation)
    ./mount_gbde.sh: mounting: gm0s2d.bde
    ./mount_gbde.sh: /e/vol002 mounted successfully from gm0s2d.bde
    Filesystem                Size    Used   Avail Capacity  Mounted on
    /dev/mirror/gm0s2d.bde    581G    514G     62G    89%    /e/vol002

It worked! But it took a long time for the file system check to complete (about 20 minutes).

Lessons learned:

  1. We should probably unmount the vol002 partition by hand before shutting down the system
  2. Make sure we have a copy of the output of the bsdlabel command for all disks on all server command saved in a safe place

FreeBsdNotes (last edited 2013-02-22 11:23:54 by 10)

Copyright 2008-2014, SoftXS GmbH, Switzerland