Однажды утром понедельника обнаружил страшное:
mik17@sol1:~$ zpool status
pool: rpool
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
c4t0d0s0 ONLINE 0 0 0
errors: No known data errors
pool: zfs_pool_zones
state: FAULTED
status: The pool metadata is corrupted and the pool cannot be opened.
action: Recovery is possible, but will result in some data loss.
Returning the pool to its state as of October 13, 2016 03:10:03 AM MSK
should correct the problem. Approximately 20 minutes of data
must be discarded, irreversibly. After rewind, several
persistent user-data errors will remain. Recovery can be attempted
by executing 'zpool clear -F zfs_pool_zones'. A scrub of the pool
is strongly recommended after recovery.
see: http://illumos.org/msg/ZFS-8000-72
scan: none requested
config:
NAME STATE READ WRITE CKSUM
zfs_pool_zones FAULTED 0 0 1 corrupted data
raidz1-0 DEGRADED 0 0 8
c4t1d0 DEGRADED 0 0 0 too many errors
spare-1 DEGRADED 0 0 0
c4t2d0 DEGRADED 0 0 0 too many errors
c4t6d0 ONLINE 0 0 0
c4t3d0 DEGRADED 0 0 0 too many errors
c4t4d0 DEGRADED 0 0 1 too many errors
c4t5d0 DEGRADED 0 0 0 too many errors
зоны не стартуют, все пропало )))
для начала попробовал "заклиарить" пулл, так как в дисках сомнений нет (это LUN с СХД)
mik17@sol1:~$ sudo zpool clear -F zfs_pool_zones
cannot clear errors for zfs_pool_zones: I/O error
теперь попробуем пулл "импортнуть":
mik17@sol1:~$ sudo zpool import -F zfs_pool_zones
cannot import 'zfs_pool_zones': a pool with that name is already created/imported,
and no additional pools with that name were found
ну хоть метаданные по нему есть, так что сначала навесим на пулл флаг экспорта:
mik17@sol1:~$ sudo zpool export zfs_pool_zones
скушал
теперь опять импорт
mik17@sol1:~$ sudo zpool import -F zfs_pool_zones
mik17@sol1:~$ sudo zpool status zfs_pool_zones
pool: zfs_pool_zones
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Mon Oct 17 11:15:29 2016
306M scanned out of 4.94G at 43.8M/s, 0h1m to go
60.8M resilvered, 6.06% done
config:
NAME STATE READ WRITE CKSUM
zfs_pool_zones ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
c4t1d0 ONLINE 0 0 0
spare-1 ONLINE 0 0 0
c4t2d0 ONLINE 0 0 0
c4t6d0 ONLINE 0 0 0 (resilvering)
c4t3d0 ONLINE 0 0 0
c4t4d0 ONLINE 0 0 0
c4t5d0 ONLINE 0 0 0
errors: 1 data errors, use '-v' for a list
восстановился с ошибками на файловой системе, нужно их scrub'нуть
mik17@sol1:~$ sudo zpool scrub zfs_pool_zones
mik17@sol1:~$ sudo zpool status -v zfs_pool_zones
pool: zfs_pool_zones
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: scrub in progress since Mon Oct 17 11:38:01 2016
225M scanned out of 4.94G at 37.5M/s, 0h2m to go
0 repaired, 4.45% done
config:
NAME STATE READ WRITE CKSUM
zfs_pool_zones ONLINE 0 0 4
raidz1-0 ONLINE 0 0 8
c4t1d0 ONLINE 0 0 0
c4t2d0 ONLINE 0 0 0
c4t3d0 ONLINE 0 0 0
c4t4d0 ONLINE 0 0 0
c4t5d0 ONLINE 0 0 0
spares
c4t6d0 AVAIL
errors: Permanent errors have been detected in the following files:
/zfs_pool_zones/zoneSokolov/root/var/cron/log
/zfs_pool_zones/zonehdscci1/root/var/cron/log
/zfs_pool_zones/zonepostgres1/root/var/svc/log/application-database-postgresql_84:default_64bit.log
/zfs_pool_zones/zonepostgres1/root/var/cron/log
после окончания очистки:
mik17@sol1:~$ sudo zpool status -v zfs_pool_zones
pool: zfs_pool_zones
state: ONLINE
scan: resilvered 57.2M in 0h0m with 0 errors on Mon Oct 17 12:08:47 2016
config:
NAME STATE READ WRITE CKSUM
zfs_pool_zones ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
c4t1d0 ONLINE 0 0 0
c4t2d0 ONLINE 0 0 0
c4t3d0 ONLINE 0 0 0
c4t4d0 ONLINE 0 0 0
c4t5d0 ONLINE 0 0 0
spares
c4t6d0 AVAIL
errors: No known data errors
На другой машинке все сложнее - там и rpool тоже в DEGRADED стейте. но это уже другая история.
mik17@sol1:~$ zpool status
pool: rpool
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
c4t0d0s0 ONLINE 0 0 0
errors: No known data errors
pool: zfs_pool_zones
state: FAULTED
status: The pool metadata is corrupted and the pool cannot be opened.
action: Recovery is possible, but will result in some data loss.
Returning the pool to its state as of October 13, 2016 03:10:03 AM MSK
should correct the problem. Approximately 20 minutes of data
must be discarded, irreversibly. After rewind, several
persistent user-data errors will remain. Recovery can be attempted
by executing 'zpool clear -F zfs_pool_zones'. A scrub of the pool
is strongly recommended after recovery.
see: http://illumos.org/msg/ZFS-8000-72
scan: none requested
config:
NAME STATE READ WRITE CKSUM
zfs_pool_zones FAULTED 0 0 1 corrupted data
raidz1-0 DEGRADED 0 0 8
c4t1d0 DEGRADED 0 0 0 too many errors
spare-1 DEGRADED 0 0 0
c4t2d0 DEGRADED 0 0 0 too many errors
c4t6d0 ONLINE 0 0 0
c4t3d0 DEGRADED 0 0 0 too many errors
c4t4d0 DEGRADED 0 0 1 too many errors
c4t5d0 DEGRADED 0 0 0 too many errors
зоны не стартуют, все пропало )))
для начала попробовал "заклиарить" пулл, так как в дисках сомнений нет (это LUN с СХД)
mik17@sol1:~$ sudo zpool clear -F zfs_pool_zones
cannot clear errors for zfs_pool_zones: I/O error
теперь попробуем пулл "импортнуть":
mik17@sol1:~$ sudo zpool import -F zfs_pool_zones
cannot import 'zfs_pool_zones': a pool with that name is already created/imported,
and no additional pools with that name were found
ну хоть метаданные по нему есть, так что сначала навесим на пулл флаг экспорта:
mik17@sol1:~$ sudo zpool export zfs_pool_zones
скушал
теперь опять импорт
mik17@sol1:~$ sudo zpool import -F zfs_pool_zones
mik17@sol1:~$ sudo zpool status zfs_pool_zones
pool: zfs_pool_zones
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Mon Oct 17 11:15:29 2016
306M scanned out of 4.94G at 43.8M/s, 0h1m to go
60.8M resilvered, 6.06% done
config:
NAME STATE READ WRITE CKSUM
zfs_pool_zones ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
c4t1d0 ONLINE 0 0 0
spare-1 ONLINE 0 0 0
c4t2d0 ONLINE 0 0 0
c4t6d0 ONLINE 0 0 0 (resilvering)
c4t3d0 ONLINE 0 0 0
c4t4d0 ONLINE 0 0 0
c4t5d0 ONLINE 0 0 0
errors: 1 data errors, use '-v' for a list
восстановился с ошибками на файловой системе, нужно их scrub'нуть
mik17@sol1:~$ sudo zpool scrub zfs_pool_zones
mik17@sol1:~$ sudo zpool status -v zfs_pool_zones
pool: zfs_pool_zones
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: scrub in progress since Mon Oct 17 11:38:01 2016
225M scanned out of 4.94G at 37.5M/s, 0h2m to go
0 repaired, 4.45% done
config:
NAME STATE READ WRITE CKSUM
zfs_pool_zones ONLINE 0 0 4
raidz1-0 ONLINE 0 0 8
c4t1d0 ONLINE 0 0 0
c4t2d0 ONLINE 0 0 0
c4t3d0 ONLINE 0 0 0
c4t4d0 ONLINE 0 0 0
c4t5d0 ONLINE 0 0 0
spares
c4t6d0 AVAIL
errors: Permanent errors have been detected in the following files:
/zfs_pool_zones/zoneSokolov/root/var/cron/log
/zfs_pool_zones/zonehdscci1/root/var/cron/log
/zfs_pool_zones/zonepostgres1/root/var/svc/log/application-database-postgresql_84:default_64bit.log
/zfs_pool_zones/zonepostgres1/root/var/cron/log
после окончания очистки:
mik17@sol1:~$ sudo zpool status -v zfs_pool_zones
pool: zfs_pool_zones
state: ONLINE
scan: resilvered 57.2M in 0h0m with 0 errors on Mon Oct 17 12:08:47 2016
config:
NAME STATE READ WRITE CKSUM
zfs_pool_zones ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
c4t1d0 ONLINE 0 0 0
c4t2d0 ONLINE 0 0 0
c4t3d0 ONLINE 0 0 0
c4t4d0 ONLINE 0 0 0
c4t5d0 ONLINE 0 0 0
spares
c4t6d0 AVAIL
errors: No known data errors
На другой машинке все сложнее - там и rpool тоже в DEGRADED стейте. но это уже другая история.
Комментариев нет:
Отправить комментарий