A namespace may have one or more mediaerrors, either known to the kernel or in a latent state. These
error locations, or badblocks can cause poison consumption events if read in an unsafe manner.
Moreover, these badblocks also indicate that due to media corruption, any data that may have been in
these locations has been unrecoverably lost.
Normally, in the presence of such errors, the administrator is expected to recover the data from out of
band means (such as backups), destroy the namespace, recreate it, and then restore the data. When the
data is re-written, the writes will allow any errors to be cleared as they are encountered. In such a
workflow, one should never need to use the clear-errors command.
However, there may be special use cases, where the data currently on the namespace does not matter - for
example, if a devdax mode namespace is being prepared for use as system-ram. In such cases, it may be
desirable to clear any errors on the namespace prior to switching its mode to prevent disruptive machine
checks due to poison consumption.
NoteOnly use this command when the data on the namespace is immaterial. For any blocks that are cleared
via this command, any data on the blocks in question will be lost, and replaced with content that is
implementation (platform) defined, and unpredictable.
Warning
This is a DANGEROUS command, and should only be used after fully understanding its implications and
consequences. This WILL erase your data.
For namespaces in one of fsdax or devdax modes, this command will only consider the data area for error
clearing. Namespace metadata, such as info-blocks, will not be touched. For namespaces in raw mode, the
full available capacity of the namespace is considered for error clearing. Namespaces that are in sector
mode are not supported, and will be skipped.
Note
It is expected that the command is run with the namespace enabled. A namespace in the disabled state
will appear as, and will be treated as a raw namespace, and error clearing will be performed for the
full available capacity of the namespace, including any potential metadata areas. If there happen to
be errors in the metadata area, clearing them may result in unpredictable outcomes. You have been
warned!
Known errors are ones that the kernel has encountered before, either via a previous scrub, or by an
attempted read from those locations. These can be listed by running ndctllist--media-errors for a given
namespace. Latent errors, as the name indicates, are unknown to the kernel. These can be found by running
a scrub operation on the NVDIMMs in question. By default, the ndctl-clear-errors command only clears
known errors. This can be overridden using the --scrub option to clear all errors.
Note
If a scrub is in progress when the command is called, it will unconditionally wait for it to
complete.