When using this metric, the following attributes MUST be set:
hw.type MUST be set to "gpu" to indicate that the errors are from a GPU.
error.type SHOULD be set to one of the following values to indicate the type of error:
"corrected": Errors that were detected and corrected by the GPU.
"uncorrected": Errors that were detected but could not be corrected by the GPU.
Name
Instrument Type
Unit (UCUM)
Description
Stability
Entity Associations
hw.errors
Counter
{error}
Number of errors encountered by the component.
Attribute
Type
Description
Examples
Requirement Level
Stability
hw.id
string
An identifier for the hardware component, unique within the monitored host
win32battery_battery_testsysa33_1
Required
hw.type
string
Type of the component [1]
battery; cpu; disk_controller
Required
error.type
string
The type of error encountered by the component. [2]
uncorrected; zero_buffer_credit; crc; bad_sector
Conditionally Required if and only if an error has occurred
hw.name
string
An easily-recognizable name for the hardware component
eth0
Recommended
hw.parent
string
Unique identifier of the parent component (typically the hw.id attribute of the enclosure, or disk controller)
dellStorage_perc_0
Recommended
network.io.direction
string
Direction of network traffic for network errors. [3]
receive; transmit
Recommended
[1] hw.type: Describes the category of the hardware component for which hw.state is being reported. For example, hw.type=temperature along with hw.state=degraded would indicate that the temperature of the hardware component has been reported as degraded.
[2] error.type: The error.type SHOULD match the error code reported by the component, the canonical name of the error, or another low-cardinality error identifier. Instrumentations SHOULD document the list of errors they report.
[3] network.io.direction: This attribute SHOULD only be used when hw.type is set to "network" to indicate the direction of the error.
error.type has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
Value
Description
Stability
_OTHER
A fallback error value to be used when the instrumentation doesn’t define a custom value.
hw.type has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
Value
Description
Stability
battery
Battery
cpu
CPU
disk_controller
Disk controller
enclosure
Enclosure
fan
Fan
gpu
GPU
logical_disk
Logical disk
memory
Memory
network
Network
physical_disk
Physical disk
power_supply
Power supply
tape_drive
Tape drive
temperature
Temperature
voltage
Voltage
network.io.direction has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
An identifier for the hardware component, unique within the monitored host
win32battery_battery_testsysa33_1
Required
network.io.direction
string
The network IO operation direction.
receive; transmit
Required
hw.driver_version
string
Driver version for the hardware component
10.2.1-3
Recommended
hw.firmware_version
string
Firmware version of the hardware component
2.0.1
Recommended
hw.model
string
Descriptive model name of the hardware component
PERC H740P; Intel(R) Core(TM) i7-10700K; Dell XPS 15 Battery
Recommended
hw.name
string
An easily-recognizable name for the hardware component
eth0
Recommended
hw.parent
string
Unique identifier of the parent component (typically the hw.id attribute of the enclosure, or disk controller)
dellStorage_perc_0
Recommended
hw.serial_number
string
Serial number of the hardware component
CNFCP0123456789
Recommended
hw.vendor
string
Vendor name of the hardware component
Dell; HP; Intel; AMD; LSI; Lenovo
Recommended
network.io.direction has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
An identifier for the hardware component, unique within the monitored host
win32battery_battery_testsysa33_1
Required
hw.driver_version
string
Driver version for the hardware component
10.2.1-3
Recommended
hw.firmware_version
string
Firmware version of the hardware component
2.0.1
Recommended
hw.gpu.task
string
Type of task the GPU is performing
decoder; encoder; general
Recommended
hw.model
string
Descriptive model name of the hardware component
PERC H740P; Intel(R) Core(TM) i7-10700K; Dell XPS 15 Battery
Recommended
hw.name
string
An easily-recognizable name for the hardware component
eth0
Recommended
hw.parent
string
Unique identifier of the parent component (typically the hw.id attribute of the enclosure, or disk controller)
dellStorage_perc_0
Recommended
hw.serial_number
string
Serial number of the hardware component
CNFCP0123456789
Recommended
hw.vendor
string
Vendor name of the hardware component
Dell; HP; Intel; AMD; LSI; Lenovo
Recommended
hw.gpu.task has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
Operational status: 1 (true) or 0 (false) for each of the possible states.
When using this metric for GPU status, the following attributes MUST be set:
hw.type MUST be set to "gpu" to indicate that the status is for a GPU.
hw.state MUST be set to one of the following values to indicate the GPU state:
"ok": The GPU is operating normally.
"degraded": The GPU is operating with reduced functionality or performance.
"failed": The GPU has failed and is not operational.
"predicted_failure": The GPU is currently operational but is predicted to fail soon.
Name
Instrument Type
Unit (UCUM)
Description
Stability
Entity Associations
hw.status
UpDownCounter
1
Operational status: 1 (true) or 0 (false) for each of the possible states. [1]
[1]:hw.status is currently specified as an UpDownCounter but would ideally be represented using a StateSet as defined in OpenMetrics. This semantic convention will be updated once StateSet is specified in OpenTelemetry. This planned change is not expected to have any consequence on the way users query their timeseries backend to retrieve the values of hw.status over time.
Attribute
Type
Description
Examples
Requirement Level
Stability
hw.id
string
An identifier for the hardware component, unique within the monitored host
win32battery_battery_testsysa33_1
Required
hw.state
string
The current state of the component
degraded; failed; needs_cleaning
Required
hw.type
string
Type of the component [1]
battery; cpu; disk_controller
Required
hw.name
string
An easily-recognizable name for the hardware component
eth0
Recommended
hw.parent
string
Unique identifier of the parent component (typically the hw.id attribute of the enclosure, or disk controller)
dellStorage_perc_0
Recommended
[1] hw.type: Describes the category of the hardware component for which hw.state is being reported. For example, hw.type=temperature along with hw.state=degraded would indicate that the temperature of the hardware component has been reported as degraded.
hw.state has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
Value
Description
Stability
degraded
Degraded
failed
Failed
needs_cleaning
Needs Cleaning
ok
OK
predicted_failure
Predicted Failure
hw.type has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.