Quantcast
Channel: Systems Management - Wiki
Viewing all articles
Browse latest Browse all 335

BMC SEL Log

$
0
0
Current Revision posted to Systems Management - Wiki by Peter Tsai on 11/7/2011 6:17:18 PM
What are These Baseboard Management Controller (BMC) System Event Log (SEL) Records Logged by Windows?
By Steven Grigsby
iDRAC6 SEL Log
Introduction
Did you ever notice a bunch of entries in the Baseboard Management Controller (BMC) System Event Log (SEL) that seem to be coming from Windows? Not the Windows system event log, but the SEL that is accessible from the Baseboard Management Controller (BMC).

It may seem that this log is only intended for hardware events and errors, but there is an OS driver that also logs events to it. Starting with Windows Server 2003 R2, Microsoft began shipping the Intelligent Platform Management Interface (IPMI) driver. One of its functions is to log certain OS events to the SEL. The IPMI driver writes multiple records to the SEL for these three types of events: boot up, shutdown and bugcheck events.

The BMC SEL log is accessible via the sideband BMC interface using tools like ipmitool or impish. It is also accessible from OpenManage Server Assistant (OMSA), the Dell Remote Access Controller (DRAC) console, and <CTRL-E> during boot. Many different types of events, which can be useful in troubleshooting, are logged in the SEL. Each SEL record packs a lot of information in only 16 bytes, and given their limited size, they can be hard to decode.

Decoding the SEL Record


Dell BMC SEL record, pulled by racadm command line tool
The SEL record above was pulled using the Dell racadm command line tool available on the Dell Systems Management Tools and Documentation DVD.


Each SEL record is 16 bytes in length and the format is defined in the IPMI spec. The table below summarizes the format of the SEL record, as
defined by the IPMI spec. At the highest level, the Record Type field indicates whether it is a System Event Record or an OEM defined event record. System Event records are defined in the IPMI specification, and OEM records are defined by the OEM.

Each event logged by the IPMI driver (whether boot up, shutdown, or bugcheck) will cause one system event record and one or more OEM records to be logged to the SEL. The type 02h system event record format is defined in the table below.

BytesFieldDescription
1:0SEL Record ID16 bit record identifier

0000h: Reserved
0001h to FFFEh: SEL Record ID
FFFFh: Reserved
2Record Type02h: System Event Record

C0h to DFh: OEM Time stamped Record
Bytes 8-16 are OEM Defined

E0h to FFh: OEM Non-Time stamped Record
Bytes 4-16 are OEM Defined
6:3SEL TimestampTime when event was logged
8:7Generator ID Byte 8: System Software ID or IPMB Slave Address=============

[7:1]: System Software ID, or 7-bit I2C slave address
[0]: 0b means bits 7-1 are the IPMB Slave Address
1b means bits 7-1 are the System Software ID

Byte 7: Channel/LUN=============

[7:4]: Channel Number. Must be 0000b if the event was received via the system interface, primary IPMB, or internally generated by the BMC

[3:2]: Reserved. Must be 00b

[1:0]: IPMB device LUN if Byte 0 holds a slave address. 00b if Byte 0 holds a System Software ID
9 EvMRev 03h: IPMI 1.0 0

4h: IPMI 2.0
10 Sensor Type Sensor type code as specified in the IPMI spec
11 Sensor Number Number of the sensor that generated the event
12 Event Type [7]: Event Direction
0b: Assertion Event
1b: De-assertion Event


[6:0]: 7 bit Event/Reading Type Code
00h: Event/Reading Type unspecified
01h: Threshold
02h-0Ch: Generic Discrete
6Fh: Sensor-Specific Discrete
70h-7Fh: OEM Discrete. Indicates the discrete state info is specific to the OEM identified by the Manufacturer ID for the IPMI device that is providing access to the sensor
13 Event Data 1: Event Field Contents
14 Event Data 2 Event Field Contents
15 Event Data 3 Event Field Contents
Boot Up Events
During a system boot, two records are written to the SEL. The first is a type 02h System Event, and the second is an OEM event.

Type 02h Boot Up Record

The first 10 bytes of the type 02h System boot up record are general header information, and the boot-relevant information starts at byte 10.

BytesValue Field Name Meaning of Value
1:0SEL Record ID record identifier
202h Record Type 02h indicates a system event record
6:3SEL Timestamp
8:70041hGenerator ID 0041h indicates the event comes from system software whose ID 20h
9 04h EvMRev IPMI 2.0 Event Message Revision
10 1Fh Sensor Type:OS boot Sensor type 1Fh is OS Boot as specified in the IPMI spec
11 00h Sensor Number
12 6FhEvent Type [7]: 0b: indicates an assertion event

[6:0]: 6Fh indicates a Sensor-specific Discreet Type code. This means use the sensor specific offsets to interpret the event data bytes
13 01h Event Data 1: C: Boot Completed[7:6]:00b = unspecified byte 2 event data

[5:4]: 00b = unspecified byte 3 event data

[3:0]: Offset from Event/Reading Code
0h: A: boot completed 1h: C: boot completed
2h: PXE boot completed
3h: Diagnostic boot completed
4h: CD-ROM boot completd
5h: ROM boot completed
6h: boot completed-boot device not specified
14 FFh Event Data 2 Not specified per bits 7-6 in event data 1
15 FFh Event Data 3 Not specified per bits 5-4 in event data 1

Type DCh Boot Up OEM Event Record

The OEM Event record for bootup does not provide much more information. Nevertheless, the type DCh OEM Boot up event record looks similar to this:

Bytes Value Field Name Description
1:0SEL Record ID
2 DCh Record Type OEM Time stamped Record (bytes 7-15 are OEM Defined)DCh: OS Boot up
6:3 SEL Timestamp
9:7137hIPMI Manufacturer ID 137h (311d) is the IANA enterprise number for Microsoft
10 Sequence Number Sequence number used to concatenate the OEM Data bytes from multiple SEL entries
14:11 Boot Time The OS Boot time
1500h Reserved Reserved
Shutdown Events
System shutdown events can cause many records to be logged to the SEL. There is one type 02h System Event record, one OEM type DDh record for the shutdown reason code, and zero or more OEM type DDh records for the shutdown comment.

Whenever a user shuts down a system using shutdown.exe, the reason and comment entered by the user are saved in the registry. Then the IPMI driver reads them from the registry and logs them to the SEL with OEM type DDh records. The shutdown reason is a 4 byte code which fits within SEL entry. The shutdown comment, however, is a user-entered string of variable length.

The IPMI driver saves the comment string to the SEL in multiple OEM type DDh records, concatenated using the sequence number. This can be problematic because there are only 4 bytes available for the comment per SEL entry, and the comment is saved as a Unicode string.

This can quickly fill the SEL log. During a manual shutdown (Start | Shutdown), even though the user is required to enter a reason and comment, the comment is never saved to the registry, and therefore not logged to the SEL. Furthermore, an incorrect reason code is saved. See Microsoft KB200106 (http://support.microsoft.com/kb/2001061).

Type 02h System Shutdown Event Record

Again, the first 10 bytes of the type 02h System Event Record is all header information. The information relevant to the system shutdown starts at byte 10.

BytesValue Field Name Description
1:0SEL Record ID record identifier
202hRecord Type 02h indicates a system event record
6:3SEL Timestamp
8:70041hGenerator ID 0041h indicates the event comes from system software whose ID 20h
9 04h EvMRev IPMI 2.0 Event Message Revision
10 20h Sensor Type:OS Stop/Shutdown Sensor type 1Fh is OS Boot as specified in the IPMI spec
11 00h Sensor Number
12 6FhEvent Type [7]: 0b: indicates an assertion event
[6:0]: 6Fh: indicates a Sensor-specific Discreet Type code. This means use the sensor specific offsets to interpret the event data bytes
13 03hEvent Data 1: [7:6]: 00b = unspecified byte 2 event data

[5:4]: 00b = unspecified byte 3 event data

[3:0]: Offset from Event/Reading Code
0h: Critical stop during OS load/initialization. Unexpected error during system startup. Stopped waiting for input or power cycle/reset
1h: Run-time critical stop (aka ‘core dump’ or ‘blue screen’
2h: OS Graceful stop (system powered up, but normal OS operation has shut down and system is awaiting reset pushbutton, power-cycle or other external input) 3h: OS Graceful Shutdown (system graceful power down by OS)
4h: Soft Shutdown initiated by PEF
5h: Agent Not Responding. Graceful shutdown request to agent via BMC did not occur due to missing or malfunctioning local agent
14 FFh Event Data 2 Not specified per bits 7-6 in event data 1
15 FFh Event Data 3 Not specified per bits 5-4 in event data 1

Type DDh Shutdown Reason OEM Event Record

The IPMI driver reads the shutdown reason code is from the DWORD registry value at

HKLM/Software/Microsoft/Windows/CurrentVersion/Reliability/shutdown/ReasonCode (DWORD)

The reason code is then logged to the SEL with the following OEM Type DDh record:

Bytes Value Field Name Description
1:0SEL Record ID
2 DDh Record Type OEM Time stamped Record (bytes 7-15 are OEM Defined)DDh: OS Shutdown
6:3 SEL Timestamp
9:7137hIPMI Manufacturer ID 137h (311d) is the IANA enterprise number for Microsoft
10 Sequence Number Sequence number used to concatenate the OEM Data bytes from multiple SEL entries
14:11 Shutdown Reason Shutdown Reason which is read from the registry:HKLM/Software/Microsoft/Windows/CurrentVersion/Reliability/shutdown/ReasonCode (DWORD)
1500hReserved

Type DDh Shutdown Comment OEM Event Record

Similarly, the shutdown comment is a REG_SZ value read from the same registry key.

HKLM/Software/Microsoft/Windows/CurrentVersion/Reliability/shutdown/Comment (REG_SZ)

Bytes Value Field Name Description
1:0SEL Record ID
2 DDh Record Type OEM Time stamped Record (bytes 7-15 are OEM Defined)DDh: OS Shutdown
6:3 SEL Timestamp
9:7137hIPMI Manufacturer ID 137h (311d) is the IANA enterprise number for Microsoft
10 Sequence Number Sequence number used to concatenate the OEM Data bytes from multiple SEL entries
14:11 Shutdown Comment Shutdown Comment which is read from the registry:HKLM/Software/Microsoft/Windows/CurrentVersion/Reliability/shutdown/Comment (REG_SZ)
1500h Reserved

Disabling the Type DDh Shutdown Comment records

Because the shutdown comment can take so many records to log, it can cause the SEL to become full very quickly. Microsoft released a hotfix to allow for disabling the shutdown log comments for Windows Server 2008 SP1. This hotfix can be found at:

http://support.microsoft.com/kb/962920

Starting with Windows Server 2008 R2, the functionality is present in the OS without applying the hotfix. So for windows Server 2008 SP1, you must apply the hotfix before modifying the registry, but for Windows Server 2008 SP2 and later you can modify the registry without applying the hotfix.

1.Open regedit and navigate to HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\IPMI
2.Create a new DWORD Value named DisableSELShutdownComment
3.Right-click DisableSELShutdownComment, and then click Modify.
4.In the Value data box, type 1, and then click OK.
5.Close regedit and reboot

Bugcheck Events
A system bugcheck (bluescreen) will cause six records to be written to the event log. A type 02h System Event records is written, followed by five type DEh OEM event records: One for the bugcheck code, and one for each of the four parameters to the bugcheck.

Type 02h Bugcheck System Event Record

Like the other Type 02h System event records logged by the IPMI driver, the relevant information starts after the header at byte 10. In the case of the bugcheck, the sensor type is 20h which is the same as the Shutdown system event record, but the event data byte at byte 13 indicates runtime critical stop.

BytesValue Field Name Description
1:0SEL Record ID record identifier
202hRecord Type 02h indicates a system event record
6:3SEL Timestamp
8:70041hGenerator ID 0041h indicates the event comes from system software whose ID 20h
9 04h EvMRev IPMI 2.0 Event Message Revision
10 20h Sensor Type: OS Stop/Shutdown Sensor type 1Fh is OS Boot as specified in the IPMI spec
11 00h Sensor Number
12 6FhEvent Type [7]: 0b: indicates an assertion event

[6:0]: 6Fh: indicates a Sensor-specific Discreet Type code. This means use the sensor specific offsets to interpret the event data bytes
13 01hEvent Data 1[7:6]: 00b = unspecified byte 2 event data

[5:4]: 00b = unspecified byte 3 event data

[3:0]: Offset from Event/Reading Code 0h:Critical stop during OS load/initialization. Unexpected error during system startup. Stopped waiting for input or power cycle/reset

1h:Run-time critical stop (aka ‘core dump’ or ‘blue screen’

2h:OS Graceful stop (system powered up, but normal OS operation has shut down and system is awaiting reset pushbutton, power-cycle or other external input)

3h:OS Graceful Shutdown (system graceful power down by OS)

4h:Soft Shutdown initiated by PEF

5h:Agent Not Responding. Graceful shutdown request to agent via BMC did not occur due to missing or malfunctioning local agent
14 FFh Event Data 2 Not specified per bits 7-6 in event data 1
15 FFh Event Data 3 Not specified per bits 5-4 in event data 1

Type DEh Bugcheck Code OEM Event Record

The bugcheck code is the stop code displayed on the bluescreen. It is used to indicate the type of failure that caused the bugcheck. The OEM Type DEh event record saves the bugcheck code to the SEL.

Bytes Value Field Name Description
1:0SEL Record ID
2 DEhRecord Type OEM Time stamped Record (bytes 7-15 are OEM Defined)DEh: OS Bugcheck
6:3 SEL Timestamp
9:7137hIPMI Manufacturer ID 137h (311d) is the IANA enterprise number for Microsoft
10 Sequence Number Sequence number used to concatenate the OEM Data bytes from multiple SEL entries
14:11 Bugcheck Stop Code Stop code listed on the BSOD
15System Architecture 00: 32bit OS
01: 64bit OS

Type DEh Bugcheck Parameter OEM Event Record

When the system encounters a condition or error it can’t handle, a system call is made to bugcheck the system. In addition to the bugcheck code described above, four parameters are passed to further describe the failure. This information can be used to debug the system and figure out why the crash occurred. The OEM Type DEh Bugcheck Parameter Event record is described below:

Bytes Value Field Name Description
1:0SEL Record ID
2 DEhRecord Type OEM Time stamped Record (bytes 7-15 are OEM Defined)

DEh: OS Bugcheck
6:3 SEL Timestamp
9:7137hIPMI Manufacturer ID 137h (311d) is the IANA enterprise number for Microsoft
10 Sequence Number Sequence number used to concatenate the OEM Data bytes from multiple SEL entries
14:11 Bugcheck Parameter Argument to Bugcheck. These are the same arguments listed on the BSOD
15System Architecture 00h: 32bit OS
01h: 64bit OS


This document is also available in Microsoft Word format as well, and it can be downloaded by clicking here.


Viewing all articles
Browse latest Browse all 335

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>