Exadata

Kdump – Kernel Crashdump

Many people agree that Linux kernel is pretty stable. It doesn’t crash pretty often. But there are times you hit a bug and It suddenly crash. In that case kdump is useful for investigating the reason.
Kdump can be simply defined as Linux Kernel Crashdump Mechanism. It is used to gather diagnostic information during a Linux system crash. If the Linux system crash, the dump file (memory image – vmcore) is generated.
This core file can be pretty huge depending on your memory usage. To investigate the root cause you need to generate stack trace from this file. For more information refer : “How to get the stack trace of kdump in Exadata (Doc ID 1363245.1)”

Example :

crash /u01/crashfiles/127.0.0.1-2012-05-12-02:31:16/crashcore  /usr/lib/debug/lib/modules/"kernel version"/vmlinux 

After a system crash, It may take a lot of time to start a linux machine when the kdump service is active. You can TEMPORARY disable the service. It is NOT advised to disable it all the time since
you may lose valuable diag information.

Disable/Enable on Start

[root@host1 ~]# dcli -g dbs_group  -l root "chkconfig kdump off  "
[root@host1 ~]# dcli -g dbs_group  -l root "chkconfig --list kdump "
host1: kdump              0:off   1:off   2:off   3:off   4:off   5:off   6:off
host2: kdump              0:off   1:off   2:off   3:off   4:off   5:off   6:off

[root@host1 ~]# dcli -g dbs_group  -l root "chkconfig kdump on   "
[root@host1 ~]# dcli -g dbs_group  -l root "chkconfig --list kdump "
host1: kdump              0:off   1:off   2:on    3:on    4:on    5:on    6:off
host2: kdump              0:off   1:off   2:on    3:on    4:on    5:on    6:off

Stop/Start Kdump Service

[root@host1 ~]#  dcli -g dbs_group  -l root " service kdump stop  "
host1: Stopping kdump:[  OK  ]
host2: Stopping kdump:[  OK  ]
[root@host1 ~]#  dcli -g dbs_group  -l root " service kdump status  "
host1: Kdump is not operational
host2: Kdump is not operational

[root@host1 ~]#  dcli -g dbs_group  -l root " service kdump start   "
host1: Starting kdump:[  OK  ]
host2: Starting kdump:[  OK  ]
[root@host1 ~]#  dcli -g dbs_group  -l root " service kdump status  "
host1: Kdump is operational
host2: Kdump is operational

for more info => http://docs.oracle.com/cd/E37670_01/E37355/html/ol_kdump_diag.html

Files Used :

/boot/grub/grub.conf
Appends the crash kernel option to the kernel line to specify the amount of reserved memory and any offset value.

/etc/kdump.conf
Sets the location where the dump file can be written, the filtering level for the makedumpfile command, and the default behavior to take if the dump fails.
See the comments in the file for information about the supported parameters.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s