|
The System Corefile is helpful during problem analysis on a SUN Solaris Computer.
A System Corefile is produced when the panic() routine calls
vfs_syncall() and dumpsys() to sync physical memory to the appropriate
disks and the current kernel image to the dump device. When savecore is run
during bootup, it scans the top end of the primary swap partition and creates a
unix.0 and a corresponding vmcore.0 files. These files are
automatically incremented as additional corefiles are captured. The .bounds
file keeps track of the current increment. Panic() is called when a situation
occurs which would compromise the data integrity of the running system. The philosopy
is that continuing would be worse than stopping and rebooting.
This are the steps in case a SUN Solaris System hangs:
[Stop]-[A]
OK>
OK> sync
Successfully capturing a corefile is dependent upon patch level, the type of device
used for primary swap space. Sun Support has a utility named core_check.sh
that will report if the system is at the proper patch level and is configured
properly to capture a corefile. This is available upon request at the SUN support.
All kernel memory pages are saved, active pages in the kernel segment map are saved,
and running user process stacks are saved. By default, the kernel memory pages of
active processes are saved. Setting the appropriate switches with dumpadm -u -c
all, forces all memory pages to be captured, however most of this data is not
useful and capturing it creates extremely large corefiles. Our advice is to not to
enable this feature unless directed by Sun Support. See the manpage on dumpadm
for more details.
A system corefile is a snapshot of kernel memory at the moment of the panic. This
data shows what threads are running on each cpu, the process table, the current
threads on the dispatch queue, the kernel memory structures. Through corefile
analysis, SUN is able to reconstruct the events which led to the panic.
Based upon this information SUN can usually determine if the problem was caused by
hardware or software, which part caused the panic, what code the cpu was running when
the panic condition occurred and then search for an exisiting bug and patch fix. Just
because a cpu reported the panic, that doesn't mean the cpu was the cause.
It's important to produce the panic strings and provide this info when opening a case
with Sun. If this a known problem it could save hours of effort to find a solution.
# strings vmcore.* | head
In Solaris 2.5 through 2.6, savecore is normally not enabled. It must
be enabled by the system administrator through editing the /etc/init.d/sysetup
file. If the system panics, the /var/adm/messages file will show 'dumping
pages....', this indicates that the system has captured a corefile. If savecore has
not been enabled, it may be run manually shortly after reboot by cd'ing into a
directory with sufficient space to hold the system corefile and typing the command
savecore -v . which tells the system to dump the savecore 'here' and provides
a verbose status message if it was able to process the savecore.
In Solaris 7 and above, savecore is enabled by default and is
controlled by the dumpadm command. You can run the dumpadm command
without arguments to get the current configuration. Starting with Solaris 7, the
system corefile is automatically compressed to conserve room in the primary swap
partition.
# dumpadm
Dump content: kernel pages
Dump device: /dev/dsk/c0t0d0s3 (swap)
Savecore directory: /var/crash/diamond
Savecore enabled: yes
If you are using something other than a raw primary swap partition, there is a risk
that a savecore may not be produced. For instance if the ' vxfs ' driver
caused the panic, the savecore may not work if swap is under ' vxfs ' control.
The fewer layers of drivers involved, the better chance of capturing a useful
corefile.
It's critical that the directory where the /etc/init.d/sysetup puts the
corefile is:
Savecore normally runs as part of /etc/rc2.d/SXXsysetup, /var is
normally mounted right away, before run level 2, so that should be OK if there is
enough room on /var for the core file.
|
/etc/init.d/sysetup
|
Check if savecore is enabled
|
|
/etc/dumpadm.conf
|
Configuration File for dumpadm
|
|
/var/crash/`uname -n`
|
Location of Crash Dump Directory
|
|
/usr/bin/savecore
|
Save a crash dump
|
|