Wednesday, December 19, 2007

The OpenBoot PROM

In this chapter, you will
Analyze host setup details using OpenBoot commands

Change the default boot device

Test system hardware

Create device aliases using nvalias

Remove custom devices using nvunalias

Diagnose and troubleshoot booting problems

Halt a hung system

One of the main hardware differences between SPARC systems that run Solaris and PC systems that run Linux or Microsoft Windows is that SPARC systems have an OpenBoot PROM monitor program, which can be used to modify firmware settings prior to booting. In this chapter, we examine how the monitor can be used to boot a system and troubleshoot hardware problems.

The OpenBoot PROM Monitor
The OpenBoot PROM monitor is based on the Forth programming language, and can be used to run Forth programs that perform the following functions:

Booting the system, by using the boot command

Performing diagnostics on hardware devices by using the diag command

Testing network connectivity by using the watch-net command

The OpenBoot monitor has two prompts from which commands can be issued: the ok prompt, and the > prompt. In order to switch from the > prompt to the ok prompt, you simply need to type n:

> n
ok

Commands are typically issued from the ok prompt. These commands include boot, which boots a system from the default system boot device, or from an optional device specified at the prompt. Thus, if a system is at run level 0, and needs to be booted, the boot command with no options specified will boot the system:

ok boot
SPARCstation 20, Type 5 Keyboard
ROM Rev. 2.4, 256 MB memory installed, Serial #456543
Ethernet address 5:2:12:c:ee:5a HostID 456543
Rebooting with command:
Boot device: /iommu@f,e0000000/sbus@f,e0001000/espdma@f,400000/esp@f,8...
SunOS Release 5.9 Version Generic 32-bit
Copyright (c) 1983-2002 by Sun Microsystems, Inc.
configuring IPv4 interfaces: hme0.
Hostname: Winston
The system is coming up. Please wait.
checking ufs filesystems
/dev/rdsk/c0t0d0s1: is clean.
NIS domainname is Cassowary.Net.
starting rpc services: rpcbind keyserv ypbind done.
Setting netmask of hme0 to 255.255.255.0
Setting default IPv4 interface for multicast: add net 224.0/
4: gateway Winston
syslog service starting.
Print services started.
volume management starting.
The system is ready.
winston console login:
Alternatively, if you have modified your hardware configuration since the last boot and you want the new devices to be recognized, you should always reboot using this command:

ok boot -r
This is equivalent to performing a reconfiguration boot using the following command sequence in a shell as the superuser:

# touch /reconfigure; sync; init 6
or

# reboot -- -r
So far, we’ve looked at automatic booting. However, sometimes it is desirable to perform a manual boot, using the command boot -a, where parameters at each stage of the booting process can be specified. These parameters include:

The path to the kernel that you wish to boot

The path to the kernel’s modules directory

The path to the system file

The type of the root file system

The name of the root device

For example, if we wished to use a different kernel, such as an experimental kernel, we would enter the following parameters during a manual boot:

Rebooting with command: boot -a
Boot device: /pci@1f,0/pci@1,2/ide@1/disk@0,1:a File and args: -a
Enter filename [kernel/sparcv9/unix]: kernel/experimental/unix
Enter default directory for modules [/platform/SUNW,Sparc-20/kernel
/platform/sun4m/kernel /kernel /usr/kernel]:
Name of system file [etc/system]:
SunOS Release 5.9 Version Generic 64-bit
Copyright (c) 1983-2002 by Sun Microsystems, Inc.
root filesystem type [ufs]:
Enter physical name of root device
[/pci@1f,0/pci@1,2/ide@1/disk@0,1:a]:
To accept the default parameters, simply press ENTER when prompted. Thus, to only change the path to the experimental kernel, we would enter kernel/experimental/unix at the Enter filename prompt.
Analyzing System Configuration
To view the OpenBoot release information for your firmware, as well as the system configuration, use the following command:

ok banner
SPARCstation 20, Type 5 Keyboard
ROM Rev. 2.4, 256 MB memory installed, Serial #456543
Ethernet address 5:2:12:c:ee:5a HostID 456543
Here, we can see the system is a SPARCstation 20, with a standard keyboard, and that the OpenBoot release level is 2.4. There are 256MB of RAM installed on the system, which has a hostid of 456543. Finally, the Ethernet address of the primary Ethernet device is 5:2:12:c:ee:5a.

Changing the Default Boot Device
To boot from the default boot device (usually the primary hard drive), you would enter the following:

ok boot
However, it is also possible to boot using the CDROM by using this command:

ok boot cdrom
The system may be booted from a host on the network by using this command:

ok boot net
Alternatively, if you have a boot floppy, the following command may be used:

ok boot floppy
Because many early Solaris distributions were made on magnetic tape, it’s also possible to boot using a tape drive with the following command:

ok boot tape
Instead of specifying a different boot device each time you want to reboot, it is possible to set an environment variable within the OpenBoot monitor, so that a specific device is booted by default. For example, to set the default boot device to be the primary hard disk, you would use the following command:

ok setenv boot-device disk
boot-device = disk
To verify that the boot device has been set correctly to disk, the following command can be used:

ok printenv boot-device
boot-device disk
In order to reset the system, to use the new settings, you simply use the reset command:

ok reset
To set the default boot device to be the primary network device, you would use the following command:

ok setenv boot-device net
boot-device = net
This configuration is commonly used for diskless clients, such as Sun Rays, which use RARP and NFS to boot across the network. To verify that the boot device has been set correctly to net, the following command can be used:

ok printenv boot-device
boot-device net disk
To set the default boot device to be the primary CD-ROM device, you would use the following command:

ok setenv boot-device cdrom
boot-device = cdrom
To verify that the boot device has been set correctly to cdrom, the following command can be used:

ok printenv boot-device
boot-device cdrom disk
To set the default boot device to be the primary floppy drive, you would use the following command:

ok setenv boot-device floppy
boot-device = floppy
To verify that the boot device has been set correctly to floppy, the following command can be used:

ok printenv boot-device
boot-device floppy disk
To set the default boot device to be the primary tape drive, you would use the following command:

ok setenv boot-device tape
boot-device = tape
To verify that the boot device has been set correctly to tape, the following command can be used:

ok printenv boot-device
boot-device tape disk
Testing System Hardware
The test command is used to test specific hardware devices, such as the loopback network device. This device could be tested by using the following command:

ok test net
Internal Loopback test - (OK)
External Loopback test - (OK)
This indicates that the loopback device is operating correctly. Alternatively, the watch-clock command is used to test the clock device:

ok watch-clock
Watching the 'seconds' register of the real time clock chip.
It should be ticking once a second.
Type any key to stop.
1
2
3
Tip Timing results can be cross-checked against a reliable timing device for accuracy.


If the system is meant to boot across the network, but a boot attempt does not succeed, it is possible to test network connectivity using the watch-net program. This determines whether or not the system’s primary network interface is able to read packets from the network it is connected to. The output from the watch-net program looks like this:

Internal Loopback test - succeeded
External Loopback test - succeeded
Looking for Ethernet packets.
'.' is a good packet. 'X' is a bad packet.
Type any key to stop
......X.........XXXX.....….XX............
In this case, a number of packets are marked as bad, even though the system has been connected successfully to the network.

In addition to the watch-net command, the OpenBoot monitor can perform a number of other diagnostic tests. For example, all of the SCSI devices attached to the system can be detected by using the probe-scsi command. The probe-scsi command displays all of the SCSI devices attached to the system. The output of probe-scsi looks like this:

ok probe-scsi
Target 1
Unit 0 Disk SUN0104 Copyright (C) 1995 Sun Microsystems All rights reserved
Target 1
Unit 0 Disk SUN0207 Copyright (C) 1995 Sun Microsystems All rights reserved
Here, we can see that two SCSI disks have been detected. If any other disks or SCSI devices were attached to the chain, they have not been detected, indicating a misconfiguration or hardware error.

Tip If you are using a PCI system, then SCSI devices may or may not appear.

Creating and Removing Device Aliases
The OpenBoot monitor is able to store certain environment variables in nonvolatile RAM (NVRAM), so that they can be used from boot to boot, by using the nvalias command. For example, to set the network device to use RARP for booting, we would use the following command:

ok nvalias net /pci@1f,4000/network@1,1:rarp

This means that booting using the net device, as shown in the following example, would use the /pci@1f,4000/network@1,1 device to boot the system across the network:

ok boot net
However, if we wanted to use the Dynamic Host Configuration Protocol (DHCP) to retrieve the host’s IP address when booting, instead of using RARP, we would use the following command:

ok boot net:dhcp
To remove the alias from NVRAM, you simply use the nvunalias command:

ok nvunalias net
This would restore the default value of net.
Troubleshooting Booting Problems
If a system fails to start correctly in multiuser mode, it’s likely that one of the scripts being run in /etc/rc2.d is the cause. In order to prevent the system from going multiuser, it is possible to boot directly into single-user mode from the ok prompt:

ok boot –s
...
INIT: SINGLE USER MODE
Type Ctrl-d to proceed with normal startup,
(or give root password for system maintenance):
At this point, the root password can be entered, and the user will be given a root shell. However, not all file systems will be mounted, although individual scripts can then be checked individually for misbehaving applications.

If the system will not boot into single-user mode, the solution is more complicated because the default boot device cannot be used. For example, if an invalid entry has been made in the /etc/passwd file for the root user, the system will not boot into single- or multiuser mode. To recover the installed system, the host needs to be booted from the installation CD-ROM into single-user mode. At this point, the default root file system can be mounted on a separate mount point, the /etc/passwd file edited, and the system rebooted with the default boot device. This sequence of steps is shown next, assuming that /etc is located on /dev/dsk/c0t0d0s1:

ok boot cdrom
...
INIT: SINGLE USER MODE
Type Ctrl-d to proceed with normal startup,
(or give root password for system maintenance):
# mkdir /temp
# mount /dev/dsk/c0t0d0s1 /temp
# vi /temp/etc/passwd
# sync; init 6
Using eeprom
Solaris provides an easy way to modify the values of variables stored in the PROM through the eeprom command. The eeprom command can be used by the root user when the system is running in either single- or multiuser mode. The following variables can be set, as shown next with their default values:

# /usr/sbin/eeprom
tpe-link-test?=true
scsi-initiator-id=7
keyboard-click?=false
keymap: data not available.
ttyb-rts-dtr-off=false
ttyb-ignore-cd=true
ttya-rts-dtr-off=false
ttya-ignore-cd=true
ttyb-mode=9600,8,n,1,-
ttya-mode=9600,8,n,1,-
pcia-probe-list=1,2,3,4
pcib-probe-list=1,2,3
mfg-mode=off
diag-level=max
#power-cycles=50
system-board-serial#: data not available.
system-board-date: data not available.
fcode-debug?=false
output-device=screen
input-device=keyboard
load-base=16384
boot-command=boot
auto-boot?=true
watchdog-reboot?=false
diag-file: data not available.
diag-device=net
boot-file: data not available.
boot-device=disk net
local-mac-address?=false
ansi-terminal?=true
screen-#columns=80
screen-#rows=34
silent-mode?=false
use-nvramrc?=false
nvramrc: data not available.
security-mode=none
security-password: data not available.
security-#badlogins=0
oem-logo: data not available.
oem-logo?=false
oem-banner: data not available.
oem-banner?=false
hardware-revision: data not available.
last-hardware-update: data not available.
diag-switch?=false
Halting a Hung System
If a system is hung, and commands cannot be entered into a shell on the console, then the key combination STOP-A can be used to halt the system and access the OpenBoot PROM monitor.

Caution If the system is halted and rebooted in this way, all data that has not been written to disk will be lost, unless the go command is used to resume the system’s normal operation.


An alternative method of accessing a system if the console is locked is to telnet to the system as an unprivileged user, using the su command to obtain superuser status, and kill whatever process is hanging the system. Normal operation can then be resumed.

STOP Commands
The STOP commands are executed on the SPARC platform by holding down the special STOP key located on the left-hand side of the keyboard, and another key that specifies the operation to be performed. The following functions are available:

STOP
Enters the POST environment.

STOP-A
Enters the PROM monitor environment.

STOP-D
Performs diagnostic tests.

STOP-F
Enters a program in the Forth language.

STOP-N
Initializes

Boot Commands
You can use the boot command with any one of the following options:

net
Boots from a network interface.

cdrom
Boots from a local CD-ROM drive.

disk
Boots from a local hard disk.

tape
Boots from a local tape drive.


In addition, you can specify the name of the kernel to boot by including its relative path after the device specifier. Or, you can pass the –a option on the command line to force the operator to enter the path to the kernel on the boot device.

No comments: