Previous | Contents | Index |
You can use any ASCII text editor to look at log files, so long as the log files are not open (that is, in use by the Advanced Server). Even if open, most log files can be read using the TYPE command. A convenient way to view the end of most log files is to include the /TAIL and /PAGE qualifiers with the TYPE command, as in the following example, where nodename is the name of the server node:
$ TYPE/TAIL=50/PAGE PWRK$LMLOGS:PWRK$LMSRV_nodename.LOG |
The log files record messages that have occurred during server operation. Not all the messages in the log need your attention. Many messages are caused by communication problems from which the server recovers automatically. If the server fails to recover from a problem, log files can provide you with information about the cause of the problem.
You can examine messages recorded in any log file. Each line in a log file provides information about logged entries, including a date and time stamp. For example, the PWRK$LMSRV_nodename.LOG file might contain information about cache exhaustion conditions.
To examine log files that are in use, use the OpenVMS DCL command BACKUP/IGNORE=INTERLOCK to back them up to a text file, as in the following example:
$ BACKUP/IGNORE=INTERLOCK PWRK$LOGS:NETBIOS_DOROTHY.LOG; - _$ PWRK$LOGS:NETBIOS_DOROTHY.TXT |
The Advanced Server provides its own common event log for recording events that cannot be recorded in the System, Security, or Application event logs. These events include process startup and shutdown, autoshare errors, problems caused by underlying OpenVMS errors (such as disk quota exceeded), and failed attempts to connect because of licensing problems.
The Advanced Server provides the ADMIN/ANALYZE utility for viewing events in Advanced Server common event log files. The events are logged in the file PWRK$COMMON:EVTLOG.DAT on each server.
To view output or to purge the EVTLOG.DAT file, enter the following command:
$ ADMINISTER/ANALYZE |
Table 6-6, Event Logger Command Qualifiers, lists the qualifiers you can use with the ADMINISTER/ANALYZE command.
Qualifier | Description |
---|---|
/AFTER= dd-mmm-yy hh:mm:ss.cc | Restricts the report or the purge operation to events after the specified time. |
/BEFORE= dd-mmm-yy hh:mm:ss.cc | Restricts the report or the purge operation to events before the specified time. |
/CLASS= event_class |
Filters the logged events that are written to the report or purged from
the EVTLOG.DAT file. The available classes are:
|
/FULL or /BRIEF | The /FULL qualifier generates a report that includes all information logged for each event. The /BRIEF qualifier outputs only the event header and is the default. |
/INPUT= event_log_file |
Specifies the name of the event log file. The default file is:
SYS$SYSDEVICE:[PWRK$ROOT]EVTLOG.DAT |
/OUTPUT= report_file | Specifies the name of the output file you want the report written to. The default output is written to SYS$OUTPUT. |
/PID= pid | Specifies the process ID whose events you want to display. |
/PURGE |
Purges entries from the EVTLOG.DAT file on the local server.
If you use the /PURGE qualifier without other qualifiers, all
entries are purged and EVTLOG.DAT file is empty. You can use /PURGE
with other qualifiers to specify which entries you want to purge. For
example, to purge all events in the EVTLOG.DAT file on the server that
are classed as ERROR and written to the file before June 1, 2001, enter
the following command:
|
/SOURCE= event_source |
Filters the logged events that are written to the report or purged from
the EVTLOG.DAT file. The available sources are:
|
Example 6-1, ADMINISTER/ANALYZE Command and Display, shows a sample report from the Event logger generated by the following command executed on the server TINMAN.
Example 6-1 ADMINISTER/ANALYZE Command and Display |
---|
$ ADMINISTER/ANALYZE/INPUT=EVTLOG.DAT :::::::::: PATHWORKS Error Log Report :::::::::: DATE: 25-JUN-2001 15:52:06.88 ================= EVENT #1 ================== Event Time: 18-JUN-2001 17:14:09.04 Node: TINMAN Process Id: 000001DB Event: Master Process starting Event Source: Master Process Event Class: Audit Process Id: 000001DB(X) ================= EVENT #2 ================== Event Time: 18-JUN-2001 17:14:19.57 Node: TINMAN Process Id: 000001DB Event: NetBEUI Daemon process starting Event Source: Master Process Event Class: Audit Process Id: 000002DE(X) ================= EVENT #3 ================== Event Time: 18-JUN-2001 17:14:23.26 Node: TINMAN Process Id: 000001DB Event: NetBEUI Daemon process shutting down Event Source: Master Process Event Class: Audit Process Id: 000002DE(X) Status: SYSTEM-S-NORMAL, normal successful completion ================= EVENT #4 ================== Event Time: 18-JUN-2001 17:14:29.04 Node: TINMAN Process Id: 000001DB Event: NetBIOS transport process starting Event Source: Master Process Event Class: Audit Process Id: 00000262(X) ================= EVENT #5 ================== Event Time: 18-JUN-2001 17:14:37.19 Node: TINMAN Process Id: 000001DB Event: LANman Controller process starting Event Source: Master Process Event Class: Audit Process Id: 00000282(X) ================= EVENT #6 ================== Event Time: 18-JUN-2001 17:14:50.93 Node: TINMAN Process Id: 000001DB Event: License Registrar process starting Event Source: Master Process Event Class: Audit Process Id: 000002D1(X) . . . ================= EVENT #19 ================== Event Time: 19-JUN-2001 09:23:34.63 Node: TINMAN Process Id: 000003DE Event: No license for client - access denied Event Source: LAN Manager Server Event Class: Warning Client: PCGURU . . . =============== EVENT #25 =================== Event Time: 19-JUN-2001 10:38:11.85 Node: TINMAN Process Id: 555749340 Event: Unexpected System Error Encountered Event Source: PATHWORKS Printing Services Event Class: Error |
Example 6-2, ADMINISTER/ANALYZE/FULL Command and Display, shows a portion of the more detailed report generated when you use the /FULL qualifier.
Example 6-2 ADMINISTER/ANALYZE/FULL Command and Display |
---|
$ ADMINISTER/ANALYZE/FULL/INPUT=EVTLOG.DAT :::::::::: PATHWORKS Error Log Report :::::::::: DATE: 25-JUN-2001 15:52:06.88 ================= EVENT #1 ================== Event Time: 18-JUN-2001 17:14:09.04 Node: TINMAN Process Id: 555749340 Event: PATHWORKS Lock Database is 90% full Event Source: Common Services PLM Event Class: Warning 0x00000032 Total Database Resources: 50 0x0000002D Current Resources in Use: 45 0x00000019 Currently open Streams: 25 0x00000017 Currently unique Opens: 23 0x00000004 Currently Locked Ranges: 4 Decode information unavailable (Hex. output): 0x62426141 0x64446343 0x66466545 0x68486747 0x00006949 . . . |
To troubleshoot server problems, you should be familiar with the following topics:
The following sections describe how to determine the cause of a server problem and solve it if possible. Problem resolution includes determining whether the problem is caused by the Advanced Server software. To solve client-based problems, hardware problems, and application-specific problems, see the documentation for the specific products involved.
Troubleshooting a server problem requires the following stages:
The following sections describe each stage in more detail.
6.2.1.1 Stage 1: Collecting Information About the Problem
When you first detect a server problem, or when the problem is reported, collect as much information as possible immediately. Record the following information:
If you are investigating a recurring or ongoing problem, you should, if
possible, implement an immediate solution that allows the client to
continue working. Record server problems and save a dump file, if one
was generated, and save associated log files and data files before
restarting the server or changing the server configuration. You can use
the information gathering command procedure
SYS$STARTUP:PWRK$GATHER_INFO.COM to save these files.
6.2.1.2 Stage 2: Analyzing the Problem
When you analyze the server problem, you should also look for the solution to the problem. Therefore, you must isolate the component that needs to be modified, replaced, removed, or enhanced.
Advanced Server software provides information in log files and tools to
help you determine the cause of a server problem. These tools keep
records of activities and errors. You can use them to isolate problem
areas and to help solve problems. You may be able to solve the problem
using the Advanced Server commands and utilities.
6.2.1.3 Stage 3: Solving the Problem
The cause of a server problem may be within your ability to correct. At best, you may determine a configuration or definition change that will correct the problem. Or, you may be able to modify a server parameter or disable a service until the problem is solved more satisfactorily.
The procedure for solving a server problem depends on your ability to capture information about the problem and the state of the server at the time of the problem. If a problem is reported to be intermittent and is difficult to reproduce at will, the procedure for analysis and solution will take longer and be more difficult. Thus, it is particularly important to collect detailed information as soon as the problem is reported.
The following sections show how to use the Advanced Server tools in the problem-solving process. Using these tools, you can modify the server to report on network activity and events, providing more detailed investigation of problems that you have already determined to be caused by the server or its network resources.
If you cannot determine the cause of a server problem, or if you cannot solve the problem, report the problem to your software specialist and keep the Advanced Server data structure PWRK$LMROOT and the log files for future analysis.
To help you report the information required for analyzing a server
problem, the Advanced Server software includes a procedure you can run to
gather server information.
6.2.1.3.1 Gathering Information About Server Status
To invoke the procedure provided by the server to gather server status information, enter the following commands:
$ SET DEFAULT SYS$STARTUP $ @PWRK$GATHER_INFO.COM |
The resulting file (PATHWORKS_AS_INFO.BCK) is a BACKUP saveset containing copies of the Advanced Server database, logs, and, if present, process dump files.
If the problem you are investigating causes a systemwide failure,
create a dump file for the system. The system dump file captures system
information. Be sure to verify that your system dump file size is
sufficient to capture a full system dump.
6.2.2 The Problem Analysis Process
Problem analysis is a process of elimination. Given little information to start, you must begin at the general level and use the information-gathering tools described in this chapter to determine the area from which the problem originates. If you have sufficient information at the beginning to isolate the problem area or if the problem is ongoing or if you can reproduce the problem, you can proceed directly to the section in this chapter that addresses the type of problem you are investigating.
The problem-solving procedure differs depending on the type of problem reported. The following sections describe several types of problems, in analytical order, from the generic characteristics of server problems to the more specific.
Problem types are characterized by behavior or source as follows:
Intermittent problems are those that are not easily reproducible. They may not prevent server operation, like ongoing problems, and they may be difficult to analyze and solve. For these types of problems, your analysis depends heavily on the log files and messages reported before and during the time the problem occurred. To help locate such problems, you can use network traces, both on the condition where the problem can be reproduced, and when the problem is intermittent.
Table 6-7, Procedure for Solving Intermittent Problems, describes how to determine the cause of an intermittent problem and what to do about it.
Step | Stage 1: Collect Information | Stage 2: Analyze the Problem | Stage 3: Solve the Problem |
---|---|---|---|
1 | Record the time and date when the problem occurred, the nature of the symptoms, the computer name of the client, if any. Related information can include applications that have connections to the server, server shares, and resources consumed by the client. | Check for alerts around the time the problem occurred. Attempt to reproduce the problem on the same client and on other clients in the domain. | You can enable and modify the Alerter service to provide more specific, immediate error notification, as described in Table 6-1, Alerter Configuration Parameters. If the problem circumstances can be reproduced, use the Alerter service to watch the messages during the occurrence of the problem. |
2 |
If the problem is unique to a specific group or one client, see Analyze
the Problem in the next column of this table.
If the problem is continuous, or if you can reproduce the problem at will, continue to the section Domain and Computer Problems. |
Use the SHOW EVENTS command to see the event messages that were
recorded for the time the problem occurred. Enable additional
event/audit tracking to get more detailed information. See
Section 6.1.3 in this guide for more information.
Check Advanced Server log files for additional messages, as described in Section 6.1.4,Advanced Server Log Files. |
Review events and log files to isolate the cause of the problem and
address it accordingly.
Intermittent problems that do not prevent use of the server may be due to faulty hardware. Check the connections to the client, the client configuration, and the network hardware. |
6.2.2.2 Domain and Computer Problems
The domainwide functions of the server depend on its role in the domain
and on the other servers in the domain. The Advanced Server command-line
interface lets you display information about the domain and modify
server activity in the domain.
Table 6-8, Procedure for Solving Domain and Computer Problems, describes how to determine the cause of domain and computer problems and what to do about it.
Step | Stage 1: Collect Information | Stage 2: Analyze the Problem | Stage 3: Solve the Problem |
---|---|---|---|
1 | Determine whether users of other computers in the domain receive error messages when attempting to connect to a server, or whether server administrators receive error messages using ADMINISTER commands. | If so, the problem may be due to a server's relationship to the other servers in the domain. Use the SHOW COMPUTERS command to determine the status of other computers in the domain. |
Use the REMOVE COMPUTER command to take the computer off the domain.
Use the SET COMPUTER /ACCOUNT_SYNCH command to synchronize the security accounts database across the domain. Use the SET COMPUTER/ROLE command to change the server role of a server in the domain, as described in Section 2.1.1.1, Changing a Server's Role in a Domain. |
2 | Determine whether domain problems require changes on multiple servers in the domain. |
Use the SHOW
ADMINISTRATION command to display the server and domain name of the server currently being administered. |
Use the SET ADMINISTRATION command to set the server and domain name of the server to be managed, as described in Section 2.1.4, Administering Another Domain. |
3 | When setting up trusts between domains, you receive the error message "Could not find domain controller for this domain." |
Check that each domain has a running domain controller.
Check that both domains are running the same transport protocol (TCP/IP, DECnet, or NetBEUI). |
Start at least one server in each domain.
Use the Configuration Manager to enable the same transport on both domains, as described in Section 7.2, Using the Configuration Manager. |
6.2.2.3 Server Operation Problems
If the server fails to complete routine operations, the log files and
error messages from the software usually indicate the nature and source
of the problem.
Table 6-9, Procedure for Solving Server Operation Problems, describes how to determine the cause of a problem in server operation and what to do about it.
Step | Stage 1: Collect Information | Stage 2: Analyze the Problem | Stage 3: Solve the Problem |
---|---|---|---|
1 | Check the error messages seen during failing procedures and operations. | Use Advanced Server log files to display messages about problems during software startup and operation. | Use the Configuration Manager to modify server parameters that affect the way the server runs, as described in Section 7.2, Using the Configuration Manager, or modify server configuration parameters, as described in Section 7.3, Using the LANMAN.INI File. |
2 | Check service startup failures, which are logged in the system event log files. | Use the SHOW EVENTS command to display system events. | Use the START SERVICES and STOP SERVICES commands to manage services, as described in Section 2.3.4, Managing Services. |
Previous | Next | Contents | Index |