HP OpenVMS Systems

Content starts here

HP Advanced Server for OpenVMS
Server Administrator's Guide


Previous Contents Index

Example 6-2, ADMINISTER/ANALYZE/FULL Command and Display, shows a portion of the more detailed report generated when you use the /FULL qualifier.

Example 6-2 ADMINISTER/ANALYZE/FULL Command and Display

$ ADMINISTER/ANALYZE/FULL/INPUT=EVTLOG.DAT

  :::::::::: PATHWORKS Error Log Report ::::::::::
           DATE: 25-OCT-2000 15:52:06.88

================= EVENT #1 ==================

Event Time:   18-OCT-2000 17:14:09.04       Node:  TINMAN
Process Id:  555749340
Event:        PATHWORKS Lock Database is 90% full
Event Source: Common Services PLM
Event Class:  Warning

0x00000032     Total Database Resources:   50
0x0000002D     Current Resources in Use:   45
0x00000019       Currently open Streams:   25
0x00000017       Currently unique Opens:   23
0x00000004      Currently Locked Ranges:   4

Decode information unavailable (Hex. output):
0x62426141
0x64446343
0x66466545
0x68486747
0x00006949
   .
   .
   .

6.2 Troubleshooting Server Problems

To troubleshoot server problems, you should be familiar with the following topics:

  • OpenVMS system administration and troubleshooting
    The OpenVMS log files, system administration procedures, and parameter settings are described in the OpenVMS operating system documentation.
  • Advanced Server concepts
    The Advanced Server concepts are described in the HP Advanced Server for OpenVMS Concepts and Planning Guide.
  • Site-specific network configuration
    Advanced Server provides data-gathering tools that are useful for describing the server and the network environment; in addition, each server system should have a log of the installation and configuration setup, including client requirements and shared resources, network administration accounts, and domain trust information.
  • Client environment
    You should be familiar with the software running on the client computers that access the server, including their server requirements and their network capabilities. For clients running PATHWORKS client software, see the extensive PATHWORKS client documentation that describes client configuration, modification, and error messages.

6.2.1 Troubleshooting Overview

The following sections describe how to determine the cause of a server problem and solve it if possible. Problem resolution includes determining whether or not the problem is caused by the Advanced Server software. To solve client-based problems, hardware problems, and application-specific problems, see the documentation for the specific products involved.

Troubleshooting a server problem requires the following steps:

  1. Collecting information about the problem
  2. Analyzing the problem to determine its characteristics and to isolate the cause of the problem
  3. Solving the problem

The following sections describe each step in more detail.

6.2.1.1 Step 1: Collecting Information About the Problem

When you first detect a server problem, or when the problem is reported, collect as much information as possible immediately. Record the following information:

  • The time and date that the problem occurred
  • The type of work that the user was performing when the problem occurred, including applications running, shares accessed, and resources used
  • Specific information about the network transport the client uses to connect to the server, the server name that the client uses to connect, whether the user account is currently logged on, and the physical location of the client connection to the network

If you are investigating a recurring or ongoing problem, you should, if possible, implement an immediate solution that allows the client to continue working. Record server problems and save a dump file, if one was generated, and save associated log files and data files before restarting the server or changing the server configuration. You can use the information gathering command procedure SYS$STARTUP:PWRK$GATHER_INFO.COM to save these files.

6.2.1.2 Step 2: Analyzing the Problem

When you analyze the server problem, you should also look for the solution to the problem. Therefore, you must isolate the component that needs to be modified, replaced, removed, or enhanced.

Advanced Server software provides information in log files and tools to help you determine the cause of a server problem. These tools keep records of activities and errors. You can use them to isolate problem areas and to help solve problems. You may be able to solve the problem using the Advanced Server commands and utilities.

6.2.1.3 Step 3: Solving the Problem

The cause of a server problem may be within your ability to correct. At best, you may determine a configuration or definition change that will correct the problem. Or, you may be able to modify a server parameter or disable a service until the problem is solved more satisfactorily.

The procedure for solving a server problem depends on your ability to capture information about the problem and the state of the server at the time of the problem. If a problem is reported to be intermittent and is difficult to reproduce at will, the procedure for analysis and solution will take longer and be more difficult. Thus, it is particularly important to collect detailed information as soon as the problem is reported.

The following sections show how to use the Advanced Server tools in the problem-solving process. Using these tools, you can modify the server to report on network activity and events, providing more detailed investigation of problems that you have already determined to be caused by the server or its network resources.

If you cannot determine the cause of a server problem, or if you cannot solve the problem, report the problem to your software specialist and keep the Advanced Server data structure PWRK$LMROOT and the log files for future analysis.

To help you report the information required for analyzing a server problem, the Advanced Server software includes a procedure you can run to gather server information.

6.2.1.3.1 Gathering Information About Server Status

To invoke the procedure provided by the server to gather server status information, enter the following commands:


$ SET DEFAULT SYS$STARTUP

$ @PWRK$GATHER_INFO.COM

The resulting file (ADVANCED_SERVER_AS_INFO.BCK) is a BACKUP saveset containing copies of the Advanced Server database, logs, and, if present, process dump files.

If the problem you are investigating causes a systemwide failure, create a dump file for the system. The system dump file captures system information. Be sure to verify that your system dump file size is sufficient to capture a full system dump.

6.2.2 The Problem Analysis Process

Problem analysis is a process of elimination. Given little information to start, you must begin at the general level and use the information-gathering tools described in this chapter to determine the area from which the problem originates. If you have sufficient information at the beginning to isolate the problem area or if the problem is ongoing or if you can reproduce the problem, you can proceed directly to the section in this chapter that addresses the type of problem you are investigating.

The problem-solving procedure differs depending on the type of problem reported. The following sections describe several types of problems, in analytical order, from the generic characteristics of server problems to the more specific.

Problem types are characterized by behavior or source as follows:

  • Intermittent
  • Domain and Computer
  • Server operation
  • Services
  • Client connection
  • Share access
  • Printer
  • User account
  • Privileged user
  • Advanced Server connection
  • License acquisition

6.2.2.1 Intermittent Problems

Intermittent problems are those that are not easily reproducible. They may not prevent server operation, like ongoing problems, and they may be difficult to analyze and solve. For these types of problems, your analysis depends heavily on the log files and messages reported before and during the time the problem occurred. To help locate such problems, you can use network traces, both on the condition where the problem can be reproduced, and when the problem is intermittent.

Table 6-7, Procedure for Solving Intermittent Problems, describes the steps you may take to determine the cause of an intermittent problem.

Table 6-7 Procedure for Solving Intermittent Problems
Step 1: Collect Information Step 2: Analyze the Problem Step 3: Solve the Problem
Record the time and date when the problem occurred, the nature of the symptoms, the computer name of the client, if any. Related information can include applications that have connections to the server, server shares, and resources consumed by the client. Check for alerts around the time the problem occurred. Attempt to reproduce the problem on the same client and on other clients in the domain. You can enable and modify the Alerter service to provide more specific, immediate error notification, as described in Table 6-1, Alerter Configuration Parameters. If the problem circumstances can be reproduced, use the Alerter service to watch the messages during the occurrence of the problem.
 
If the problem is unique to a specific group or one client, see Analyze the Problem in the next column of this table.

If the problem is continuous, or if you can reproduce the problem at will, continue to the section Domain and Computer Problems.

Use the SHOW EVENTS command to see the event messages that were recorded for the time the problem occurred. Enable additional event/audit tracking to get more detailed information. See Section 6.1.3 in this guide for more information.

Check Advanced Server log files for additional messages, as described in Section 6.1.4,Advanced Server Log Files.

Review events and log files to isolate the cause of the problem and address it accordingly.

Intermittent problems that do not prevent use of the server may be due to faulty hardware. Check the connections to the client, the client configuration, and the network hardware.

6.2.2.2 Domain and Computer Problems

The domainwide functions of the server depend on its role in the domain and on the other servers in the domain. The Advanced Server command-line interface lets you display information about the domain and modify server activity in the domain.

Table 6-8, Procedure for Solving Domain and Computer Problems, described how to determine the cause of server and domain problems and what to do about them.

Table 6-8 Procedure for Solving Domain and Computer Problems
Step 1: Collect Information Step 2: Analyze the Problem Step 3: Solve the Problem
Determine whether users of other computers in the domain receive error messages when attempting to connect to a server, or whether server administrators receive error messages using ADMINISTER commands. If so, the problem may be due to a server's relationship to the other servers in the domain. Use the SHOW COMPUTERS command to determine the status of other computers in the domain. Use the REMOVE COMPUTER command to take the computer off the domain.

Use the SET COMPUTER /ACCOUNT_SYNCH command to synchronize the security accounts database across the domain.

Use the SET COMPUTER/ROLE command to change the server role of a server in the domain, as described in Section 2.1.1.1, Changing a Server's Role in a Domain.

 
Determine whether domain problems require changes on multiple servers in the domain. Use the SHOW
ADMINISTRATION command to display the server and domain name of the server currently being administered.
Use the SET ADMINISTRATION command to set the server and domain name of the server to be managed, as described in Section 2.1.4, Administering Another Domain.
 
When setting up trusts between domains, you receive the error message "Could not find domain controller for this domain." Check that each domain has a running domain controller.

Check that both domains are running the same transport protocol (TCP/IP, DECnet, or NetBEUI).

Start at least one server in each domain.

Use the Configuration Manager to enable the same transport on both domains, as described in Section 7.2, Managing File Server Parameters Affecting System Resources.

6.2.2.3 Server Operation Problems

If the server fails to complete routine operations, the log files and error messages from the software usually indicate the nature and source of the problem.

Table 6-9, Procedure for Solving Server Operation Problems, describes how to determine the cause of a problem in server operation and what do to about it.

Table 6-9 Procedure for Solving Server Operation Problems
Step 1: Collect Information Step 2: Analyze the Problem Step 3: Solve the Problem
Check the error messages seen during failing procedures and operations. Use Advanced Server log files to display messages about problems during software startup and operation. Use the Configuration Manager to modify server parameters that affect the way the server runs, as described in Section 7.2, Managing File Server Parameters Affecting System Resources, or modify server configuration parameters, as described in Section 7.3, Managing Server Configuration Parameters Stored in the OpenVMS Registry.
 
Check service startup failures, which are logged in the system event log files. Use the SHOW EVENTS command to display system events. Use the START SERVICES and STOP SERVICES commands to manage services, as described in Section 2.3.4, Managing Services.

6.2.2.3.1 Monitoring Data Cache Use by the File Server

Advanced Server uses its data cache for caching the security databases, in addition to client file data. To ensure a balance of cache usage, the file server periodically monitors its use of the data cache, as follows:

  • Total security databases utilization
    The file server monitors the total utilization of the data cache by the security databases. If the file server detects that the utilization of the data cache for these files exceeds thirty-five percent (35%), a warning message is posted to the file server log file indicating that the current cache configuration may not be adequate for the current load imposed on the file server.
    For example:


    BlobCache Warning: Sum of Blob file control areas
                       is 950272 bytes (45% of data cache).
    

    The condition reported by this warning message will not prevent the file server from being able to properly process requests associated with the security databases. The message (shown below) indicates that you should increase the size of the data cache.
  • Individual security database utilization
    The file server monitors utilization of the data cache by individual security database files. When the database expands in size, more cache resources are required to continue operating. If the file server detects that an operation will cause a database file expansion, and that expanding the database file will cause it to utilize more than fifty percent (50%) of the data cache, error messages are recorded in the file server log, as in the following example:


    BlobCache Error: The largest single Blob file control area
                     is 1187840 bytes (57% of data cache).
    
    BlobCache Error: The largest single Blob file control area
                     is PWRK$LMROOT:[LANMAN.DOMAINS]DOMAIN1.
    

    In addition to recording the problem in the file server log, the software generates an operator message and raises a server alert.
    These messages indicate that the operation will prevent the file server from completing the current and future operations. In this case, you should use the Configuration Manager (ADMIN/CONFIG), as described in Section 7.2, Managing File Server Parameters Affecting System Resources, to increase the size of the data cache so that utilization of the data cache by a single database file remains under 50%. The change to the data cache size takes effect the next time you start the server.

You can use the ADMIN/ANALYZE command to monitor these warning messages and error messages, as described in Section 6.1.4.2, The Advanced Server Common Event Log.

6.2.2.4 Problems with Services

Advanced Server software includes several optional services. For example, Auditing is a service useful for analyzing server problems. However, the services must be enabled.

Table 6-10, Procedure for Solving Service Problems, describes how to determine whether a problem is caused by network service problems and what do to about them.

Table 6-10 Procedure for Solving Service Problems
Step 1: Collect Information Step 2: Analyze the Problem Step 3: Solve the Problem
Check whether the services are running. Use the SHOW SERVICES command to display the services that are running. Use the following commands to control the operation of the services:
    START SERVICE
    STOP SERVICE
    PAUSE SERVICE
    CONTINUE SERVICE
    (See Section 2.3.4, Managing Services, for more information.)

6.2.2.5 Client Connection Problems

Clients may be individually or collectively reporting a failure to connect to the server or reporting slow response time in connecting to the server or the share.

Table 6-11, Procedure for Solving Client Connection Problems, describes the causes behind many typical client connection problems and what to do about them. For information about problems connecting to shares or specific files, see Section 6.2.2.6, Share Access Problems.

Table 6-11 Procedure for Solving Client Connection Problems
Step 1: Collect Information Step 2: Analyze the Problem Step 3: Solve the Problem
If a client cannot end a session or there are too many sessions, you can control the user sessions. Use the SHOW SESSIONS command to display current Advanced Server client sessions. Use the CLOSE SESSION command to close unneeded sessions.
 
If more than one client reports a problem when connection to the server is lost or with slow response time, the problem may be caused by too many connections to the same server. Use the SHOW
CONNECTIONS command to display the connections that clients have established to Advanced Server shares.
Use the CLOSE CONNECTION command to end one or more connections.
 
When a client tries to log on over a WAN, the following message is received: "You were logged on, but have not been validated by a server." Clients may use NetBIOS broadcasts to send logon requests, and these requests do not go over the router. To locate domain controllers capable of authenticating logons, use a WINS Server or LMHOSTS entries that include the #DOM directive.

6.2.2.6 Share Access Problems

Clients may fail to connect to shares or lose existing connections. The shares must be set to permit client access. Share setup includes:
  • Allowing access to users who are members of user groups that have access to the share
  • Setting permissions to allow access to the share such as read access
  • Setting OpenVMS file and directory protections, if the Advanced Server and OpenVMS security model is in use
  • Setting the maximum connection limit to allow the required connects

Table 6-12, Procedure for Solving Share Access Problems, describes the causes behind some typical share access problems and what to do about them.

Table 6-12 Procedure for Solving Share Access Problems
Step 1: Collect Information Step 2: Analyze the Problem Step 3: Solve the Problem
Determine whether the client is connected but failing to access resources in the shares. For example, the client computer displays the connection to the server but is unable to list all the files and directories to which the client requires access. Use the SHOW USER command to display the groups to which the user belongs.

Use the SHOW SHARE command to display the groups allowed to access the share.

To add the user to a group, use the MODIFY GROUP command to add the user name. To let the user's group access a share, use the MODIFY SHARE/PERMISSIONS command, as described in Section 4.3.4, Changing Share Properties.
 
  Use the SHOW FILE command to display access permissions on the resources. If the OpenVMS and Advanced Server security model is enabled, use the OpenVMS command DIRECTORY/SECURITY to display the OpenVMS owner and protection information. Use the server SET FILE/PERMISSIONS command, as described in Section 4.3.5.2, Setting Permissions on a File or Directory, to modify the permissions on the file to give the user or group access to the specific resource. Use the OpenVMS SET FILE/PROTECTION command to modify the RMS protections on a directory or file.
 
  Use the Advanced Server SHOW HOSTMAP command to display host mapped user accounts. Use the ADD HOSTMAP command, as described in Section 3.1.16.2, Establishing User Account Host Mapping, to associate a network user account with an OpenVMS user account.
 
If some clients report problems connecting to a share, the problem may be caused by too many connections. Use the SHOW SHARES command to display information about the connection limit on the share. Use the MODIFY SHARE command to change the connection limit on the share, as described in Section 4.3.4, Changing Share Properties.
 
If clients report failure to access a specific file, the problem may be caused by incorrect permission settings on the file. Use the SHOW FILE command to display files that are open, clients who have the files open, and the permissions granted to the clients. Use the SET FILE
/PERMISSIONS command, as described in Section 4.3.6, Specifying File and Directory Access Permissions, to set the file permissions correctly.


Previous Next Contents Index