|
OpenVMS System Manager's Manual
13.11.3.1 Investigating the Problem
Check the operator log that was current at the time the queue manager
started up or failed over. Search the log for operator messages from
the queue manager.
On systems with multiple queue managers, also search for messages
displayed by additional queue managers by including their process names
in the search string. To display information about queue managers
running on your system, use the SHOW QUEUE/MANAGERS command, as
explained in Section 13.4.
For more information about multiple queue managers and their process
names, see Section 13.8.1.
The following messages indicate that the queue database is not in the
specified location:
%%%%%%%%%%% OPCOM 4-FEB-2000 15:06:25.21 %%%%%%%%%%%
Message from user QUEUE_MANAGE on MANGLR
%QMAN-E-OPENERR, error opening CLU$COMMON:[SYSEXE]SYS$QUEUE_MANAGER.QMAN$QUEUES;
%%%%%%%%%%% OPCOM 4-FEB-2000 15:06:27.29 %%%%%%%%%%%
Message from user QUEUE_MANAGE on MANGLR
-RMS-E-FNF, file not found
%%%%%%%%%%% OPCOM 4-FEB-2000 15:06:27.45 %%%%%%%%%%%
Message from user QUEUE_MANAGE on MANGLR
-SYSTEM-W-NOSUCHFILE, no such file
|
The following messages indicate that the queue database disk is not
mounted:
%%%%%%%%%%% OPCOM 4-FEB-2000 15:36:49.15 %%%%%%%%%%%
Message from user QUEUE_MANAGE on MANGLR
%QMAN-E-OPENERR, error opening DISK888:[QUEUE_DATABASE]SYS$QUEUE_MANAGER.QMAN$QUEUES;
%%%%%%%%%%% OPCOM 4-FEB-2000 15:36:51.69 %%%%%%%%%%%
Message from user QUEUE_MANAGE on MANGLR
-RMS-F-DEV, error in device name or inappropriate device type for operation
%%%%%%%%%%% OPCOM 4-FEB-2000 15:36:52.20 %%%%%%%%%%%
Message from user QUEUE_MANAGE on MANGLR
-SYSTEM-W-NOSUCHDEV, no such device available
|
13.11.3.2 Cause
The queuing system does not work correctly under the following
circumstances:
- If the dirspec parameter specified with the
START/QUEUE/MANAGER command (specifying the location of the queue and
journal files) is not translated exactly the same on all nodes, and the
queue manager starts on one of the affected nodes. You typically find
this problem in an OpenVMS Cluster environment when you add a system
disk or move the queue database.
- If the queue database disk is not mounted for the node on which the
queue manager attempts to run.
In general, the queuing system will be shut off completely if the queue
manager encounters a serious error and forces a crash or failover twice
in two minutes consecutively on the same node. Therefore, the queuing
system may have stopped, or it may continue to run if the queue manager
moves to yet another node on which it can access the database after the
original failed startup.
13.11.3.3 Correcting the Problem
Perform the following steps:
- If the queue manager is stopped, enter START/QUEUE/MANAGER and
include the following information:
- An appropriate list of nodes with the /ON qualifier.
- The appropriate dirspec parameter (to specify the location
of the queue and journal files). All the nodes included in the node
list with the /ON qualifier must be able to access this directory.
- On all nodes specified in the node list (except on any nodes that
boot from the disk where the queue database files are stored), add a
MOUNT command to the SYLOGICALS.COM procedure to mount the disk that
holds the master file. You do not need to explicitly mount the disk on
a node where it is the system disk.
13.11.4 If the Queue Manager Becomes Unavailable
The queue manager becomes unavailable if it does not start or has
stopped running.
13.11.4.1 Investigating the Problem
To investigate the problem, enter SHOW CLUSTER to see if the nodes on
the list are available.
13.11.4.2 Cause
An insufficient failover node list might have been specified for the
queue manager, so that none of the nodes in the failover list is
available to run the queue manager.
13.11.4.3 Correcting the Problem
Make sure the queue manager list contains a sufficient number of nodes
by entering START/QUEUE/MANAGER with the /ON qualifier to specify a
node list appropriate for your configuration.
If you are in doubt about what nodes to specify, Compaq recommends that
you specify an asterisk (*) wildcard character as the last node in the
list; the asterisk indicates that any remaining node in the cluster can
run the queue manager. Specifying the asterisk prevents your queue
manager from becoming unavailable because of an insufficient node list.
13.11.5 If the Queuing System Does Not Work on a Specific OpenVMS Cluster Node
Use this section if the queuing system does not work on a specific node
when it starts up.
13.11.5.1 Investigating the Problem
Perform the following steps:
- Search the operator log that was current when the problem existed
for the following messages. These messages are broadcast every 30
seconds after the affected node boots.
%%%%%%%%%%% OPCOM 4-FEB-2000 15:36:49.15 %%%%%%%%%%%
Message from user QUEUE_MANAGE on ZNFNDL
%QMAN-E-COMMERROR, unexpected error #5 in communicating with node CSID 000000
%%%%%%%%%%% OPCOM 4-FEB-2000 15:36:49.15 %%%%%%%%%%%
Message from user QUEUE_MANAGE on ZNFNDL
-SYSTEM-F-WRONGACP, wrong ACP for device_
|
- Compare the node's value for the system address parameters SCSNODE
and SCSSYSTEMID with the values for the DECnet node name and node ID,
as follows:
$ RUN SYS$SYSTEM:SYSMAN
SYSMAN> PARAMETERS SHOW SCSSYSTEMID
Parameter Name Current Default Min. Max. Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
SCSSYSTEMID 19941 0 -1 -1 Pure-numbe
SYSMAN> PARAMETERS SHOW SCSNODE
Parameter Name Current Default Min. Max. Unit Dynamic
-------------- ------- ------- ------- ------- ---- -------
SCSNODE "RANDY " " " " " "ZZZZ" Ascii
SYSMAN> EXIT
$ RUN SYS$SYSTEM:NCP
NCP> SHOW EXECUTOR SUMMARY
Node Volatile Summary as of 5-FEB-2000 15:50:36
Executor node = 19.45 (DREAMR)
State = on
Identification = DECnet for OpenVMS V7.2
NCP> EXIT
$ WRITE SYS$OUTPUT 19*1024+45
19501
|
13.11.5.2 Cause
If the DECnet node name and node ID do not match the SCSNODE and
SCSSYSTEMID system address parameters, IPC (interprocess communication,
an operating system internal mechanism) cannot work properly and the
affected node will not be able to participate in the queuing system.
13.11.5.3 Correcting the Problem
Perform the following steps:
- Modify the system address parameters SCSNODE and SCSSYSTEMID or
modify the DECnet node name and node ID, so the values match.
For
more information about these system parameters, refer to the
OpenVMS System Management Utilities Reference Manual. For more information about the DECnet node name and
node ID, refer to the DECnet for OpenVMS Guide to Networking.
- Reboot the system.
13.11.6 If You See Inconsistent Queuing Behavior on Different OpenVMS Cluster Nodes
Use this section if you see the following symptoms:
- After submitting a print job, you can display the job with a SHOW
ENTRY command on the same node, but not on other nodes in the OpenVMS
Cluster environment.
- After defining or modifying a queue, the changes appear in a SHOW
QUEUE display on some nodes, but not on others.
- You can successfully submit or print a job on some nodes, but on
other nodes, you receive a JOBQUEDIS error.
13.11.6.1 Investigating the Problem
Perform the following steps:
- Enter SHOW LOGICAL to translate the QMAN$MASTER logical name within
the environment of each node in the cluster. If there is no translation
on any given node, then translate the default value of
SYS$COMMON:[SYSEXE].
If the SHOW LOGICAL translations show a
different physical disk name on one or more nodes, you have identified
the problem.
- Check the operator log files that were current at the time that one
of the affected nodes booted. Search for an OPCOM message similar to
the following one from the process JOB_CONTROL:
%%%%%%%%%%% OPCOM 4-FEB-2000 14:41:20.88 %%%%%%%%%%%
Message from user JOB_CONTROL on MANGLR
%JBC-E-OPENERR, error opening BOGUS:[QUEUE_DIR]QMAN$MASTER.DAT;
%%%%%%%%%%% OPCOM 4-FEB-2000 14:41:21.12 %%%%%%%%%%%
Message from user JOB_CONTROL on MANGLR
-RMS-E-FNF, file not found
|
13.11.6.2 Cause
This problem may be caused by different definitions for the logical
name QMAN$MASTER on different nodes in the cluster, causing multiple
queuing environments. You typically find this problem in OpenVMS
Cluster environments when you have just added a system disk or moved
the queuing database.
13.11.6.3 Correcting the Problem
Perform the following steps:
- If only one queue manager and queue database exist, skip to step 2.
If more than one queue manager and queue database exist, perform
the following steps:
- Enter a command in the following format on one of the nodes where
the QMAN$MASTER logical name is incorrectly defined:
STOP/QUEUE/MANAGER/CLUSTER/NAME_OF_MANAGER=name
|
where /NAME_OF_MANAGER specifies the name of the queue
manager to be stopped.
- Delete all three files for the invalid queue database. (On systems
with multiple queue managers, you might have more than three invalid
files.)
- Reassign the logical name QMAN$MASTER on the affected systems and
correct the definition in the startup procedure where the logical name
is defined (usually SYLOGICALS.COM).
- Enter STOP/QUEUE/MANAGER/CLUSTER on an unaffected node to stop the
valid queue manager.
- Enter START/QUEUE/MANAGER on any node and verify that the queuing
system is working properly.
13.12 Reporting a Queuing System Problem to Compaq
If you encounter problems with the queuing system that you need to
report to a Compaq support representative, provide the information in
the following table. This information will help Compaq support
representatives diagnose your problem. Please provide as much of the
information as possible.
Information |
Description |
Summary of the problem
|
Include the following information:
- The environment in which the problem occurred. For example, does
the problem occur only on certain nodes, from certain user accounts, or
when using certain layered products?
- How this problem affects your operations. What site operations are
being affected (for example, printing checks or submitting crucial
batch jobs)? How often does the problem occur (for example, one
printout per month, several printouts per day)?
- What events occurred on the system between the time the queuing
system operated correctly and the time the problem appeared.
- Any workarounds you are currently using.
|
Steps for reproducing the problem
|
Specify the exact steps and include a list of any special hardware or
software required to reproduce the problem.
|
Configuration information
|
For example:
- Is the configuration an OpenVMS Cluster system, and does it have
multiple system disks?
- Do you intend the queue database to be located in the default
location (SYS$COMMON:[SYSEXE])? Do you intend the master file to be
included in a different location than the queue and journal files?
|
Output from the SHOW QUEUE/MANAGERS/FULL command
|
Use SYSMAN to enter the command on all nodes, as follows:
$ RUN SYS$SYSTEM:SYSMAN
SYSMAN> SET ENVIRONMENT/CLUSTER
SYSMAN> DO/OUTPUT SHOW QUEUE/MANAGERS/FULL
SYSMAN> EXIT
$ TYPE SYSMAN.LIS
Type the output file SYSMAN.LIS to verify that the output for all
nodes match.
|
Location of the queue and journal files
|
If possible, find out the most recent value that was specified in the
dirspec parameter of the START/QUEUE/MANAGER command (to
specify the location of the queue and journal files). If none was
specified, the default is SYS$COMMON:[SYSEXE].
|
Translation of QMAN$MASTER logical name
|
Verify that the translation is the same on all nodes.
Enter the following commands, and include the resulting output:
$ RUN SYS$SYSTEM:SYSMAN
SYSMAN> SET ENVIRONMENT/CLUSTER
SYSMAN> DO SHOW LOGICAL QMAN$MASTER
If the translations returned from the SHOW LOGICAL command are not
physical disk names, repeat the SHOW LOGICAL command within the
environment of each node to translate the returned value until you
reach a translation that includes the physical device name.
|
Operator log file output
|
Enter the following commands to search the operator log for any message
output by the job controller or queue manager:
$ SEARCH SYS$MANAGER:OPERATOR.LOG/WINDOW=5 -
_$ JOB_CONTROL,QUEUE_MANAGE
On systems with multiple queue managers, for queue managers other
than the default, specify the first 12 characters of the queue manager
name of any additional queue manager. For example, for a queue manager
named PRINT_MANAGER, specify PRINT_MANAGE as follows:
$ SEARCH SYS$MANAGER:OPERATOR.LOG/WINDOW=5 -
_$ JOB_CONTROL,QUEUE_MANAGE,PRINT_MANAGE
|
Information returned from relevant DCL commands
|
Include this information if entering a DCL command shows evidence of
the problem.
|
A copy of the journal file of the queue database
|
Use the Backup utility (BACKUP) with the /IGNORE=INTERLOCK qualifier to
create a copy of the file SYS$QUEUE_MANAGER.QMAN$JOURNAL, and provide
this copy to Compaq.
On systems with multiple queue managers, include copies of journal
files for all queue managers. Journal files for queue managers other
than the default are named in the format
name_of_manager.QMAN$JOURNAL.
|
Copies of any process dumps that might have been created
|
Enter the following commands to find any related process dumps, and
provide copies of the files to Compaq:
$ RUN SYS$SYSTEM:SYSMAN
SYSMAN> SET ENVIRONMENT/CLUSTER
SYSMAN> DO DIRECTORY/DATE SYS$SPECIFIC:[SYSEXE]JBC$*.DMP, -
_SYSMAN> QMAN$*.DMP,PRTSMB.DMP,LATSYM.DMP
If the problem involves an execution queue using a symbiont other
than PRTSMB or LATSYM, also include process dump files from the
symbiont. The file name has the format
image_file_name.DMP.
|
Output from the SHOW QUEUE command
|
If your problem affects individual queues, enter the SHOW QUEUE command
to show each affected queue.
|
Any other relevant information
|
For example:
- When was the queue database last created or modified? Was it
created or modified since the last reboot of the node or nodes?
- Does the IPCACP process exist on the affected nodes? If not, try to
determine whether the process existed earlier. For example, check the
system accounting records.
|
Chapter 14 Setting Up and Maintaining Queues
If you have a printer connected to your system, or if you want to use
batch processing, you must use queues. A queue allows
users to submit requests for printing or batch processing at any time;
the system then prints or processes jobs as resources allow.
Before setting up queues, you need to understand how the queue manager
and the queue database operate and how to create them for the OpenVMS
queuing system. These are explained in Chapter 13.
Information Provided in This Chapter
This chapter describes the following tasks:
This chapter explains the following concepts:
Note
This chapter contains many references to DCL commands. You can find
additional information about all DCL commands in the OpenVMS DCL Dictionary.
|
14.1 Understanding Queuing
A batch or print job submitted either by entering the DCL command
SUBMIT or PRINT or through an application is sent to a queue for
processing. Information about the user's queue request, including the
type of job, the file name or names, the name of the queue, and any
special options,
is sent to the queue manager. The queue manager stores and retrieves
appropriate information from the queue database to print or execute the
job.
The queue manager places the job in the appropriate queue to await its
turn for processing. Only one print job can be printed on a printer at
a single time. However, more than one batch job can execute
simultaneously in a batch queue.
For more information about the queue manager and queue database, and
the operation of batch and print queues, including print symbionts, see
Chapter 13.
14.1.1 Managing Queues on Small Systems
Many features available for queues are not required on small systems
with minimal queuing needs (for example, on workstations). If you are
managing a small system, you probably need only the information in the
following sections:
14.1.2 Understanding Classes and Types of Queues
In general, queues can be divided into two classes:
Class |
Description |
Execution queues
|
Queues that accept batch or print jobs for processing.
|
Generic queues
|
Queues that hold jobs until an appropriate execution queue becomes
available. The queue manager then requeues the job to the available
execution queue.
|
The following sections provide more details about execution and generic
queues.
|