D I G I T A L Software Product Description _________________________________________________________ PRODUCT NAME: DIGITAL Network Process Failover, Version 1.0 SPD 70.43 DESCRIPTION Digital Network Process Failover Version 1.0 is a software foundation for High Availability(HA)applications which fills the gap between cluster technology and Fault Tolerant systems. Like clusters (loose coupling) it relies on standard computers, like Fault Tolerant it ensures run time data reliability. Digital Network Process Failover will ease your ability to build applications providing continuous services where logic and data seamlessly survive a software failure. Digital Network Process Failover product is composed of the Network Process Failover Core Middleware and optional modules which provide platform administration agents. Network Process Failover Core Middleware comprises a set of software components and Application Programming Interface (API, provided at the user level)for each of its software components. Digital Network Process Failover provides a process survival implementation based on distributed technology that uses standby (shadow) processes running on different nodes. For each active (running) process configured on a given node, service continuity is provided by configuring a duplicate standby process on a different node. For this reason, the Network Process Failover process is said to be "highly available". The Digital Network Process Failover environment brings process failover capabilities to Digital UNIX configurations (that is, a Network Process Failover domain which comprises multiple node configurations interconnected through a local area network). Digital Network Process Failover software components provide corresponding distributed services for configuring and managing high-availability processes, data, and communication means. In addition, the Network Process Failover provides applications with distribution facilities which enable transparent access to location- independent services. Tools and modules necessary for configuring and managing the Network Process Failover domain are provided as optional modules. NETWORK PROCESS FAILOVER CORE MIDDLEWARE FEATURES The Network Process Failover Core Middleware consists of three service components: . the Data Manager (DM), which handles all in-memory user data-related requests and insures data integrity among all Network Process Failover software components configured throughout the interconnected nodes, . the Communication System (CS), which provides a high availability communication means between each component running in the Network Process Failover environment among the configured nodes of the platform, and . the UNIX Middleware Administrator (XMA), which provides configuration and management capabilities for achieving real-time failover detection in the Network Process Failover environment. Each component is controlled at the user level by means of a corresponding API. Data Management Data Management (DM) offers a complete set of facilities for managing shared or private data sets, including: . access rights, . record identification and selection, . access by value and pointer, . update integrity, and . back-up. In DM, record selection can be direct or sequential (that is, sequential selection performed amongst a specified number of allocated records with or without primary or secondary key criteria). DM provides a transaction mechanism for uninterrupted atomic execution of read or write operations throughout the platform. DM also offers trigger management for notifying a given process whenever a data set or record is updated. Data Management ensures coherency between active reference data sets on a given node and the corresponding replica on a different node of the platform. The user can request updates of the replica implicitly (occurs at transaction close) or explicitly. Users can also configure Data Management for handling disk backups of data set synchronization. Disk backup can be implicit or explicit. Communication Services CS can provide communication between high availability processes either platform-wide or within a single node. CS provides three communication methods: datagram, question/answer, and session-based. It also supports two addressing methods: functional and organic. A high-availability process uses functional addressing to request a function without specifying its provider. Functional addressing provides: . an independent routing scheme for location transparency which enables flexible distribution throughout the platform, and . a method for service distribution using load-sharing between nodes. The load may be evenly or preferentially distributed (through percentile load). The high-availability process uses organic addressing to request a specific service from a specific node. Configuration and Management At the node level, Digital Network Process Failover Core Middleware uses the UNIX Middleware Administrator (XMA) component to install and configure the Network Process Failover environment. At the shell-level, there is an interface which provides basic facilities for: . active and standby configuration of Network Process Failover processes, and . failover management. If a highly available process fails, the Network Process Failover management provides capabilities that are transparent to users at the process level. For node-level failures, advanced configuration and real-time management facilities are available to the user through the XMA API. This API offers the following configuration services: . high-availability process setup, . data management setup, and . communication services setup. The services offered for supervision, failure detection, and failover strategy of the Network Process Failover environment are: . constant surveillance of process state status, . alarm notifications, and . reconfiguration control. DIGITAL NETWORK PROCESS FAILOVER ADMINISTRATION AGENT FEATURES (OPTIONAL) The optional Digital Network Process Failover administration agent features consist of the following elements: . a Global Agent (npfmgr), which manages the entire Digital Network Process Failover domain. The high- availability characteristics of npfmgr are founded upon TruCluster_ ASE technology. (For any given Digital Network Process Failover domain, only one npfmgr can be up and running at a given time). In a given domain, two members must be configured with TruCluster_ ASE, these are named pilot members. . a Local Agent (npfagent), which is located on each system in the Digital Network Process Failover domain. The npfagent maintains the link between the local system and the platform npfmgr. It updates the npfmgr when any changes in the local configuration occur and dynamically starts any local reconfiguration action either automatically or as instructed by npfmgr, . a set of command line interface (CLI) utilities is provided to help facilitate the configuration and management of high-availability processes on the systems in the Network Process Failover domain. SOFTWARE INFORMATION AND REQUIREMENTS Required Software Network Process Failover Core Middleware for Digital UNIX requires the Digital UNIX operating system Version 4.0d, which is a separately licensed product. Please see SPD 41.61.20 for further details. Optional Software If the Network Process Failover application requires the use of the optional Network Process Failover administration agent features, TruCluster_ ASE (Available Server Software) is mandatory. Please see SPD 44.17.XX for further details. Software Configuration Requirements The Network Process Failover Core Middleware software configuration requires: . A minimum of 128 MB of memory on each member system. . 25 MB of free disk space (permanent): Note: These requirements refer to the disk space required on the system disk. Sizes given here are approximations. Actual sizes can vary depending on the system environment, configuration, and software options. SOFTWARE LICENSING Digital Network Process Failover is provided only under license as a standard Digital UNIX software layered product. Each component in the Network Process Failover environment requires a separate license. Please contact your local Digital representative or dealer for more information about Digital licensing terms and policies. This product supports the Digital UNIX License Management Facility (LMF). For further information on the LMF, please consult the Digital UNIX Operating System Software Product Description (SPD 41.61.20) or the Digital UNIX Operating System documentation. SOFTWARE PRODUCT SERVICES A variety of service options are available from Digital. For further information, please contact your local Digital service representative. HARDWARE REQUIREMENTS Supported Hardware The initial release of Digital Network Process Failover will be supported on AlphaServer 800 and AlphaServer 4100 SMP (symmetric multiprocessing (SMP). Redundant network capabilities can be provided by FDDI. Network Process Failover will support FDDI links as an interconnection means to its domain. When network redundancy is not required, the Network Process Failover domain supports standard Ethernet and Fast Ethernet. Shared disk management may be required to handle disk storage redundancy. The Network Process Failover domain uses TruCluster_ ASE technology to support SCSI device sharing. Primary and backup networks must be installed between pilot members running npfmgr. GROWTH CONSIDERATIONS The minimum hardware and software requirements for any future version of this product may be different from the requirements for the current version. DISTRIBUTION MEDIA Network Process Failover is an individually licensed product which is distributed as part of the Digital UNIX Layered Products CD-ROM. ORDERING INFORMATION DIGITAL Network Process Failover Developement QA-66RAA-GZ Network Process Failover Development Documentation Kit QL-66RA*-*A Network Process Failover Software Development Licenses DIGITAL Network Process Failover Runtime QA-66SAA-GZ Network Process Failover RunTime Documentation Kit QL-66SA*-** Network Process Failover Runtime Licenses