(--------------------------------------------------------------------) (Instances and Environments\cr_instances_and_environments)

WASD (instances) and (environments) are two distinct mechanisms for supporting multiple WASD server processes on a single system.

Server instances are multiple, cooperating server processes providing the same set of configured resources.

Server environments are multiple, independent server processes providing differently configured resources. (....................................................................) (Server Instances\hd_server_instances)

The term (instance) is used by WASD to describe an autonomous server process. WASD will support multiple server processes running on a single system, alone or in combination with multiple server processes running across a cluster. This is (not) the same as supporting multiple virtual servers (see (HREF=[-.config]config.sdml!../config/#hd_virtual_services)([-.CONFIG]DOC_config.sdml)) When multiple instances are configured on a single system they cooperate to distribute the request load between themselves and share certain essential resources such as accounting and authorization information. (WARNING) Versions earlier than Compaq TCP/IP Services v5.3 and some TCPware v5.(n) (at least) have a problem with socket listen queuing that can cause services to (hang) (should this happen just disable instances and restart the server). Ensure you have the requisite version/ECO/patch installed before activating multiple instances on production systems! (VMS Clustering Comparison\hd_instances_compare)

The approach WASD has used in providing multiple instance serving may be compared in many ways to VMS clustering.

A cluster is often described as a loosely-coupled, distributed operating environment where autonomous processors can join, process and leave (even fail) independently, participating in a single management domain and communicating with one another for the purposes of resource sharing and high availability.

Similarly WASD instances run in autonomous, detached processes (across one or more systems in a cluster) using a common configuration and management interface, aware of the presence and activity of other instances (via the Distributed Lock Manager and shared memory), sharing processing load and providing rolling restart and automatic (fail-through) as required. (Load Sharing\hd_instances_load)

On a multi-CPU system there are performance advantages to having processing available for scheduling on each. WASD employs AST (I/O) based processing and was not originally designed to support VMS kernel threading. Benchmarking has shown this to be quite fast and efficient even when compared to a kernel-threaded server (OSU) across 2 CPUs. The advantage of multiple CPUs for a single multi-threaded server also diminishes where a site frequently activates scripts for processing. These of course (potentially) require a CPU each for processing. Where a system has many CPUs (and to a lesser extent with only two and few script activations) WASD's single-process, AST-driven design would scale more poorly. Running multiple WASD instances addresses this.

(Of course load sharing is not the only advantage to multiple instances \BOLD) (Restart\hd_instances_restart)

When multiple WASD instances are executing on a node and a restart is initiated only one process shuts down at a time. Others remain available for requests until the one restarting is again fully ready to process them itself, at which point the next commences restart. This has been termed a (rolling restart). Such behaviour allows server reconfiguration on a busy site without even a small loss of availability. (Fail-Through\hd_instances_fail)

When multiple instances are executing on a node and one of these exits for some reason (resource exhaustion, bugcheck, etc.) the other(s) will continue to process requests. Of course requests in-progress by the particular instance at the time of instance failure are disconnected (this contrasts with the rolling restart behaviour described above). If the former process has actually exited (in contrast to just the image) a new server process will automatically be created after a few seconds.

The term (fail-through) is used rather than (failover) because one server does not commence processing as another ceases. All servers are constantly active with those remaining immediately and automatically taking all requests in the absence any one (or more) of them. (Considerations\hd_instances_cost)

Of course (there is no such thing as a free lunch) and supporting multiple instances is no exception to this rule. To coordinate activity between and access to shared resources, multiple instances use low-level mutexes and the VMS Distributed Lock Manager (DLM). This does add some system overhead and a little latency to request processing, however as the benchmarks indicate increases in overall request throughput on a multi-CPU system easily offset these costs. On single CPU systems the advantages of rolling restart and fail-through need to be assessed against the small cost on a per-site basis. It is to be expected many low activity sites will not require multiple instances to be active at all.

When managing multiple instances on a single node it is important to consider each process will receive a request in round-robin distribution and that this needs to be considered when debugging scripts, using the Server Administration page and the likes of WATCH, etc. (see (hd_watch_instances)). (Configuration\hd_instances_config)

If not explicitly configured only one instance is created. The configuration directive [InstanceMax] allows multiple instances to be specified (see (HREF=[-.config]config.sdml!../config/#cr_config_directives)([-.CONFIG]DOC_config.sdml)) When this is set to an integer that many instances are created and maintained. If set to (CPU) then one instance per system CPU is created. If set to (CPU-(integer)) then one instance for all but one CPU is created, etc. The current limit on instances is eight, although this is somewhat arbitrary. As with all requests, Server Administration page access is automatically shared between instances. There are occasions when consistent access to a single instance is desirable. This is provided via an (admin service) (see (HREF=[-.config]config.sdml!../config/#cr_service_directives)([-.CONFIG]DOC_config.sdml))

When executing, the server process name appends the instance number to the (WASD). Associated scripting processes are named accordingly. This example shows such a system: () Pid Process Name State Pri I/O CPU Page flts Pages 21600801 SWAPPER HIB 16 0 0 00:06:53.65 0 0 21600807 CLUSTER_SERVER HIB 12 1879 0 00:01:14.51 91 112 21600808 CONFIGURE HIB 10 30 0 00:00:01.46 47 23 21600816 ACME_SERVER HIB 10 71525 0 00:01:28.08 508 713 M 21600818 SMISERVER HIB 9 11197 0 00:00:02.29 158 231 21600819 TP_SERVER HIB 9 1337711 0 00:05:55.78 80 105 216421F1 WASD1:80 HIB 5 5365731 0 00:23:12.86 37182 7912 2164523F WASD2:80 HIB 5 5347938 0 00:23:31.41 38983 7831 2162BA5D WASD_WOTSUP HIB 3 2111 0 00:00:00.47 735 518 2164ABCF WASD1:80-651 LEF 6 57884 0 00:00:16.71 3562 3417 2164CBDB WASD2:80-612 LEF 4 19249 0 00:00:04.16 3153 3116 21631BDC WASD2:80-613 LEF 5 18663 0 00:00:07.19 3745 3636 2164BBE6 WASD1:80-658 LEF 5 3009 0 00:00:00.94 2359 2263 (Status\hd_instance_status)

The instance management infrastructure distributes basic status data to all instances on the node and/or cluster. The intent is to provide an easily comprehended snapshot of multi-instance/multi-node WASD processing status. The data comprises: (UNNUMBERED) instance name (e.g. "KLAATU::WASD:443") date/time the instance status was last updated + how long (ago) this was (seconds, minutes, hours, or days) date/time the instance last started + how long (ago) this was (seconds, minutes, hours, or days) number of times the instance has started up date/time the instance last exited + how long (ago) this was (seconds, minutes, hours, or days) the VMS status at the last exit instance WASD version (e.g. "11.2.0") number of requests processed during the preceding minute number of requests processed during the preceding sixty minutes

The data are constrained to these items due to the need to accomodate it within a 64 byte lock value block for cluster purposes. Single node environments do not utilise the DLM, each instance updating its table entry directly.

Each node has a table with an entry for every other instance in that WASD environment. Instance data are updated once every minute so any instance with data older than one minute is no longer behaving correctly. This could be due to some internal error, or that the instance no longer exists (e.g. been stopped, exited or otherwise no longer executing). An entry for an instance that no longer exists is retained indefinitely, or until a /DO=STATUS=PURGE is performed removing all such (expired) entries, or a /DO=STATUS=RESET removing all entries (and allowing those currently executing to repopulate the instance data over the next minute.

These status data are accessible via command-line and in-browser reports, intended for larger WASD installations, primarily those operating across multiple nodes in a cluster. With the data being stored in a common, another of those other nodes can provide a per-cluster history even if one or more nodes become completely non-operational.

This is an example report on a 132 column terminal display. Due to screen width constraints the date/time omits the year field of the date. $ httpd/do=status Instance Ago Up Ago Count Exit Ago Status Version /Min /Hour ~~~~~~~~~~~~~~~~ ~~~~ ~~~~~~~~~~~~~~~ ~~~~ ~~~~~ ~~~~~~~~~~~~~~~ ~~~~ ~~~~~~~~~~ ~~~~~~~ ~~~~ ~~~~~ 1 KLAATU::WASD:80 41s 18-DEC 23:27:57 54m 21 18-DEC 23:27:57 54m %X00000001 11.2.0 2 17 KLAATU::WASD1:80---1d-17-DEC-02:49:21---1d-----5-17-DEC-02:50:03---1d-%X00000001-11.2.0----3-----15 KLAATU::WASD2:80---1d-17-DEC-02:49:25---1d-----5-17-DEC-02:50:07---1d-%X00000001-11.2.0----0-----10 KLAATU::WASD3:80---1d-17-DEC-02:49:29---1d-----6-17-DEC-02:50:11---1d-%X00000001-11.2.0----0------3 as at 19-DEC-2017 00:22:41

This provides an example CLI report showing a single node, where a single instance has been started, changed to a three instance configuration, restarted so that the three instances have begun processing. The configuration has been returned a single instance and then the existing three instances restarted the previous day, resulting in the original single instance returning to processing. That instance was last (re)started some 54 minutes ago (a normal exit status showing) and its status was last updated some 41 seconds ago. Note that the three instances showing white-space struck-through with hyphens are stale, having last been updated 1 day ago. Entries older than three minutes are displayed in this format to differentiate them from current entries.

The same report on an 80 column terminal. Note that the overt date/time has been omitted, leaving only the period (ago) the event happened. $ httpd/do=status Instance Ago Up Count Exit Status Version /Min /Hour ~~~~~~~~~~~~~~~~ ~~~~ ~~~~ ~~~~~ ~~~~ ~~~~~~~~~~ ~~~~~~~ ~~~~ ~~~~~ 1 KLAATU::WASD:80 5s 58m 21 58m %X00000001 11.2.0 1 18 KLAATU::WASD1:80---1d---1d-----5---1d-%X00000001-11.2.0----3-----15 KLAATU::WASD2:80---1d---1d-----5---1d-%X00000001-11.2.0----0-----10 KLAATU::WASD3:80---1d---1d-----6---1d-%X00000001-11.2.0----0------3 as at 19-DEC-2017 00:25:05

Where multiple instances exist, or have existed, and the terminal page size is greater than 24 lines, HTTPMON displays an equivalent of the 80 column report at the bottom of the display.

Similarly, the Server Admin report ((cr_server_admin)) shows an HTML equivalent of the 80 column report immediately below the control and time panels. (Using Instance Status\hd_instance_status_using) (UNNUMBERED) The strike-through (hyphens) of an instance line immediately indicates the instance is no longer updating (after 3 minutes).

Clear stale entries using $ HTTPD/DO=STATUS=PURGE. The instance name (Ago) shows how long ago it was last updated. If the exit (Ago) is more recent than the startup (Ago) the instance has exited but not restarted. The exit (Status) can show a non-normal status (i.e. not %X00000001). An excessive startup (Count) suggests something amiss. Per-minute and/or per-hour request counts that seem atypically low while instance status seems otherwise normal suggests a networking issue, perhaps up-stream. (....................................................................) (Server Environments\hd_server_environments)

WASD server environments allow multiple, distinctly configured environments to execute on a single system. Generally, WASD's unlimited virtual servers and multiple account scripting eliminates the need for multiple execution environments to kludge these requirements. However there may be circumstances that make this desirable; regression and forward-compatibility testing comes to mind.

First some general comments on the design of WASD. (UNNUMBERED) WASD creates and populates it's own logical name table (see (HREF=[-.config]config.sdml!../config/#hd_logical_names)([-.CONFIG]DOC_config.sdml)).

It also adds the WASD_FILE_DEV[(n)] and WASD_ROOT[(n)] logical names to the SYSTEM logical name table. WASD creates and uses rights identifiers.

Installation creates and associates specific rights identifiers with separate accounts for server and script execution. Some specifically named identifiers have functional meaning to the server. Server startup can create and associate rights identifers used to manage the server run-time environment. WASD makes extensive use of the DLM to coordinate WASD activities system- and cluster-wide.

All executing server images are aware of all other executing server images on the same system and within the same cluster. This performs all manner of coordination (e.g. instance recovery, instantiated services) and data exchange (e.g. $HTTPD/DO=MAP/ALL) activities. WASD uses global sections to accumulate data and for communication between WASD instances.

Some of these are by default permanent and remain on a system unless explicitly removed. WASD uses detached scripting processes.

As it's possible to $STOP a server process (and thereby prevent it's run-down handlers from cleaning up those detached processes). It therefore needs to be able to recognise as its 'own' and clean any such 'orphaned' processes up next time it starts. It does this by having a rights identifier associated with the server process name (e.g. WASD:80 grants its scripting processes WASD_PRC_WASD_80, a second instance WASD2:80, WASD_PRC_WASD2_80, etc.)

All of these mechanisms support multiple, independent environments on a single system. Due to design and implementation considerations there are fifteen such environments available per system. The primary (default) is one. Environments two to fifteen are available for site usage. (Demonstration mode, /DEMO uses environment zero.) Server (instances) ((hd_server_instances)) share a single environment.

There are two approaches to provisioning such multiple, independent environments. (Ad Hoc Server Wrapper\hd_instances_adhoc)

This is a DCL procedure that allows virtually any WASD release HTTP server to be executed in a detached process, either by itself or concurrently with a full release or other ad hoc detached server. The server image and associated configuration files used by this process can be specified within the procedure allowing completely independent versions and environments to be fully supported.

Full usage instructions may be found in the example procedure(s) in (HTML=WASD_ROOT:[EXAMPLE]*ADHOC*.COM) (HTML/OFF) WASD_ROOT:[EXAMPLE]*ADHOC*.COM (HTML/ON)

Two versions are provided, one for pre-v10 and one for post-v10 (due to changes in logical naming schema). (Formal Environments\hd_instances_formal)

Although the basic infrastructure for supporting multiple environments (i.e. the 0..15 environment number) has been in place since version 8, formal support in server CLI qualifiers and DCL procedures has only been available since version 10. To support version 9 or earlier environments the (hd_instances_adhoc) must be used.

WASD version 10 startup and other run-time procedures have been modified to support running multiple WASD environments simply from independent WASD file-system trees. The standard (HTML= STARTUP.COM ) (HTML/OFF) STARTUP.COM (HTML/ON) procedure accepts the WASD_ENV parameter to specify which environment (1..15) the server should execute within (primary/default is 1). The procedure then derives the WASD_ROOT logical name from the location of the startup procedure.

For example: $! start current release $ WASD_STARTUP = "/SYSUAF=(ID,SSL)/PERSONA" $ @DKA0:[WASD_ROOT.STARTUP]STARTUP.COM $! start previous release in environment 2 $ WASD_ENV = 2 $ @DKA0:[WASD_ROOT_MINUS1.STARTUP]STARTUP.COM (Considerations\hd_instances_formal_consider)

WASD environments each fully support all WASD features and facilities (including multiple server instances) with the exception of DECnet scripting where because of DECnet objects' global (per-system) definition only the one must be shared between environments.

Per-environment configuration must be done in its own WASD_ROOT part of the file-system and logical names must be defined in the environment's associated logical name table. The site administrator must keep track of which environment requires to be accessed from the command-line and set the process logical name search list using the appropriate $ @WASD_FILE_DEV([n]) where (n) can be a non-primary environment number (see (HREF=[-.config]config.sdml!../config/#hd_logical_names)([-.CONFIG]DOC_config.sdml)).

It is not possible to have multiple environments bind their services to the same IP address and port (for fundamental networking reasons). Unless the network interface is specifically multi-homed for the purpose, services provided by separate environments must be configured to use unique IP ports.

Non-primary environments (215) prefix the environment as a (hex) digit before the (WASD) in the process name. The above example when executing, each with a single scripting process, would appear in the system as (second environment providing a service on port 2280): () Pid Process Name State Pri I/O CPU Page flts Pages 00000101 SWAPPER HIB 16 0 0 00:00:11.98 0 0 00000111 ACME_SERVER HIB 10 6247 0 00:00:12.63 540 611 M 00000112 QUEUE_MANAGER HIB 10 328 0 00:00:00.18 136 175 00000122 TCPIP$INETACP HIB 10 1249419 0 00:07:33.95 401 326 00000123 TCPIP$ROUTED LEF 6 3495839 0 00:01:15.49 166 165 S 00000468 WASD:80 HIB 6 132924 0 00:01:29.26 17868 2856 0000046D 2WASD:2280 HIB 6 129344 0 00:01:29.26 17712 2840 0000049D WASD:80-8 LEF 4 4449 0 00:00:00.67 934 194 00000503 2WASD:2280-2 LEF 4 565 0 00:00:00.28 732 102 (Cleaning Up\hd_instances_formal_cleanup)

As described earlier each environment creates and maintains logical name table(s) and system-level name(s), detached scripting processes, lock resources and permananent global sections. Lock resources disappear with the server processes. Logical names, global sections, rights identifiers and occasionally detached scripting processes may require some cleaning up when a non-primary environment's use is concluded. (--------------------------------------------------------------------)