|4.9File and Virtual Specifications|
|4.12Script-Generated SSI Documents|
The HTML pre-processor is used to provide dynamic information inside of an otherwise static, HTML (HyperText Markup Language) document. The HTTPd server provides this as internal functionality, scanning the input document for special pre-processor directives, which are replaced by dynamic information based upon the particular directive.
As of version 5.1 WASD SSI has been enhanced to provide flow-control statements, allowing blocks of the document to be conditionally processed, see 4.7 Flow Control. These extensions allow quite versatile documents to be created without resorting to script processing.
Two documents are provided as examples of SSI processing.
|WASD_ROOT:[WASDWASDOC.ENV]SSI.SHTML||(access as SSI)|
|WASD_ROOT:[WASDWASDOC.ENV]XSSI.SHTML||(access as SSI)|
By default the HTML pre-processor is invoked when the document file's extension is ".SHTML". As there is a significant overhead with pre-processed HTML compared to normal HTML, it should only be used when it serves a useful documentary purpose, and not just for the novelty.
Essential compatibility with OSU Server Side Includes is provided. This may ease any transition between the two. See 4.11 OSU Compatibility for further information.
One effective use for pre-processed HTML is the creation of single virtual documents from two or more physical documents. That is, the pre-processed document is used to include multiple physical documents, that may even be independently administered, to return a composite document to the client. This is a relatively low-overhead activity as SSI goes, but because it is a dynamic document, without some extra considerations (see 4.2 Last-Modified Information).
This provides an example of the efficient use of SSI processing to create virtual documents. Each page will comprise a header (containing the body tag and page header, etc), the document proper and a footer (containg the end-of-page information, modification date, and end-body tag, etc).
A more efficient variant places the document proper in its own, plain HTML file which is then #included (it is much, much, much more efficient for the server to throw a file at the network, than parse every character in one ;^)
This example provides a seemingly more convoluted, but very much more powerful configuration, that uses recursion to greatly simplify maintenance of common-layout documents for the end-user.
File 1; the document accessed via the browser URL, doesn't matter what its name is, this configuration is completely naming independent.
File 2; the TEMPLATE.SHTML refered to by the first include above.
File 3; the DOCUMENT.HTML refered to by the second include in file 1.
This is just a bunch of HTML!
This is an explanation of how it works …
The following link provides an example of a such a virtual document.
SSI documents generally contain dynamic elements, that is those that may change with each access to the document (e.g. current date/time). This makes evaluation of any document modification date difficult and so by default no "Last-Modified: timestamp" information is supplied against an SSI document. The potential efficiencies of having document timestamps, so that requests can be made for a document to be returned only if modified after a certain date/time ("If-Modified-Since: timestamp"), are significant against the CPU overheads of processing SSI documents.
WASD allows the document author to determine whether or not a last-modified header field should be generated for a particular document and which contributing file(s) should be used to determine it. This is done using the #modified directive. If a virtual document is made up of multiple source documents (files) each can be assessed using multiple virtual= or file= tags, the most recently modified will be used to determine if the virtual document has been modified, and also to generate the last-modified timestamp.
The if-modified-since tag compares the determined revision date/time of the document file(s) with any "If-Modified-Since:" timestamp supplied with the request. If the virtual document's revision date/time is the same or older than the request's then a not-modified (304 status) header is generated and sent to the client and document processing ceases. If more recent an appropriate "Last-Modified:" header field is added to the document and it continues to be processed.
If a request has a "Pragma: no-cache" field (as with Navigator's reload function) the document is always generated (this is consistent with general WASD behaviour). The following example illustrates the essential features.
This construct should be placed at the very beginning of the SSI document, and certainly before there is any chance of output being sent to the browser. Once output to the client has occured there can be no change to the response header information (not unreasonably).
SSI preprocessed documents are dynamic in the sense that the information presented can be different every time the document is generated (e.g. if time directives are included). If it is important that each time the document is accessed it is regenerated then an HTML META tag can be included in the HTML header to cause the document to expire. This will result in the document being reloaded with each access. This can be accomplished two ways.
The syntax follows closely that used by the other implementations, but some directives are tailored to the WASD and VMS environment. The directive is enclosed within an HTML comment and takes the form:
A tag provides parameter information to the directive. A directive may have zero, one or more parameters. Values supplied with any tag may be literal or via variable substitution (see 4.6 Variables). A value must be encolosed by quotation marks if it contains white-space.
A directive can be split over multiple lines provided the new line begins naturally on white-space within the directive. For example, this is correctly split
Directive and tag keywords are case insensitive. The tag value may or may not be case sensitive, depending upon the command/tag. Generally the effect of a command is to produce additional text to be inserted in the document, although it is possible to control the flow of processing in a document with decision structures.
|#accesses||document access count||4.5.1 #ACCESSES|
|#config||document processing options||4.5.2 #CONFIG|
|#dir||directory listing||4.5.3 #DIR|
|#dcl||DCL command processing||4.5.4 #DCL|
|#echo||output information||4.5.5 #ECHO|
|#elif||flow control||4.5.6 #ELIF|
|#else||flow control||4.5.7 #ELSE|
|#endif||flow control||4.5.8 #ENDIF|
|#exec||same as "#dcl"||4.5.9 #EXEC|
|#exit||flow control, stop current document processing||4.5.10 #EXIT|
|#fcreated||output file creation date/time||4.5.11 #FCREATED|
|#flastmod||output file last modification date/time||4.5.12 #FLASTMOD|
|#fsize||output file size||4.5.13 #FSIZE|
|#if||flow control||4.5.14 #IF|
|#include||include a text file or another SSI document||4.5.15 #INCLUDE|
|#modified||HTTP response control||4.5.16 #MODIFIED|
|#orif||flow control||4.5.17 #ORIF|
|#printenv||list document variables||4.5.18 #PRINTENV|
|#set||assign value to a document variable||4.5.19 #SET|
|#ssi||block of SSI statements||4.5.20 #SSI|
|#stop||stop SSI processing completely||4.5.21 #STOP|
The #accesses directive allows the number of times the document has been accessed to be included. It does this by creating a counter file in the same location and using the same name with a dollar symbol appended to the type (extension). The count may be reset by deleting the file. This is an expensive function (in terms of file system activity) and so should be used appropriately. It can be disabled by server configuration. Three tags provide additional functionality:
Provides the count as 1st, 2nd, 3rd, 4th, 5th … 10th, 11th, 12th … 120th, 121st, 122nd, etc.
This tag includes the specified text immediately after the access count is displayed, then adds the creation date of the counter file.
The #config directive allows time and file size formats to be specified for all subsequent directives providing these values. Optional specifications for individual directives may still be made, and override, do not supercede, any specification made using a config directive. A config directive may be made once, or any number of times in a document, and applies until another is made, or until the end of the document.
This directive allows the error message generated if a problem problem processing the SSI document occurs (e.g. miss-spelled directive) to be specified in the document.
Switches document processing trace on or off, intended for use when debugging more complex or flow-controlled SSI documents.
Output from a trace is colour-coded.
The following link provides an example of a document trace.
The #dir directive generates an Index of … directory listing inside an HTML document. Apart from not generating a title (it is up to the pre-processed document to title, or otherwise caption, the listing) it provides all the functionality of the WASD HTTPd directory listing (see 3. Directory Listing), including query string format control via the "par=" parameter (note that from the "?httpd=index" introducer used with directory listings is not necessary from SSI). It is an WASD HTTPd extension to pre-processed HTML.
Listing specified using a VMS file path.
Listing specified using URL-style syntax.
The #dcl directive executes a DCL command and incorporates the output into the processed document. It is an WASD HTTPd extension to the more common exec directive, which is also included.
By default, output from the DCL command has all HTML-forbidden characters (e.g. "<", "&") escaped before inclusion in the processed document. Thus command output cannot interfere with document markup, but nor can the DCL command provide HTML markup. This behaviour may be changed by appending the following tag to the directive:
Some #dcl directives are for privileged documents only, documents defined as those being owned by the SYSTEM account, and not being world-writeable. The reason for this should be obvious. There are implicit security concerns about any document being able to execute any DCL command(s), even if it is being executed in a completely unprivileged process. Hence only innocuous commands are allowed in standard documents.
Execute the DCL "WRITE SYS$OUTPUT" command, using the specified parameter.
Execute the DCL "SHOW" command, using the specified parameter.
Execute the DCL "DIRECTORY" command, using the supplied file specification. Qualifiers may be included in the optional "par" tag to control the format of the listing.
Execute the specified DCL command.
Execute the DCL command procedure specified as a VMS file path, with any specified parameters applied to the procedure.
Execute the DCL command procedure specified in URL-style syntax, with any specified parameters applied to the procedure.
Execute the specified CGI script. The CGI response header is suppressed and only the response body is included in the document.
The #echo directive incorporates the specified information into the processed document. Multiple tags may be used within the one directive.
Any SSI variable (e.g. CREATED), CGI variable (e.g. HTTP_USER_AGENT), or document assigned variable (e.g. EXAMPLE1), see 4.6 Variables.
The date/time of the current document's creation.
Include the current date/time.
Include the current Greenwich Mean Time (UTC) date/time.
The current document's URL-style path.
The current document's VMS file path.
Append the specified string to the response header (with correct carriage control). Should be used as early as possible in the SSI document.
The date/time of the current document's last modification.
The #elif directive (else-if) allows blocks of HTML markup and SSI directives to be conditionally processed, see 4.7 Flow Control and 4.5.14 #IF. This directive effectively allows a case statement to be constructed.
The #else directive allows blocks of HTML markup and SSI directives to be conditionally processed, see 4.7 Flow Control. It is the default block after an "#if", "#orif" or "#elif".
The #endif directive marks the end of a block of document text being conditionally processed, see 4.7 Flow Control.
The #exec directive executes a DCL command and incorporates the output into the processed document. It is the VMS equivalent of the exec shell directive of some Unix implementations. It is implemented in the same way as the #DCL directive, and so the general detail of that directive applies. It supports both the cmd tag and the cgi tag, allowing execution of CGI scripts (the response header is absorbed).
The #exec directive is for privileged documents only, documents defined as those being owned by the SYSTEM account, and not being world-writeable. The reason for this should be obvious. There are implicit security concerns about any document being able to execute any DCL command(s), even if it is being executed in a completely unprivileged process.
The #exit directive causes the server to stop processing the current SSI file. If the current file was an #included SSI file, processing continues back with the parent file. Note that the #stop directive also is available, it stops processing of the entire virtual document.
The #fcreated directive incorporates the creation date/time of a specified file/document into the processed document.
Document specified using a VMS file path.
Document specified using URL-style syntax.
The #flastmod directive incorporates the last modification date/time of a specified file/document into the processed document.
Document specified using a VMS file path.
Document specified using URL-style syntax.
The #fsize directive incorporates the size, in bytes, kbytes or Mbytes, of a specified file/document into the processed document.
Document specified using a VMS file path.
Document specified using URL-style syntax.
The #if directive allows blocks of HTML markup and SSI directives to be conditionally processed, see 4.7 Flow Control.
Variable the decision will be based upon.
Is the string the same as in the variable?
If the variable is a number is it the same as this?
If the variable is a number is it greater than this?
If the variable is a number is it less than this?
Search the variable for this string. May contain the * (asterisk) wildcard, matching one or more characters, and the % (percentage), matching any single character.
As in the following examples:
The #include directive incorporates the contents of a specified file/document into the processed document.
Include the contents of the document specified using a VMS file specification.
Include the contents of the document specified using URL-style syntax.
The contents of the specified file are included differently depending on the MIME content-type of the file. Files of text/html content-type (HTML documents) are included directly, and any HTML tags within them contribute to the markup of the document. Files of text/plain content-type (plain-text documents) are encapsulated in "<pre></pre>" tags and have all HTML-forbidden characters (e.g. "<", "&") escaped before inclusion in the processed document. An HTML file can be forced to be included as plain-text by using the following syntax:
To "force" a file to be considered as text regardless of the actual content (as determined by the server from the file type), use on of the following depending on whether it should be rendered as plain or HTML text.
Other SSI files may be included and their content dynamically included in the resulting document. To prevent a recursive inclusion of documents the nesting level of SSI documents is limited to five.
The #modified directive allows a document author to control the "Last-Modified:"/"If-Modified-Since:"/"304 Not modified" behaviour of an SSI document. See 4.1 Virtual Documents.
Get the last-modified date/time of the current document.
Get the last-modified date/time of the document specified using VMS file specification.
Get the last-modified date/time of the document specified using URL-style syntax.
Compares any "If-Modified-Since:" request header timestamp to the revision date time obtained using file or virtual (most recent if multiple). If the document timestamp is more recent (has been modified) an appropriate "Last-Modified" response header field is generated and added to the response, and document processing continues. If it has not been modified a "304" response header is return (document not modified) and document processing stops.
Adds a "Last-Modified:" response header field using a timestamp retrieved using file or virtual (note: unnecessary if the if-modified-since tag is used).
Adds a "Expires:" response header field. The string literal should be a legitimate RFC-1123 date string. This can be used for pre-expiring documents (so they are always reloaded), set it to a date in the not-too-distant past (as in the example below). Of course it could also be used for setting the legitimate future expiry of documents.
The #orif directive (or-if) allows blocks of HTML markup and SSI directives to be conditionally processed, see 4.7 Flow Control and 4.5.14 #IF. In the absence of any real expression parser this directive allows a block to be processed if one of multiple conditions are met.
The #printenv directive prints a plain-text list of all SSI-specific, then CGI, then document-assigned variables (see 4.6 Variables). This directive is intended for use when debugging flow-controlled SSI documents.
The following link uses the example SSI document WASD_ROOT:[WASDOC.ENV]XSSI.SHTML to demonstrate this.
The #set directive allows a user variable to be assigned or modified, see 4.6 Variables.
Variables are always stored as strings and have a finite but generally usable length. Some comparison tags provided in the flow-control directives treat the contents of variables as numbers. A numeric conversion is done at evaluation time.
The #ssi directive allows multiple SSI directives to be used without the requirement to enclose them in the normal HTML comment tags (i.e. <!-- -->). This helps reduce the clutter in an SSI document that uses the extended capabilities of variable assignment and flow control. Document HTML cannot be included between the opening and closing comment elements of the "#ssi" tag, although of course document output can be generated using the "#echo" tag.
The example SSI document WASD_ROOT:[WASDOC.ENV]XSSI.SHTML will demonstrate this concept.
The #stop directive causes the server to stop processing the virtual document. It can be used with flow control structures to conditionally process only part of a virtual document. Note that the #exit directive also is available, it stops processing of the current file (for nested #includes, etc.).
The SSI processor maintains information about the server, date and time, request path, request parameters, etc., accessible via variable name. Although these server variables cannot be modified by the document the processor also allows the author to create and assign new document variables by name. SSI variables have global scope, with a small number of exceptions listed below. That is, the same set of variables are shared with the parent document by any other SSI documents #included, and any included by those, etc.
One other special-purpose variable, THE_FILE_NAME, see 4.9.1 THE_FILE_NAME.
Server assigned variables comprise some SSI-specific as well as the same CGI variables available to CGI scripts. These may be found listed in the CGI Variables in WASD Scripting document. <P> The following link provides a list of the SSI and CGI variables available to SSI documents.
Whenever a directive uses information from a tag (see 4.4 Directive Syntax) values from variables may be substituted as as a whole or partial value. This is done using curly braces to delimit the variable name. For example
Variables are considered numeric when they begin with a digit. Those beginning with an alphabetic are considered to have a numeric value of zero.
Variables are considered to be boolean false if empty and true when not empty.
It is also possible to extract substrings from variables using the following syntax,
where the start-index begins with the zeroth character and numbers up to the last character in the string, and count may be zero or any positive number. If only one number is supplied it is regarded as a count and the string is extracted from the zeroth character.
The example SSI document WASD_ROOT:[WASDOC.ENV]XSSI.SHTML can demonstrate these concepts.
WASD SSI allows blocks of document to be conditionally processed. This uses constructs in a similar way to any programming language. The emphasis has been on simplicity and speed of processing. No complex expression parser is provided. Despite this, complex document constructs can be implemented. Flow control structures may be nested up to eight levels.
The "#if", "#orif" and "#elif" directives must provide an evaluation. This can be single variable, which if numeric and non-zero is considered true, if zero if false, or can be a string, which if empty is false, and if not empty is true. Tests can be made against the variable which when evaluated return a true or false. Multiple tests may be made against the one variable, or against more than one variable. Multiple tests act as a logical AND of the results and terminate when the first fails.
Any evaluation can have the result negated by prefixing it with an exclamation point. For instance, the first of these examples would produce a false result, the second true.
The following is a simple example illustration of variable setting, use of variable substrings, and conditional processing of document blocks.
The example SSI document WASD_ROOT:[WASDOC.ENV]XSSI.SHTML further illustrates these concepts.
A query string may be passed to an SSI document in much the same way as to a CGI script. In this way the behaviour of the document can be varied in accordance to information explicitly passed to it when accessed. To prevent the server's default query engine being given the request precede any query string with "?httpd=ssi". The server detects this and passes the request instead to the SSI processor. Just append the desired query string components to this as if they were form elements. For example:
The following link uses the example SSI document WASD_ROOT:[WASDOC.ENV]XSSI.SHTML to demonstrate this. Look for the <QUOTE>(FORM_TEST1=one), etc.
Documents may be specified using either the "FILE" or "VIRTUAL" tags.
The "FILE" tag expects an absolute VMS file specification.
The "VIRTUAL" tag expects an URL-style path to a document. This can be an absolute or relative path. See 2.3 Document Specification for further details.
Generally, when an error are encountered document processing halts and and an error report is generated. For some common circumstances, in particular the existance or not of a particular file, may require an alternative action. For file activities (e.g. #include, #flastmod, #created, #fsize) the optional fmt="" tag provides some measure of control on error behaviour. If the format string begins with a "?" files not found are not reported as errors and processing continues. Other file systems errors, such as directory not found, syntax errors, etc., are always reported.
Every time a file is accessed (e.g. #include, #flastmod) the server variable THE_FILE_NAME gets set to that name if successful, or reset to empty if unsuccessful. This variable can be checked to determine success or otherwise.
Whenever a time directive is used an optional tag can be included to specify the format of the output. The default looks a little VMS-ish. If a format specification is made it must confirm to the C programming language function strftime().
The format specifier follows a similar syntax to the C standard library printf() family of functions, where conversion specifiers are introduced by percentage symbols. Here are some example uses:
A problem with any supplied time formatting specification will be reported.
The following table provides the general conversion specifiers. For further information on the formatting process refer to a C programming library document on the strftime() function.
|a||The locale's abbreviated weekday name|
|A||The locale's full weekday name|
|b||The locale's abbreviated month name|
|B||The locale's full month name|
|c||The locale's appropriate date and time representation|
|C||The century number (the year divided by 100 and truncated to an integer) as a decimal number (00 - 99)|
|d||The day of the month as a decimal number (01 - 31)|
|D||Same as %m/%d/%y|
|e||The day of the month as a decimal number (1 - 31) in a 2 digit field with the leading space character fill|
|Ec||The locale's alternative date and time representation|
|EC||The name of the base year (period) in the locale's alternative representation|
|Ex||The locale's alternative date representation|
|EX||The locale's alternative time representation|
|Ey||The offset from the base year (%EC) in the locale's alternative representation|
|EY||The locale's full alternative year representation|
|h||Same as %b|
|H||The hour (24-hour clock) as a decimal number (00 - 23)|
|I||The hour (12-hour clock) as a decimal number (01 - 12)|
|j||The day of the year as a decimal number (001 - 366)|
|m||The month as a decimal number (01 - 12)|
|M||The minute as a decimal number (00 - 59)|
|n||The newline character|
|Od||The day of the month using the locale's alternative numeric symbols|
|Oe||The date of the month using the locale's alternative numeric symbols|
|OH||The hour (24-hour clock) using the locale's alternative numeric symbols|
|OI||The hour (12-hour clock) using the locale's alternative numeric symbols|
|Om||The month using the locale's alternative numeric symbols|
|OM||The minutes using the locale's alternative numeric symbols|
|OS||The seconds using the locale's alternative numeric symbols|
|Ou||The weekday as a number in the locale's alternative representation (Monday=1)|
|OU||The week number of the year (Sunday as the first day of the week) using the locale's alternative numeric symbols|
|OV||The week number of the year (Monday as the first day of the week) as a decimal number (01 -53) using the locale's alternative numeric symbols. If the week containing January 1 has four or more days in the new year, it is considered as week 1. Otherwise, it is considered as week 53 of the previous year, and the next week is week 1.|
|Ow||The weekday as a number (Sunday=0) using the locale's alternative numeric symbols|
|OW||The week number of the year (Monday as the first day of the week) using the locale's alternative numeric symbols|
|Oy||The year without the century using the locale's alternative numeric symbols|
|p||The locale's equivalent of the AM/PM designations associated with a 12-hour clock|
|r||The time in AM/PM notation|
|R||The time in 24-hour notation (%H:%M)|
|S||The second as a decimal number (00 - 61)|
|t||The tab character|
|T||The time (%H:%M:%S)|
|u||The weekday as a decimal number between 1 and 7 (Monday=1)|
|U||The week number of the year (the first Sunday as the first day of week 1) as a decimal number (00 - 53)|
|V||The week number of the year (Monday as the first day of the week) as a decimal number (00 - 53). If the week containing January 1 has four or more days in the new year, it is considered as week 1. Otherwise, it is considered as week 53 of the previous year, and the next week is week 1.|
|w||The weekday as a decimal number (0 [Sunday] - 6)|
|W||The week number of the year (the first Monday as the first day of week 1) as a decimal number (00 - 53)|
|x||The locale's appropriate date representation|
|X||The locale's appropriate time representation|
|y||The year without century as a decimal number (00 - 99)|
|Y||The year with century as a decimal number|
|Z||Timezone name or abbreviation. If timezone information is not available, no character is output.|
Essential compatibility with OSU Server Side Includes directives is provided. This is intended to ease any transition to WASD, as existing SSI documents will not need to be changed unless any of the WASD capabilities are required. To provide transparent processing of OSU .HTMLX files ensure the following WASD configuration is in place.
In HTTPD$CONFIG file:
Note that the content description must contain the string "OSU" to activate some compliancy behaviours.
In HTTPD$MAP file:
This provides a mechanism for the OSU part-document facility. (Yes, the "__part" has two leading underscores!)
The following OSU directives are provided specifically for OSU compatibility, although there is no reason why most of these may not also be deployed in general WASD SSI documents if there is a requirement. Note that these are OSU-specifics, other OSU directives are provided by the standard WASD SSI engine.
|#begin label [label]||delimit a part-document (see ‘OSU "Part"s’ in 4.11 OSU Compatibility)|
|#config verify=1||enable commented-tag trace output|
|#echo accesses||document access count|
|#echo accesses_ordinal||document access count|
|#echo getenv=""||output logical or symbol|
|#echo hw_name||system hardware name|
|#echo server_name||HTTPd server host name|
|#echo server_version||HTTPd software version|
|#echo vms_version||HTTPd system version of VMS|
|#end label [label]||delimit a part-document (see ‘OSU "Part"s’ in 4.11 OSU Compatibility)|
|#include [file|virtual]="" part="label"||include only part of a virtual document|
If WASD is configured for OSU SSI compatibility the following link provides an online demonstration as well as further explanation of the OSU SSI engine using an OSU preprocessor document from the distribution (included within copyright compliance).
How do we know WASD is processing it? Look for the #echo var="GETENV=SYS$REM_ID" towards the end of the document. It should indicate "[VARIABLE_DOES_NOT_EXIST!]" because it's attempting to output a DECnet-related logical name!
The OSU processor allows for delimited subsections of an #included document, or a URL referenced document for that matter, to be included in the output. This is supported, but only for compatibility. It is only enabled for ".HTMLX" documents and if otherwise used may interact unexpectedly with WASD SSI flow-control.
It is possible to have script output passed back through the SSI engine for markup. This approach might allow script output to automatically be wrapped in standard site headers and footers for example. Essentially the script must output an SSI-markup response body and include in the otherwise standard CGI response header a field containing "Script-Control: X-content-handler=SSI". The following example in DCL show the essential elements of such a script.